I'm trying to create an Open Office spreadsheet programmatically but for some reason simply compressing a folder with all the necessary files makes Open Office flag the file as corrupted.
How did I get to this? I started by creating a normal spreadsheet in Open Office with some values in it. After saving I change the extension to .zip and make a copy of the folder. I then compress the second folder using command line zip and change the file extension to .ods. When trying to open the resulting file I get an error from Open Office saying the file is corrupt.
Does Open Office use a special compression algorithm? Doing a "file test.ods" shows it as a compressed zip, so what does Open Office add during the compression routine to make it work?
Section 17 Of the OASIS OpenOffice Specification defines how OpenDocument Packages need to be packaged.
Section 17.4 MIME Type Stream reads like this:
If a MIME type for a document that
makes use of packages is existing,
then the package
SHOULD contain a stream called "mimetype". This stream SHOULD be
first stream of the package's zip
file, it MUST NOT be compressed,
and it MUST NOT use an 'extra
field' in its header (see [ZIP])..
The purpose is to allow packaged files
to be identified through 'magic
number' mechanisms, such as Unix's
file/magic utility. If a ZIP file
contains a stream at the beginning of
the file that is uncompressed, and has
no extra data in the header, then the
stream name and the stream content can
be found at fixed positions. More
specifically, one will find:
- a string 'PK' at position 0 of all zip files
- a string 'mimetype' at position 30 of all such package files
- the mimetype itself at position 38 of such a package.
I have tried tokland suggestion, but I have founded that LibreOffice 4 require specific order (only for the first ones?):
mimetype
(uncompressed)
meta.xml
settings.xml
content.xml
Thumbnails/thumbnail.png
Configurations2/images/Bitmaps/
Configurations2/popupmenu/
Configurations2/toolpanel/
Configurations2/statusbar/
Configurations2/progressbar/
Configurations2/toolbar/
Configurations2/menubar/
Configurations2/accelerator/current.xml
Configurations2/floater/
styles.xml
META-INF/manifest.xml
I create a script to do that folder2od.sh:
#!/bin/sh
# Convert folder (unzipped OpenDocument file) to OpenDocument file (odt, ods, etc.)
# Usage: ./folder2od.sh "path/to/folder" "file.odt"
cmdfolder=$(cd `dirname "$0"`; pwd -P)
folder=$(cd `dirname "$2"`; pwd -P)
file=$(basename "$2")
absfile="$folder/$file"
cd "$1"
zip -0 -X "$file" "mimetype"
list=$(cat <<'END_HEREDOC'
meta.xml
settings.xml
content.xml
Thumbnails/thumbnail.png
Configurations2/images/Bitmaps/
Configurations2/popupmenu/
Configurations2/toolpanel/
Configurations2/statusbar/
Configurations2/progressbar/
Configurations2/toolbar/
Configurations2/menubar/
Configurations2/accelerator/current.xml
Configurations2/floater/
styles.xml
META-INF/manifest.xml
END_HEREDOC
)
for f in $list
do
zip "$absfile" "$f"
done
cd "$cmdfolder"
I've found some interesting infos here: http://www.jejik.com/articles/2010/03/how_to_correctly_create_odf_documents_using_zip/
The shell script worked for me, too :) I had problems zipping back up, after unzipping an odt file. Guess the manifest part was what's missing.
The shell script above did not handle inline pictures/graphics, however, so I made some small adjustments which worked for me (also, the script had a bug in that END_HEREDOC was not on a dedicated line):
#!/bin/sh
# Convert folder (unzipped OpenDocument file) to OpenDocument file (odt, ods, etc.)
# Usage: ./folder2od.sh "path/to/folder" "file.odt"
cmdfolder=$(cd `dirname "$0"`; pwd -P)
folder=$(cd `dirname "$2"`; pwd -P)
file=$(basename "$2")
absfile="$folder/$file"
cd "$1"
zip -0 -X "$file" "mimetype"
list=$(cat <<'END_HEREDOC'
meta.xml
settings.xml
content.xml
Pictures/
Thumbnails/
Configurations2/
styles.xml
manifest.rdf
META-INF/manifest.xml
END_HEREDOC
)
for f in $list
do
zip -r "$absfile" "$f"
done
cd "$cmdfolder"