|
This article cover the basics of the OpenDocument format. e.g. What is the OpenDocument format, the Xml structure of the styles and the content, Microsoft and OpenDocument, application that support the OpenDocument format, ....
About the OpenDocument format by Lars Behrmann Table of contents What is the OpenDocument format? The OpenDocument format is a new free file format for office documents. Unlike the office formats used by Microsoft Office like .xls, .doc, ... the OpenDocument isn't a closed binary format. The OpenDocument format is a simple compressed archive which contains all needed the document styles and the document content in simple Xml files. The pictures used within documents are simply added to this archive. The advantage of this solutoin is that documents in the OpenDocument format are much smaller then their compiled pendants from Microsoft Office. As I write all styles and contents are stored in simple Xml files, to make sure that the used Xml tags and elements are used in the right manner. They have to match a Xml schema defintion. This schema is developed by the OASIS (Open Document Format for Office Applications) consortium and is free available for download by everyone. The structur of the OpenDocument format. The file content Each document in the OpenDocument format e.g. text, spreadsheet, .. contain all the same file content in their archive. These contents are: - Configurations2 (folder)
- META-INF (folder)
- Pictures (folder)
- Thumbnails (folder)
- preview pictures of the document
- content.xml
- meta.xml
- mimetype
- settings.xml
- styles.xml
This is what you see when you unzip an OpenDocument file to your harddisk. Now, let me explain what the files and folder are in detail. The Configurations2 folder The Configurations2 folder you will find additional configurations files. Mostly these folder is empty e.g also the popular Office Suite OpenOffice doesn't use this folder. The META-INF folder The META-INF folder contains the manifiest.xml file. In simple words, the manifiest file define the file and folder content of the OpenDocument within Xml elelements. e.g. that there is a content.xml, that their are several pictures with the image format x, .... The Pictures folder As you can think to your self. This folder will contain all pictures that are used within the document. The Thumbnail folder The Thumbnail folder contain little preview graphics of the document (thumbnails). The content.xml file In the content.xml file there you will find the real content which you can see if you open the OpenDocument within an office application. Next to the content elements you will also find all local style elements which are used by the existing content. The mimetype file The mimetype file is a simple text plain file which only contain the OpenDocument type e.g. application/vnd.oasis.opendocument.text for an OpenDocument textdocument. Notice, this file has really no extension! The settings.xml file The settings.xml file describe the document bevavoir if the document is loaded into an office application e.g. visualisation (view), printer settings, forbidden characters, .... The styles.xml file In the styles.xml file you will find all global style elements which are used by the document content. Against the local styles which you find in the content.xml file these styles are ahve global effects on the document. e.g. which is the default style of tables, the standard style for a paragraph, ... The Xml structure In this section I will explain you the base structure of the "most" important files styles.xml and content.xml. The structure of the content.xml file. I think the easiest way to explain the structure of the content and styles Xml file is to show the Xml layout of this documents. The content.xml layout: The Xml comments decribe the detail meaning of the following Xml element. Notice, that such Xml comments, will be removed by the most office applications. So don't think you could place there your own comments e.g. to identifie your documents later! I use the comments only for explanations.
<?xml version="1.0" encoding="UTF-8" ?> <!-- Namespaces used by this document. --> <office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0" xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:ooow="http://openoffice.org/2004/writer" xmlns:oooc="http://openoffice.org/2004/calc" xmlns:dom="http://www.w3.org/2001/xml-events" xmlns:xforms="http://www.w3.org/2002/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" office:version="1.0"> <!-- Scripts used by this document. --> <office:scripts /> <!-- Fonts used by this document. --> <office:font-face-decls> <style:font-face style:name="StarSymbol" svg:font-family="StarSymbol" style:font-charset="x-symbol" /> <style:font-face style:name="Tahoma1" svg:font-family="Tahoma" /> <style:font-face style:name="Lucida Sans Unicode" svg:font-family="'Lucida Sans Unicode'" style:font-pitch="variable" /> <style:font-face style:name="Tahoma" svg:font-family="Tahoma" style:font-pitch="variable" /> <style:font-face style:name="Times New Roman" svg:font-family="'Times New Roman'" style:font-family-generic="roman" style:font-pitch="variable" /> </office:font-face-decls> <!-- Local styles used by this document. --> <office:automatic-styles> <style:style style:name="fr1" style:family="graphic" style:parent-style-name="Graphics"> <style:graphic-properties style:horizontal-pos="left" style:horizontal-rel="paragraph" style:mirror="none" fo:clip="rect(0cm 0cm 0cm 0cm)" draw:luminance="0%" draw:contrast="0%" draw:red="0%" draw:green="0%" draw:blue="0%" draw:gamma="100%" draw:color-inversion="false" draw:image-opacity="100%" draw:color-mode="standard" /> </style:style> </office:automatic-styles> <office:body> <!-- The content displayed by this document. --> <office:text> <office:forms form:automatic-focus="false" form:apply-design-mode="false" /> <text:sequence-decls> <text:sequence-decl text:display-outline-level="0" text:name="Illustration" /> <text:sequence-decl text:display-outline-level="0" text:name="Table" /> <text:sequence-decl text:display-outline-level="0" text:name="Text" /> <text:sequence-decl text:display-outline-level="0" text:name="Drawing" /> </text:sequence-decls> <text:p text:style-name="Standard"> <draw:frame draw:style-name="fr1" draw:name="graphic1" text:anchor-type="paragraph" svg:width="2.646cm" svg:height="2.646cm" draw:z-index="0"> <draw:image xlink:href="Pictures/100000000000006400000064D8E5DEA0.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" /> </draw:frame> </text:p> </office:text> </office:body> </office:document-content>
The structure of the styles.xml file. The stylles.xml layout: The Xml comments decribe the detail meaning of the following Xml element. Notice, that such Xml comments, will be removed by the most office applications. So don't think you could place there your own comments e.g. to identifie your documents later! I use the comments only for explanations.
<?xml version="1.0" encoding="UTF-8" ?> <!-- Namespaces used by this document. --> <office:document-styles xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0" xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:ooow="http://openoffice.org/2004/writer" xmlns:oooc="http://openoffice.org/2004/calc" xmlns:dom="http://www.w3.org/2001/xml-events" office:version="1.0"> <office:font-face-decls> <!-- Fonts used by this document. --> <style:font-face style:name="StarSymbol" svg:font-family="StarSymbol" style:font-charset="x-symbol" /> <style:font-face style:name="Tahoma1" svg:font-family="Tahoma" /> <style:font-face style:name="Lucida Sans Unicode" svg:font-family="'Lucida Sans Unicode'" style:font-pitch="variable" /> <style:font-face style:name="Tahoma" svg:font-family="Tahoma" style:font-pitch="variable" /> <style:font-face style:name="Times New Roman" svg:font-family="'Times New Roman'" style:font-family-generic="roman" style:font-pitch="variable" /> </office:font-face-decls> <office:styles> <!-- Default graphic styles used by this document. --> <style:default-style style:family="graphic"> <style:graphic-properties draw:shadow-offset-x="0.3cm" draw:shadow-offset-y="0.3cm" draw:start-line-spacing-horizontal="0.283cm" draw:start-line-spacing-vertical="0.283cm" draw:end-line-spacing-horizontal="0.283cm" draw:end-line-spacing-vertical="0.283cm" style:flow-with-text="false" /> <style:paragraph-properties style:text-autospace="ideograph-alpha" style:line-break="strict" style:writing-mode="lr-tb" style:font-independent-line-spacing="false"> <style:tab-stops /> </style:paragraph-properties> <style:text-properties style:use-window-font-color="true" fo:font-size="12pt" fo:language="de" fo:country="DE" style:font-size-asian="12pt" style:language-asian="none" style:country-asian="none" style:font-size-complex="12pt" style:language-complex="none" style:country-complex="none" /> </style:default-style> <!-- Default paragraph styles used by this document. --> <style:default-style style:family="paragraph"> <style:paragraph-properties fo:hyphenation-ladder-count="no-limit" style:text-autospace="ideograph-alpha" style:punctuation-wrap="hanging" style:line-break="strict" style:tab-stop-distance="1.251cm" style:writing-mode="page" /> <style:text-properties style:use-window-font-color="true" style:font-name="Times New Roman" fo:font-size="12pt" fo:language="de" fo:country="DE" style:font-name-asian="Lucida Sans Unicode" style:font-size-asian="12pt" style:language-asian="none" style:country-asian="none" style:font-name-complex="Tahoma" style:font-size-complex="12pt" style:language-complex="none" style:country-complex="none" fo:hyphenate="false" fo:hyphenation-remain-char-count="2" fo:hyphenation-push-char-count="2" /> </style:default-style> <!-- Default table styles used by this document. --> <style:default-style style:family="table"> <style:table-properties table:border-model="collapsing" /> </style:default-style> <!-- Default table row styles used by this document. --> <style:default-style style:family="table-row"> <style:table-row-properties fo:keep-together="auto" /> </style:default-style> <!-- Default text style standard used by this document. --> <style:style style:name="Standard" style:family="paragraph" style:class="text" /> <!-- Default body text style used by this document. --> <style:style style:name="Text_20_body" style:display-name="Text body" style:family="paragraph" style:parent-style-name="Standard" style:class="text"> <style:paragraph-properties fo:margin-top="0cm" fo:margin-bottom="0.212cm" /> </style:style> <!-- Default heading styles used by this document. --> <style:style style:name="Heading" style:family="paragraph" style:parent-style-name="Standard" style:next-style-name="Text_20_body" style:class="text"> <style:paragraph-properties fo:margin-top="0.423cm" fo:margin-bottom="0.212cm" fo:keep-with-next="always" /> <style:text-properties fo:font-size="14pt" style:font-size-asian="14pt" style:font-size-complex="14pt" /> </style:style> <!-- Default heading 1 styles used by this document. --> <style:style style:name="Heading_20_1" style:display-name="Heading 1" style:family="paragraph" style:parent-style-name="Heading" style:class="text" style:default-outline-level="1"> <style:text-properties fo:font-size="115%" fo:font-weight="bold" style:font-size-asian="115%" style:font-weight-asian="bold" style:font-size-complex="115%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 2 styles used by this document. --> <style:style style:name="Heading_20_2" style:display-name="Heading 2" style:family="paragraph" style:parent-style-name="Heading" style:class="text" style:default-outline-level="2"> <style:text-properties fo:font-size="14pt" fo:font-style="italic" fo:font-weight="bold" style:font-size-asian="14pt" style:font-style-asian="italic" style:font-weight-asian="bold" style:font-size-complex="14pt" style:font-style-complex="italic" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 3 styles used by this document. --> <style:style style:name="Heading_20_3" style:display-name="Heading 3" style:family="paragraph" style:parent-style-name="Heading" style:class="text" style:default-outline-level="3"> <style:text-properties fo:font-size="14pt" fo:font-weight="bold" style:font-size-asian="14pt" style:font-weight-asian="bold" style:font-size-complex="14pt" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 4 styles used by this document. --> <style:style style:name="Heading_20_4" style:display-name="Heading 4" style:family="paragraph" style:parent-style-name="Heading" style:class="text" style:default-outline-level="4"> <style:text-properties fo:font-size="85%" fo:font-style="italic" fo:font-weight="bold" style:font-size-asian="85%" style:font-style-asian="italic" style:font-weight-asian="bold" style:font-size-complex="85%" style:font-style-complex="italic" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 5 styles used by this document. --> <style:style style:name="Heading_20_5" style:display-name="Heading 5" style:family="paragraph" style:parent-style-name="Heading" style:class="text"> <style:text-properties fo:font-size="85%" fo:font-weight="bold" style:font-size-asian="85%" style:font-weight-asian="bold" style:font-size-complex="85%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 6 styles used by this document. --> <style:style style:name="Heading_20_6" style:display-name="Heading 6" style:family="paragraph" style:parent-style-name="Heading" style:class="text"> <style:text-properties fo:font-size="75%" fo:font-weight="bold" style:font-size-asian="75%" style:font-weight-asian="bold" style:font-size-complex="75%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading 7 styles used by this document. --> <style:style style:name="Heading_20_7" style:display-name="Heading 7" style:family="paragraph" style:parent-style-name="Heading" style:class="text"> <style:text-properties fo:font-size="75%" fo:font-weight="bold" style:font-size-asian="75%" style:font-weight-asian="bold" style:font-size-complex="75%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading styles 8 used by this document. --> <style:style style:name="Heading_20_8" style:display-name="Heading 8" style:family="paragraph" style:parent-style-name="Heading" style:class="text"> <style:text-properties fo:font-size="75%" fo:font-weight="bold" style:font-size-asian="75%" style:font-weight-asian="bold" style:font-size-complex="75%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading styles 9 used by this document. --> <style:style style:name="Heading_20_9" style:display-name="Heading 9" style:family="paragraph" style:parent-style-name="Heading" style:class="text"> <style:text-properties fo:font-size="75%" fo:font-weight="bold" style:font-size-asian="75%" style:font-weight-asian="bold" style:font-size-complex="75%" style:font-weight-complex="bold" /> </style:style> <!-- Default heading styles 10 used by this document. --> <style:style style:name="Heading_20_10" style:display-name="Heading 10" style:family="paragraph" style:parent-style-name="Heading" style:next-style-name="Text_20_body" style:class="text"> <style:text-properties fo:font-size="75%" fo:font-weight="bold" style:font-size-asian="75%" style:font-weight-asian="bold" style:font-size-complex="75%" style:font-weight-complex="bold" /> </style:style> <!-- Default list text styles used by this document. --> <style:style style:name="List" style:family="paragraph" style:parent-style-name="Text_20_body" style:class="list"> <style:text-properties style:font-name-complex="Tahoma1" /> </style:style> <!-- Default table content styles used by this document. --> <style:style style:name="Table_20_Contents" style:display-name="Table Contents" style:family="paragraph" style:parent-style-name="Standard" style:class="extra"> <style:paragraph-properties text:number-lines="false" text:line-number="0" /> </style:style> <!-- Default table heading styles used by this document. --> <style:style style:name="Table_20_Heading" style:display-name="Table Heading" style:family="paragraph" style:parent-style-name="Table_20_Contents" style:class="extra"> <style:paragraph-properties fo:text-align="center" style:justify-single-word="false" text:number-lines="false" text:line-number="0" /> <style:text-properties fo:font-style="italic" fo:font-weight="bold" style:font-style-asian="italic" style:font-weight-asian="bold" style:font-style-complex="italic" style:font-weight-complex="bold" /> </style:style> <!-- Default caption styles used by this document. --> <style:style style:name="Caption" style:family="paragraph" style:parent-style-name="Standard" style:class="extra"> <style:paragraph-properties fo:margin-top="0.212cm" fo:margin-bottom="0.212cm" text:number-lines="false" text:line-number="0" /> <style:text-properties fo:font-size="10pt" fo:font-style="italic" style:font-size-asian="10pt" style:font-style-asian="italic" style:font-name-complex="Tahoma1" style:font-size-complex="10pt" style:font-style-complex="italic" /> </style:style> <!-- Default index styles used by this document. --> <style:style style:name="Index" style:family="paragraph" style:parent-style-name="Standard" style:class="index"> <style:paragraph-properties text:number-lines="false" text:line-number="0" /> <style:text-properties style:font-name-complex="Tahoma1" /> </style:style> <!-- Default list bullet styles used by this document. --> <style:style style:name="Bullet_20_Symbols" style:display-name="Bullet Symbols" style:family="text"> <style:text-properties style:font-name="StarSymbol" fo:font-size="9pt" style:font-name-asian="StarSymbol" style:font-size-asian="9pt" style:font-name-complex="StarSymbol" style:font-size-complex="9pt" /> </style:style> <!-- Default graphic styles used by this document. --> <style:style style:name="Graphics" style:family="graphic"> <style:graphic-properties text:anchor-type="paragraph" svg:x="0cm" svg:y="0cm" style:wrap="none" style:vertical-pos="top" style:vertical-rel="paragraph" style:horizontal-pos="center" style:horizontal-rel="paragraph" /> </style:style> <!-- Default outline level styles used with text by this document. --> <text:outline-style> <text:outline-level-style text:level="1" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="2" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="3" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="4" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="5" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="6" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="7" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="8" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="9" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> <text:outline-level-style text:level="10" style:num-format=""> <style:list-level-properties text:min-label-distance="0.381cm" /> </text:outline-level-style> </text:outline-style> <!-- Default footnode and endnote styles used by this document. --> <text:notes-configuration text:note-class="footnote" style:num-format="1" text:start-value="0" text:footnotes-position="page" text:start-numbering-at="document" /> <text:notes-configuration text:note-class="endnote" style:num-format="i" text:start-value="0" /> <!-- Default line numbering style used by this document. --> <text:linenumbering-configuration text:number-lines="false" text:offset="0.499cm" style:num-format="1" text:number-position="left" text:increment="5" /> </office:styles> <office:automatic-styles> <!-- Default pagelayout style used by this document. --> <style:page-layout style:name="pm1"> <style:page-layout-properties fo:page-width="20.999cm" fo:page-height="29.699cm" style:num-format="1" style:print-orientation="portrait" fo:margin-top="2cm" fo:margin-bottom="2cm" fo:margin-left="2cm" fo:margin-right="2cm" style:writing-mode="lr-tb" style:footnote-max-height="0cm"> <style:footnote-sep style:width="0.018cm" style:distance-before-sep="0.101cm" style:distance-after-sep="0.101cm" style:adjustment="left" style:rel-width="25%" style:color="#000000" /> </style:page-layout-properties> <style:header-style /> <style:footer-style /> </style:page-layout> </office:automatic-styles> <office:master-styles> <style:master-page style:name="Standard" style:page-layout-name="pm1" /> </office:master-styles> </office:document-styles> Which Office Suites offer OpenDocument support? More and more Office Suites offer OpenDocument or make it to their standard office document format. Here a short list of the most important office suites whith OpenDocument support. Yes, I know that there are some more Office suites which also support OpenDocument, but I think it's enought to list the most important. Also the other applications, often support only a part of the OpenDocument format. Which programming libraries exist for the OpenDocument format? Up to now, there are not really a lot of libraries for OpenDocument available. Next to my AODL - An OpenDocument library for C# resp. net there exist a perl module for OpenDocument support. Mircosoft and the OpenDocument format I think one of the most important questions about OpenDocument and Microsoft is, will Microsoft Office support OpenDocument in future versions? Up to now, there isn't planned support for OpenDocument. Instead of support the OpenDocument format Microsoft has developed his own Xml document format called Open Xml. The base idea of Open Xml is the same as by OpenDocument. Write an Xml schema which define the structure of Xml documents which could be displayed and edit by an Office application. But there are some differences. Microsofts Xml schema is also free available, but what is the meaning of free? In this case the license you have to be compatible to, if you plan to use this format seems to be incompataible which all known Open Source license modells. Maybe I'm wrong with this opion and feel free to correct me in this case. But how ever, I think many Open Source projects would offer support for Open Xml and OpenDocument, but until the license problem isn't clear Open Xml will be unused by most of these projects. At least, I think after one of the next big releases of Windows Vista and the next version of Microsoft Office it wouldn't wonder me if Microsoft will implement OpenDocument support for their Office products. But what should windows user do until this day? One answer could be AODC , an OpenDcoument converter. AODC - An OpenDocument converter AODC is an free OpenDocument converter written .net C#. It's a little standalone application which also could be started within external memory disks e.g. USB sticks. The goal of AODC is to make the gap between the OpenDocument format and Microsoft user which have no application with OpenDocument support installed a little bit smaller. Therefore it's possible to convert documents in the OpenDocument text format into HTML files which are optimized to be displayed with all popular browser and Microsoft Word since vers. 2002. The next OpenDocument format which will be available for conversation is the spreadsheet format. More infor and download of AODC you will find at http://aodc.opendocument4all.com Lars Behrmann,
This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
, 26. Dec. 2005 |