OpenDocument文件格式简介

本文介绍了OpenDocument文件格式的内部结构,包括多个XML文件的作用及如何组织。重点讲解了content.xml中文档内容的表现形式,styles.xml中的样式定义以及使用ZIP文件的好处。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Format overview

Author: Daniel Carrera

An OpenDocument is a ZIP file that contains several XML files. The exact files and directories in the archive will depend on the content of the document (e.g. images, macros, etc). A typical document, when unzipped, will have the following contents:

content.xml
META-INF/manifest.xml
meta.xml
mimetype
Pictures/
settings.xml
styles.xml

We look at these in turn:

content.xml

This is the most important file. It carries the actual content of the document (except for binary data, like images). The base format is inspired by HTML, and though far more complex, it should be reasonably legible to humans:

<text:h text:style-name="Heading_2">
  This is a title
</text:h>
<text:p text:style-name="Text_body"/>
<text:p text:style-name="Text_body">
  This is a paragraph. The formating (font,
  colour, etc.) are specified in the Text_body
  style. The empty text:p tag above is a blank
  paragraph (ie. an empty line).
</text:p>

META-INF/manifest.xml

The manifest file contains a list of all the files in the ZIP archive. The contents might look like this:

<manifest:file-entry
 manifest:media-type="image/png" 
 manifest:full-path="Pictures/10ECF14403.png"/>
<manifest:file-entry 
 manifest:media-type="text/xml" 
 manifest:full-path="content.xml"/>
<manifest:file-entry 
 manifest:media-type="text/xml" 
 manifest:full-path="styles.xml"/>

The presense of a manifest means that OpenDocument files are also JAR archives. This is another example of OpenDocument reusing well established standards instead of reinventing the wheel.

meta.xml

This file contains the file metadata. For example, Author, "Last modified by", date of last modification, etc. The contents look somewhat like this:

<meta:creation-date>
  2003-09-10T15:31:11
</meta:creation-date>
<dc:creator>Daniel Carrera</dc:creator>
<dc:date>
  2005-06-29T22:02:06
</dc:date>
<dc:language>es-ES</dc:language>
<meta:document-statistic
 meta:table-count="6" meta:object-count="0"
 meta:page-count="59" meta:paragraph-count="676"
 meta:image-count="2" meta:word-count="16701"
 meta:character-count="98757"/>

The names of the tags are taken from the Dublin Core XML standard. the date follows the ISO standard.

mimetype

This is a one-line file containing the mimetype of the file. For a text document that would be:

application/vnd.oasis.opendocument.text

Pictures/

This is a directory that contains images in common image formats such as JPEG and PNG. They are referenced from content.xml in a way similar to the <img> tag in HTML.

<draw:image   
   xlink:href="Pictures/1D67595BF2E.png" 
   xlink:type="simple" xlink:show="embed" 
   xlink:actuate="onLoad"/>

settings.xml

This includes settings such as the zoom factor or the cursor position. These are properties that are not content or layout.

styles.xml

This file contains style information. Styles include things like font size, colour, page width, and any kind of formatting.

OpenDocument provides a strong separation between content (in content.xml) and formatting (in styles.xml). The style types include:

  • Paragraph styles.
  • Page Styles.
  • Character Styles.
  • Frame Styles.
  • List styles.

In OpenDocument all formatting is done through styles. Even "manual" formatting is implemented through styles (the application dynamically makes new styles as needed).

Why use a ZIP file?

File size

The most obvious benefit is file size. Because they are compressed, OpenDocument files are normally a lot smaller than equivalent Microsoft .doc files. Furthermore, the longer the document the greater the benefit due to compression.

Cleaner XML

You can embed binary data (like images) in a way that keeps the XML code clean and understandable. You can put an image in the Pictures directory and refer to it from inside the document with the syntax:

<draw:image
   xlink:href="Pictures/12010E7CF14403.png"
   xlink:type="simple" xlink:show="embed" 
   xlink:actuate="onLoad"/>

If the format didn't use a ZIP file, it would have to embed the binary data directly into the file. That might look somewhat like this:

<image content="AAAAB3NzaC1kc3MAAACBALhIE5ZbPWD
uB44Qo/+DGECA8u1Jl4QdwubYgiweQQX4ZeD6LduuZk+HMW
bfvGpADeOAzS7Aw1nBPbp1F7AKo9LpGBwv/70dX0HE5hm5X
2JKXhzom4M2IPtv9BV7qKXvqdibltAPX6kTWS7Bp/o3krNL
zNsV6zkuMEETFz3Rmt2hAAAAFQCfLFFL0ouPHx3wKtgyeL5
aUO8W+QAAAIBf7MGYYn8ylLOdAs4LX00pQpaAEuwjYalnxy
ZUMpBnBhwjOkY0OH10m3hASu/jnTvbJchm43NK0YyvW1zCa
YGKrUllFrfh4pamr4Ov3zmoL7BUK0zGwowrD6ILd3OroNch
pAetV0YMJ2FkSfPlDuBaddMhymtWoFDLQ9QEzkbaTgAAAIA
sxyNNHT7MN8VaF8GZWLUaq+dl9rj2wgSPYHkDV/EvGqgB0e
GNEXzty3X2GqAg9z10Qj1W2Ua7FnC57kUdSG6B68Ei1Qkv/
N0yGFSZ3xbnP9hxFa2H/2DDPAftuBJT8MUZJHKttXVZ6jJ/
3aXEMBcL8eXw6kWroOR/L7NUxHpvyw"/>

(Though in reality the code would be much longer).

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值