XML基础知识

本文介绍了XML的基础知识,包括XML的设计理念、与HTML的区别、基本语法、文档结构及常用规范等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

大部分内容来自http://www.w3schools.com

本文主要是个人学习笔记,将替代手写的笔记。以此记录自己的学习过程,记录不懂的,可以逐渐增加新知识点。


What is XML?

  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to carry data, not to display data
  • XML tags are not predefined. You must define your own tags
  • XML is designed to be self-descriptive
  • XML is a W3C Recommendation

The Difference Between XML and HTML

XML is not a replacement for HTML.

XML and HTML were designed with different goals:

  • XML was designed to transport and store data, with focus on what data is
  • HTML was designed to display data, with focus on how data looks

HTML is about displaying information, while XML is about carrying information.

An Example XML Document

XML documents use a self-describing and simple syntax:

<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
  <to>Tove</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>

The first line is the XML declaration. It defines the XML version (1.0) and the encoding used (ISO-8859-1 = Latin-1/West European character set).

The next line describes the root element of the document (like saying: "this document is a note"):

The next 4 lines describe 4  child elements  of the root (to, from, heading, and body):

XML Documents Form a Tree Structure

DOM node tree
<bookstore>
  <book category="COOKING">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
  <book category="CHILDREN">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>


XML Syntax Rules

All XML Elements Must Have a Closing Tag

In XML, it is illegal to omit the closing tag. All elements  must  have a closing tag:

XML Tags are Case Sensitive

XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.

XML Elements Must be Properly Nested

In XML, all elements  must  be properly nested within each other:
<b><i>This text is bold and italic</i></b>

In the example above, "Properly nested" simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.

XML Documents Must Have a Root Element

XML documents must contain one element that is the  parent  of all other elements. This element is called the  root  element.
<root>
  <child>
    <subchild>.....</subchild>
  </child>
</root>

XML Attribute Values Must be Quoted

XML elements can have attributes in name/value pairs just like in HTML.

In XML, the attribute values must always be quoted.

Study the two XML documents below. The first one is incorrect, the second is correct:

<note date=12/11/2007>
  <to>Tove</to>
  <from>Jani</from>
</note>

<note date="12/11/2007">
  <to>Tove</to>
  <from>Jani</from>
</note>

Entity References

Some characters have a special meaning in XML.

If you place a character like "<" inside an XML element, it will generate an error because the parser interprets it as the start of a new element.

This will generate an XML error:

<message>if salary < 1000 then</message>

To avoid this error, replace the "<" character with an  entity reference :
<message>if salary < 1000 then</message>

There are 5 predefined entity references in XML:
&lt; < less than
&gt; > greater than
&amp; & ampersand 
&apos; ' apostrophe
&quot; " quotation mark

Comments in XML

The syntax for writing comments in XML is similar to that of HTML.

<!-- This is a comment -->

White-space is Preserved in XML

HTML truncates multiple white-space characters to one single white-space:
HTML: Hello           Tove
Output: Hello Tove
With XML, the white-space in a document is not truncated.

XML Stores New Line as LF

In Windows applications, a new line is normally stored as a pair of characters: carriage return (CR) and line feed (LF). In Unix applications, a new line is normally stored as an LF character. Macintosh applications also use an LF to store a new line.

XML stores a new line as LF.


XML Elements

What is an XML Element?

An XML element is everything from (including) the element's start tag to (including) the element's end tag.

An element can contain:

  • other elements
  • text
  • attributes
  • or a mix of all of the above...

XML Naming Rules

XML elements must follow these naming rules:

  • Names can contain letters, numbers, and other characters
  • Names cannot start with a number or punctuation character
  • Names cannot start with the letters xml (or XML, or Xml, etc)
  • Names cannot contain spaces

Any name can be used, no words are reserved.


Best Naming Practices

Make names descriptive. Names with an underscore separator are nice: <first_name>, <last_name>.

Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>.

Avoid "-" characters. If you name something "first-name," some software may think you want to subtract name from first.

Avoid "." characters. If you name something "first.name," some software may think that "name" is a property of the object "first."

Avoid ":" characters. Colons are reserved to be used for something called namespaces (more later).

XML documents often have a corresponding database. A good practice is to use the naming rules of your database for the elements in the XML documents.

Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your software vendor doesn't support them.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值