html 字符集 47,HTML字符集

本文介绍了HTML页面字符集的发展历程,从早期的ASCII,到Windows-1252和ISO-8859-1,最终过渡到HTML5推荐的UTF-8。UTF-8解决了多种字符编码问题,成为现代网页的标准。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

HTML字符集

To display an HTML page correctly, the browser must know what character set (encoding) to use:

Example

HTML字符集

The HTML5 specification encourages web developers to use the UTF-8 character

set!

This has not always been the case. The character encoding for the early web was ASCII.

Later, from HTML 2.0 to HTML 4.01, ISO-8859-1 was

considered as the standard character set.

With XML and HTML5, UTF-8 finally arrived and solved a lot of character encoding problems.

In the Beginning: ASCII

Computer data is stored as binary codes (01000101) in the electronics.

To standardize the storing of text, the American Standard Code for Information

Interchange (ASCII) was created. It defined a unique binary number for

each storable character to support the numbers from 0-9, the upper and lower case alphabet (a-z, A-Z), and special

characters like ! $ + - ( ) @ < > , .

Since ASCII used 7 bits for the character, it could only represent 128 different characters.

The biggest weakness with ASCII, was that it excluded non English letters.

ASCII is still in use today, especially in large mainframe computer systems.

For a closer look, please study our Complete ASCII

参考.

In Windows: Windows-1252

Windows-1252 was the default character set in Windows, up to Windows 95.

It is an extension to ASCII, with added international characters.

It

uses a full byte (8-bits) to represent 256 different characters.

Since Windows-1252 has been the default in Windows, it is supported by all browsers.

For a closer look, please study: The Complete Windows-1252 参考.

In HTML 4: ISO-8859-1

The character set most often used in HTML 4 was ISO-8859-1.

ISO-8859-1 is an extension to ASCII, with added international characters.

Example

In HTML 4, a character set different from ISO-8859-1 can be specified in the tag:

Example

All HTML 4 processors also support UTF-8:

Example

When a browser detects ISO-8859-1 it normally defaults to Windows-1252,

because Windows-1252 has 32 more international characters.

For a closer look, please study: The Complete ISO-8859-1 参考

In HTML5: Unicode UTF-8

The HTML5 specification encourages web developers to use the UTF-8 character

set.

Example

A character-set different from UTF-8 can be specified in the tag:

Example

The Unicode Consortium developed the UTF-8 and UTF-16 standards, because the ISO-8859 character-sets are

limited, and not compatible a multilingual environment.

The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.

All HTML5 and XML processors support UTF-8, UTF-16, Windows-1252, and ISO-8859.

For a closer look, please study: The Complete Unicode 参考.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值