character encodings in the Java 2 platform(Java2 平台的字符编码)

本文详细介绍了网页组件中请求、页面及响应的字符编码设置方法。涵盖了如何为JSP页面设置正确的字符集以确保跨语言环境中数据的正确解析与显示,并提供了具体的实现方式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Web components usually use PrintWriter to produce responses; PrintWriter automatically encodes using ISO-8859-1. Servlets can also output binary data using OutputStream classes, which perform no encoding. An application that uses a character set that cannot use the default encoding must explicitly set a different encoding.

 

For web components, three encodings must be considered:

  • Request

  • Page (JSP pages)

  • Response

Request Encoding

The request encoding is the character encoding in which parameters in an incoming request are interpreted. Currently, many browsers do not send a request encoding qualifier with the Content-Type header. In such cases, a web container will use the default encoding, ISO-8859-1, to parse request data.

If the client hasn’t set character encoding and the request data is encoded with a different encoding from the default, the data won’t be interpreted correctly. To remedy this situation, you can use the

ServletRequest.setCharacterEncoding(String enc) 

method to override the character encoding supplied by the container.

To control the request encoding from JSP pages, you can use the JSTL

 

fmt:requestEncoding 

 tag.

You must call the method or tag before parsing any request parameters or reading any input from the request. Calling the method or tag once data has been read will not affect the encoding.

Page Encoding

For JSP pages, the page encoding is the character encoding in which the file is encoded.

For JSP pages in standard syntax, the page encoding is determined from the following sources:

  • The page encoding value of a JSP property group (see Setting Properties for Groups of JSP Pages) whose URL pattern matches the page.

  • The pageEncoding attribute of the page directive of the page. It is a translation-time error to name different encodings in the pageEncoding attribute of the page directive of a JSP page and in a JSP property group.

  • The CHARSET value of the contentType attribute of the page directive.

If none of these is provided, ISO-8859-1 is used as the default page encoding.

 

The pageEncoding and contentType attributes determine the page character encoding of only the file that physically contains the page directive. A web container raises a translation-time error if an unsupported page encoding is specified.

Response Encoding

The response encoding is the character encoding of the textual response generated by a web component. The response encoding must be set appropriately so that the characters are rendered correctly for a given locale. A web container sets an initial response encoding for a JSP page from the following sources:

  • The CHARSET value of the contentType attribute of the page directive

  • The encoding specified by the pageEncoding attribute of the page directive

  • The page encoding value of a JSP property group whose URL pattern matches the page

If none of these is provided, ISO-8859-1 is used as the default response encoding.

The setCharacterEncoding, setContentType, and setLocale methods can be called repeatedly to change the character encoding. Calls made after the servlet response’s getWriter method has been called or after the response is committed have no effect on the character encoding. Data is sent to the response stream on buffer flushes (for buffered pages) or on encountering the first content on unbuffered pages.

 

Calls to setContentType set the character encoding only if the given content type string provides a value for the charset attribute. Calls to setLocale set the character encoding only if neither setCharacterEncoding nor setContentType has set the character encoding before. To control the response encoding from JSP pages, you can use the JSTL fmt.setLocale tag.

To obtain the character encoding for a locale, the setLocale method checks the locale encoding mapping for the web application. For example, to map Japanese to the Japanese-specific encoding Shift_JIS, follow these steps:

  1. Select the WAR.

  2. Click the Advanced Settings button.

  3. In the Locale Character Encoding table, Click the Add button.

  4. Enter ja in the Extension column.

  5. Enter Shift_JIS in the Character Encoding column.

If a mapping is not set for the web application, setLocale uses a Application Server mapping.

The first application in Chapter 5, JavaServer Pages Technology allows a user to choose an English string representation of a locale from all the locales available to the Java 2 platform and then outputs a date localized for that locale. To ensure that the characters in the date can be rendered correctly for a wide variety of character sets, the JSP page that generates the date sets the response encoding to UTF-8 by using the following directive:

<%@ page contentType="text/html; charset=UTF-8" %>

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值