关于手机POST发送中文字符的测试

本文探讨了WAP项目中不同手机浏览器提交中文字符时出现的编码问题,并通过实验对比了多种常见编码间的转换效果,提出了针对特定机型的乱码解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近搞一个WAP项目,发现有些手机用POST方式提交中文字符的编码有些奇怪,我的环境是GB18030,支持GBK/GB2312。我开发测试Opera7.60。开发语言JSP/JAVA。

据我所知,JAVA默认传输的字符集是8859_1(单字节字符集),手机上大部分是UTF-8,少部分是GB2312;
关于这些字符集请参考http://www.itican.net/lmy0083/?page_id=88

我主要拿”人”这个汉字测试,

GBK/GB2312(Hex) C8 CB
UTF-8(Hex) E4 BA BA
Unicode 人;

我开发测试以Opera7.60 这个浏览器,比较方便,当然,开发都是按这个浏览器支持的开发的,比如我把opera的字符集设为UTF-8,所有POST的中文字符就会以8859_1 默认的字符集发送(这是单字节字符集,不可能包含中文字符的,中文字符都是双字节,UTF-8为3字节)但是奇怪的是,接收必须
String sPost = new String(request.getParameter(”input_name”).getBytes(”8859_1″),”UTF-8″);
否则不能满足我应用,我不能直接拿
request.getParameter(”input_name”)
来用,当然也不能写成
String sPost = new String(request.getParameter(”input_name”).getBytes(”8859_1″),”其他字符集”);


这样开发,一些手机没问题,可以支持,比如Nokia的6681,但是实际测试发现,有些手机比如N800,Nokia6670,输入中文以这样的方式接收,将会接收到乱码;

于是我拿一些浏览器/模拟器/手机做测试,来看我到底接收到的是那种编码;

测试程序(接收部分,发送部分任意写个input,name为”input_name”):

String sPost= request.getParameter("input_name");
if(sPost== null || sPost.length()==0){sPost= "0";}

String GB2312_TO_UTF_8 = new String(sPost.getBytes("GB2312"),"UTF-8");
String GB2312_TO_GBK = new String(sPost.getBytes("GB2312"),"GBK");
String GB2312_TO_8859_1 = new String(sPost.getBytes("GB2312"),"8859_1");

String GBK_TO_UTF_8 = new String(sPost.getBytes("GBK"),"UTF-8");
String GBK_TO_GB2312 = new String(sPost.getBytes("GBK"),"GB2312");
String GBK_TO_8859_1 = new String(sPost.getBytes("GBK"),"8859_1");

String ISO_8859_1_TO_UTF_8 = new String(sPost.getBytes("8859_1"),"UTF-8");
String ISO_8859_1_TO_GB2312 = new String(sPost.getBytes("8859_1"),"GB2312");
String ISO_8859_1_TO_GBK = new String(sPost.getBytes("8859_1"),"GBK");

String UTF_8_TO_8859_1 = new String(sPost.getBytes("UTF-8"),"8859_1");
String UTF_8_TO_GB2312 = new String(sPost.getBytes("UTF-8"),"GB2312");
String UTF_8_TO_GBK = new String(sPost.getBytes("UTF-8"),"GBK");


System.out.println("<WAP TEST>*********************************************************");
System.out.println("<WAP TEST>ren");
System.out.println("<WAP TEST>          UTF_8  |" + "E4BABA|" + java.net.URLEncoder.encode(sPost,"UTF-8"));
System.out.println("<WAP TEST>          GBK    |" + "C8CB  |" + java.net.URLEncoder.encode(sPost,"GBK"));
System.out.println("<WAP TEST>          GB2312 |" + "C8CB  |" + java.net.URLEncoder.encode(sPost,"GB2312"));
System.out.println("<WAP TEST>          8859_1 |" + "      |" + java.net.URLEncoder.encode(sPost,"8859_1"));

System.out.println("<WAP TEST>GB2312 TO UTF_8  |" + "E4BABA|" + java.net.URLEncoder.encode(GB2312_TO_UTF_8,"UTF-8"));
System.out.println("<WAP TEST>GB2312 TO GBK    |" + "C8CB  |" + java.net.URLEncoder.encode(GB2312_TO_GBK,"GBK"));
System.out.println("<WAP TEST>GB2312 TO 8859_1 |" + "C8CB  |" + java.net.URLEncoder.encode(GB2312_TO_8859_1,"8859_1"));

System.out.println("<WAP TEST>GBK    TO UTF_8  |" + "E4BABA|" + java.net.URLEncoder.encode(GBK_TO_UTF_8,"UTF-8"));
System.out.println("<WAP TEST>GBK    TO GB2312 |" + "C8CB  |" + java.net.URLEncoder.encode(GBK_TO_GB2312,"GB2312"));
System.out.println("<WAP TEST>GBK    TO 8859_1 |" + "C8CB  |" + java.net.URLEncoder.encode(GBK_TO_8859_1,"8859_1"));

System.out.println("<WAP TEST>8859_1 TO UTF_8  |" + "E4BABA|" + java.net.URLEncoder.encode(ISO_8859_1_TO_UTF_8,"UTF-8"));
System.out.println("<WAP TEST>8859_1 TO GB2312 |" + "C8CB  |" + java.net.URLEncoder.encode(ISO_8859_1_TO_GB2312,"GB2312"));
System.out.println("<WAP TEST>8859_1 TO GBK    |" + "C8CB  |" + java.net.URLEncoder.encode(ISO_8859_1_TO_GBK,"GBK"));

System.out.println("<WAP TEST>UTF_8  TO 8859_1 |" + "E4BABA|" + java.net.URLEncoder.encode(UTF_8_TO_8859_1,"8859_1"));
System.out.println("<WAP TEST>UTF_8  TO GB2312 |" + "C8CB  |" + java.net.URLEncoder.encode(UTF_8_TO_GB2312,"GB2312"));
System.out.println("<WAP TEST>UTF_8  TO GBK    |" + "C8CB  |" + java.net.URLEncoder.encode(UTF_8_TO_GBK,"GBK"));
测试结果如下:

N800 Openwave V6.1

<WAP TEST>*********************************************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%E4%BA%BA
<WAP TEST>          GBK    |C8CB  |%C8%CB
<WAP TEST>     bsp;    GB2312 |C8CB  |%C8%CB
<WAP TEST>          8859_1 |      |%3F
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>GB2312 TO GBK    |C8CB  |%C8%CB
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%C8%CB
<WAP TEST>GBK    TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>GBK    TO GB2312 |C8CB  |%C8%CB
<WAP TEST>GBK    TO 8859_1 |C8CB  |%C8%CB
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%3F
<WAP TEST>8859_1 TO GB2312 |C8CB  |%3F
<WAP TEST>8859_1 TO GBK    |C8CB  |%3F
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%E4%BA%BA
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%E4%BA%3F
<WAP TEST>UTF_8  TO GBK    |C8CB  |%E4%BA%3F
Opera UTF-8

<WAP TEST>*********************************************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%C3%A4%C2%BA%C2%BA
<WAP TEST>          GBK    |C8CB  |%3F%3F%3F
<WAP TEST>          GB2312 |C8CB  |%3F%3F%3F
<WAP TEST>          8859_1 |      |%E4%BA%BA
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%3F%3F%3F
<WAP TEST>GB2312 TO GBK    |C8CB  |%3F%3F%3F
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%3F%3F%3F
<WAP TEST>GBK    TO UTF_8  |E4BABA|%3F%3F%3F
<WAP TEST>GBK    TO GB2312 |C8CB  |%3F%3F%3F
<WAP TEST>GBK    TO 8859_1 |C8CB  |%3F%3F%3F
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%E4%BA%BA
<WAP TEST>8859_1 TO GB2312 |C8CB  |%E4%BA%3F
<WAP TEST>8859_1 TO GBK    |C8CB  |%E4%BA%3F
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%C3%A4%C2%BA%C2%BA
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%C3%A4%C2%BA%C2%BA
<WAP TEST>UTF_8  TO GBK    |C8CB  |%C3%A4%C2%BA%C2%BA
Opera GBK/GB2312

<WAP TEST>*********************************************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%C3%88%C3%8B
<WAP TEST>          GBK    |C8CB  |%3F%3F
<WAP TEST>          GB2312 |C8CB  |%3F%3F
<WAP TEST>          8859_1 |      |%C8%CB
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%3F%3F
<WAP TEST>GB2312 TO GBK    |C8CB  |%3F%3F
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%3F%3F
<WAP TEST>GBK    TO UTF_8  |E4BABA|%3F%3F
<WAP TEST>GBK    TO GB2312 |C8CB  |%3F%3F
<WAP TEST>GBK    TO 8859_1 |C8CB  |%3F%3F
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>8859_1 TO GB2312 |C8CB  |%C8%CB
<WAP TEST>8859_1 TO GBK    |C8CB  |%C8%CB
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%C3%88%C3%8B
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%3F%3F%3F%3F
<WAP TEST>UTF_8  TO GBK    |C8CB  |%C3%88%C3%8B
Nokia 6681

<WAP TEST>*********************************************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%C3%A4%C2%BA%C2%BA
<WAP TEST>          GBK    |C8CB  |%3F%3F%3F
<WAP TEST>          GB2312 |C8CB  |%3F%3F%3F
<WAP TEST>          8859_1 |      |%E4%BA%BA
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%3F%3F%3F
<WAP TEST>GB2312 TO GBK    |C8CB  |%3F%3F%3F
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%3F%3F%3F
<WAP TEST>GBK    TO UTF_8  |E4BABA|%3F%3F%3F
<WAP TEST>GBK    TO GB2312 |C8CB  |%3F%3F%3F
<WAP TEST>GBK    TO 8859_1 |C8CB  |%3F%3F%3F
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%E4%BA%BA
<WAP TEST>8859_1 TO GB2312 |C8CB  |%E4%BA%3F
<WAP TEST>8859_1 TO GBK    |C8CB  |%E4%BA%3F
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%C3%A4%C2%BA%C2%BA
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%C3%A4%C2%BA%C2%BA
<WAP TEST>UTF_8  TO GBK    |C8CB  |%C3%A4%C2%BA%C2%BA
Openwave V7 Simulator

<WAP TEST>*****************************

****************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%E4%BA%BA
<WAP TEST>          GBK    |C8CB  |%C8%CB
<WAP TEST>          GB2312 |C8CB  |%C8%CB
<WAP TEST>          8859_1 |      |%3F
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>GB2312 TO GBK    |C8CB  |%C8%CB
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%C8%CB
<WAP TEST>GBK    TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>GBK    TO GB2312 |C8CB  |%C8%CB
<WAP TEST>GBK    TO 8859_1 |C8CB  |%C8%CB
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%3F
<WAP TEST>8859_1 TO GB2312 |C8CB  |%3F
<WAP TEST>8859_1 TO GBK    |C8CB  |%3F
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%E4%BA%BA
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%E4%BA%3F
<WAP TEST>UTF_8  TO GBK    |C8CB  |%E4%BA%3F
Openwave V6.2.2 GB2312

<WAP TEST>*********************************************************
<WAP TEST>ren
<WAP TEST>          UTF_8  |E4BABA|%C3%88%C3%8B
<WAP TEST>          GBK    |C8CB  |%3F%3F
<WAP TEST>          GB2312 |C8CB  |%3F%3F
<WAP TEST>          8859_1 |      |%C8%CB
<WAP TEST>GB2312 TO UTF_8  |E4BABA|%3F%3F
<WAP TEST>GB2312 TO GBK    |C8CB  |%3F%3F
<WAP TEST>GB2312 TO 8859_1 |C8CB  |%3F%3F
<WAP TEST>GBK    TO UTF_8  |E4BABA|%3F%3F
<WAP TEST>GBK    TO GB2312 |C8CB  |%3F%3F
<WAP TEST>GBK    TO 8859_1 |C8CB  |%3F%3F
<WAP TEST>8859_1 TO UTF_8  |E4BABA|%EF%BF%BD%EF%BF%BD
<WAP TEST>8859_1 TO GB2312 |C8CB  |%C8%CB
<WAP TEST>8859_1 TO GBK    |C8CB  |%C8%CB
<WAP TEST>UTF_8  TO 8859_1 |E4BABA|%C3%88%C3%8B
<WAP TEST>UTF_8  TO GB2312 |C8CB  |%3F%3F%3F%3F
<WAP TEST>UTF_8  TO GBK    |C8CB  |%C3%88%C3%8B
由测试结果可以看出:
1 N800 (Openwave V6.1)接收的Post中文字符 可以转换为UTF-8,GB2312/GBK都可以,都正确。
2 Opera UTF-8 只能采取8859_1转换UTF-8形式,某些手机是这样,可以支持。
3 Opera GBK/GB2312 只能采取8859_1转换GB2312/GBK形式,某些手机是这样,可以支持。
4 Nokia 6681 该手机就是和Opera UTF-8 测试结果关键部分相同,实际测试也是这样;
5 Openwave V7 Simulator测试结果同N800 (Openwave V6.1);
6 Openwave V6.2.2 GB2312 测试结果同Opera GBK/GB2312 ;

然而我是按Opera UTF-8 模式开发的(当初想的比较少),所以我开发的程序只适合2,4;起码N800就不支持;

解决方法:

if(request.getMethod().equals("POST")){
       //具体代码不一一列举
       //判断是POST过来的中文字符
       //具体方法如下,先认为该字符的编码按8859_1   encode,然后匹配该字符的hex,如果有 %3F(问号)则认为该字符是乱码,则将该字符转换为
  sPost = new String(sPost .getBytes("utf-8"),"8859_1");
}
这个解决办法可以解决N800,Nokia6670等手机的中文输入乱码问题(一系列手机);
由于测试手机太少,不能断定该方法可以解决全部手机的中文乱码问题,只是权宜之计;
本测试和系统环境,以及程序编码环境等诸多因素相关;

大家还有什么好的建议和方法呢??

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值