httpclient 使用的几点注意事项
今天在研究httpclient 的使用,发现其中有几个难点需要记录下来
一个是post的中文乱码问题,一个是Cookie的管理问题
在使用过程中, 可以发现按照最基本的方式调用 HttpClient 时, 并不支持 UTF-8 编码格式只能从代码入手(下面是现成的代码我就不多写了)
首先在 PostMethod 中找到了 generateRequestEntity() 方法:
/** */
/**
* Generates a request entity from the post parameters, if present. Calls
* {@link EntityEnclosingMethod#generateRequestBody()} if parameters have not been set.
*
* @since 3.0
*/

protected
RequestEntity generateRequestEntity()
...
{

if (!this.params.isEmpty()) ...{
// Use a ByteArrayRequestEntity instead of a StringRequestEntity.
// This is to avoid potential encoding issues. Form url encoded strings
// are ASCII by definition but the content type may not be. Treating the content
// as bytes allows us to keep the current charset without worrying about how
// this charset will effect the encoding of the form url encoded string.
String content = EncodingUtil.formUrlEncode(getParameters(), getRequestCharSet());
ByteArrayRequestEntity entity = new ByteArrayRequestEntity(
EncodingUtil.getAsciiBytes(content),
FORM_URL_ENCODED_CONTENT_TYPE
);
return entity;

} else ...{
return super.generateRequestEntity();
}
}


原来使用 NameValuePair 加入的 HTTP 请求的参数最终都会转化为 RequestEntity 提交到 HTTP 服务器, 接着在 PostMethod 的父类 EntityEnclosingMethod 中找到了如下的代码:



/** */
/**
* Returns the request's charset. The charset is parsed from the request entity's
* content type, unless the content type header has been set manually.
*
* @see RequestEntity#getContentType()
*
* @since 3.0
*/

public
String getRequestCharSet()
...
{

if (getRequestHeader("Content-Type") == null) ...{
// check the content type from request entity
// We can't call getRequestEntity() since it will probably call
// this method.

if (this.requestEntity != null) ...{
return getContentCharSet(
new Header("Content-Type", requestEntity.getContentType()));

} else ...{
return super.getRequestCharSet();
}

} else ...{
return super.getRequestCharSet();
}
}

解决方案
从上面两段代码可以看出是 HttpClient 是如何依据 "Content-Type" 获得请求的编码(字符集), 而这个编码又是如何应用到提交内容的编码过程中去的. 按照这个原来, 其实我们只需要重载 getRequestCharSet() 方法, 返回我们需要的编码(字符集)名称, 就可以解决 UTF-8 或者其它非默认编码提交 POST 请求时的乱码问题了.
第一次测试
首先在 Tomcat 的webapps 下部署一个应用包含一个页面 test.jsp, 作为测试页面, 主要代码片段如下:
<%
@ page contentType
=
"
text/html;charset=UTF-8
"
%>
<%
//
request.setCharacterEncoding("UTF-8");
System.out.println(
"
Tomcat Console:###
"
+
request.getCharacterEncoding());
String val
=
request.getParameter(
"
TEXT
"
);
System.out.println(
"
Tomcat Console:### The result is
"
+
val);
%>
接着写一个测试类, 主要代码如下:
package
cn.edu.ctgu.ghl.test.main;

import
org.apache.commons.logging.Log;
import
org.apache.commons.logging.LogFactory;

import
java.io.IOException;

import
org.apache.commons.httpclient.
*
;
import
org.apache.commons.httpclient.methods.PostMethod;

public
class
Test
...
{

/** *//**
* Logger for this class
*/
private static final Log logger = LogFactory.getLog(Test.class);


public static void main(String[] args) throws Exception, IOException ...{
String url = "http://localhost:8050/wap/test.jsp";
PostMethod postMethod = getPostMehtod(url);
//填入各个表单域的值

NameValuePair[] data = ...{
new NameValuePair("TEXT", "中文"),
};
//将表单的值放入postMethod中
postMethod.setRequestBody(data);
//执行postMethod
HttpClient httpClient = new HttpClient();
httpClient.executeMethod(postMethod);
postMethod.releaseConnection();
}


private static PostMethod getPostMehtod(String url) ...{
PostMethod postMethod = new PostMethod(url);
//postMethod.addRequestHeader("Content-Type", "UTF-8");
postMethod.setRequestHeader("Content-Type", "UTF-8");

if (logger.isDebugEnabled()) ...{
logger.debug("getPostMehtod(String) - Test Console:###" + postMethod.getRequestCharSet());
logger.debug("getPostMehtod(String) - Test Console:###" + postMethod.getResponseCharSet()); //$NON-NLS-1$
}
return postMethod;
}

/**//* public static class UTF8PostMethod extends PostMethod{
public UTF8PostMethod(String url){
super(url);
}
@Override
public String getRequestCharSet() {
//return super.getRequestCharSet();
return "UTF-8";
}
} */
}
运行这个测试程序, 在 Tomcat 的后台输出中可以正确打印出 :
Tomcat Console:###null
Tomcat Console:### The result is null
而Test后台输出为:
DEBUG [main] (Test.java:36) - getPostMehtod(String) - Test Console:###ISO-8859-1
DEBUG [main] (Test.java:37) - getPostMehtod(String) - Test Console:###ISO-8859-1
有Test后台输出可知postMethod.setRequestHeader("Content-Type", "UTF-8");并没能改变HttpClient请求时向服务器传递参数时的编码为UTF-8,还是默认的iso88591,而由test.jsp里System.out.println("Tomcat Console:###" + request.getCharacterEncoding());输出Tomcat Console:###null可知postMethod.setRequestHeader("Content-Type", "UTF-8");并没有在request里添加请求头("Content-Type", "UTF-8");所以不知道PostMethod.addRequestHeader或者addRequestHeader如何用才有效,而且Tomcat Console:### The result is null,居然得不到提交的数据留作记号,高人指点
第2次测试
test.jsp为
<%
@ page contentType
=
"
text/html;charset=UTF-8
"
%>
<%
//
下面的这行很重要,但是我个人又很糊涂,大部分时间我们可能用HttpClient访问
//
的是别人写的网站,可能是php,jsp.asp,aspx等,在没有源代码的情况下我如何知道别人希望得到的请求参数编码呢
//
如果把下面的注释掉得到的将是烂码
request.setCharacterEncoding(
"
UTF-8
"
);
//
和第一次测试比加了这句
System.out.println(
"
Tomcat Console:###
"
+
request.getCharacterEncoding());
String val
=
request.getParameter(
"
TEXT
"
);
System.out.println(
"
Tomcat Console:### The result is
"
+
val);
%>
Test.java
package
cn.edu.ctgu.edu.ghl;

import
org.apache.commons.logging.Log;
import
org.apache.commons.logging.LogFactory;

import
java.io.IOException;

import
org.apache.commons.httpclient.
*
;
import
org.apache.commons.httpclient.methods.PostMethod;

public
class
Test2
...
{

/** *//**
* Logger for this class
*/
private static final Log logger = LogFactory.getLog(Test2.class);


public static void main(String[] args) throws Exception, IOException ...{
String url = "http://localhost:8050/wap/test.jsp";
PostMethod postMethod = new UTF8PostMethod(url);
//填入各个表单域的值

NameValuePair[] data = ...{
new NameValuePair("TEXT", "中文"),
};
//将表单的值放入postMethod中
postMethod.setRequestBody(data);

if (logger.isDebugEnabled()) ...{
logger.debug("main(String[]) - " + postMethod.getRequestCharSet()); //$NON-NLS-1$
}
//执行postMethod
HttpClient httpClient = new HttpClient();
httpClient.executeMethod(postMethod);
postMethod.releaseConnection();
}
//Inner class for UTF-8 support

public static class UTF8PostMethod extends PostMethod...{

public UTF8PostMethod(String url)...{
super(url);
}
@Override

public String getRequestCharSet() ...{
//return super.getRequestCharSet();
return "UTF-8";
}
}

}
package
cn.edu.ctgu.ghl.test.main;

import
org.apache.commons.logging.Log;
import
org.apache.commons.logging.LogFactory;

import
java.io.IOException;

import
org.apache.commons.httpclient.
*
;
import
org.apache.commons.httpclient.methods.PostMethod;

public
class
Test2
...
{

/** *//**
* Logger for this class
*/
private static final Log logger = LogFactory.getLog(Test2.class);


public static void main(String[] args) throws Exception, IOException ...{
String url = "http://localhost:8050/wap/test.jsp";
PostMethod postMethod = new UTF8PostMethod(url);
//填入各个表单域的值

NameValuePair[] data = ...{
new NameValuePair("TEXT", "中文"),
};
//将表单的值放入postMethod中
postMethod.setRequestBody(data);

if (logger.isDebugEnabled()) ...{
logger.debug("main(String[]) - " + postMethod.getRequestCharSet()); //$NON-NLS-1$
}
//执行postMethod
HttpClient httpClient = new HttpClient();
httpClient.executeMethod(postMethod);
postMethod.releaseConnection();
}
//Inner class for UTF-8 support

public static class UTF8PostMethod extends PostMethod...{

public UTF8PostMethod(String url)...{
super(url);
}
@Override

public String getRequestCharSet() ...{
//return super.getRequestCharSet();
return "UTF-8";
}
}

}
运行这个测试程序, 在 Tomcat 的后台输出中可以正确打印出 :
Tomcat Console:###UTF-8
Tomcat Console:### The result is 中文
而Test后台输出为:
DEBUG [main] (Test2.java:26) - main(String[]) - ISO-8859-1
DEBUG [main] (Test2.java:27) - main(String[]) - UTF-8
可见第二中方法是有效的,但是要求我们自己从PostMethod继承一个方法类
第二个就是Cookie的管理
我有一个 Cookie数组cookie[],一般来说设置cookie应该是:
java 代码
HttpState initialState = new HttpState();
for(int i=0; i < cookie.length; i++){
initialState.addCookie(cookie[i]);
}
但是httpClient必须这样设置:而里面的myName=debugcn完全是多余的,我只是为了方便,反正多加一个cookie过去也不会有什么影响。。。
java 代码
HttpState initialState = new HttpState();
String cookieString = "debugcn";
for (int i = 0; i < cookie.length; i++) {
cookieString += ";" + cookie[i];
}
initialState.addCookie(new Cookie(www.debug.cn, "myName", cookieString, "/", null, false));
不知道为什么会这么搞笑.搞了半天才搞定.一定要写下来做备份