【httpclient编写爬虫】post提交json数据和普通键值

本文对比了两种常见的POST提交方式:键值对提交与JSON数据提交,并提供了使用Apache HttpClient实现这两种提交方式的具体示例代码。

写在开头

在开发爬虫的过程中,难免碰到post提交的问题。
本文比较了两种数据提交方式,并且使用httpclient模拟网站post提交两种数据。

我见过的post提交方式有两种:

  1. 普通的键值对提交方式;
  2. 提交json数据。

我所使用的httpclient版本

<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.2</version>
</dependency>

普通键值对的提交方式

CloseableHttpClient httpclient = HttpClients.createDefault();

HttpPost httpPost = new HttpPost("http://targethost/login");
List<NameValuePair> nvps = new ArrayList<NameValuePair>();
nvps.add(new BasicNameValuePair("username", "vip"));
nvps.add(new BasicNameValuePair("password", "secret"));
httpPost.setEntity(new UrlEncodedFormEntity(nvps));
CloseableHttpResponse response2 = httpclient.execute(httpPost);

try {
    System.out.println(response2.getStatusLine());
    HttpEntity entity2 = response2.getEntity();
    // do something useful with the response body
    // and ensure it is fully consumed
    EntityUtils.consume(entity2);
} finally {
    response2.close();
}

JSON数据提交方式

要提交的数据

{
    "username" : "vip",
    "password" : "secret"
}

代码

import org.apache.http.Consts;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

import java.io.IOException;

/**
 * Created by CarlZhang on 2017/1/1.
 */
public class PostJsonTest {
    public static void main(String[] args) {
        CloseableHttpClient httpclient = HttpClients.createDefault();
        try {

            HttpPost httpPost = new HttpPost("http://targethost/login");

            //json数据{"username":"vip","password":"secret"}
            String jsonStr  = "{\"username\":\"vip\",\"password\":\"secret\"}";

            StringEntity se = new StringEntity(jsonStr, Consts.UTF_8);
            se.setContentEncoding("UTF-8");
            se.setContentType("application/json");

            httpPost.setEntity(se);
            CloseableHttpResponse response2 = httpclient.execute(httpPost);

            try {
                System.out.println(response2.getStatusLine());
                HttpEntity entity2 = response2.getEntity();
                // do something useful with the response body
                // and ensure it is fully consumed
                //EntityUtils.consume(entity2);
                String res = EntityUtils.toString(entity2);
                System.out.println(res);
            } finally {
                response2.close();
            }

        } catch (IOException e) {
            e.printStackTrace();
        }finally {
            try {
                httpclient.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

实例-JSON提交

如下是我在某个网站点击发帖,然后在chrome (按住F12键)打开的dubug工具,可以看到我提交的post请求。其中Form Data就是使用post提交的json数据。
这里写图片描述

后端怎么拿到这些数据的呢?
该网站使用的开源库latke中的Requests类
可以看到,它是通过Reader对象去读入流数据的。

 /**
     * Gets the request json object with the specified request.
     *
     * @param request the specified request
     * @param response the specified response, sets its content type with "application/json"
     * @return a json object
     * @throws ServletException servlet exception
     * @throws IOException io exception
     */
    public static JSONObject parseRequestJSONObject(final HttpServletRequest request, final HttpServletResponse response)
        throws ServletException, IOException {
        response.setContentType("application/json");

        final StringBuilder sb = new StringBuilder();
        BufferedReader reader;

        final String errMsg = "Can not parse request[requestURI=" + request.getRequestURI() + ", method=" + request.getMethod()
            + "], returns an empty json object";

        try {
            try {
                reader = request.getReader();
            } catch (final IllegalStateException illegalStateException) {
                reader = new BufferedReader(new InputStreamReader(request.getInputStream()));
            }

            String line = reader.readLine();

            while (null != line) {
                sb.append(line);
                line = reader.readLine();
            }
            reader.close();

            String tmp = sb.toString();

            if (Strings.isEmptyOrNull(tmp)) {
                tmp = "{}";
            }

            return new JSONObject(tmp);
        } catch (final Exception ex) {
            LOGGER.log(Level.ERROR, errMsg, ex);

            return new JSONObject();
        }
    }

另外,前端js代码是通过jquery的类库去提交的json数据
这里写图片描述

实例-普通post提交

这里写图片描述

有兴趣可以查看我这篇文章【js类库AngularJs】解决angular+springmvc的post提交问题

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值