private final static CloseableHttpClient httpClient = HttpClients.createDefault();//模拟客户端,全局静态常量 //抓取网页 public static void testGetUrl(String url)throws IOException{ HttpGet httpGet = new HttpGet(url);//请求方法 CloseableHttpResponse httpResponse = httpClient.execute(httpGet);//客户端发生请求,并返回响应 System.out.println(httpResponse.getStatusLine());//输出响应状态码 HttpEntity entity = httpResponse.getEntity();//得到响应实体 dump(entity);//抓取网页内容 }
private static void dump(HttpEntity entity){ BufferedReader br = null; try { br = new BufferedReader(new InputStreamReader(entity.getContent(), "utf-8")); String str = null; while ((str = br.readLine()) != null){ System.out.println(str); } }catch (IOException e){ e.printStackTrace(); }finally { try{ if(br != null) br.close(); }catch(IOException e){ e.printStackTrace(); } } }
Java网页抓取示例
本文提供了一个使用Java进行网页抓取的示例代码。通过创建一个全局静态的CloseableHttpClient实例来模拟客户端,利用HttpGet发起GET请求并获取网页内容。文章详细展示了如何解析HTTP响应并读取实体中的数据。
2813





