如何获取https的网页

Here’s a simple Java HTTPS client to demonstrate the use of HttpsURLConnection class to print a https URL content and certificate detail.

Access https URL : https://www.google.com/

package com.mkyong.client;
 
import java.net.MalformedURLException;
import java.net.URL;
import java.security.cert.Certificate;
import java.io.*;
 
import javax.net.ssl.HttpsURLConnection;
import javax.net.ssl.SSLPeerUnverifiedException;
 
public class HttpsClient{
 
   public static void main(String[] args)
   {
        new HttpsClient().testIt();
   }
 
   private void testIt(){
 
      String https_url = "https://www.google.com/";
      URL url;
      try {
 
	     url = new URL(https_url);
	     HttpsURLConnection con = (HttpsURLConnection)url.openConnection();
 
	     //dumpl all cert info
	     print_https_cert(con);
 
	     //dump all the content
	     print_content(con);
 
      } catch (MalformedURLException e) {
	     e.printStackTrace();
      } catch (IOException e) {
	     e.printStackTrace();
      }
 
   }
 
   private void print_https_cert(HttpsURLConnection con){
 
    if(con!=null){
 
      try {
 
	System.out.println("Response Code : " + con.getResponseCode());
	System.out.println("Cipher Suite : " + con.getCipherSuite());
	System.out.println("\n");
 
	Certificate[] certs = con.getServerCertificates();
	for(Certificate cert : certs){
	   System.out.println("Cert Type : " + cert.getType());
	   System.out.println("Cert Hash Code : " + cert.hashCode());
	   System.out.println("Cert Public Key Algorithm : " + cert.getPublicKey().getAlgorithm());
	   System.out.println("Cert Public Key Format : " + cert.getPublicKey().getFormat());
	   System.out.println("\n");
	}
 
	} catch (SSLPeerUnverifiedException e) {
		e.printStackTrace();
	} catch (IOException e){
		e.printStackTrace();
	}
 
     }
 
   }
 
   private void print_content(HttpsURLConnection con){
	if(con!=null){
 
	try {
 
	   System.out.println("****** Content of the URL ********");			
	   BufferedReader br = 
		new BufferedReader(
			new InputStreamReader(con.getInputStream()));
 
	   String input;
 
	   while ((input = br.readLine()) != null){
	      System.out.println(input);
	   }
	   br.close();
 
	} catch (IOException e) {
	   e.printStackTrace();
	}
 
       }
 
   }
 
}


Output…

Response Code : 200
Cipher Suite : SSL_RSA_WITH_RC4_128_SHA
 
Cert Type : X.509
Cert Hash Code : 7810131
Cert Public Key Algorithm : RSA
Cert Public Key Format : X.509
 
Cert Type : X.509
Cert Hash Code : 6042770
Cert Public Key Algorithm : RSA
Cert Public Key Format : X.509
 
****** Content of the URL ********
<!doctype html><html><head><meta http-equiv="content-type" ......

转载地址: http://www.blogjava.net/stevenjohn/archive/2012/08/16/385559.html

### Java HTTPS 获取网页数据 为了通过HTTPS协议获取网页的数据,在Java中可以利用`HttpURLConnection`类或者更高级别的库如Apache HttpClient。下面展示了基于`HttpURLConnection`实现的一个简单例子,该实例能够处理SSL连接并读取来自指定URL的内容。 ```java import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import javax.net.ssl.HttpsURLConnection; public class HttpsExample { private static final String USER_AGENT = "Mozilla/5.0"; public static void main(String[] args) throws Exception { // 创建目标URL对象, 此处替换为实际的目标网站链接 URL obj = new URL("https://example.com"); // 打开到此URL的连接,并将其转换成HttpsURLConnection类型以便配置安全参数 HttpsURLConnection con = (HttpsURLConnection) obj.openConnection(); // 设置请求属性模拟浏览器行为 con.setRequestMethod("GET"); con.setRequestProperty("User-Agent", USER_AGENT); int responseCode = con.getResponseCode(); System.out.println("Response Code : " + responseCode); BufferedReader in = null; try { in = new BufferedReader(new InputStreamReader(con.getInputStream())); String inputLine; StringBuilder response = new StringBuilder(); while ((inputLine = in.readLine()) != null) { response.append(inputLine); } // 输出响应内容 System.out.println(response.toString()); } finally { if (in != null) { in.close(); } } } } ``` 上述代码片段说明了如何设置一个基本的HTTPS GET请求[^1]。需要注意的是,当涉及到复杂的场景比如需要处理JavaScript渲染后的页面时,则可能需要用到像Selenium WebDriver这样的工具来驱动真实的浏览器环境执行脚本[^2];而对于简单的静态HTML解析或是API调用来说,直接使用HTTP客户端就足够了。如果希望进一步简化操作流程以及管理多线程下载任务的话,还可以考虑采用专门设计用于网络爬虫框架的解决方案,例如WebMagic所提供的功能[^3]。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值