常用WebClient的参数设置
-
//1.创建对象
-
WebClient webClient=newWebClient(BrowserVersion.CHROME);
-
//2.设置参数
-
//启动js
-
webClient.getOptions().setJavaScriptEnabled(true);
-
//关闭css渲染
-
webClient.getOptions().setCssEnabled(false);
-
//启动重定向
-
webClient.getOptions().setRedirectEnabled(true);
-
//启动cookie管理
-
webClient.setCookieManager(newCookieManager());
-
//启动ajax代理
-
webClient.setAjaxController(newNicelyResynchronizingAjaxController());
-
//js运行时错误,是否抛出异常
-
webClient.getOptions().setThrowExceptionOnScriptError(false);
-
//3.获取页面
-
HtmlPage page=webClient.getPage(url);
-
//等待js渲染执行 waitime等待时间(ms)
-
webClient.waitForBackgroundJavaScript(waitime);
斗鱼页面的获取示例:
public void getPage(){ String url = Constants.DOU_YU_URL + XMLTools.getNodeValue(Constants.SETTINGS_PATH,"roomID",""); logger.info("请求地址为:"+url); // String url = "http://www.douyu.com"; try { HtmlPage page = webClient.getPage(url); //logger.info(page.asText()); DomNodeList domNodeList = page.getElementsByTagName("class"); page.getPageEncoding(); logger.info("页面中class的个数"+domNodeList.size()); logger.info("页面编码:"+page.getPageEncoding()); List<HtmlAnchor> listAnchor= page.getAnchors(); for(HtmlAnchor anchor:listAnchor){ logger.info("页面中链接:"+anchor.getHrefAttribute()); } } catch (IOException e) { logger.info(e.getMessage()); }
获取页面中复选框及模拟单击事件部分代码:
WebClient初始化及基本设置、获取页面等参考上面。
URL mainUrl = new URL("https://vendorcentral.amazon.cn/gp/vendor/members/obieeReports/araBasic/ara-basic?ref_=vc_ven-obiee-ara-basic-home_subNav"); HtmlPage mainHtml = webClient.getPage(mainUrl); //模拟点击跳转 HtmlSelect select = (HtmlSelect) mainHtml.getElementById("report-selector"); // select.setSelectedAttribute("/st/vendor/members/analytics/basic/productDetail",true); HtmlOption option = select.getOptionByText("销售和库存 - 商品详情"); HtmlPage page3 = option.click();