org.jsoup 抓取网页信息转成Map
主要只是介绍,单个jar包的功能,以及使用的方法。并不会深挖,简单使用笔记,一边日后使用能够快速入手。
jar获取主要从mvn工厂中获取。https://mvnrepository.com/
<!--将html转换为Map-->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.10.3</version>
</dependency>
解析一个HTML里的一类标签属性,获取后返回List集合。
@Test
public void getHtmlString() throws IOException {
Document doc = Jsoup.connect("https://blog.youkuaiyun.com/xiaoliang98").get();
Map<String,String> urls=new HashMap<>();
Elements links = doc.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
if (linkHref.contains("article/details")) {
String text = link.text();
urls.put(text,linkHref);
}
}
urls.entrySet().forEach(System.out::println);
System.out.println(urls.size());
}
控制台输出的所有连接
原创 Log4J 日志介绍功能介绍与使用案例帮你简单理解log4j=https://blog.youkuaiyun.com/xiaoliang98/article/details/109673273
原创 java web 02 CSS 后端学习=https://blog.youkuaiyun.com/xiaoliang98/article/details/109564046
Maven基本项目模板配置,从小到大的配置 逐渐累加=https://blog.youkuaiyun.com/xiaoliang98/article/details/110138574
原创 javase 09 Stream=https://blog.youkuaiyun.com/xiaoliang98/article/details/109367329
java web 03 JavaScript基础 53=https://blog.youkuaiyun.com/xiaoliang98/article/details/109565597
原创 Javase 15 XML=https://blog.youkuaiyun.com/xiaoliang98/article/details/109477730
Mybatis SQL语句批量操作与多级缓存 774=https://blog.youkuaiyun.com/xiaoliang98/article/details/109540947
原创 javase 16 网络编程=https://blog.youkuaiyun.com/xiaoliang98/article/details/109547806
转载 10 Ajax 在jQuery中的使用=https://blog.youkuaiyun.com/xiaoliang98/article/details/109907269
原创 Servlet程序 Filter过滤器 图解=https://blog.youkuaiyun.com/xiaoliang98/article/details/109754506
Spring MVC controller数据接收与页面跳转=https://blog.youkuaiyun.com/xiaoliang98/article/details/110246248
java web 01 HTML适用于后端=https://blog.youkuaiyun.com/xiaoliang98/article/details/109563576#comments_13789154
原创 javase 06 枚举=https://blog.youkuaiyun.com/xiaoliang98/article/details/109345988
原创 java web 08 EL表达式 获取java中的对象属性值遍历在页面上=https://blog.youkuaiyun.com/xiaoliang98/article/details/109622634
原创 Mybatis SQL语句批量操作与多级缓存=https://blog.youkuaiyun.com/xiaoliang98/article/details/109540947
javase 17 内部类=https://blog.youkuaiyun.com/xiaoliang98/article/details/109548441#comments_13754566
原创 java web 04 jQuery jQuery api与jquery-1.7.2.js资料分享=https://blog.youkuaiyun.com/xiaoliang98/article/details/109605852
原创 javase 10 IO=https://blog.youkuaiyun.com/xiaoliang98/article/details/109369925
原创 javase 12 反射=https://blog.youkuaiyun.com/xiaoliang98/article/details/109393985
原创 AOP的原理与及三种实现方式=https://blog.youkuaiyun.com/xiaoliang98/article/details/110112244
原创 mybatis 进阶 动态sql与多表连接的案例=https://blog.youkuaiyun.com/xiaoliang98/article/details/109496749
Oracle 上下合并两张表格 207=https://blog.youkuaiyun.com/xiaoliang98/article/details/109424615
原创 Oracle 安装与学习 适合小白入手练习=https://blog.youkuaiyun.com/xiaoliang98/article/details/109407665
原创 Spring MVC controller数据接收与页面跳转=https://blog.youkuaiyun.com/xiaoliang98/article/details/110246248
原创 动态cs直观显示区别:static absolute relative fixed=https://blog.youkuaiyun.com/xiaoliang98/article/details/109608635
原创 jsoup 获取优快云博客的所有链接=https://blog.youkuaiyun.com/xiaoliang98/article/details/110291877
原创 Maven的使用与及Maven的集成=https://blog.youkuaiyun.com/xiaoliang98/article/details/109906484
原创 mysql 的使用笔记 适合已经学过的=https://blog.youkuaiyun.com/xiaoliang98/article/details/109909934
java web 03 JavaScript基础=https://blog.youkuaiyun.com/xiaoliang98/article/details/109565597#comments_13788376
原创 javase 14 正则=https://blog.youkuaiyun.com/xiaoliang98/article/details/109436639
jsoup 获取优快云博客的所有链接=https://blog.youkuaiyun.com/xiaoliang98/article/details/110291877
原创 javase 13 注解=https://blog.youkuaiyun.com/xiaoliang98/article/details/109436350
原创 逆向工程 包含配置文件=https://blog.youkuaiyun.com/xiaoliang98/article/details/109545074
原创 javase 07 容器=https://blog.youkuaiyun.com/xiaoliang98/article/details/109366768
原创 javase 17 内部类=https://blog.youkuaiyun.com/xiaoliang98/article/details/109548441
原创 java web 03 JavaScript基础=https://blog.youkuaiyun.com/xiaoliang98/article/details/109565597
原创 javase 05 泛型=https://blog.youkuaiyun.com/xiaoliang98/article/details/109345942
mybatis 进阶 动态sql与多表连接的案例=https://blog.youkuaiyun.com/xiaoliang98/article/details/109496749#comments_13743811
原创 Servlet 入门解释个人理解 后期会更新修改=https://blog.youkuaiyun.com/xiaoliang98/article/details/109659679
原创 动态代理三种实现方式 与及案例分析=https://blog.youkuaiyun.com/xiaoliang98/article/details/110005253
原创 mybatis 小白入门首选 简单无脑操作带你走进mybatis=https://blog.youkuaiyun.com/xiaoliang98/article/details/109479089
mybatis 进阶 动态sql与多表连接的案例 1035=https://blog.youkuaiyun.com/xiaoliang98/article/details/109496749
原创 Spring 使用c3p0连接mysql8.0版本的环境配置=https://blog.youkuaiyun.com/xiaoliang98/article/details/109708187
java web 04 jQuery jQuery api与jquery-1.7.2.js资料分享=https://blog.youkuaiyun.com/xiaoliang98/article/details/109605852#comments_13806891
原创 java web 01 HTML适用于后端=https://blog.youkuaiyun.com/xiaoliang98/article/details/109563576
原创 JS 登录验证 在前端验证代码格式=https://blog.youkuaiyun.com/xiaoliang98/article/details/109659916
mysql 的使用笔记 适合已经学过的 265=https://blog.youkuaiyun.com/xiaoliang98/article/details/109909934
原创 网页不能复制文本的解决办法=https://blog.youkuaiyun.com/xiaoliang98/article/details/109826696
原创 SpringIOC 原理详解与及注解配置 整合Junit=https://blog.youkuaiyun.com/xiaoliang98/article/details/109767260
原创 Oracle JDBC 详解类实现方法介绍 封装增删改查方法=https://blog.youkuaiyun.com/xiaoliang98/article/details/109458453
原创 Maven基本项目模板配置,从小到大的配置 逐渐累加=https://blog.youkuaiyun.com/xiaoliang98/article/details/110138574
原创 javase 11 多线程=https://blog.youkuaiyun.com/xiaoliang98/article/details/109393177
原创 Oracle 上下合并两张表格=https://blog.youkuaiyun.com/xiaoliang98/article/details/109424615
53