如何效率的向一个ul里面添加10000个li

本文介绍了两种在网页开发中高效操作DOM的方法:一种是通过创建文档碎片进行批量添加元素;另一种是利用字符串直接修改innerHTML属性实现快速更新。前者适用于大量元素的添加场景,通过减少DOM操作次数提高效率;后者则适合于元素数量较少的情况,简洁明了。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

方法一:文档碎片

<body>

<ul></ul>

<script>

var oUl = document.getElementsByTagName('ul')[0];

var oFrag = document.createDocumentFragment();

for (var i = 0; i < 10000; i++) {

var oLi = document.createElement('li');

oLi.innerHTML = i;

oFrag.appendChild(oLi);

}

oUl.appendChild(oFrag);

</script>

 

方法二:利用字符串直接写入

<body>

<ul></ul>

<script>

var oUl = document.getElementsByTagName('ul')[0];

 var str = '';

 for (var i = 0; i < 100; i++) {

 str += '<li>' + i + '</li>';

 }

 oUl.innerHTML = str;

</script>

<div data-v-4ef07c39="" class="table"><div data-v-40b64b12="" data-v-4ef07c39="" class="search"><div data-v-40b64b12="" class="left"><div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium el-input--suffix"><!----><input type="text" readonly="readonly" autocomplete="off" placeholder="请选择数据中心" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div><div class="el-select-dropdown el-popper" style="display: none; min-width: 118.859px;"><div class="el-scrollbar" style=""><div class="el-select-dropdown__wrap el-scrollbar__wrap el-scrollbar__wrap--hidden-default"><ul class="el-scrollbar__view el-select-dropdown__list"><!----><li data-v-40b64b12="" class="el-select-dropdown__item"><span>空</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>X1</span></li></ul></div><div class="el-scrollbar__bar is-horizontal"><div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div></div><div class="el-scrollbar__bar is-vertical"><div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div></div></div><!----></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium is-disabled el-input--suffix"><!----><input type="text" disabled="disabled" readonly="readonly" autocomplete="off" placeholder="请选择业务应用" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div><div class="el-select-dropdown el-popper" style="display: none; min-width: 118.859px;"><div class="el-scrollbar" style="display: none;"><div class="el-select-dropdown__wrap el-scrollbar__wrap el-scrollbar__wrap--hidden-default"><ul class="el-scrollbar__view el-select-dropdown__list"></ul></div><div class="el-scrollbar__bar is-horizontal"><div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div></div><div class="el-scrollbar__bar is-vertical"><div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div></div></div><p class="el-select-dropdown__empty"> 无数据 </p></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium is-disabled el-input--suffix"><!----><input type="text" disabled="disabled" readonly="readonly" autocomplete="off" placeholder="请选择子业务应用" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div><div class="el-select-dropdown el-popper" style="display: none; min-width: 118.859px;"><div class="el-scrollbar" style="display: none;"><div class="el-select-dropdown__wrap el-scrollbar__wrap el-scrollbar__wrap--hidden-default"><ul class="el-scrollbar__view el-select-dropdown__list"></ul></div><div class="el-scrollbar__bar is-horizontal"><div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div></div><div class="el-scrollbar__bar is-vertical"><div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div></div></div><p class="el-select-dropdown__empty"> 无数据 </p></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium el-input--suffix"><!----><input type="text" readonly="readonly" autocomplete="off" placeholder="请选择资产类别" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div><div class="el-select-dropdown el-popper" style="display: none; min-width: 118.859px;"><div class="el-scrollbar" style=""><div class="el-select-dropdown__wrap el-scrollbar__wrap el-scrollbar__wrap--hidden-default"><ul class="el-scrollbar__view el-select-dropdown__list"><!----><li data-v-40b64b12="" class="el-select-dropdown__item"><span>空</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>虚拟服务器</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>物理服务器</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>备份一体机</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>时钟服务器</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>数据库服务器</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>存储</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>网络设备</span></li></ul></div><div class="el-scrollbar__bar is-horizontal"><div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div></div><div class="el-scrollbar__bar is-vertical"><div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div></div></div><!----></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium el-input--suffix"><!----><input type="text" readonly="readonly" autocomplete="off" placeholder="请选择资产品牌" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div><div class="el-select-dropdown el-popper" style="display: none; min-width: 118.859px;"><div class="el-scrollbar" style=""><div class="el-select-dropdown__wrap el-scrollbar__wrap el-scrollbar__wrap--hidden-default"><ul class="el-scrollbar__view el-select-dropdown__list"><!----><li data-v-40b64b12="" class="el-select-dropdown__item"><span>空</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>华为</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>华三</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>中兴</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>迪普</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>锐捷</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>博科</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>同有</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>天融信</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>深信服</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>爱数</span></li><li data-v-40b64b12="" class="el-select-dropdown__item"><span>戴尔</span></li></ul></div><div class="el-scrollbar__bar is-horizontal"><div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div></div><div class="el-scrollbar__bar is-vertical"><div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div></div></div><!----></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium el-input--suffix"><!----><input type="text" autocomplete="off" placeholder="请选择状态" class="el-input__inner" readonly="readonly"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div></div> <div data-v-40b64b12="" class="el-select el-select--medium" style="width: 12% !important;"><!----><div class="el-input el-input--medium el-input--suffix"><!----><input type="text" autocomplete="off" placeholder="请选择所在分区" class="el-input__inner" readonly="readonly"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><i class="el-select__caret el-input__icon el-icon-arrow-up"></i><!----><!----><!----><!----><!----></span><!----></span><!----><!----></div></div> <button data-v-40b64b12="" type="button" class="el-button el-button--info btn-5 el-button--default el-button--medium"><!----><!----><span> 条件重置 </span></button></div> <div data-v-40b64b12="" class="right"><div data-v-40b64b12="" class="el-input el-input--medium" style="width: 80%;"><!----><input type="text" autocomplete="off" placeholder="请输入资产名称/IP/细分业务进行搜索" class="el-input__inner"><!----><!----><!----><!----></div> <button data-v-40b64b12="" type="button" class="el-button searchBtn el-button--default el-button--medium"><!----><i class="el-icon-search"></i><!----></button></div></div> <div data-v-4ef07c39="" class="el-table el-table--fit el-table--striped el-table--scrollable-x el-table--enable-row-hover el-table--enable-row-transition el-table--medium" style="width: 100%;"><div class="hidden-columns"><div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div> <div data-v-4ef07c39=""></div></div><div class="el-table__header-wrapper"><table cellspacing="0" cellpadding="0" border="0" class="el-table__header" style="width: 1320px;"><colgroup><col name="el-table_1_column_1" width="50"><col name="el-table_1_column_2" width="230"><col name="el-table_1_column_3" width="120"><col name="el-table_1_column_4" width="180"><col name="el-table_1_column_5" width="160"><col name="el-table_1_column_6" width="80"><col name="el-table_1_column_7" width="120"><col name="el-table_1_column_8" width="100"><col name="el-table_1_column_9" width="100"><col name="el-table_1_column_10" width="80"><col name="el-table_1_column_11" width="100"></colgroup><thead class=""><tr class=""><th colspan="1" rowspan="1" class="el-table_1_column_1 is-leaf"><div class="cell">序号</div></th><th colspan="1" rowspan="1" class="el-table_1_column_2 is-leaf"><div class="cell">资产名称</div></th><th colspan="1" rowspan="1" class="el-table_1_column_3 is-leaf"><div class="cell">IP地址</div></th><th colspan="1" rowspan="1" class="el-table_1_column_4 is-leaf"><div class="cell">监控点名称</div></th><th colspan="1" rowspan="1" class="el-table_1_column_5 is-leaf"><div class="cell">告警时间</div></th><th colspan="1" rowspan="1" class="el-table_1_column_6 is-leaf"><div class="cell">重试次数</div></th><th colspan="1" rowspan="1" class="el-table_1_column_7 is-leaf"><div class="cell">运维单位</div></th><th colspan="1" rowspan="1" class="el-table_1_column_8 is-leaf"><div class="cell">负责人</div></th><th colspan="1" rowspan="1" class="el-table_1_column_9 is-leaf"><div class="cell">联系方式</div></th><th colspan="1" rowspan="1" class="el-table_1_column_10 is-leaf"><div class="cell">详细信息</div></th><th colspan="1" rowspan="1" class="el-table_1_column_11 is-leaf"><div class="cell">告警状态</div></th></tr></thead></table></div><div class="el-table__body-wrapper is-scrolling-left"><table cellspacing="0" cellpadding="0" border="0" class="el-table__body" style="width: 1320px;"><colgroup><col name="el-table_1_column_1" width="50"><col name="el-table_1_column_2" width="230"><col name="el-table_1_column_3" width="120"><col name="el-table_1_column_4" width="180"><col name="el-table_1_column_5" width="160"><col name="el-table_1_column_6" width="80"><col name="el-table_1_column_7" width="120"><col name="el-table_1_column_8" width="100"><col name="el-table_1_column_9" width="100"><col name="el-table_1_column_10" width="80"><col name="el-table_1_column_11" width="100"></colgroup><tbody><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:05</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">2</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:04</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">3</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:04</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">4</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:03</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">5</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:03</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">6</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:02</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">7</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">8</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">9</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 16:31:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">10</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:05</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:04</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">12</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:04</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">13</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:03</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">14</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:02</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">15</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:02</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">16</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">17</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><tr class="el-table__row el-table__row--striped"><td rowspan="1" colspan="1" class="el-table_1_column_1 "><div class="cell">18</div></td><td rowspan="1" colspan="1" class="el-table_1_column_2 "><div class="cell">mesdb1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_3 "><div class="cell">10.10.10.11</div></td><td rowspan="1" colspan="1" class="el-table_1_column_4 "><div class="cell">Oracle行锁争用检查</div></td><td rowspan="1" colspan="1" class="el-table_1_column_5 "><div class="cell">2025-07-17 13:11:01</div></td><td rowspan="1" colspan="1" class="el-table_1_column_6 "><div class="cell">0</div></td><td rowspan="1" colspan="1" class="el-table_1_column_7 "><div class="cell">Not Found</div></td><td rowspan="1" colspan="1" class="el-table_1_column_8 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_9 "><div class="cell">undefined</div></td><td rowspan="1" colspan="1" class="el-table_1_column_10 "><div class="cell">CRITICAL - select count(*) from v$session where event ='enq: tx - row lock contention' : 1</div></td><td rowspan="1" colspan="1" class="el-table_1_column_11 "><div class="cell"><div data-v-4ef07c39="" class="btn-2"> 严重告警 </div></div></td></tr><!----></tbody></table><!----><!----></div><!----><!----><!----><!----><div class="el-table__column-resize-proxy" style="display: none;"></div></div> <div data-v-4ef07c39="" class="page"><div data-v-4ef07c39="" class="el-pagination is-background"><span class="el-pagination__total">共 6024 条</span><button type="button" disabled="disabled" class="btn-prev"><i class="el-icon el-icon-arrow-left"></i></button><ul class="el-pager"><li class="number active">1</li><!----><li class="number">2</li><li class="number">3</li><li class="number">4</li><li class="number">5</li><li class="number">6</li><li class="el-icon more btn-quicknext el-icon-more"></li><li class="number">335</li></ul><button type="button" class="btn-next"><i class="el-icon el-icon-arrow-right"></i></button><span class="el-pagination__jump">前往<div class="el-input el-input--medium el-pagination__editor is-in-pagination"><!----><input type="number" autocomplete="off" min="1" max="335" class="el-input__inner"><!----><!----><!----><!----></div>页</span></div></div></div> 爬取这个分页表的全部数据到excel里面
最新发布
07-19
<think>我们使用Java爬取分页表格数据并导出到Excel,需要以下步骤: 1. 发送HTTP请求获取网页内容(分页处理) 2. 解析HTML,提取表格数据(包括表头和每一行的数据) 3. 循环处理每一页 4. 将数据写入Excel文件 我们将使用以下库: - Jsoup: 用于解析HTML和提取数据 - Apache POI: 用于操作Excel文件 步骤: 1. 添加依赖(Maven): - Jsoup: 最新版本(如1.15.4) - Apache POI: 最新版本(如5.2.3) 2. 编写代码: a. 分析分页URL规律,构造每一页的URL b. 对于每一页,使用Jsoup获取表格,解析表头和行数据 c. 将数据累积到一个集合中(例如List<List<String>>) d. 使用POI创建Excel工作簿,将数据写入,并保存文件 假设分页URL规律:例如 baseUrl?page=1, baseUrl?page=2 ... 示例代码结构: 依赖添加(pom.xml): ```xml <dependencies> <!-- Jsoup --> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.15.4</version> </dependency> <!-- Apache POI --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>5.2.3</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.3</version> </dependency> </dependencies> ``` Java代码: ```java import org.apache.poi.ss.usermodel.*; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import java.util.List; public class PaginatedTableScraper { public static void main(String[] args) { String baseUrl = "https://example.com/alarm-table?page="; // 替换为实际分页URL int totalPages = 10; // 假设总共10页,可以通过爬取第一页获取总页数,这里简化 // 存储所有数据 List<List<String>> allData = new ArrayList<>(); List<String> headers = null; for (int page = 1; page <= totalPages; page++) { String url = baseUrl + page; System.out.println("爬取页面: " + url); try { Document doc = Jsoup.connect(url) .userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36") .timeout(10000) .get(); // 定位表格,假设表格有id="alarmTable",或者通过其他选择器 Element table = doc.select("table#alarmTable").first(); if (table == null) { // 如果找不到,尝试其他选择器,比如第一个表格 table = doc.select("table").first(); } if (table == null) { System.out.println("第" + page + "页未找到表格"); continue; } // 提取表头(只在第一页提取一次,假设表头每页相同) if (headers == null) { headers = new ArrayList<>(); // 获取表头行,通常在thead中,如果没有则找第一行 Element headerRow = table.select("thead tr").first(); if (headerRow == null) { headerRow = table.select("tr").first(); } if (headerRow != null) { for (Element th : headerRow.select("th")) { headers.add(th.text().trim()); } } } // 提取数据行(tbody中的tr) Elements rows = table.select("tbody tr"); if (rows.isEmpty()) { rows = table.select("tr:not(thead tr)"); // 排除表头行 } for (Element row : rows) { List<String> rowData = new ArrayList<>(); for (Element td : row.select("td")) { rowData.add(td.text().trim()); } // 确保行数据列数与表头一致(避免空行或非数据行) if (rowData.size() == headers.size()) { allData.add(rowData); } } // 避免请求过快,添加延迟 Thread.sleep(1000); } catch (IOException | InterruptedException e) { e.printStackTrace(); } } // 导出到Excel if (headers != null && !allData.isEmpty()) { exportToExcel(headers, allData, "alarm_data.xlsx"); System.out.println("导出完成,共导出" + allData.size() + "行数据"); } else { System.out.println("未获取到数据"); } } private static void exportToExcel(List<String> headers, List<List<String>> data, String fileName) { try (Workbook workbook = new XSSFWorkbook()) { Sheet sheet = workbook.createSheet("告警数据"); // 创建表头行 Row headerRow = sheet.createRow(0); for (int i = 0; i < headers.size(); i++) { Cell cell = headerRow.createCell(i); cell.setCellValue(headers.get(i)); } // 填充数据 for (int i = 0; i < data.size(); i++) { Row row = sheet.createRow(i + 1); List<String> rowData = data.get(i); for (int j = 0; j < rowData.size(); j++) { row.createCell(j).setCellValue(rowData.get(j)); } } // 自动调整列宽(按内容) for (int i = 0; i < headers.size(); i++) { sheet.autoSizeColumn(i); } // 写入文件 try (FileOutputStream outputStream = new FileOutputStream(fileName)) { workbook.write(outputStream); } } catch (IOException e) { e.printStackTrace(); } } } ``` 关键点说明: 1. 分页处理:通过循环构造每一页的URL,这里假设分页参数是page,且知道总页数。实际中,总页数可以通过解析分页控件获取(比如最后一页的页码)。 2. 表头提取:只在第一页提取,后续页面复用。注意表头可能在thead中,也可能在表格的第一行。 3. 数据行提取:排除表头行,只提取数据行(td)。 4. 反爬措施:设置User-Agent和超时时间,并添加请求间隔(1秒)。 5. 导出Excel:使用Apache POI创建XSSFWorkbook(对应.xlsx格式),创建表头行和数据行,并自动调整列宽。 如何获取总页数? 通常,分页控件在网页底部,我们可以从第一页中解析出总页数。例如: ```java // 在第一页解析后,查找分页控件中的最后一页的页码 Element lastPageLink = doc.select("a.page-link:contains(末页)").first(); if (lastPageLink == null) { lastPageLink = doc.select("a.page-link:contains(>>)").first(); // 或者">>"符号 } // 或者通过分页列表中的最后一个数字 Elements pageLinks = doc.select("ul.pagination li.page-item a"); if (!pageLinks.isEmpty()) { String lastPageHref = pageLinks.get(pageLinks.size() - 2).attr("href"); // 注意:排除“下一页”按钮 // 从lastPageHref中提取页码,例如:?page=10 // 这里需要根据实际情况解析 } ``` 改进:在爬取第一页时解析总页数 在循环之前,先获取第一页,解析总页数,然后循环从1到totalPages。 注意:如果网站有反爬机制(如IP限制、验证码等),需要更复杂的处理(如使用代理、处理cookies等)。 相关问题: 1. 如何动态获取分页表格的总页数? 2. 爬取过程中遇到验证码怎么办? 3. 如何提高爬虫的效率和稳定性? 4. 如何将爬取的数据存储到数据库(如MySQL)? 5. 如何定时执行爬虫任务? 请根据实际网页结构调整选择器(如表格选择器、分页选择器等)。</think># Java爬取分页表格数据并导出Excel解决方案 我将提供一个完整的Java解决方案,用于爬取分页表格数据并导出到Excel文件。以下是实现步骤和完整代码: ## 解决方案步骤 1. **添加必要依赖**(Maven配置) 2. **爬取分页数据**(处理分页逻辑) 3. **解析HTML表格**(提取表头和行数据) 4. **导出到Excel**(使用Apache POI) 5. **错误处理与反爬措施** ## 完整实现代码 ```java import org.apache.poi.ss.usermodel.*; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; import java.util.List; import java.util.Random; public class PaginatedTableScraper { public static void main(String[] args) { String baseUrl = "https://example.com/alarm-table?page="; // 替换为实际分页URL int totalPages = 10; // 假设总共10页,实际应根据网页确定 List<List<String>> allData = new ArrayList<>(); List<String> headers = null; try { // 1. 爬取所有分页数据 for (int page = 1; page <= totalPages; page++) { String url = baseUrl + page; System.out.println("爬取页面: " + url); // 添加随机延迟避免被封IP Thread.sleep(1000 + new Random().nextInt(2000)); // 获取网页内容 Document doc = Jsoup.connect(url) .userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36") .timeout(10000) .get(); // 2. 解析表格 Element table = doc.select("table").first(); if (table == null) { System.out.println("第" + page + "页未找到表格"); continue; } // 提取表头(只在第一页提取) if (headers == null) { headers = new ArrayList<>(); Elements headerCells = table.select("thead th"); if (headerCells.isEmpty()) { headerCells = table.select("tr:first-child th"); } for (Element th : headerCells) { headers.add(th.text().trim()); } System.out.println("表头: " + headers); } // 提取数据行 Elements rows = table.select("tbody tr"); if (rows.isEmpty()) { rows = table.select("tr:not(thead tr)"); } for (Element row : rows) { List<String> rowData = new ArrayList<>(); Elements cells = row.select("td"); for (Element cell : cells) { rowData.add(cell.text().trim()); } if (rowData.size() == headers.size()) { allData.add(rowData); } } } // 3. 导出到Excel if (headers != null && !allData.isEmpty()) { exportToExcel(headers, allData, "alarm_data.xlsx"); System.out.println("导出完成,共导出" + allData.size() + "行数据"); } else { System.out.println("未获取到数据"); } } catch (IOException | InterruptedException e) { e.printStackTrace(); } } // 导出到Excel方法 private static void exportToExcel(List<String> headers, List<List<String>> data, String fileName) { try (Workbook workbook = new XSSFWorkbook()) { Sheet sheet = workbook.createSheet("告警数据"); // 创建表头行 Row headerRow = sheet.createRow(0); for (int i = 0; i < headers.size(); i++) { Cell cell = headerRow.createCell(i); cell.setCellValue(headers.get(i)); // 设置表头样式 CellStyle headerStyle = workbook.createCellStyle(); headerStyle.setFillForegroundColor(IndexedColors.GREY_25_PERCENT.getIndex()); headerStyle.setFillPattern(FillPatternType.SOLID_FOREGROUND); Font headerFont = workbook.createFont(); headerFont.setBold(true); headerStyle.setFont(headerFont); cell.setCellStyle(headerStyle); } // 填充数据 for (int i = 0; i < data.size(); i++) { Row row = sheet.createRow(i + 1); List<String> rowData = data.get(i); for (int j = 0; j < rowData.size(); j++) { row.createCell(j).setCellValue(rowData.get(j)); } } // 自动调整列宽 for (int i = 0; i < headers.size(); i++) { sheet.autoSizeColumn(i); } // 写入文件 try (FileOutputStream outputStream = new FileOutputStream(fileName)) { workbook.write(outputStream); } } catch (IOException e) { e.printStackTrace(); } } } ``` ## Maven依赖配置 ```xml <dependencies> <!-- Jsoup HTML解析器 --> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.15.4</version> </dependency> <!-- Apache POI for Excel --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>5.2.3</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.3</version> </dependency> </dependencies> ``` ## 关键功能说明 1. **分页处理**: - 通过循环构造分页URL(如`?page=1`、`?page=2`) - 添加随机延迟避免触发反爬机制 2. **表格解析**: - 使用Jsoup定位表格元素 - 智能处理表头位置(`<thead>`或首行) - 提取所有`<td>`元素内容 3. **Excel导出**: - 使用Apache POI创建XLSX格式文件 - 自动调整列宽适应内容 - 添加表头样式美化输出 4. **反爬措施**: - 设置User-Agent模拟浏览器 - 添加随机延迟(1-3秒) - 支持代理设置(如需) ## 扩展功能建议 1. **动态获取总页数**: ```java // 在解析第一页时获取总页数 Element pagination = doc.select(".pagination").first(); if (pagination != null) { Element lastPage = pagination.select("a:contains(末页)").first(); if (lastPage != null) { String href = lastPage.attr("href"); // 从href中提取总页数 } } ``` 2. **处理登录认证**: ```java // 登录后保存cookie Connection.Response loginRes = Jsoup.connect(loginUrl) .data("username", "your_user", "password", "your_pass") .method(Connection.Method.POST) .execute(); Map<String, String> cookies = loginRes.cookies(); // 后续请求使用cookie Document doc = Jsoup.connect(targetUrl) .cookies(cookies) .get(); ``` 3. **定时任务调度**: ```java // 使用Quartz或Spring Scheduler定时执行爬虫 @Scheduled(cron = "0 0 8 * * ?") // 每天8点执行 public void scheduledScrape() { // 爬取和导出逻辑 } ``` ## 常见问题解决 | 问题类型 | 解决方案 | |---------|---------| | 表格在iframe中 | 定位iframe并提取`src`,单独请求该URL | | 动态加载内容 | 使用Selenium WebDriver渲染JavaScript | | 验证码 | 集成OCR库或手动输入验证码 | | 分页参数加密 | 分析网络请求找到真实分页参数 | | 数据量过大 | 分批导出或直接存储到数据库 |
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值