正则表达式:<a href="([^>]+)" class="up">下一页</a></div>
[^>]+ 红字的含义 不匹配">"符号,从而过滤掉了多余的连接
需要匹配的源代码:
<div class="page1 mt20"><a href="http://dili.xilu.com/20180821/1000010001055637.html" class="up" >首页</a> <a href="http://dili.xilu.com/20180821/1000010001055637.html" class="up">上一页</a> <a href="http://dili.xilu.com/20180821/1000010001055637.html" class="up2">1</a> <a class='up1' >2</b> <a href="http://dili.xilu.com/20180821/1000010001055637_3.html" class="up2">3</a> <a href="http://dili.xilu.com/20180821/1000010001055637_4.html" class="up2">4</a> <a href="http://dili.xilu.com/20180821/1000010001055637_5.html" class="up2">5</a> <a href="http://dili.xilu.com/20180821/1000010001055637_6.html" class="up2">6</a> <a href="http://dili.xilu.com/20180821/1000010001055637_3.html" class="up">下一页</a></div>
匹配结果:
<a href="http://dili.xilu.com/20180821/1000010001055637_3.html" class="up">下一页</a></div>
http://dili.xilu.com/20180821/1000010001055637_3.html