bs4 里提取a标签里的坑啊

本文分享了使用BeautifulSoup(bs4)提取网页中a标签的href属性时遇到的问题及解决方案。当某些a标签缺失href属性时,采用get方法而非[]方式可以有效避免程序错误。

bs4 里提取a标签里的坑啊

今天遇到了一个很坑的事情

使用bs4(全称:BeautifulSoup)提取一个网页里所有a标签里的href属性

比较坑的地方是这个网页里有的a标签里没有href属性,所以一运行循环提取a标签的href就可劲的报错,搞得人很焦灼
我尝试用控制循环的次数和字符串的替换来跳过这个坑。

你懂的

都失败了

最后

我将提取href的方式从[‘href’]改为get(‘href’)就逃过这个坑!!!!!!

这个bug!!!好坑!!!

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" id="extr-page"> <head> <base href="http://10.50.64.10:8080/"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no"> <title>VionTool MES系统登录</title> <link rel="shortcut icon" href="/staticResource/images/vt_icon.ico" type="image/x-icon"> <link href="/staticResource/css/login.css?v=20250609015108" rel="stylesheet" rev="stylesheet" type="text/css" media="all"/> <link href="/staticResource/css/demo.css?v=20250609015108" rel="stylesheet" rev="stylesheet" type="text/css" media="all"/> <link rel="stylesheet" type="text/css" media="screen" href="/staticResource/vendor/css/bootstrap.min.css?v=20250609015108"> <link rel="stylesheet" type="text/css" media="screen" href="/staticResource/vendor/css/smartadmin-production.min.css?v=20250609015108"> <script src="/staticResource/scripts/jquery/jquery-1.10.2.min.js?v=20250609015108"></script> <link href="/staticResource/styles/jet-login.css?v=20250609015108" rel="stylesheet"/> <link href="/staticResource/styles/font-awesome.min.css?v=20250609015108" rel="stylesheet"/> <script src="/staticResource/scripts/plugins/jquery.md5.js?v=20250609015108"></script> <script src="/staticResource/scripts/plugins/cookie/jquery.cookie.js?v=20250609015108"></script> <script src="/staticResource/scripts/plugins/dialog/dialog.js?v=20250609015108"></script> <script src="/staticResource/scripts/utils/jet-ui.js?v=20250609015108"></script> <link rel="stylesheet" href="http://10.50.64.10:8080//staticResource/vt/js/bootstrap-select/bootstrap-select.min.css?v=20250609015108"/> <script type="text/javascript" src="http://10.50.64.10:8080//staticResource/vt/js/bootstrap-select/bootstrap-select.min.js?v=20250609015108"></script> <script type="text/javascript" src="http://10.50.64.10:8080//staticResource/vt/js/vt-ui.js?v=20250609015108"></script> <script type="text/javascript" src="http://10.50.64.10:8080//staticResource/js/com/visiontool/system/loginInfo/login.js?v=20250609015108"></script> <link rel="stylesheet" href="http://10.50.64.10:8080//staticResource/vt/js/vt-ui.css?v=20250609015108"/> </head> <body style="overflow-x: hidden;"> <header id="header"> <div id="logo-group"> <span id="logo"> <img src="/staticResource/vendor/img/logo.png" alt="江西鸿泰模具股份有限公司"> </span> </div> </header> <div id="main" role="main"> <!-- MAIN CONTENT --> <div id="content" class="container"> <div class="row"> <div class="col-xs-12 col-sm-12 col-md-5 col-lg-6 hidden-xs hidden-sm"> <h1 class="txt-color-red login-header-big">VisionTool MES 生产制造系统</h1> <div class="hero"> <img src="/staticResource/vendor/img/demo/loading.jpg" class="pull-right display-image" alt="" style="width:380px"> </div> </div> <div class="col-xs-12 col-sm-12 col-md-5 col-lg-4"> <div class="well no-padding"> <form action="index.html" id="login-form" class="smart-form client-form"> <header>系统登录认证</header> <fieldset> <section> <label class="label username">用户名</label> <label class="input"> <i class="icon-append fa fa-user"></i> <input id="username" type="username" name="username" placeholder="用户名"> <b class="tooltip tooltip-top-right"><i class="fa fa-user txt-color-teal"></i>请输入用户名</b></label> </section> <section> <label class="label">密码</label> <label class="input"> <i class="icon-append fa fa-lock"></i> <input id="password" type="password" name="password" placeholder="密码"> <b class="tooltip tooltip-top-right"><i class="fa fa-lock txt-color-teal"></i> 请输入密码</b> </label> </section> <section> <label class="label">班次</label> <select id="shiftClass" name="shiftClass" class="selectpicker"></select> </section> <div class="note"> <a onclick="alert('请联系信息管理部:') ">忘记密码了?</a> </div> </section> </fieldset> <footer> <button id="btnlogin" type="button" class="btn btn-primary">登录</button> </footer> </form> </div> </div> </div> </div> </div> <div class="copyright" style="position: absolute; margin-left: 10px ;height: 50px;text-align: center;margin-left: 10px;right: 0;z-index: 10000;bottom: 0;"> <br> <p style="font-size: 15px">Copyright © 2019. 江西鸿泰模具股份有限公司 </p> </div> <script src="/staticResource/vendor/js/bootstrap/bootstrap.min.js?v=20250609015108"></script> </body> </html> 这是我的网页的login_page 怎么提取这个的CSRF令牌
最新发布
06-25
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值