http://blog.youkuaiyun.com/cnweike/article/details/8076440
- import mechanize
- import cookielib
- br = mechanize.Browser()
- cj = cookielib.LWPCookieJar()
- br.set_cookiejar(cj)
- br.set_handle_equiv(True)
- br.set_handle_gzip(True)
- br.set_handle_redirect(True)
- br.set_handle_referer(True)
- br.set_handle_robots(False)
- br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)
- br.set_debug_http(False)
- br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1')]
- response = br.open('http://xxxx/signon')
- br.select_form(name='loginFrm')
- br.form['userName'] = 'xxx'
- br.form['password'] = 'yyy'
- br.submit()
- print 'login successful!'
- response = br.open('http://xxxx/app/application/attendmanage/vieworiginaldata.jsp')
- br.select_form(name='form1')
- br.form.set_all_readonly(False)
- br.form.action = 'http://xxxx/app/servlet/ViewOriginalDataServlet'
- br.form['fromdate'] = '2012-09-05'
- br.submit()
- print br.response().read()

本文介绍了一个Python脚本,该脚本利用Mechanize和Cookielib库实现自动化登录,并从特定网站抓取数据。通过设置浏览器属性、处理重定向和设置表单数据,最终成功登录并抓取指定日期的数据。
355

被折叠的 条评论
为什么被折叠?



