最近 requests 库的作者,有更新了一个库,requests-html for humans ,试用了一下,效果真的很好。
使得开发者使用的时候更方便调用。它依赖于 PyQuery、Requests、lxml 等库。
不信你也试试
如何安装呢?
pip3 install <span class="wp_keywordlink_affiliate"><a href="https://www.168seo.cn/tag/requests-html" title="View all posts in requests-html" target="_blank">requests-html</a></span>
|
1
|
pip3
install
requests
-
html
|
看下面的教程吧
>>> from requests_html import HTMLSession >>> session = HTMLSession() >>> r = session.get('https://<span class="wp_keywordlink"><a href="http://www.168seo.cn/python" title="python">python</a></span>.org/')
|
1
2
3
4
|
>>>
from
requests_html
import
HTMLSession
>>>
session
=
HTMLSession
(
)
>>>
r
=
session
.
get
(
'https://python.org/'
)
|
获取整个页面的links
>>> r.html.links {'//docs.python.org/3/tutorial/', '/about/apps/', 'https://github.com/python/pythondotorg/issues', '/accounts/login/', '/dev/peps/', '/about/legal/', '//docs.python.org/3/tutorial/introduction.html#lists', '/download/alternatives', 'http://feedproxy.google.com/~r/PythonInsider/~3/kihd2DW98YY/python-370a4-is-available-for-testing.html', '/download/other/', '/downloads/windows/', 'https://mail.python.org/mailman/listinfo/python-dev', '/doc/av', 'https://devguide.python.org/', '/about/success/#engineering', 'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event', 'https://www.openstack.org', '/about/gettingstarted/', 'http://feedproxy.google.com/~r/PythonInsider/~3/AMoBel8b8Mc/python-3.html', '/success-stories/industrial-light-magic-runs-python/', 'http://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator', '/', 'http://pyfound.blogspot.com/', '/events/python-events/past/', '/downloads/release/python-2714/', 'https://wiki.python.org/moin/PythonBooks', 'http://plus.google.com/+Python', 'https://wiki.python.org/moin/', 'https://status.python.org/', '/community/workshops/', '/community/lists/', 'http://buildbot.net/', '/community/awards', 'http://twitter.com/ThePSF', 'https://docs.python.org/3/license.html', '/psf/donations/', 'http://wiki.python.org/moin/Languages', '/dev/', '/events/python-user-group/', 'https://wiki.qt.io/PySide', '/community/sigs/', 'https://wiki.gnome.org/Projects/PyGObject', 'http://www.ansible.com', 'http://www.saltstack.com', 'http://planetpython.org/', '/events/python-events', '/about/help/', '/events/python-user-group/past/', '/about/success/', '/psf-landing/', '/about/apps', '/about/', 'http://www.wxpython.org/', '/events/python-user-group/665/', 'https://www.python.org/psf/codeofconduct/', '/dev/peps/peps.rss', '/downloads/source/', '/psf/sponsorship/sponsors/', 'http://bottlepy.org', 'http://roundup.sourceforge.net/', 'http://pandas.pydata.org/', 'http://brochure.getpython.info/', 'https://bugs.python.org/', '/community/merchandise/', 'http://tornadoweb.org', '/events/python-user-group/650/', 'http://flask.pocoo.org/', '/downloads/release/python-364/', '/events/python-user-group/660/', '/events/python-user-group/638/', '/psf/', '/doc/', 'http://blog.python.org', '/events/python-events/604/', '/about/success/#government', 'http://python.org/dev/peps/', 'https://docs.python.org', 'http://feedproxy.google.com/~r/PythonInsider/~3/zVC80sq9s00/python-364-is-now-available.html', '/users/membership/', '/about/success/#arts', 'https://wiki.python.org/moin/Python2orPython3', '/downloads/', '/jobs/', 'http://trac.edgewall.org/', 'http://feedproxy.google.com/~r/PythonInsider/~3/wh73_1A-N7Q/python-355rc1-and-python-348rc1-are-now.html', '/privacy/', 'https://pypi.python.org/', 'http://www.riverbankcomputing.co.uk/software/pyqt/intro', 'http://www.scipy.org', '/community/forums/', '/about/success/#scientific', '/about/success/#software-development', '/shell/', '/accounts/signup/', 'http://www.facebook.com/pythonlang?fref=ts', '/community/', 'https://kivy.org/', '/about/quotes/', 'http://www.web2py.com/', '/community/logos/', '/community/diversity/', '/events/calendars/', 'https://wiki.python.org/moin/BeginnersGuide', '/success-stories/', '/doc/essays/', '/dev/core-mentorship/', 'http://ipython.org', '/events/', '//docs.python.org/3/tutorial/controlflow.html', '/about/success/#education', '/blogs/', '/community/irc/', 'http://pycon.blogspot.com/', '//jobs.python.org', 'http://www.pylonsproject.org/', 'http://www.djangoproject.com/', '/downloads/mac-osx/', '/about/success/#business', 'http://feedproxy.google.com/~r/PythonInsider/~3/x_c9D0S-4C4/python-370b1-is-now-available-for.html', 'http://wiki.python.org/moin/TkInter', 'https://docs.python.org/faq/', '//docs.python.org/3/tutorial/controlflow.html#defining-functions'}
|
1
2
|
>>>
r
.
html
.
links
{
'//docs.python.org/3/tutorial/'
,
'/about/apps/'
,
'https://github.com/python/pythondotorg/issues'
,
'/accounts/login/'
,
'/dev/peps/'
,
'/about/legal/'
,
'//docs.python.org/3/tutorial/introduction.html#lists'
,
'/download/alternatives'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/kihd2DW98YY/python-370a4-is-available-for-testing.html'
,
'/download/other/'
,
'/downloads/windows/'
,
'https://mail.python.org/mailman/listinfo/python-dev'
,
'/doc/av'
,
'https://devguide.python.org/'
,
'/about/success/#engineering'
,
'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event'
,
'https://www.openstack.org'
,
'/about/gettingstarted/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/AMoBel8b8Mc/python-3.html'
,
'/success-stories/industrial-light-magic-runs-python/'
,
'http://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator'
,
'/'
,
'http://pyfound.blogspot.com/'
,
'/events/python-events/past/'
,
'/downloads/release/python-2714/'
,
'https://wiki.python.org/moin/PythonBooks'
,
'http://plus.google.com/+Python'
,
'https://wiki.python.org/moin/'
,
'https://status.python.org/'
,
'/community/workshops/'
,
'/community/lists/'
,
'http://buildbot.net/'
,
'/community/awards'
,
'http://twitter.com/ThePSF'
,
'https://docs.python.org/3/license.html'
,
'/psf/donations/'
,
'http://wiki.python.org/moin/Languages'
,
'/dev/'
,
'/events/python-user-group/'
,
'https://wiki.qt.io/PySide'
,
'/community/sigs/'
,
'https://wiki.gnome.org/Projects/PyGObject'
,
'http://www.ansible.com'
,
'http://www.saltstack.com'
,
'http://planetpython.org/'
,
'/events/python-events'
,
'/about/help/'
,
'/events/python-user-group/past/'
,
'/about/success/'
,
'/psf-landing/'
,
'/about/apps'
,
'/about/'
,
'http://www.wxpython.org/'
,
'/events/python-user-group/665/'
,
'https://www.python.org/psf/codeofconduct/'
,
'/dev/peps/peps.rss'
,
'/downloads/source/'
,
'/psf/sponsorship/sponsors/'
,
'http://bottlepy.org'
,
'http://roundup.sourceforge.net/'
,
'http://pandas.pydata.org/'
,
'http://brochure.getpython.info/'
,
'https://bugs.python.org/'
,
'/community/merchandise/'
,
'http://tornadoweb.org'
,
'/events/python-user-group/650/'
,
'http://flask.pocoo.org/'
,
'/downloads/release/python-364/'
,
'/events/python-user-group/660/'
,
'/events/python-user-group/638/'
,
'/psf/'
,
'/doc/'
,
'http://blog.python.org'
,
'/events/python-events/604/'
,
'/about/success/#government'
,
'http://python.org/dev/peps/'
,
'https://docs.python.org'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/zVC80sq9s00/python-364-is-now-available.html'
,
'/users/membership/'
,
'/about/success/#arts'
,
'https://wiki.python.org/moin/Python2orPython3'
,
'/downloads/'
,
'/jobs/'
,
'http://trac.edgewall.org/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/wh73_1A-N7Q/python-355rc1-and-python-348rc1-are-now.html'
,
'/privacy/'
,
'https://pypi.python.org/'
,
'http://www.riverbankcomputing.co.uk/software/pyqt/intro'
,
'http://www.scipy.org'
,
'/community/forums/'
,
'/about/success/#scientific'
,
'/about/success/#software-development'
,
'/shell/'
,
'/accounts/signup/'
,
'http://www.facebook.com/pythonlang?fref=ts'
,
'/community/'
,
'https://kivy.org/'
,
'/about/quotes/'
,
'http://www.web2py.com/'
,
'/community/logos/'
,
'/community/diversity/'
,
'/events/calendars/'
,
'https://wiki.python.org/moin/BeginnersGuide'
,
'/success-stories/'
,
'/doc/essays/'
,
'/dev/core-mentorship/'
,
'http://ipython.org'
,
'/events/'
,
'//docs.python.org/3/tutorial/controlflow.html'
,
'/about/success/#education'
,
'/blogs/'
,
'/community/irc/'
,
'http://pycon.blogspot.com/'
,
'//jobs.python.org'
,
'http://www.pylonsproject.org/'
,
'http://www.djangoproject.com/'
,
'/downloads/mac-osx/'
,
'/about/success/#business'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/x_c9D0S-4C4/python-370b1-is-now-available-for.html'
,
'http://wiki.python.org/moin/TkInter'
,
'https://docs.python.org/faq/'
,
'//docs.python.org/3/tutorial/controlflow.html#defining-functions'
}
|
获取整个页面的绝对路径
>>> r.html.absolute_links {'https://github.com/python/pythondotorg/issues', 'https://docs.python.org/3/tutorial/', 'https://www.python.org/about/success/', 'http://feedproxy.google.com/~r/PythonInsider/~3/kihd2DW98YY/python-370a4-is-available-for-testing.html', 'https://www.python.org/dev/peps/', 'https://mail.python.org/mailman/listinfo/python-dev', 'https://www.python.org/doc/', 'https://www.python.org/', 'https://www.python.org/about/', 'https://www.python.org/events/python-events/past/', 'https://devguide.python.org/', 'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event', 'https://www.openstack.org', 'http://feedproxy.google.com/~r/PythonInsider/~3/AMoBel8b8Mc/python-3.html', 'https://docs.python.org/3/tutorial/introduction.html#lists', 'http://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator', 'http://pyfound.blogspot.com/', 'https://wiki.python.org/moin/PythonBooks', 'http://plus.google.com/+Python', 'https://wiki.python.org/moin/', 'https://www.python.org/events/python-events', 'https://status.python.org/', 'https://www.python.org/about/apps', 'https://www.python.org/downloads/release/python-2714/', 'https://www.python.org/psf/donations/', 'http://buildbot.net/', 'http://twitter.com/ThePSF', 'https://docs.python.org/3/license.html', 'http://wiki.python.org/moin/Languages', 'https://docs.python.org/faq/', 'https://jobs.python.org', 'https://www.python.org/about/success/#software-development', 'https://www.python.org/about/success/#education', 'https://www.python.org/community/logos/', 'https://www.python.org/doc/av', 'https://wiki.qt.io/PySide', 'https://www.python.org/events/python-user-group/660/', 'https://wiki.gnome.org/Projects/PyGObject', 'http://www.ansible.com', 'http://www.saltstack.com', 'https://www.python.org/dev/peps/peps.rss', 'http://planetpython.org/', 'https://www.python.org/events/python-user-group/past/', 'https://docs.python.org/3/tutorial/controlflow.html#defining-functions', 'https://www.python.org/community/diversity/', 'https://docs.python.org/3/tutorial/controlflow.html', 'https://www.python.org/community/awards', 'https://www.python.org/events/python-user-group/638/', 'https://www.python.org/about/legal/', 'https://www.python.org/dev/', 'https://www.python.org/download/alternatives', 'https://www.python.org/downloads/', 'https://www.python.org/community/lists/', 'http://www.wxpython.org/', 'https://www.python.org/about/success/#government', 'https://www.python.org/psf/', 'https://www.python.org/psf/codeofconduct/', 'http://bottlepy.org', 'http://roundup.sourceforge.net/', 'http://pandas.pydata.org/', 'http://brochure.getpython.info/', 'https://www.python.org/downloads/source/', 'https://bugs.python.org/', 'https://www.python.org/downloads/mac-osx/', 'https://www.python.org/about/help/', 'http://tornadoweb.org', 'http://flask.pocoo.org/', 'https://www.python.org/users/membership/', 'http://blog.python.org', 'https://www.python.org/privacy/', 'https://www.python.org/about/gettingstarted/', 'http://python.org/dev/peps/', 'https://www.python.org/about/apps/', 'https://docs.python.org', 'https://www.python.org/success-stories/', 'https://www.python.org/community/forums/', 'http://feedproxy.google.com/~r/PythonInsider/~3/zVC80sq9s00/python-364-is-now-available.html', 'https://www.python.org/community/merchandise/', 'https://www.python.org/about/success/#arts', 'https://wiki.python.org/moin/Python2orPython3', 'http://trac.edgewall.org/', 'http://feedproxy.google.com/~r/PythonInsider/~3/wh73_1A-N7Q/python-355rc1-and-python-348rc1-are-now.html', 'https://pypi.python.org/', 'https://www.python.org/events/python-user-group/650/', 'http://www.riverbankcomputing.co.uk/software/pyqt/intro', 'https://www.python.org/about/quotes/', 'https://www.python.org/downloads/windows/', 'https://www.python.org/events/calendars/', 'http://www.scipy.org', 'https://www.python.org/community/workshops/', 'https://www.python.org/blogs/', 'https://www.python.org/accounts/signup/', 'https://www.python.org/events/', 'https://kivy.org/', 'http://www.facebook.com/pythonlang?fref=ts', 'http://www.web2py.com/', 'https://www.python.org/psf/sponsorship/sponsors/', 'https://www.python.org/community/', 'https://www.python.org/download/other/', 'https://www.python.org/psf-landing/', 'https://www.python.org/events/python-user-group/665/', 'https://wiki.python.org/moin/BeginnersGuide', 'https://www.python.org/accounts/login/', 'https://www.python.org/downloads/release/python-364/', 'https://www.python.org/dev/core-mentorship/', 'https://www.python.org/about/success/#business', 'https://www.python.org/community/sigs/', 'https://www.python.org/events/python-user-group/', 'http://ipython.org', 'https://www.python.org/shell/', 'https://www.python.org/community/irc/', 'https://www.python.org/about/success/#engineering', 'http://www.pylonsproject.org/', 'http://pycon.blogspot.com/', 'https://www.python.org/about/success/#scientific', 'https://www.python.org/doc/essays/', 'http://www.djangoproject.com/', 'https://www.python.org/success-stories/industrial-light-magic-runs-python/', 'http://feedproxy.google.com/~r/PythonInsider/~3/x_c9D0S-4C4/python-370b1-is-now-available-for.html', 'http://wiki.python.org/moin/TkInter', 'https://www.python.org/jobs/', 'https://www.python.org/events/python-events/604/'}
|
1
2
|
>>>
r
.
html
.
absolute
_links
{
'https://github.com/python/pythondotorg/issues'
,
'https://docs.python.org/3/tutorial/'
,
'https://www.python.org/about/success/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/kihd2DW98YY/python-370a4-is-available-for-testing.html'
,
'https://www.python.org/dev/peps/'
,
'https://mail.python.org/mailman/listinfo/python-dev'
,
'https://www.python.org/doc/'
,
'https://www.python.org/'
,
'https://www.python.org/about/'
,
'https://www.python.org/events/python-events/past/'
,
'https://devguide.python.org/'
,
'https://wiki.python.org/moin/PythonEventsCalendar#Submitting_an_Event'
,
'https://www.openstack.org'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/AMoBel8b8Mc/python-3.html'
,
'https://docs.python.org/3/tutorial/introduction.html#lists'
,
'http://docs.python.org/3/tutorial/introduction.html#using-python-as-a-calculator'
,
'http://pyfound.blogspot.com/'
,
'https://wiki.python.org/moin/PythonBooks'
,
'http://plus.google.com/+Python'
,
'https://wiki.python.org/moin/'
,
'https://www.python.org/events/python-events'
,
'https://status.python.org/'
,
'https://www.python.org/about/apps'
,
'https://www.python.org/downloads/release/python-2714/'
,
'https://www.python.org/psf/donations/'
,
'http://buildbot.net/'
,
'http://twitter.com/ThePSF'
,
'https://docs.python.org/3/license.html'
,
'http://wiki.python.org/moin/Languages'
,
'https://docs.python.org/faq/'
,
'https://jobs.python.org'
,
'https://www.python.org/about/success/#software-development'
,
'https://www.python.org/about/success/#education'
,
'https://www.python.org/community/logos/'
,
'https://www.python.org/doc/av'
,
'https://wiki.qt.io/PySide'
,
'https://www.python.org/events/python-user-group/660/'
,
'https://wiki.gnome.org/Projects/PyGObject'
,
'http://www.ansible.com'
,
'http://www.saltstack.com'
,
'https://www.python.org/dev/peps/peps.rss'
,
'http://planetpython.org/'
,
'https://www.python.org/events/python-user-group/past/'
,
'https://docs.python.org/3/tutorial/controlflow.html#defining-functions'
,
'https://www.python.org/community/diversity/'
,
'https://docs.python.org/3/tutorial/controlflow.html'
,
'https://www.python.org/community/awards'
,
'https://www.python.org/events/python-user-group/638/'
,
'https://www.python.org/about/legal/'
,
'https://www.python.org/dev/'
,
'https://www.python.org/download/alternatives'
,
'https://www.python.org/downloads/'
,
'https://www.python.org/community/lists/'
,
'http://www.wxpython.org/'
,
'https://www.python.org/about/success/#government'
,
'https://www.python.org/psf/'
,
'https://www.python.org/psf/codeofconduct/'
,
'http://bottlepy.org'
,
'http://roundup.sourceforge.net/'
,
'http://pandas.pydata.org/'
,
'http://brochure.getpython.info/'
,
'https://www.python.org/downloads/source/'
,
'https://bugs.python.org/'
,
'https://www.python.org/downloads/mac-osx/'
,
'https://www.python.org/about/help/'
,
'http://tornadoweb.org'
,
'http://flask.pocoo.org/'
,
'https://www.python.org/users/membership/'
,
'http://blog.python.org'
,
'https://www.python.org/privacy/'
,
'https://www.python.org/about/gettingstarted/'
,
'http://python.org/dev/peps/'
,
'https://www.python.org/about/apps/'
,
'https://docs.python.org'
,
'https://www.python.org/success-stories/'
,
'https://www.python.org/community/forums/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/zVC80sq9s00/python-364-is-now-available.html'
,
'https://www.python.org/community/merchandise/'
,
'https://www.python.org/about/success/#arts'
,
'https://wiki.python.org/moin/Python2orPython3'
,
'http://trac.edgewall.org/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/wh73_1A-N7Q/python-355rc1-and-python-348rc1-are-now.html'
,
'https://pypi.python.org/'
,
'https://www.python.org/events/python-user-group/650/'
,
'http://www.riverbankcomputing.co.uk/software/pyqt/intro'
,
'https://www.python.org/about/quotes/'
,
'https://www.python.org/downloads/windows/'
,
'https://www.python.org/events/calendars/'
,
'http://www.scipy.org'
,
'https://www.python.org/community/workshops/'
,
'https://www.python.org/blogs/'
,
'https://www.python.org/accounts/signup/'
,
'https://www.python.org/events/'
,
'https://kivy.org/'
,
'http://www.facebook.com/pythonlang?fref=ts'
,
'http://www.web2py.com/'
,
'https://www.python.org/psf/sponsorship/sponsors/'
,
'https://www.python.org/community/'
,
'https://www.python.org/download/other/'
,
'https://www.python.org/psf-landing/'
,
'https://www.python.org/events/python-user-group/665/'
,
'https://wiki.python.org/moin/BeginnersGuide'
,
'https://www.python.org/accounts/login/'
,
'https://www.python.org/downloads/release/python-364/'
,
'https://www.python.org/dev/core-mentorship/'
,
'https://www.python.org/about/success/#business'
,
'https://www.python.org/community/sigs/'
,
'https://www.python.org/events/python-user-group/'
,
'http://ipython.org'
,
'https://www.python.org/shell/'
,
'https://www.python.org/community/irc/'
,
'https://www.python.org/about/success/#engineering'
,
'http://www.pylonsproject.org/'
,
'http://pycon.blogspot.com/'
,
'https://www.python.org/about/success/#scientific'
,
'https://www.python.org/doc/essays/'
,
'http://www.djangoproject.com/'
,
'https://www.python.org/success-stories/industrial-light-magic-runs-python/'
,
'http://feedproxy.google.com/~r/PythonInsider/~3/x_c9D0S-4C4/python-370b1-is-now-available-for.html'
,
'http://wiki.python.org/moin/TkInter'
,
'https://www.python.org/jobs/'
,
'https://www.python.org/events/python-events/604/'
}
|
根据css选择器获取元素
>>> about = r.html.find('#about', first=True)
|
1
|
>>>
about
=
r
.
html
.
find
(
'#about'
,
first
=
True
)
|
获取文本
>>> print(about.text) About Applications Quotes Getting Started Help Python Brochure
|
1
2
3
4
5
6
7
|
>>>
print
(
about
.
text
)
About
Applications
Quotes
Getting
Started
Help
Python
Brochure
|
获取元素的属性
>>> about.attrs {'id': 'about', 'class': ('tier-1', 'element-1'), 'aria-haspopup': 'true'}
|
1
2
|
>>>
about
.
attrs
{
'id'
:
'about'
,
'class'
:
(
'tier-1'
,
'element-1'
)
,
'aria-haspopup'
:
'true'
}
|
获取元素的html
>>> about.html '<li aria-haspopup="true" class="tier-1 element-1 " id="about">\n<a class="" href="/about/" title="">About</a>\n<ul aria-hidden="true" class="subnav menu" role="menu">\n<li class="tier-2 element-1" role="treeitem"><a href="/about/apps/" title="">Applications</a></li>\n<li class="tier-2 element-2" role="treeitem"><a href="/about/quotes/" title="">Quotes</a></li>\n<li class="tier-2 element-3" role="treeitem"><a href="/about/gettingstarted/" title="">Getting Started</a></li>\n<li class="tier-2 element-4" role="treeitem"><a href="/about/help/" title="">Help</a></li>\n<li class="tier-2 element-5" role="treeitem"><a href="http://brochure.getpython.info/" title="">Python Brochure</a></li>\n</ul>\n</li>'
|
1
2
|
>>>
about
.
html
'<li aria-haspopup="true" class="tier-1 element-1 " id="about">\n<a class="" href="/about/" title="">About</a>\n<ul aria-hidden="true" class="subnav menu" role="menu">\n<li class="tier-2 element-1" role="treeitem"><a href="/about/apps/" title="">Applications</a></li>\n<li class="tier-2 element-2" role="treeitem"><a href="/about/quotes/" title="">Quotes</a></li>\n<li class="tier-2 element-3" role="treeitem"><a href="/about/gettingstarted/" title="">Getting Started</a></li>\n<li class="tier-2 element-4" role="treeitem"><a href="/about/help/" title="">Help</a></li>\n<li class="tier-2 element-5" role="treeitem"><a href="http://brochure.getpython.info/" title="">Python Brochure</a></li>\n</ul>\n</li>'
|
还有好的方法 大家自己去看吧
关键是支持 JavaScript,利用 Chromium 进行渲染
>>> r = session.get('http://python-requests.org/') >>> r.html.render() >>> r.html.search('Python 2 will retire in only {months} months!')['months'] '<time>25</time>'
|
1
2
3
4
5
6
|
>>>
r
=
session
.
get
(
'http://python-requests.org/'
)
>>>
r
.
html
.
render
(
)
>>>
r
.
html
.
search
(
'Python 2 will retire in only {months} months!'
)
[
'months'
]
'<time>25</time>'
|
支持翻页功能:
>>> r = session.get('https://reddit.com') >>> for html in r.html: ... print(html) <HTML url='https://www.reddit.com/'> <HTML url='https://www.reddit.com/?count=25&after=t3_81puu5'> <HTML url='https://www.reddit.com/?count=50&after=t3_81nevg'> <HTML url='https://www.reddit.com/?count=75&after=t3_81lqtp'> <HTML url='https://www.reddit.com/?count=100&after=t3_81k1c8'> <HTML url='https://www.reddit.com/?count=125&after=t3_81p438'> <HTML url='https://www.reddit.com/?count=150&after=t3_81nrcd'> … >>> r = session.get('https://reddit.com') >>> r.html.next() 'https://www.reddit.com/?count=25&after=t3_81pm82'
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
>>>
r
=
session
.
get
(
'https://reddit.com'
)
>>>
for
html
in
r
.
html
:
.
.
.
print
(
html
)
<
HTML
url
=
'https://www.reddit.com/'
>
<
HTML
url
=
'https://www.reddit.com/?count=25&after=t3_81puu5'
>
<
HTML
url
=
'https://www.reddit.com/?count=50&after=t3_81nevg'
>
<
HTML
url
=
'https://www.reddit.com/?count=75&after=t3_81lqtp'
>
<
HTML
url
=
'https://www.reddit.com/?count=100&after=t3_81k1c8'
>
<
HTML
url
=
'https://www.reddit.com/?count=125&after=t3_81p438'
>
<
HTML
url
=
'https://www.reddit.com/?count=150&after=t3_81nrcd'
>
…
>>>
r
=
session
.
get
(
'https://reddit.com'
)
>>>
r
.
html
.
next
(
)
'https://www.reddit.com/?count=25&after=t3_81pm82'
|
9699

被折叠的 条评论
为什么被折叠?



