Python---爬虫---请求---urljoin自动补全url

最新推荐文章于 2025-07-08 16:11:15 发布

agsddd

最新推荐文章于 2025-07-08 16:11:15 发布

阅读量2.1k

点赞数

CC 4.0 BY-SA版权

分类专栏：爬虫开发

本文链接：https://blog.youkuaiyun.com/weixin_41245276/article/details/87567139

爬虫开发专栏收录该内容

46 篇文章

订阅专栏

本文详细介绍了使用Python的urllib.parse模块中的urljoin函数进行URL地址解析与组合的方法。通过多个示例展示了如何处理相对路径、绝对路径以及不同层级目录的URL组合，是网络爬虫与网页导航开发的重要参考资料。

>>>from urllib.parse import urljoin
>>> urljoin("http://www.chachabei.com/folder/currentpage.html", "anotherpage.html")
'http://www.chachabei.com/folder/anotherpage.html'
>>> urljoin("http://www.chachabei.com/folder/currentpage.html", "/anotherpage.html")
'http://www.chachabei.com/anotherpage.html'
>>> urljoin("http://www.chachabei.com/folder/currentpage.html", "folder2/anotherpage.html")
'http://www.chachabei.com/folder/folder2/anotherpage.html'
>>> urljoin("http://www.chachabei.com/folder/currentpage.html", "/folder2/anotherpage.html")
'http://www.chachabei.com/folder2/anotherpage.html'
>>> urljoin("http://www.chachabei.com/abc/folder/currentpage.html", "/folder2/anotherpage.html")
'http://www.chachabei.com/folder2/anotherpage.html'
>>> urljoin("http://www.chachabei.com/abc/folder/currentpage.html", "../anotherpage.html")
'http://www.chachabei.com/abc/anotherpage.html'

原文：https://blog.youkuaiyun.com/mycms5/article/details/76902041