python url解析path_python 将相对url路径解析为其绝对路径

本文介绍了一个用于处理URL的Python函数,该函数可以将相对URL转换为绝对URL,并且能够处理URL中的各种组件如scheme、netloc等。通过详细的代码实现,展示了如何解析基本URL与待组合URL的不同部分,然后根据RFC3986规范进行路径段的处理及冗余符号的过滤。

def urljoin(base, url, allow_fragments=True):

"""Join a base URL and a possibly relative URL to form an absolute

interpretation of the latter."""

if not base:

return url

if not url:

return base

base, url, _coerce_result = _coerce_args(base, url)

bscheme, bnetloc, bpath, bparams, bquery, bfragment = \

urlparse(base, '', allow_fragments)

scheme, netloc, path, params, query, fragment = \

urlparse(url, bscheme, allow_fragments)

if scheme != bscheme or scheme not in uses_relative:

return _coerce_result(url)

if scheme in uses_netloc:

if netloc:

return _coerce_result(urlunparse((scheme, netloc, path,

params, query, fragment)))

netloc = bnetloc

if not path and not params:

path = bpath

params = bparams

if not query:

query = bquery

return _coerce_result(urlunparse((scheme, netloc, path,

params, query, fragment)))

base_parts = bpath.split('/')

if base_parts[-1] != '':

# the last item is not a directory, so will not be taken into account

# in resolving the relative path

del base_parts[-1]

# for rfc3986, ignore all base path should the first character be root.

if path[:1] == '/':

segments = path.split('/')

else:

segments = base_parts + path.split('/')

# filter out elements that would cause redundant slashes on re-joining

# the resolved_path

segments[1:-1] = filter(None, segments[1:-1])

resolved_path = []

for seg in segments:

if seg == '..':

try:

resolved_path.pop()

except IndexError:

# ignore any .. segments that would otherwise cause an IndexError

# when popped from resolved_path if resolving for rfc3986

pass

elif seg == '.':

continue

else:

resolved_path.append(seg)

if segments[-1] in ('.', '..'):

# do some post-processing here. if the last segment was a relative dir,

# then we need to append the trailing '/'

resolved_path.append('')

return _coerce_result(urlunparse((scheme, netloc, '/'.join(

resolved_path) or '/', params, query, fragment)))

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符  | 博主筛选后可见
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值