java公共域,获取URL的二级域(java)

博客探讨在Java中提取URL二级域名(SLD)的方法,提出是否有解析器、库或算法、正则表达式可用。给出示例代码,后指出可寻找公共后缀,其下一级域名即为所求,还介绍了Apache Http Component相关处理类。

I am wondering if there is a parser or library in java for extracting the second level domain (SLD) in an URL - or failing that an algo or regex for doing the same. For example:

URI uri = new URI("http://www.mydomain.ltd.uk/blah/some/page.html");

String host = uri.getHost();

System.out.println(host);

which prints:

mydomain.ltd.uk

Now what I'd like to do is robustly identify the SLD ("ltd.uk") component. Any ideas?

Edit: I'm ideally looking for a general solution, so I'd match ".uk" in "police.uk", ".co.uk" in "bbc.co.uk" and ".com" in "amazon.com".

Thanks

解决方案

Don't know your purpose but Second-Level Domain may not mean much to you. You probably need to find public suffix and the domain right below it is what you are looking for.

Apache Http Component (HttpClient 4) comes with classes to handle this,

org.apache.http.impl.cookie.PublicSuffixFilter

org.apache.http.impl.cookie.PublicSuffixListParser

You need to download the public suffix list from here,

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值