java公共域,获取URL的二级域（java）

最新推荐文章于 2025-10-10 12:00:47 发布

转载最新推荐文章于 2025-10-10 12:00:47 发布 · 278 阅读

文章标签：

#java公共域

博客探讨在Java中提取URL二级域名（SLD）的方法，提出是否有解析器、库或算法、正则表达式可用。给出示例代码，后指出可寻找公共后缀，其下一级域名即为所求，还介绍了Apache Http Component相关处理类。

I am wondering if there is a parser or library in java for extracting the second level domain (SLD) in an URL - or failing that an algo or regex for doing the same. For example:

URI uri = new URI("http://www.mydomain.ltd.uk/blah/some/page.html");

String host = uri.getHost();

System.out.println(host);

which prints:

mydomain.ltd.uk

Now what I'd like to do is robustly identify the SLD ("ltd.uk") component. Any ideas?

Edit: I'm ideally looking for a general solution, so I'd match ".uk" in "police.uk", ".co.uk" in "bbc.co.uk" and ".com" in "amazon.com".

Thanks

解决方案

Don't know your purpose but Second-Level Domain may not mean much to you. You probably need to find public suffix and the domain right below it is what you are looking for.

Apache Http Component (HttpClient 4) comes with classes to handle this,

org.apache.http.impl.cookie.PublicSuffixFilter

org.apache.http.impl.cookie.PublicSuffixListParser

You need to download the public suffix list from here,