插件22:从URL地址读取链接信息

本文介绍了一个PHP插件,用于从指定网址提取所有超链接,并以数组形式返回。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

<?php // Plug-in 22: Get Links From URL

// This is an executable example with additional code supplied
// To obtain just the plug-ins please click on the Download link

$result = PIPHP_GetLinksFromURL("http://pluginphp.com");

echo "<ul>";
for ($j = 0 ; $j < count($result) ; ++$j)
   echo "<li>$result[$j]</li>";

function PIPHP_GetLinksFromURL($page)
{
   // Plug-in 22: get Links From URL
   //
   // This plug-in accepts the URL or a web page and returns
   // an array of all the links found in it. The argument
   // required is:
   //
   //    $page: The web site's main URL

   $contents = @file_get_contents($page);
   if (!$contents) return NULL;
   
   $urls    = array();
   $dom     = new domdocument();
   @$dom    ->loadhtml($contents);
   $xpath   = new domxpath($dom);
   $hrefs   = $xpath->evaluate("/html/body//a");

   for ($j = 0 ; $j < $hrefs->length ; $j++)
      $urls[$j] = PIPHP_RelToAbsURL($page,
         $hrefs->item($j)->getAttribute('href'));

   return $urls;
}

// The below function is repeated here to ensure that it's
// available to the main function which relies on it

function PIPHP_RelToAbsURL($page, $url)
{
   // Plug-in 21: Relative To Absolute URL
   // This plug-in accepts the absolute URL of a web page
   // and a link featured within that page. The link is then
   // turned into an absolute URL which can be independently
   // accessed. Only applies to http:// URLs. Arguments are:
   //    $page: The web page containing the URL
   //    $url:  The URL to convert to absolute

   if (substr($page, 0, 7) != "http://") return $url;
   
   $parse = parse_url($page);
   $root  = $parse['scheme'] . "://" . $parse['host'];
   $p     = strrpos(substr($page, 7), '/');
   
   if ($p) $base = substr($page, 0, $p + 8);
   else $base = "$page/";
   
   if (substr($url, 0, 1) == '/')           $url = $root . $url;
   elseif (substr($url, 0, 7) != "http://") $url = $base . $url;
   
   return $url;
}
?>

插件说明:

本插件接受一个web页面的URL地址,对他进行解析,只寻找"<a href "超链接标签,以数组的形式返回所有找到的超链接地址。他需要一个参数:

$page: 一个web页面的URL地址,包括前导符“http://”和域名。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值