php查找txt文本里的关键词,利用php抓取批量关键词百度推广广告中网址保存在txt文件中...

本文介绍了一种使用PHP脚本分析特定关键词在搜索引擎中广告竞争程度的方法。通过抓取百度搜索结果页面并解析其中的广告数据,该脚本能够评估关键词的广告热度。此过程涉及到了网络爬虫技术的应用及正则表达式的使用。

结合服务器的定时任务可以定时查找关键词广告的竞争程度

php代码<?php

$fp = @fopen("semallurl.txt", "a+");

$kws1="上海酒店,北京酒店,广州酒店,天津酒店,广州酒店";

$kws=explode(",",$kws1);

foreach ($kws as $kw){

$keywords=$kw;

$enkeywords=urlencode($keywords);

$pageURL="http://www.baidu.com/s?word=$enkeywords";

$contents=fetch($pageURL); /*抓取页面*/

$contents=preg_replace ('/

$contents_left="";

$contents_right="";

$ads_left_green="";

$ads_left_white="";

$contentsbytwoside="";

$ads_right="";/*变量初始化*/

$contentsbytwoside=explode('

$contents_left=$contentsbytwoside[2];

$contents_left='

preg_match_all('/(

preg_match_all('(

]*class=\"EC_im EC_fr EC_PP EC_idea1017 \">.*?)',$contents_right,$ads_right);

echo "------------Keywords ads for".$kw."start ------------------------------------
" ;

fwrite($fp, "----------".$kw . " ads start------------------------- \r\n");

echo "left ads with green background is
";

/*print_r($ads_left_green[0]);*/

foreach ($ads_left_green[0] as $tg1)

{

preg_match('/.*?<\/span>/' , $tg1,$tg11);

fwrite($fp,strip_tags($tg11[0]) . "\r\n");

echo $tg11[0]."
";

};

echo "

-------------
" ;

echo "left ads with white background is
";

/*print_r($ads_left_white[0]);*/

foreach ($ads_left_white[0] as $tg2)

{

preg_match('/.*?<\/span>/' , $tg2,$tg22);

fwrite($fp,strip_tags($tg22[0]) . "\r\n");

echo $tg22[0]."
";

};

echo "

-------------
" ;

echo "right ads with is
";

/*print_r($ads_right[0]);*/

foreach ($ads_right[0] as $tg3)

{

preg_match('/(.*?<\/font>)/' , $tg3,$tg33);

fwrite($fp,strip_tags($tg33[0]) . "\r\n");

echo $tg33[0]."
";

};

echo "---------------Keywords ads for".$kw."END ------------------------------------
" ;

fwrite($fp, "----------".$kw . " ads End------------------------- \r\n");

};

fwrite($fp, date("Y-m-d H:i:s") . " PHP代码自动运行!\r\n");

fclose($fp);

function fetch($Date){

$ch = curl_init();

$timeout = 5;

curl_setopt ($ch, CURLOPT_URL, "$Date");

curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");

curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);

$contents = curl_exec($ch);

curl_close($ch);

return $contents;

}

?>

article_wechat2021.jpg?1111

本文原创发布php中文网,转载请注明出处,感谢您的尊重!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值