block_cralwer
# save at /etc/nginx/block_cralwer
# then use it `include block_cralwer` at `server` directive
set $fbd 0;
if ($http_user_agent ~* "yandex|Ahref|MJ12bot|XoviBot|SemrushBot|AhrefsBot|Twitterbot|Claritybot|Crawler|Python") {
set $fbd 1;
}
location ~* \/(plus|data|trust|include|shtml|bbs|rank|rxcq|tager) {
set $fbd 1;
}
location ~ ^/(wp-admin|wp-login\.php) {
set $fbd 1;
}
if ($fbd = 1 ) {
return 403;
}
本文介绍了一种使用Nginx配置文件来屏蔽特定爬虫的方法,通过设置条件判断请求头中的User-Agent,有效阻止如Yandex、Ahrefs等搜索引擎爬虫及特定路径的访问,以保护网站资源。
973

被折叠的 条评论
为什么被折叠?



