攻防世界-T1 Training-WWW-Robots

最新推荐文章于 2025-10-15 16:03:02 发布

原创最新推荐文章于 2025-10-15 16:03:02 发布 · 337 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#web安全 #php

“考古”文章专栏收录该内容

62 篇文章

订阅专栏

本文介绍了如何通过robots.txt文件控制网站爬虫访问，包括理解user-agent和disallow指令，以及实际操作中阻止/disallow指定文件如/fl0g.php的示例。

文章目录

步骤1
步骤二
结束语

步骤1

看到文本——>提取有效信息——>利用有效信息

文本：In this little training challenge, you are going to learn about the Robots_exclusion_standard.
The robots.txt file is used by web crawlers to check if they are allowed to crawl and index your website or only parts of it.
Sometimes these files reveal the directory structure instead protecting the content from being crawled.

Enjoy!

有效信息：robots.txt

利用：这里有效信息为一个txt文件，尝试访问。

反馈：在这里插入图片描述 user-agent:意思为该事件的适用对象
user-agent:* 意思为适用对象为全体对象
disallow: 意思为禁止访问
disallow ：/fl0g.php意思为禁止访问的文件为/fl0g.php
disallow: *意思为禁止访问全体文件