What is Methabot?

Methabot是一款开源的高速网络爬虫及命令行工具,支持多种自定义选项,如脚本文件解析、用户自定义文件类型过滤等。其特点包括JavaScript可编程、多线程、MySQL支持等。

The Methabot Project - Index

What is Methabot?

Methabot is an open source web crawler and command line tool optimized for speed. It supports scripted filetype parsing, a wide variety of customization options and is easily configured to fit anyones particular needs.

WEBSITE MOVED: This project has moved to a new website: http://metha-sys.org/

Latest News

Features

Methabot is rich with fine features, some of them, but not all, are listed below.

  • It's fast, designed from the ground and up with speed-optimization in mind.
  • Scriptable through Javascript with E4X
  • User-defined filetype filtering (according to MIME type, file extension or UMEX expression)
  • Multi-threaded
  • Highly configurable from command line
  • Extensible module system, supporting custom data parsers, filters and protocol handlers.
  • MySQL support through the Javascript-MySQL binding (lmm_mysql).
  • Simple yet powerful filtering of URLs through UMEX.
  • Automated downloading
  • Support for automatic cookie handling when running over HTTP
  • Robots Exclusion Standard
  • Reliable, fault-tolerant networking, redirect-loop detection and some spider trap detection
  • Parser chaining, share data easily between C and javascript parsers
  • Unix-friendly interface, piping in and out data for parsing and crawling
  • HTML to XML/ XHTML conversion
  • Portable, tested with success on 32-bit/64-bit Linux 2.6, 32-bit/64-bit FreeBSD 6.x/7.0 and Mac OS X. Should work on almost any Unix-like OS, partial support for Windows. Old versions of Methabot have full support for Windows.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值