Ebot
http://www.redaelli.org/matteo-blog/projects/ebot/
Erlang Bot (Ebot) is an opensource web crawler written on top of Erlang, a NOSQL database (Apache CouchDB or Riak), RabbitMQ, Webmachine (Mochiweb), RRDTOOL, .. Using a NOSQL instead of a Relational Database, Ebot can grow easily and cheaply… Ebot is a solid and highly scalable, distribuited and customizable web crawler.

The Ebot crawler project is hosted at http://github.com/matteoredaelli/ebot
Thanks to Ebot crawler I’ve been improving my knowledge about Erlang, the AMQP protocol (RabbitMQ) and NOSQL databases (Apache CouchDB and Riak) with the distribuited map/reduce queries
Below there is an example of a url document generated by the ebot crawler (with apache couchdb backend)

Below you find a sample image of Statistics generated by ebot web crawler using RRDTOOL

ErlangBot是一种开源的Web爬虫,它使用Erlang语言、NoSQL数据库(如Apache CouchDB或Riak)、RabbitMQ、Webmachine(Mochiweb)、RDDTool等构建,能够轻松扩展和部署。该爬虫项目在GitHub上托管,通过分布式Map/Reduce查询提高了知识和技术技能。ErlangBot生成了使用Apache CouchDB后端的URL文档,并利用RRDTOOL生成统计数据。



5376

被折叠的 条评论
为什么被折叠?



