收集了一些关于Hadoop的资料,整理如下:
引自:http://hadoop.apache.org/The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including: 1. Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and support for the MapReduce distributed computing metaphor. HBase builds on Hadoop Core to provide a scalable, distributed database. 2. Pig is a high-level data-flow language and execution framework for parallel computation. It is built on top of Hadoop Core. 3.
English:
2. http://en.wikipedia.org/wiki/Hadoop
3. http://radar.oreilly.com/archives/2007/08/yahoos-bet-on-hadoop.html
4. http://code.google.com/edu/submissions/uwspr2007_clustercourse/listing.html
5. Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins.” Pig Latin: A Not-So-Foreign Language for Data Processing”. Sigmod08.
6. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. “Bigtable: A Distributed Storage System for Structured Data”. OSDI'06
7. The Hadoop Distributed File System: Architecture and Design. Dhruba Borthakur
8. Cluster Computing for Web-Scale Data Processing. Aaron Kimball Sierra Michels-Slettvet. SIGCSE’08
9. Exploring Large-Data Issues in the Curriculum: A Case Study with MapReduce. Jimmy Lin
中文:
Hadoop中的集群配置和使用技巧
http://hi.baidu.com/shirdrn/blog/item/ddcf7e0319a47a8ed53f7c7e.html
Hadoop分布式文件系统:架构和设计要点(翻译)
http://www.javaeye.com/topic/200508
http://blog.youkuaiyun.com/daidodo/archive/2008/02/24/2116761.aspx