记得从0.1版本就使用过,当时还是用的是Apache Hadoop,现在都已经有自己的增强版本了,真的不错。
HDFS – Self healing distributed file system
MapReduce – Powerful, parallel data processing framework
Hadoop Common – a set of utilities that support the Hadoop subprojects
HBase – Hadoop database for random read/write access
Hive – SQL-like queries and tables on large datasets
Pig – Dataflow language and compiler
Oozie – Workflow for interdependent Hadoop jobs
Sqoop – Integrate databases and data warehouses with Hadoop
Flume – Highly reliable, configurable streaming data collection
Zookeeper – Coordination service for distributed applications
Hue – User interface framework and SDK for visual Hadoop applications
下载:http://www.cloudera.com/downloads/
Hadoop 介绍:http://www.sfbayacm.org/wp/wp-content/uploads/2010/01/amr-hadoop-acm-dm-sig-jan2010.pdf

本文介绍了Hadoop及其相关组件,包括HDFS(自愈分布式文件系统)、MapReduce(强大的并行数据处理框架)、Hadoop Common(支持Hadoop子项目的实用工具集)、HBase(Hadoop数据库用于随机读写访问)、Hive(提供类似SQL的查询语言)、Pig(数据流语言和编译器)、Oozie(工作流调度器)、Sqoop(集成数据库和数据仓库与Hadoop)、Flume(可靠且可配置的数据流收集系统)、Zookeeper(协调服务)和Hue(可视化Hadoop应用程序的用户界面框架)。

被折叠的 条评论
为什么被折叠?



