Run hadoop example

最新推荐文章于 2024-02-17 21:05:10 发布

weixin_34186128

最新推荐文章于 2024-02-17 21:05:10 发布

阅读量204

点赞数

CC 4.0 BY-SA版权

文章标签：大数据数据库

原文链接：http://www.cnblogs.com/licheng/archive/2011/11/08/2241780.html

本文详细介绍了Hadoop MapReduce的各种示例程序，包括单词计数(wordcount)、聚合单词计数(aggregatewordcount)、数据库计数(dbcount)等，展示了如何使用Hadoop进行大数据处理和分析的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

root@u1:/home/sa/hod/hadoop-0.20.1# bin/hadoop fs -put ./conf/core-site.xml /input
root@u1:/home/sa/hod/hadoop-0.20.1# bin/hadoop jar hadoop-*-examples.jar wordcount /input /output2

root@u1:/home/sa/hod/hadoop-0.20.1# bin/hadoop jar hadoop-*-examples.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
dbcount: An example job that count the pageview counts from a database.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using monte-carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sleep: A job that sleeps at each map and reduce task.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.