MapReduce之wordcount

Hadoop WordCount 实现
本文介绍了使用Hadoop MapReduce实现WordCount的具体步骤,包括Mapper和Reducer的编写,以及如何运行作业。

Step 1:
导入Hadoop中MapReduce的所有jar包

Step 2:WordCount 的Mapper

public class WCMapper extends Mapper<LongWritable, Text, Text, IntWritable>{

    IntWritable v = new IntWritable(1);

    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {

        String tmp = value.toString();
        String[] arr = tmp.split(" ");
        for(String s : arr){
            if(!"".equals(s)){
                Text k = new Text(s);
                context.write(k, v);
            }
        }
    }
}

Step 3:WordCount的Reducer

public class WCReducer extends Reducer<Text, IntWritable, Text, IntWritable>{

    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context)
            throws IOException, InterruptedException {

        int sum = 0;
        for(IntWritable value : values){
            sum += value.get();
        }
        context.write(key, new IntWritable(sum));
    }
}

Step 4:WordCount的Main—RunJob

public class RunJob {

    public static void main(String[] args) {
        try {
            Configuration conf = new Configuration();
            FileSystem fs = FileSystem.newInstance(conf);

            Job job = Job.getInstance();
            job.setJarByClass(RunJob.class);

            job.setJobName("wordcount");
//          job.setInputFormatClass(KeyValueTextInputFormat.class);

            job.setMapOutputKeyClass(Text.class);
            job.setMapOutputValueClass(IntWritable.class);      

            job.setMapperClass(WCMapper.class);
            job.setReducerClass(WCReducer.class);
            job.setCombinerClass(WCReducer.class);

            job.setNumReduceTasks(3);
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(IntWritable.class);

//          job.setCombinerClass(cls);

            FileInputFormat.addInputPath(job, new Path("/data/"));
            Path output = new Path("/wc");
            if(fs.exists(output)){
                fs.delete(output, true);
            }
            FileOutputFormat.setOutputPath(job, output);

            boolean flag = job.waitForCompletion(true);
            if(flag){
                System.out.println("Job finished !");
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值