单词统计,MapReducer处理数据,写入HBase数据库(案例)

本文介绍了一个使用Hadoop进行WordCount处理并将结果存储到HBase数据库的应用示例。该示例包括了如何配置Hadoop环境,设置Mapper和Reducer以处理文本文件中的单词计数,以及如何将结果写入HBase的详细步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

  • WordCountDemo
package com.word;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

//通过MapperReducer处理数据,然后写入HBase数据库
public class WordCountDemo {
	public static void main(String[] args) {
		Configuration configuration = new Configuration();
		configuration.set("fs.defaultFS","hdfs://node01:8020");//Active NameNode
		configuration.set("yarn.resourcemanager.hostname", "node03:8808");//yarn
		configuration.set("hbase.zookeeper.quorum", "node01,node02,node03");//zookeeper集群
		
		try {
			//配置环境
			Job job = Job.getInstance(configuration);
			job.setJarByClass(WordCountDemo.class);
			
			job.setMapperClass(WCMapper.class);
			job.setMapOutputKeyClass(Text.class);
			job.setMapOutputValueClass(IntWritable.class);
			
			FileInputFormat.addInputPath(job, new Path("/WC/input/word.txt"));
			TableMapReduceUtil.initTableReducerJob("wc", WCReducer.class, job);
			
			if(job.waitForCompletion(true)){
				System.out.println("~~~~~~~ ok ~~~~~~~");
			}
		} catch (IOException e) {
			e.printStackTrace();
		} catch (ClassNotFoundException e) {
			e.printStackTrace();
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
		
	}
}

  • WCMapper
package com.word;

import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;

public class WCMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
	@Override
	protected void map(LongWritable key, Text value, Context context)
			throws IOException, InterruptedException {
		String lines = value.toString();
		StringTokenizer words = new StringTokenizer(lines);
		while(words.hasMoreTokens()){
			context.write(new Text(words.nextToken()), new IntWritable(1));
		}
	}
}

  • WCReducer
package com.sxt.wc;

import java.io.IOException;

import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.hbase.mapreduce.TableReducer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;

public class WCReducer extends	TableReducer<Text, IntWritable, ImmutableBytesWritable> {

	@Override
	protected void reduce(Text text, Iterable<IntWritable> iterable,
			Context context) throws IOException, InterruptedException {
		
		int sum = 0;
		for (IntWritable i : iterable) {
			sum += i.get();
		}
		System.out.println("============="+text.toString());
		
		Put put = new Put(text.toString().getBytes());
		put.add("cf1".getBytes(), "count".getBytes(), (sum+"").getBytes());
		
		context.write(null, put);
	}

}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值