流量统计求和-mapreduce中自定义类型的序列化接口统计访问网站的所有信息-编程(eight day)

需求:统计输出日志里面所有的人(手机号)访问每个网站所耗上下行流量及总和


 * 本案例实现的功能:演示自定义数据类型如何实现hadoop的序列化接口
1、该类一定要保留空参构造函数
2、write方法中输出字段二进制数据的顺序  要与  readFields方法读取数据的顺序一致

梳理流程:

1、创建一个访问的流量bean

2、先写一个mapper

3、再写一个Reduce

4、还要一个jobsubmitter来运行测试

 

流量bean


public class FlowBean implements Writable {

	private int upFlow;
	private int dFlow;
	private String phone;
	private int amountFlow;

	public FlowBean(){}
	
	public FlowBean(String phone, int upFlow, int dFlow) {
		this.phone = phone;
		this.upFlow = upFlow;
		this.dFlow = dFlow;
		this.amountFlow = upFlow + dFlow;
	}

	public String getPhone() {
		return phone;
	}

	public void setPhone(String phone) {
		this.phone = phone;
	}

	public int getUpFlow() {
		return upFlow;
	}

	public void setUpFlow(int upFlow) {
		this.upFlow = upFlow;
	}

	public int getdFlow() {
		return dFlow;
	}

	public void setdFlow(int dFlow) {
		this.dFlow = dFlow;
	}

	public int getAmountFlow() {
		return amountFlow;
	}

	public void setAmountFlow(int amountFlow) {
		this.amountFlow = amountFlow;
	}

	/**
	 * hadoop系统在序列化该类的对象时要调用的方法
	 */
	@Override
	public void write(DataOutput out) throws IOException {

		out.writeInt(upFlow);
		out.writeUTF(phone);
		out.writeInt(dFlow);
		out.writeInt(amountFlow);

	}

	/**
	 * hadoop系统在反序列化该类的对象时要调用的方法
	 */
	@Override
	public void readFields(DataInput in) throws IOException {
		this.upFlow = in.readInt();
		this.phone = in.readUTF();
		this.dFlow = in.readInt();
		this.amountFlow = in.readInt();
	}

	@Override
	public String toString() {
		 
		return this.phone + ","+this.upFlow +","+ this.dFlow +"," + this.amountFlow;
	}
	
}

 mapper:

public class FlowCountMapper extends Mapper<LongWritable, Text, Text, FlowBean>{
	
	
	@Override
	protected void map(LongWritable key, Text value, Context context)
			throws IOException, InterruptedException {//这个逻辑部分自己根据需求来写,这里只是示例
		
		String line = value.toString();
		String[] fields = line.split("\t");
		
		String phone = fields[1];
		
		int upFlow = Integer.parseInt(fields[fields.length-4]);
		int dFlow = Integer.parseInt(fields[fields.length-3]);
		
		context.write(new Text(phone), new FlowBean(phone, upFlow, dFlow));
	}
}

reducer :

public class FlowCountReducer extends Reducer<Text, FlowBean, Text, FlowBean>{
	/**
	 *  key:是某个手机号
	 *  values:是这个手机号所产生的所有访问记录中的流量数据
	 *  
	 *  <135,flowBean1><135,flowBean2><135,flowBean3><135,flowBean4>
	 */
	@Override
	protected void reduce(Text key, Iterable<FlowBean> values, Reducer<Text, FlowBean, Text, FlowBean>.Context context)
			throws IOException, InterruptedException {
		int upSum = 0;
		int dSum = 0;
		
		for(FlowBean value:values){
			upSum += value.getUpFlow();
			dSum += value.getdFlow();
		}
		context.write(key, new FlowBean(key.toString(), upSum, dSum));
	}
}

jobsubmitter:

public static void main(String[] args) throws Exception {

		Configuration conf = new Configuration();

		Job job = Job.getInstance(conf);

		job.setJarByClass(JobSubmitter.class);

		job.setMapperClass(FlowCountMapper.class);
		job.setReducerClass(FlowCountReducer.class);
		
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(FlowBean.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(FlowBean.class);

		FileInputFormat.setInputPaths(job, new Path("F:\\mrdata\\flow\\input"));
		FileOutputFormat.setOutputPath(job, new Path("F:\\mrdata\\flow\\province-output"));

		job.waitForCompletion(true);

	}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值