MapReduce实际案例,MapTask运行机制,ReduceTask运行机制,MapReduce执行流程,hadoop数据压缩,Join算法的实现

本文详细解析MapReduce的实际案例,如上行流量倒序排序和手机号码分区,深入探讨MapTask和ReduceTask的运行流程,包括数据分区、MapReduce执行流程中的shuffle阶段。同时介绍了hadoop数据压缩的配置和Join算法的实现,包括reduce端和map端的join策略及其优势。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

MapReduce实际案例

上行流量倒序排序

  • 第一步:定义FlowBean实现WritableComparable实现比较排序
    java 的compareTo方法说明
    compareTo 方法用于将当前对象与方法的参数进行比较。
    如果指定的数与参数相等返回 0。
    如果指定的数小于参数返回 -1。
    如果指定的数大于参数返回 1。
    例如:o1.compareTo(o2);
    返回正数的话,当前对象(调用 compareTo 方法的对象 o1)要排在比较对象(compareTo 传参对象 o2)后面,返回负数的话,放在前面。
public class FlowBean implements WritableComparable<FlowBean> {
   
    private Integer upFlow;
    private Integer  downFlow;
    private Integer upCountFlow;
    private Integer downCountFlow;
    public FlowBean() {
   
    }
    public FlowBean(Integer upFlow, Integer downFlow, Integer upCountFlow, Integer downCountFlow) {
   
        this.upFlow = upFlow;
        this.downFlow = downFlow;
        this.upCountFlow = upCountFlow;
        this.downCountFlow = downCountFlow;
    }
    @Override
    public void write(DataOutput out) throws IOException {
   
        out.writeInt(upFlow);
        out.writeInt(downFlow);
        out.writeInt(upCountFlow);
        out.writeInt(downCountFlow);
    }
    @Override
    public void readFields(DataInput in) throws IOException {
   
        upFlow = in.readInt();
        downFlow = in.readInt();
        upCountFlow = in.readInt();
        downCountFlow = in.readInt();
    }
    public Integer getUpFlow() {
   
        return upFlow;
    }
    public void setUpFlow(Integer upFlow) {
   
        this.upFlow = upFlow;
    }
    public Integer getDownFlow() {
   
        return downFlow;
    }
    public void setDownFlow(Integer downFlow) {
   
        this.downFlow = downFlow;
    }
    public Integer getUpCountFlow() {
   
        return upCountFlow;
    }
    public void setUpCountFlow(Integer upCountFlow) {
   
        this.upCountFlow = upCountFlow;
    }
    public Integer getDownCountFlow() {
   
        return downCountFlow;
    }
    public void setDownCountFlow(Integer downCountFlow) {
   
        this.downCountFlow = downCountFlow;
    }
    @Override
    public String toString() {
   
        return upFlow+"\t"+downFlow+"\t"+upCountFlow+"\t"+downCountFlow;
    }
    @Override
    public int compareTo(FlowBean o) {
   
        return this.upCountFlow > o.upCountFlow ?-1:1;
    }
}
  • 第二步:定义FlowMapper
public class FlowMapper extends Mapper<LongWritable,Text,FlowBean,Text> {
   
     Text outKey = new Text();
     FlowBean flowBean = new FlowBean();
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
   
        String[] split = value.toString().split("\t");
        flowBean.setUpFlow(Integer.parseInt(split[1]));
        flowBean.setDownFlow(Integer.parseInt(split[2]));
        flowBean.setUpCountFlow(Integer.parseInt(split[3]));
        flowBean.setDownCountFlow(Integer.parseInt(split[4]));
        outKey.set(split[0]);
        context.write(flowBean,outKey);
    }
}
  • 第三步:定义FlowReducer
public class FlowReducer extends Reducer<FlowBean,Text,Text,FlowBean> {
   
    FlowBean flowBean = new FlowBean();
    @Override
    protected void reduce(FlowBean key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
   
       context.write(values.iterator().next(),key);
    }
}
  • 第四步:程序main函数入口
public class FlowMain extends Configured implements Tool {
   
    @Override
    public int run(String[] args) throws Exception {
   
        Configuration conf = super.getConf();
        conf.set(
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值