2.28 MapReduce在实际应用中常见的优化

本文详细介绍了Hadoop性能优化的关键点,包括ReduceTask数量的合理设置、MapTask输出压缩、ShufflePhase参数调整以及map和reduce任务的虚拟CPU分配策略。通过实践案例,指导读者如何针对具体场景进行有效优化。

一、优化的点

  • Reduce Task Number
  • Map Task输出压缩
  • Shuffle Phase 参数
  • map、reduce分配的虚拟CPU

 

二、Reduce Task Number

Reduce Task 默认是一个;

Reduce Task的数目也不是越多越好,实际中需要测试调整,以调整到最优的个数, 如下;

job.setNumReduceTasks(2);

 

三、Map Task输出压缩

上一节已经讲到了;

 

四、Shuffle Phase 参数

具体可参考:mapred-default.xml

可调的有如下几点:

mapreduce.task.io.sort.factor:

<property>
  <name>mapreduce.task.io.sort.factor</name>
  <value>10</value>
  <description>The number of streams to merge at once while sorting
  files.  This determines the number of open file handles.</description>
</property>

 

mapreduce.task.io.sort.mb:

<property>
  <name>mapreduce.task.io.sort.mb</name>
  <value>100</value>
  <description>The total amount of buffer memory to use while sorting 
  files, in megabytes.  By default, gives each merge stream 1MB, which
  should minimize seeks.</description>
</property>

 

mapreduce.map.sort.spill.percent:

<property>
  <name>mapreduce.map.sort.spill.percent</name>
  <value>0.80</value>
  <description>The soft limit in the serialization buffer. Once reached, a
  thread will begin to spill the contents to disk in the background. Note that
  collection will not block if this threshold is exceeded while a spill is
  already in progress, so spills may be larger than this threshold when it is
  set to less than .5</description>
</property>

 

五、map、reduce分配的虚拟CPU

默认都是一个虚拟CPU,实际中也可以调整;

1、map

mapreduce.map.cpu.vcores:

<property>
  <name>mapreduce.map.cpu.vcores</name>
  <value>1</value>
  <description>
      The number of virtual cores required for each map task.
  </description>
</property>

 

2、reduce

mapreduce.reduce.cpu.vcores:

<property>
  <name>mapreduce.reduce.cpu.vcores</name>
  <value>1</value>
  <description>
      The number of virtual cores required for each reduce task.
  </description>
</property>

转载于:https://www.cnblogs.com/weiyiming007/p/10717143.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值