Number of Maps and Reduces

本文详细解释了在Hadoop中,一个作业的Map任务数量是如何由输入切片的数量决定的,并指出mapred.map.tasks参数只是为InputFormat提供的映射数的提示。同时,介绍了如何通过任务跟踪器控制并行执行的Map任务数量。

The number of map tasks for a given job is driven by the number of input splits and not by the mapred.map.tasks parameter. For each input split a map task is spawned. So, over the lifetime of a mapreduce job the number of map tasks is equal to the number of input splits. mapred.map.tasks is just a hint to the InputFormat for the number of maps.

 

 

In your example Hadoop has determined there are 24 input splits and will spawn 24 map jobs in total. But, you can control how many map tasks can be executed in parallel by each of the task tracker.

 

For more information on the number of map and reduce tasks, please look at the below url

http://wiki.apache.org/hadoop/HowManyMapsAndReduces

 

 

 

References

http://stackoverflow.com/questions/6885441/setting-the-number-of-map-tasks-and-reduce-tasks

25/09/19 01:02:07 INFO input.FileInputFormat: Total input paths to process : 3 25/09/19 01:02:07 INFO mapreduce.JobSubmitter: number of splits:3 25/09/19 01:02:07 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled 25/09/19 01:02:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1742282717592_54820905 25/09/19 01:02:08 INFO mapred.YARNRunner: Launch job with jobPriority : 1 25/09/19 01:02:08 INFO impl.YarnClientImpl: Submitted application application_1742282717592_54820905 25/09/19 01:02:08 INFO mapreduce.Job: The url to track the job: http://rm1.offline.hadoop.data.sina.com.cn:9008/proxy/application_1742282717592_54820905/ 25/09/19 01:02:08 INFO mapreduce.Job: Running job: job_1742282717592_54820905 25/09/19 01:02:17 INFO mapreduce.Job: Job job_1742282717592_54820905 running in uber mode : false 25/09/19 01:02:17 INFO mapreduce.Job: map 0% reduce 0% 25/09/19 01:02:31 INFO mapreduce.Job: map 67% reduce 0% 25/09/19 01:02:43 INFO mapreduce.Job: map 78% reduce 0% 25/09/19 01:02:44 INFO mapreduce.Job: map 100% reduce 0% 25/09/19 01:02:51 INFO mapreduce.Job: map 100% reduce 14% 25/09/19 01:02:52 INFO mapreduce.Job: map 100% reduce 45% 25/09/19 01:02:53 INFO mapreduce.Job: map 100% reduce 65% 25/09/19 01:02:54 INFO mapreduce.Job: map 100% reduce 75% 25/09/19 01:02:55 INFO mapreduce.Job: map 100% reduce 82% 25/09/19 01:02:56 INFO mapreduce.Job: map 100% reduce 88% 25/09/19 01:02:57 INFO mapreduce.Job: map 100% reduce 94% 25/09/19 01:02:58 INFO mapreduce.Job: map 100% reduce 96% 25/09/19 01:02:59 INFO mapreduce.Job: map 100% reduce 97% 25/09/19 01:03:02 INFO mapreduce.Job: map 100% reduce 99% 25/09/19 01:03:04 INFO mapreduce.Job: map 100% reduce 100% 25/09/19 01:03:08 INFO mapreduce.Job: Job job_1742282717592_54820905 completed successfully 25/09/19 01:03:08 INFO mapreduce.Job: Counters: 157 File System Counters ADNS: Number of bytes read=0 ADNS: Number of bytes written=0 ADNS: Number of read operations=0 ADNS: Number of large read operations=0 ADNS: Number of write operations=0 DCNS: Number of bytes read=0 DCNS: Number of bytes written=0 DCNS: Number of read operations=0 DCNS: Number of large read operations=0 DCNS: Number of write operations=0 DWNS: Number of bytes read=0 DWNS: Number of bytes written=0 DWNS: Number of read operations=0 DWNS: Number of large read operations=0 DWNS: Number of write operations=0 FILE: Number of bytes read=54467 FILE: Number of bytes written=62792629 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 MBNS: Number of bytes read=0 MBNS: Number of bytes written=0 MBNS: Number of read operations=0 MBNS: Number of large read operations=0 MBNS: Number of write operations=0 N-ADXNS: Number of bytes read=0 N-ADXNS: Number of bytes written=0 N-ADXNS: Number of read operations=0 N-ADXNS: Number of large read operations=0 N-ADXNS: Number of write operations=0 NS1-BACKUP: Number of bytes read=4051010 NS1-BACKUP: Number of bytes written=70804 NS1-BACKUP: Number of read operations=912 NS1-BACKUP: Number of large read operations=0 NS1-BACKUP: Number of write operations=600 NS2-BACKUP: Number of bytes read=0 NS2-BACKUP: Number of bytes written=0 NS2-BACKUP: Number of read operations=0 NS2-BACKUP: Number of large read operations=0 NS2-BACKUP: Number of write operations=0 NS3-BACKUP: Number of bytes read=0 NS3-BACKUP: Number of bytes written=0 NS3-BACKUP: Number of read operations=0 NS3-BACKUP: Number of large read operations=0 NS3-BACKUP: Number of write operations=0 NS4-BACKUP: Number of bytes read=0 NS4-BACKUP: Number of bytes written=0 NS4-BACKUP: Number of read operations=0 NS4-BACKUP: Number of large read operations=0 NS4-BACKUP: Number of write operations=0 VIEWFS: Number of bytes read=0 VIEWFS: Number of bytes written=0 VIEWFS: Number of read operations=0 VIEWFS: Number of large read operations=0 VIEWFS: Number of write operations=0 WBMLNS: Number of bytes read=0 WBMLNS: Number of bytes written=0 WBMLNS: Number of read operations=0 WBMLNS: Number of large read operations=0 WBMLNS: Number of write operations=0 WBNS: Number of bytes read=0 WBNS: Number of bytes written=0 WBNS: Number of read operations=0 WBNS: Number of large read operations=0 WBNS: Number of write operations=0 WBXNS: Number of bytes read=0 WBXNS: Number of bytes written=0 WBXNS: Number of read operations=0 WBXNS: Number of large read operations=0 WBXNS: Number of write operations=0 Job Counters Killed reduce tasks=1 Launched map tasks=3 Launched reduce tasks=300 Rack-local map tasks=3 Total time spent by all maps in occupied slots (ms)=147676 Total time spent by all reduces in occupied slots (ms)=4329292 Total time spent by all map tasks (ms)=73838 Total time spent by all reduce tasks (ms)=2164646 Total vcore-milliseconds taken by all map tasks=73838 Total vcore-milliseconds taken by all reduce tasks=2164646 Total megabyte-milliseconds taken by all map tasks=113415168 Total megabyte-milliseconds taken by all reduce tasks=4433195008 Total gcore-seconds taken by all map tasks=0 Total gcore-seconds taken by all reduce tasks=0 Map-Reduce Framework Map input records=109072 Map output records=2046 Map output bytes=77734 Map output materialized bytes=77037 Input split bytes=1089 Combine input records=0 Combine output records=0 Reduce input groups=1914 Reduce shuffle bytes=77037 Reduce input records=2046 Reduce output records=1914 Spilled Records=4092 Shuffled Maps =900 Failed Shuffles=0 Merged Map outputs=900 GC time elapsed (ms)=44702 CPU time spent (ms)=550080 Physical memory (bytes) snapshot=178229309440 Virtual memory (bytes) snapshot=1151441580032 Total committed heap usage (bytes)=344315658240 Blog Mapper input_count=109072 GameHotKeywordsMapMapper TOTAL_map_output=2046 MeidiKeywordsReducer cartoon_cartoon_259_mid_output=1 cartoon_cartoon_263_mid_output=31 cartoon_cartoon_306_mid_output=80 cartoon_cartoon_307_mid_output=1 cartoon_cartoon_311_mid_output=10 cartoon_cartoon_312_mid_output=17 cartoon_cartoon_321_mid_output=1 cartoon_cartoon_322_mid_output=10 cartoon_cartoon_323_mid_output=4 cartoon_cartoon_328_mid_output=8 cartoon_cartoon_338_mid_output=43 cartoon_cartoon_342_mid_output=11 cartoon_cartoon_346_mid_output=91 cartoon_cartoon_349_mid_output=291 cartoon_cartoon_350_mid_output=216 cartoon_cartoon_351_mid_output=58 cartoon_cartoon_352_mid_output=105 cartoon_cartoon_353_mid_output=14 cartoon_cartoon_355_mid_output=133 cartoon_cartoon_358_mid_output=9 cartoon_cartoon_359_mid_output=10 cartoon_cartoon_360_mid_output=6 cartoon_cartoon_363_mid_output=18 cartoon_cartoon_365_mid_output=195 cartoon_cartoon_367_mid_output=131 cartoon_cartoon_368_mid_output=22 cartoon_cartoon_369_mid_output=5 cartoon_cartoon_371_mid_output=57 cartoon_cartoon_372_mid_output=79 cartoon_cartoon_374_mid_output=50 cartoon_cartoon_378_mid_output=1 cartoon_cartoon_379_mid_output=1 cartoon_cartoon_387_mid_output=6 cartoon_cartoon_388_mid_output=1 cartoon_cartoon_391_mid_output=11 cartoon_cartoon_398_mid_output=12 cartoon_cartoon_399_mid_output=10 cartoon_cartoon_400_mid_output=7 cartoon_cartoon_401_mid_output=13 cartoon_cartoon_402_mid_output=2 cartoon_cartoon_403_mid_output=11 cartoon_cartoon_490_mid_output=132 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 apache setup value cacheFile=36 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0
最新发布
09-20
[root@hadoop01 jars]# hadoop jar film.jar CleanDriver /film/input /film/outputs/cleandata 25/03/29 14:28:19 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/192.168.20.20:8032 25/03/29 14:28:19 INFO input.FileInputFormat: Total input paths to process : 1 25/03/29 14:28:19 INFO mapreduce.JobSubmitter: number of splits:1 25/03/29 14:28:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1743079842141_0006 25/03/29 14:28:20 INFO impl.YarnClientImpl: Submitted application application_1743079842141_0006 25/03/29 14:28:20 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1743079842141_0006/ 25/03/29 14:28:20 INFO mapreduce.Job: Running job: job_1743079842141_0006 25/03/29 14:28:26 INFO mapreduce.Job: Job job_1743079842141_0006 running in uber mode : false 25/03/29 14:28:26 INFO mapreduce.Job: map 0% reduce 0% 25/03/29 14:28:31 INFO mapreduce.Job: map 100% reduce 0% 25/03/29 14:28:36 INFO mapreduce.Job: map 100% reduce 100% 25/03/29 14:28:36 INFO mapreduce.Job: Job job_1743079842141_0006 completed successfully 25/03/29 14:28:36 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=6 FILE: Number of bytes written=245465 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=8669 HDFS: Number of bytes written=0 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2452 Total time spent by all reduces in occupied slots (ms)=2374 Total time spent by all map tasks (ms)=2452 Total time spent by all reduce tasks (ms)=2374 Total vcore-milliseconds taken by all map tasks=2452 Total vcore-milliseconds taken by all reduce tasks=2374 Total megabyte-milliseconds taken by all map tasks=251084
03-30
[root@hadoop01 hadoop-2.7.6]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.6.jar wordcount wcinput2/ wcoutput2 25/09/07 12:58:08 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/192.168.10.131:8032 25/09/07 12:58:09 INFO input.FileInputFormat: Total input paths to process : 0 25/09/07 12:58:09 INFO mapreduce.JobSubmitter: number of splits:0 25/09/07 12:58:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1757220240312_0003 25/09/07 12:58:10 INFO impl.YarnClientImpl: Submitted application application_1757220240312_0003 25/09/07 12:58:10 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1757220240312_0003/ 25/09/07 12:58:10 INFO mapreduce.Job: Running job: job_1757220240312_0003 25/09/07 12:58:17 INFO mapreduce.Job: Job job_1757220240312_0003 running in uber mode : true 25/09/07 12:58:17 INFO mapreduce.Job: map 0% reduce 100% 25/09/07 12:58:19 INFO mapreduce.Job: Job job_1757220240312_0003 completed successfully 25/09/07 12:58:19 INFO mapreduce.Job: Counters: 40 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=0 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10 HDFS: Number of bytes written=130781 HDFS: Number of read operations=17 HDFS: Number of large read operations=0 HDFS: Number of write operations=7 Job Counters Launched reduce tasks=1 Total time spent by all maps in occupied slots (ms)=0 Total time spent by all reduces in occupied slots (ms)=1315 TOTAL_LAUNCHED_UBERTASKS=1 NUM_UBER_SUBREDUCES=1 Total time spent by all reduce tasks (ms)=1315 Total vcore-milliseconds taken by all reduce tasks=1315 Total megabyte-milliseconds taken by all reduce tasks=1346560 Map-Reduce Framework Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=0 Reduce input records=0 Reduce output records=0 Spilled Records=0 Shuffled Maps =0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=8 CPU time spent (ms)=640 Physical memory (bytes) snapshot=331374592 Virtual memory (bytes) snapshot=3107405824 Total committed heap usage (bytes)=177209344 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Output Format Counters Bytes Written=308 [root@hadoop01 hadoop-2.7.6]# ll wcoutput2/ ls: 无法访问wcoutput2/: 没有那个文件或目录
09-08
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值