LocalJobRunner

本文深入探讨了Hadoop作业在本地模式下的执行流程,包括JobClient初始化、LocalJobRunner提交作业、线程池执行Map和Reduce任务,以及清理操作等关键步骤。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

       hadoop作业分本地模式和分布式模式两种执行模式,JobClient初始化时会读取配置项mapred.job.tracker(默认为local),如果该配置项的值为local,则hadoop采本地模式执行作业,否则采用分布式模式执行。本地模式使用LocalJobRuner提交并执行作业。对LocalJobRunner实例调用submitJob( )方法会创建Job(LocalJobRunner的内部类)实例,该实例完成作业的执行。
       从图中可以看到,LocalJobRunner的内部类Job就是一个线程,其实本地模式的MapReduce作业就由该线程完成。Job的内部类MapTaskRunnable实现了Runnable接口,代表了Map任务,每个分片都对应一个MapTaskRunnable实例,Job采用java并发包提供的ExecutorService线程池来执行MapTaskRunnable实例,线程池的大小为分片数量和mapreduce.local.map.tasks.maximum配置项值中较小者,至少为1。线程池创建好之后就将MapTaskRunnable实例都提交到其中去执行,然后线程池停止接受新任务等待线程执行完毕。线程执行完毕后会逐个检查MapTaskRunnable实例有没有异常出现,如果有则认为map执行失败,直接抛出异常终止执行,如果都没有异常则认为map都执行成功,接下来继续执行reduce。本地模式值允许0个或者1个reduce任务。ReduceTask代表一个reduce任务,它从map的输出文件中读取数据进行reduce操作,将结果写到指定的目录中。reduce任务执行完之后,会进行一些清理操作,删除map的中间输出,删除作业提交目录和其中的作业配置文件、删除作业的本地拷贝文件等。

本地执行模式:

 源代码就不再贴了,请参考LocalJobRunner。

[root@localhost sqoop-1.4.7.bin__hadoop-2.6.0]# bin/sqoop export --connect jdbc:mysql://localhost:3306/dbtaobao --username root --password 123456 --table user_log --export-dir '/user/hive/warehouse/dbtaobao.db/inner_user_log' --fields-terminated-by ','; Warning: /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 2025-06-06 00:44:19,902 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2025-06-06 00:44:20,015 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2025-06-06 00:44:20,281 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2025-06-06 00:44:20,287 INFO tool.CodeGenTool: Beginning code generation Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2025-06-06 00:44:22,137 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_log` AS t LIMIT 1 2025-06-06 00:44:22,270 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_log` AS t LIMIT 1 2025-06-06 00:44:22,293 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop/hadoop-3.1.3 注: /tmp/sqoop-root/compile/2647ebe9a3777fcaa95ca65a919294ec/user_log.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 2025-06-06 00:44:26,610 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/2647ebe9a3777fcaa95ca65a919294ec/user_log.jar 2025-06-06 00:44:26,629 INFO mapreduce.ExportJobBase: Beginning export of user_log 2025-06-06 00:44:26,629 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2025-06-06 00:44:27,018 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2025-06-06 00:44:28,689 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:28,961 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 2025-06-06 00:44:28,966 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-06-06 00:44:28,966 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2025-06-06 00:44:29,494 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2025-06-06 00:44:29,762 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 2025-06-06 00:44:29,762 INFO impl.MetricsSystemImpl: JobTracker metrics system started 2025-06-06 00:44:30,031 INFO input.FileInputFormat: Total input files to process : 1 2025-06-06 00:44:30,053 INFO input.FileInputFormat: Total input files to process : 1 2025-06-06 00:44:30,136 INFO mapreduce.JobSubmitter: number of splits:4 2025-06-06 00:44:30,284 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-06-06 00:44:30,660 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local583218486_0001 2025-06-06 00:44:30,660 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-06-06 00:44:31,271 INFO mapred.LocalDistributedCacheManager: Creating symlink: /tmp/hadoop-root/mapred/local/1749141870870/libjars <- /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/libjars/* 2025-06-06 00:44:31,278 WARN fs.FileUtil: Command 'ln -s /tmp/hadoop-root/mapred/local/1749141870870/libjars /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/libjars/*' failed 1 with: ln: 无法创建符号链接"/usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/libjars/*": 没有那个文件或目录 2025-06-06 00:44:31,278 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: /tmp/hadoop-root/mapred/local/1749141870870/libjars <- /usr/sqoop/sqoop-1.4.7.bin__hadoop-2.6.0/libjars/* 2025-06-06 00:44:31,278 INFO mapred.LocalDistributedCacheManager: Localized file:/tmp/hadoop/mapred/staging/root583218486/.staging/job_local583218486_0001/libjars as file:/tmp/hadoop-root/mapred/local/1749141870870/libjars 2025-06-06 00:44:31,569 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 2025-06-06 00:44:31,571 INFO mapreduce.Job: Running job: job_local583218486_0001 2025-06-06 00:44:31,614 INFO mapred.LocalJobRunner: OutputCommitter set in config null 2025-06-06 00:44:31,631 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.sqoop.mapreduce.NullOutputCommitter 2025-06-06 00:44:31,798 INFO mapred.LocalJobRunner: Waiting for map tasks 2025-06-06 00:44:31,802 INFO mapred.LocalJobRunner: Starting task: attempt_local583218486_0001_m_000000_0 2025-06-06 00:44:32,000 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2025-06-06 00:44:32,006 INFO mapred.MapTask: Processing split: Paths:/user/hive/warehouse/dbtaobao.db/inner_user_log/000000_0:355065+59179,/user/hive/warehouse/dbtaobao.db/inner_user_log/000000_0:414244+59179 2025-06-06 00:44:32,012 INFO Configuration.deprecation: map.input.file is deprecated. Instead, use mapreduce.map.input.file 2025-06-06 00:44:32,012 INFO Configuration.deprecation: map.input.start is deprecated. Instead, use mapreduce.map.input.start 2025-06-06 00:44:32,012 INFO Configuration.deprecation: map.input.length is deprecated. Instead, use mapreduce.map.input.length 2025-06-06 00:44:32,050 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:32,260 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:32,299 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 2025-06-06 00:44:32,328 INFO mapred.LocalJobRunner: Starting task: attempt_local583218486_0001_m_000001_0 2025-06-06 00:44:32,357 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2025-06-06 00:44:32,359 INFO mapred.MapTask: Processing split: Paths:/user/hive/warehouse/dbtaobao.db/inner_user_log/000000_0:0+118355 2025-06-06 00:44:32,387 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:32,524 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 2025-06-06 00:44:32,553 INFO mapred.LocalJobRunner: Starting task: attempt_local583218486_0001_m_000002_0 2025-06-06 00:44:32,566 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2025-06-06 00:44:32,567 INFO mapred.MapTask: Processing split: Paths:/user/hive/warehouse/dbtaobao.db/inner_user_log/000000_0:118355+118355 2025-06-06 00:44:32,585 INFO mapreduce.Job: Job job_local583218486_0001 running in uber mode : false 2025-06-06 00:44:32,587 INFO mapreduce.Job: map 0% reduce 0% 2025-06-06 00:44:32,616 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:32,745 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:32,799 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 2025-06-06 00:44:32,866 INFO mapred.LocalJobRunner: Starting task: attempt_local583218486_0001_m_000003_0 2025-06-06 00:44:32,901 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 2025-06-06 00:44:32,903 INFO mapred.MapTask: Processing split: Paths:/user/hive/warehouse/dbtaobao.db/inner_user_log/000000_0:236710+118355 2025-06-06 00:44:32,930 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:33,004 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-06 00:44:33,034 INFO mapreduce.AutoProgressMapper: Auto-progress thread is finished. keepGoing=false 2025-06-06 00:44:33,049 INFO mapred.LocalJobRunner: map task executor complete. 2025-06-06 00:44:33,050 WARN mapred.LocalJobRunner: job_local583218486_0001 java.lang.Exception: java.io.IOException: java.lang.ClassNotFoundException: user_log at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) Caused by: java.io.IOException: java.lang.ClassNotFoundException: user_log at org.apache.sqoop.mapreduce.TextExportMapper.setup(TextExportMapper.java:74) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: user_log at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.sqoop.mapreduce.TextExportMapper.setup(TextExportMapper.java:70) ... 10 more 2025-06-06 00:44:33,590 INFO mapreduce.Job: Job job_local583218486_0001 failed with state FAILED due to: NA 2025-06-06 00:44:33,599 INFO mapreduce.Job: Counters: 0 2025-06-06 00:44:33,623 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 2025-06-06 00:44:33,625 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 4.6274 seconds (0 bytes/sec) 2025-06-06 00:44:33,626 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2025-06-06 00:44:33,626 INFO mapreduce.ExportJobBase: Exported 0 records. 2025-06-06 00:44:33,626 ERROR mapreduce.ExportJobBase: Export job failed! 2025-06-06 00:44:33,626 ERROR tool.ExportTool: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) [root@localhost sqoop-1.4.7.bin__hadoop-2.6.0]# 给我解决方案
最新发布
06-07
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值