【MapReduce】Streaming Job Failed!

在尝试运行一个使用Python编写的MapReduce程序时,在Hadoop环境中遇到错误。经过检查,发现错误可能源于MR脚本、环境配置或jar包问题。最终通过查找正确的hadoop-streaming.jar位置并更新jar包解决了问题。经验总结指出,ClassNotFound异常可能是jar包与环境不匹配,且查看运行日志对于定位问题至关重要。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

报错发生情况:

用Python写好了一个MR程序,使用Linux环境本地测试正常。
在Hadoop环境上测试就报错。

我的环境:

$hadoop version
Hadoop 2.5.2
...

执行指令:

hadoop jar $HADOOP_INSTALL_HOME/contrib/streaming/hadoop-*streaming*.jar   \
-file ./mapper.py -mapper ./mapper.py \
-file ./reducer.py -reducer ./reducer.py \
-input /data/poem/data_test \
-output /data/poem/result

报错信息:

packageJobJar: [mapper.py, reducer.py] [] /tmp/streamjob4957099323859594325.jar tmpDir=null
17/04/13 15:10:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/04/13 15:10:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/04/13 15:10:56 INFO mapred.FileInputFormat: Total input paths to process : 2
17/04/13 15:10:56 INFO mapreduce.JobSubmitter: number of splits:2
17/04/13 15:10:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1492067422224_0001
17/04/13 15
[root@node ~]# mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8 Server version: 8.0.42 MySQL Community Server - GPL Copyright (c) 2000, 2025, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE weblog_db; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CREATE DATABASE weblog_db' at line 1 mysql> CREATE DATABASE weblog_db; ERROR 1007 (HY000): Can't create database 'weblog_db'; database exists mysql> USE weblog_db; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> DROP DATABASE IF EXISTS weblog_db; Query OK, 2 rows affected (0.02 sec) mysql> CREATE DATABASE weblog_db; Query OK, 1 row affected (0.01 sec) mysql> USE weblog_db; Database changed mysql> CREATE TABLE page_visits ( -> page VARCHAR(255) , -> visits BIGINT -> ); ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'TABLE page_visits ( page VARCHAR(255) , visits BIGINT )' at line 1 mysql> CREATE TABLE page_visits ( -> page VARCHAR(255), -> visits BIGINT -> ); Query OK, 0 rows affected (0.02 sec) mysql> SHOW TABLES; +---------------------+ | Tables_in_weblog_db | +---------------------+ | page_visits | +---------------------+ 1 row in set (0.00 sec) mysql> ^C mysql> q -> quit -> exit -> ^C mysql> ^C mysql> ^C mysql> ^DBye [root@node ~]# hive SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Hive Session ID = 7bb79582-cc2b-49b6-abc7-020dcdc46542 Logging initialized using configuration in jar:file:/home/hive-3.1.3/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = 15d9da52-e18e-40b2-a80f-e76eda81df4c hive> DESCRIBE FORMATTED page_visits; OK # col_name data_type comment page string visits bigint # Detailed Table Information Database: default OwnerType: USER Owner: root CreateTime: Tue Jul 08 01:43:42 CST 2025 LastAccessTime: UNKNOWN Retention: 0 Location: hdfs://node:9000/hive/warehouse/page_visits Table Type: MANAGED_TABLE Table Parameters: COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} bucketing_version 2 numFiles 1 numRows 4 rawDataSize 56 totalSize 60 transient_lastDdlTime 1751910222 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 Time taken: 0.785 seconds, Fetched: 32 row(s) hive> [root@node ~]# [root@node ~]# sqoop export \ > --connect jdbc:mysql://localhost/weblog_db \ > --username root \ > --password Aa@123456 \ > --table page_visits \ > --export-dir hdfs://node:9000/hive/warehouse/page_visits \ > --input-fields-terminated-by '\001' \ > --num-mappers 1 Warning: /home/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 2025-07-08 15:28:12,550 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2025-07-08 15:28:12,587 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2025-07-08 15:28:12,704 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2025-07-08 15:28:12,708 INFO tool.CodeGenTool: Beginning code generation Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2025-07-08 15:28:13,225 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `page_visits` AS t LIMIT 1 2025-07-08 15:28:13,266 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `page_visits` AS t LIMIT 1 2025-07-08 15:28:13,280 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop3.3 Note: /tmp/sqoop-root/compile/363869e21c2078b9742685122c43a3cc/page_visits.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2025-07-08 15:28:16,377 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/363869e21c2078b9742685122c43a3cc/page_visits.jar 2025-07-08 15:28:16,391 INFO mapreduce.ExportJobBase: Beginning export of page_visits 2025-07-08 15:28:16,391 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2025-07-08 15:28:16,484 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2025-07-08 15:28:17,339 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 2025-07-08 15:28:17,342 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:28:17,343 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2025-07-08 15:28:17,555 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node/192.168.196.122:8032 2025-07-08 15:28:17,782 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1751959003014_0001 2025-07-08 15:28:26,026 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:28:26,029 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:28:26,495 INFO mapreduce.JobSubmitter: number of splits:1 2025-07-08 15:28:26,528 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:28:26,619 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1751959003014_0001 2025-07-08 15:28:26,620 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-07-08 15:28:26,805 INFO conf.Configuration: resource-types.xml not found 2025-07-08 15:28:26,805 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-07-08 15:28:27,226 INFO impl.YarnClientImpl: Submitted application application_1751959003014_0001 2025-07-08 15:28:27,264 INFO mapreduce.Job: The url to track the job: http://node:8088/proxy/application_1751959003014_0001/ 2025-07-08 15:28:27,264 INFO mapreduce.Job: Running job: job_1751959003014_0001 2025-07-08 15:28:34,334 INFO mapreduce.Job: Job job_1751959003014_0001 running in uber mode : false 2025-07-08 15:28:34,335 INFO mapreduce.Job: map 0% reduce 0% 2025-07-08 15:28:38,374 INFO mapreduce.Job: map 100% reduce 0% 2025-07-08 15:28:38,381 INFO mapreduce.Job: Job job_1751959003014_0001 failed with state FAILED due to: Task failed task_1751959003014_0001_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-07-08 15:28:38,448 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2061 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=2061 Total vcore-milliseconds taken by all map tasks=2061 Total megabyte-milliseconds taken by all map tasks=2110464 2025-07-08 15:28:38,456 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 2025-07-08 15:28:38,457 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 21.1033 seconds (0 bytes/sec) 2025-07-08 15:28:38,462 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2025-07-08 15:28:38,462 INFO mapreduce.ExportJobBase: Exported 0 records. 2025-07-08 15:28:38,462 ERROR mapreduce.ExportJobBase: Export job failed! 2025-07-08 15:28:38,463 ERROR tool.ExportTool: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) [root@node ~]# sqoop export \ > --connect jdbc:mysql://localhost/weblog_db \ > --username root \ > --password Aa@123456 \ > --table page_visits \ > --export-dir hdfs://node:9000/hive/warehouse/page_visits \ > --input-fields-terminated-by ',' \ > --num-mappers 1 Warning: /home/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 2025-07-08 15:30:31,174 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2025-07-08 15:30:31,218 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2025-07-08 15:30:31,333 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 2025-07-08 15:30:31,336 INFO tool.CodeGenTool: Beginning code generation Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary. 2025-07-08 15:30:31,771 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `page_visits` AS t LIMIT 1 2025-07-08 15:30:31,814 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `page_visits` AS t LIMIT 1 2025-07-08 15:30:31,821 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop3.3 Note: /tmp/sqoop-root/compile/ab00e36d1f5084a0f7d522b4e9a975e5/page_visits.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2025-07-08 15:30:33,116 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/ab00e36d1f5084a0f7d522b4e9a975e5/page_visits.jar 2025-07-08 15:30:33,129 INFO mapreduce.ExportJobBase: Beginning export of page_visits 2025-07-08 15:30:33,129 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2025-07-08 15:30:33,212 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2025-07-08 15:30:33,877 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 2025-07-08 15:30:33,880 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:30:33,880 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2025-07-08 15:30:34,097 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at node/192.168.196.122:8032 2025-07-08 15:30:34,310 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1751959003014_0002 2025-07-08 15:30:39,127 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:30:39,131 INFO input.FileInputFormat: Total input files to process : 1 2025-07-08 15:30:39,995 INFO mapreduce.JobSubmitter: number of splits:1 2025-07-08 15:30:40,022 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 2025-07-08 15:30:40,532 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1751959003014_0002 2025-07-08 15:30:40,532 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-07-08 15:30:40,689 INFO conf.Configuration: resource-types.xml not found 2025-07-08 15:30:40,689 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-07-08 15:30:40,746 INFO impl.YarnClientImpl: Submitted application application_1751959003014_0002 2025-07-08 15:30:40,783 INFO mapreduce.Job: The url to track the job: http://node:8088/proxy/application_1751959003014_0002/ 2025-07-08 15:30:40,784 INFO mapreduce.Job: Running job: job_1751959003014_0002 2025-07-08 15:30:46,847 INFO mapreduce.Job: Job job_1751959003014_0002 running in uber mode : false 2025-07-08 15:30:46,848 INFO mapreduce.Job: map 0% reduce 0% 2025-07-08 15:30:50,893 INFO mapreduce.Job: map 100% reduce 0% 2025-07-08 15:30:51,905 INFO mapreduce.Job: Job job_1751959003014_0002 failed with state FAILED due to: Task failed task_1751959003014_0002_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-07-08 15:30:51,973 INFO mapreduce.Job: Counters: 8 Job Counters Failed map tasks=1 Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=2058 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=2058 Total vcore-milliseconds taken by all map tasks=2058 Total megabyte-milliseconds taken by all map tasks=2107392 2025-07-08 15:30:51,979 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 2025-07-08 15:30:51,980 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 18.0828 seconds (0 bytes/sec) 2025-07-08 15:30:51,983 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2025-07-08 15:30:51,983 INFO mapreduce.ExportJobBase: Exported 0 records. 2025-07-08 15:30:51,983 ERROR mapreduce.ExportJobBase: Export job failed! 2025-07-08 15:30:51,984 ERROR tool.ExportTool: Error during export: Export job failed! at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:445) at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:931) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:80) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:99) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) [root@node ~]# 6.2 Sqoop导出数据 6.2.1从Hive将数据导出到MySQL 6.2.2sqoop导出格式 6.2.3导出page_visits表 6.2.4导出到ip_visits表 6.3验证导出数据 6.3.1登录MySQL 6.3.2执行查询
最新发布
07-09
486_0003 running in uber mode : false 2025-06-11 18:48:39,482 INFO mapreduce.Job: map 0% reduce 0% 2025-06-11 18:48:45,635 INFO mapreduce.Job: Task Id : attempt_1744881937486_0003_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:48:46,677 INFO mapreduce.Job: map 50% reduce 0% 2025-06-11 18:48:50,727 INFO mapreduce.Job: Task Id : attempt_1744881937486_0003_m_000000_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:48:52,758 INFO mapreduce.Job: Task Id : attempt_1744881937486_0003_r_000000_0, Status : FAILED [2025-06-11 18:48:51.419]Container [pid=71601,containerID=container_1744881937486_0003_01_000005] is running 537561600B beyond the 'VIRTUAL' memory limit. Current usage: 166.6 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1744881937486_0003_01_000005 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 71601 71599 71601 71601 (bash) 0 1 116002816 303 /bin/bash -c /usr/local/jdk1.8.0_391/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1744881937486_0003/container_1744881937486_0003_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0003/container_1744881937486_0003_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 192.168.30.110 37705 attempt_1744881937486_0003_r_000000_0 5 1>/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0003/container_1744881937486_0003_01_000005/stdout 2>/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0003/container_1744881937486_0003_01_000005/stderr |- 71616 71601 71601 71601 (java) 453 87 2676416512 42355 /usr/local/jdk1.8.0_391/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1744881937486_0003/container_1744881937486_0003_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0003/container_1744881937486_0003_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 192.168.30.110 37705 attempt_1744881937486_0003_r_000000_0 5 [2025-06-11 18:48:51.462]Container killed on request. Exit code is 143 [2025-06-11 18:48:51.478]Container exited with a non-zero exit code 143. 2025-06-11 18:48:54,802 INFO mapreduce.Job: Task Id : attempt_1744881937486_0003_m_000000_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:49:00,976 INFO mapreduce.Job: map 100% reduce 100% 2025-06-11 18:49:01,992 INFO mapreduce.Job: Job job_1744881937486_0003 failed with state FAILED due to: Task failed task_1744881937486_0003_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-06-11 18:49:02,105 INFO mapreduce.Job: Counters: 42 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=352125 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1771275 HDFS: Number of bytes written=0 HDFS: Number of read operations=3 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=4 Failed reduce tasks=1 Killed reduce tasks=1 Launched map tasks=5 Launched reduce tasks=2 Other local map tasks=3 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=18987 Total time spent by all reduces in occupied slots (ms)=8718 Total time spent by all map tasks (ms)=18987 Total time spent by all reduce tasks (ms)=8718 Total vcore-milliseconds taken by all map tasks=18987 Total vcore-milliseconds taken by all reduce tasks=8718 Total megabyte-milliseconds taken by all map tasks=19442688 Total megabyte-milliseconds taken by all reduce tasks=8927232 Map-Reduce Framework Map input records=10009 Map output records=10009 Map output bytes=110099 Map output materialized bytes=130123 Input split bytes=92 Combine input records=0 Spilled Records=10009 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=121 CPU time spent (ms)=1250 Physical memory (bytes) snapshot=268398592 Virtual memory (bytes) snapshot=2787106816 Total committed heap usage (bytes)=249036800 Peak Map Physical memory (bytes)=268398592 Peak Map Virtual memory (bytes)=2787106816 File Input Format Counters Bytes Read=1771183 2025-06-11 18:49:02,105 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! [root@master ~]#
06-12
[root@master ~]# hadoop jar "/root/software/hadoop-3.1.3/share/hadoop/tools/lib/hadoop-streaming-3.1.3.jar" -file "/usr/bin/python3" "/root/csv_python_code/my_mapper_csv.py" -mapper "/root/csv_python_code/my_mapper_csv.py" -file /usr/bin/python3 "/root/csv_python_code/my_reducer_csv.py" -reducer "/root/csv_python_code/my_reducer_csv.py" -input /my_input_csv/* -output /my_output.csv 2025-06-11 18:58:58,183 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead. packageJobJar: [/usr/bin/python3, /root/csv_python_code/my_mapper_csv.py, /usr/bin/python3, /root/csv_python_code/my_reducer_csv.py, /tmp/hadoop-unjar1612883193461920012/] [] /tmp/streamjob7378925739115730847.jar tmpDir=null 2025-06-11 18:58:59,167 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.30.110:8032 2025-06-11 18:58:59,389 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.30.110:8032 2025-06-11 18:58:59,779 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1744881937486_0004 2025-06-11 18:58:59,893 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:00,007 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:00,435 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:00,466 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:00,908 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:01,415 INFO mapred.FileInputFormat: Total input files to process : 1 2025-06-11 18:59:01,456 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:01,909 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:01,927 INFO mapreduce.JobSubmitter: number of splits:2 2025-06-11 18:59:02,066 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false 2025-06-11 18:59:02,114 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1744881937486_0004 2025-06-11 18:59:02,114 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2025-06-11 18:59:02,343 INFO conf.Configuration: resource-types.xml not found 2025-06-11 18:59:02,344 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2025-06-11 18:59:02,437 INFO impl.YarnClientImpl: Submitted application application_1744881937486_0004 2025-06-11 18:59:02,489 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1744881937486_0004/ 2025-06-11 18:59:02,491 INFO mapreduce.Job: Running job: job_1744881937486_0004 2025-06-11 18:59:09,693 INFO mapreduce.Job: Job job_1744881937486_0004 running in uber mode : false 2025-06-11 18:59:09,696 INFO mapreduce.Job: map 0% reduce 0% 2025-06-11 18:59:13,846 INFO mapreduce.Job: Task Id : attempt_1744881937486_0004_m_000000_0, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:59:14,874 INFO mapreduce.Job: map 50% reduce 0% 2025-06-11 18:59:18,949 INFO mapreduce.Job: Task Id : attempt_1744881937486_0004_m_000000_1, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:59:21,984 INFO mapreduce.Job: Task Id : attempt_1744881937486_0004_m_000000_2, Status : FAILED Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168) 2025-06-11 18:59:24,034 INFO mapreduce.Job: Task Id : attempt_1744881937486_0004_r_000000_0, Status : FAILED [2025-06-11 18:59:22.157]Container [pid=72527,containerID=container_1744881937486_0004_01_000005] is running 537983488B beyond the 'VIRTUAL' memory limit. Current usage: 168.3 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1744881937486_0004_01_000005 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 72527 72525 72527 72527 (bash) 0 1 116002816 303 /bin/bash -c /usr/local/jdk1.8.0_391/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1744881937486_0004/container_1744881937486_0004_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0004/container_1744881937486_0004_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 192.168.30.110 45262 attempt_1744881937486_0004_r_000000_0 5 1>/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0004/container_1744881937486_0004_01_000005/stdout 2>/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0004/container_1744881937486_0004_01_000005/stderr |- 72539 72527 72527 72527 (java) 432 50 2676838400 42782 /usr/local/jdk1.8.0_391/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx820m -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1744881937486_0004/container_1744881937486_0004_01_000005/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/root/software/hadoop-3.1.3/logs/userlogs/application_1744881937486_0004/container_1744881937486_0004_01_000005 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 192.168.30.110 45262 attempt_1744881937486_0004_r_000000_0 5 [2025-06-11 18:59:22.166]Container killed on request. Exit code is 143 [2025-06-11 18:59:22.179]Container exited with a non-zero exit code 143. 2025-06-11 18:59:28,176 INFO mapreduce.Job: map 100% reduce 100% 2025-06-11 18:59:29,197 INFO mapreduce.Job: Job job_1744881937486_0004 failed with state FAILED due to: Task failed task_1744881937486_0004_m_000000 Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0 2025-06-11 18:59:29,306 INFO mapreduce.Job: Counters: 43 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=352126 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1771275 HDFS: Number of bytes written=0 HDFS: Number of read operations=3 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Failed map tasks=4 Failed reduce tasks=1 Killed map tasks=1 Killed reduce tasks=1 Launched map tasks=6 Launched reduce tasks=1 Other local map tasks=4 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=16751 Total time spent by all reduces in occupied slots (ms)=7052 Total time spent by all map tasks (ms)=16751 Total time spent by all reduce tasks (ms)=7052 Total vcore-milliseconds taken by all map tasks=16751 Total vcore-milliseconds taken by all reduce tasks=7052 Total megabyte-milliseconds taken by all map tasks=17153024 Total megabyte-milliseconds taken by all reduce tasks=7221248 Map-Reduce Framework Map input records=10009 Map output records=10009 Map output bytes=110099 Map output materialized bytes=130123 Input split bytes=92 Combine input records=0 Spilled Records=10009 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=160 CPU time spent (ms)=1030 Physical memory (bytes) snapshot=289234944 Virtual memory (bytes) snapshot=2799288320 Total committed heap usage (bytes)=209715200 Peak Map Physical memory (bytes)=289234944 Peak Map Virtual memory (bytes)=2799288320 File Input Format Counters Bytes Read=1771183 2025-06-11 18:59:29,308 ERROR streaming.StreamJob: Job not successful! Streaming Command Failed! [root@master ~]#
06-12
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值