Java heap space 报错
实验过程中,执行map过程中,出现task failed,报错为java heap space,原因是jvm的内存太小了,无法达到要求,修改方法一般是改进程序,减小程序消耗的内存,还有就是增大datanode的jvm内存。
因为在datanode上使用java -Xmx命令出现无法创建虚拟机的报错(不知道什么原因),所以只能去更改mapred里面的默认设置。
我们可以去mapred-default中看看,发现jvm内存的最大值为200M,现在不够我们将它修改为512m
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx200m</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>
然后在mapred-site.xml里面加上
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512m</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>
即可。
在执行Hadoop MapReduce任务时遇到Java heap space错误,原因是任务Tracker进程的JVM内存不足。尝试通过增加Datanode的JVM内存未果,转而在MapReduce配置中修改默认设置。在mapred-default中发现默认的最大JVM内存为200M,将其改为512M,并在mapred-site.xml中进行相应的配置更新,以解决内存不足问题。
2172

被折叠的 条评论
为什么被折叠?



