运行后,map的0%都没完成,直接
org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
受到这句话的启发
For us to see why your job is running out of memory we would probably need to see your code. Perhaps you are creating memory-intensive objects every map() that could instead be created once in setup() and re-used every map()?
去检查了自己的代码,果然,有个HashSet放在map中初始化了。改在setup阶段初始化,map使用前clear,问题解决。
org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
受到这句话的启发
For us to see why your job is running out of memory we would probably need to see your code. Perhaps you are creating memory-intensive objects every map() that could instead be created once in setup() and re-used every map()?
去检查了自己的代码,果然,有个HashSet放在map中初始化了。改在setup阶段初始化,map使用前clear,问题解决。
本文深入探讨了在使用MapReduce框架处理大规模数据时遇到内存溢出的问题,并提供了通过在setup阶段初始化并复用对象来解决此问题的方法。通过将HashSet的初始化从map阶段移到setup阶段并在每次使用前清空集合,成功解决了内存不足导致的任务失败问题。
1001

被折叠的 条评论
为什么被折叠?



