Hive动态分区（Dynamic Partition）导致的内存问题

最新推荐文章于 2025-04-26 22:05:42 发布

原创最新推荐文章于 2025-04-26 22:05:42 发布 · 4.9k 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#hive

大数据专栏收录该内容

1 篇文章

订阅专栏

本文探讨了Hive动态分区导致的内存问题，特别是在大量动态分区创建时出现的内存溢出或YarnKill现象。通过调整hive.optimize.sort.dynamic.partition参数，可以实现动态分区的全局排序，从而减少每个分区值在Reducer上的记录写入，有效减轻内存压力。

在这里插入图片描述 ## Hive动态分区（Dynamic Partition）导致的内存问题
默认hive.optimize.sort.dynamic.partition是false，hive进行动态分区的时候，可以看见一直在 FileSinkOperator: New Final Path导致内存中保存太多句柄和数据，导致OOM或者内存超被Yarn Kill。开启下面的参数可以缓解内存压力。

hive.optimize.sort.dynamic.partition

Default Value: true in Hive 0.13.0 and 0.13.1; false in Hive 0.14.0 and later (HIVE-8151)
Added In: Hive 0.13.0 with HIVE-6455
When enabled, dynamic partitioning column will be globally sorted. This way we can keep only one record writer open for each partition value in the reducer thereby reducing the memory pressure on reducers.

Container killed on request. Exit code is 137 Container exited with a non-zero exit code 137 Killed by external signal

2019-04-17 16:16:58,358 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20181009/own_library_flag=0/000000_0
2019-04-17 16:16:58,374 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20180909/own_library_flag=0/000000_0
2019-04-17 16:16:58,374 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_task_tmp.-ext-10002/dayno=20180909/own_library_flag=0/_tmp.000000_0
2019-04-17 16:16:58,374 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20180909/own_library_flag=0/000000_0
2019-04-17 16:16:58,388 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20181001/own_library_flag=0/000000_0
2019-04-17 16:16:58,388 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_task_tmp.-ext-10002/dayno=20181001/own_library_flag=0/_tmp.000000_0
2019-04-17 16:16:58,388 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20181001/own_library_flag=0/000000_0
2019-04-17 16:16:58,396 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20180906/own_library_flag=0/000000_0
2019-04-17 16:16:58,396 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_task_tmp.-ext-10002/dayno=20180906/own_library_flag=0/_tmp.000000_0
2019-04-17 16:16:58,396 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20180906/own_library_flag=0/000000_0
2019-04-17 16:16:58,405 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20181127/own_library_flag=0/000000_0
2019-04-17 16:16:58,405 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_task_tmp.-ext-10002/dayno=20181127/own_library_flag=0/_tmp.000000_0
2019-04-17 16:16:58,405 INFO [main] org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://alg-hdfs/warehouse/browser/browser.db/t_xqh_temp_20190417_document/.hive-staging_hive_2019-04-17_16-16-01_858_4125803238649808238-15659/_tmp.-ext-10002/dayno=20181127/own_library_flag=0/000000_0
2019-04-17 16:16:58,436 FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“doc_id”:“V_01KGpTaP”,“id”:-1,“outid”:“V_01KGpTaP”,“source”:“yidian”,“gid”:null,“title”:“福克斯RS到底跑多快”,“style”:null,“cover1”:"",“cover2”:"",“cover3”:"",“doctype”:-1,“article_type”:“视频”,“summary”:"",“author”:“绝对的汽车控”,“publishtime”:-1,“expiretime”:-1,“content”:null,“url”:null,“imgcount”:-1,“videoduration”:-1,“originlevel”:100,“computerlevel”:-1,“artificiallevel”:-1,“topcategory”:“汽车”,“secondcategory”:null,“origintags”:"",“computertags”:"",“artificialtags”:"",“secondresource”:null,“origindata”:"",“status”:-1,“mark”:"",“clickcount”:-1,“upscount”:-1,“commentcount”:-1,“sharecount”:-1,“threadcount”:-1,“clickusercount”:-1,“createtime”:1550938435,“createtime_st”:“20190224 00:13:55”,“updatetime”:-1,“updatetime_st”:"-1",“openid”:-1,“sourceen”:"",“base62id”:"",“docattr”:"",“docflag”:"",“similarid”:"",“author_name”:“绝对的汽车控”,“author_level”:-1,“author_type”:-1,“own_library_flag”:0,“import_time”:null}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChildMapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
… 9 more