一次impala查询详细错误记录和分析

本文记录了一次Impala集群出现MemoryLimitExceeded错误的情况,详细描述了错误信息及查询语句,并给出了问题的定位过程及解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

前言

impala集群出错的一次记录和解决方法以及解决思路。

错误记录

错误信息

Memory limit exceeded Cannot perform hash aggregation. Partitioned input data too many times. This could mean there is too much skew in the data or the memory limit is set too low.

Query信息

就是个这么长的Query语句,Query需要join十多张的表,各种的字段。这只是很多sql中的其中一个。

create TABLE test.cp_ag_info ASSELECT a1.id cid, hr_num, position_num, available_po_num, rs_num, auto_filter_num, read_num, see_num, manual_refuse_num, it_num, auto_refuse_num, forward_num, get_rs_po_num, get_read_rs_po_num, get_see_rs_po_num, get_it_rs_po_num
FROM mysql.cp a1
LEFT JOIN (
SELECT cid, COUNT(DISTINCT uid) hr_num
FROM (
SELECT id uid, testid cid
FROM mysql.dante_user

......

UNION
SELECT a1.user_id uid, a2.dante_cp_id cid
FROM mds.t_cp_user a1
LEFT JOIN mds.t_cp a2
ON a1.cp_id=a2.id
WHERE a1.is_del='false' AND a2.is_del='false') f
GROUP BY cid) a6
ON CAST(a1.id AS STRING)= a6.cid
LEFT JOIN (
SELECT testid cid, COUNT(1) position_num, COUNT(CASE WHEN isenable!=0 AND isexpired!=1 

......

COUNT(CASE WHEN a1.DELIVER_AUTO_FILTER=1 THEN a1.orderid END) auto_filter_num,
COUNT(CASE WHEN a1.READ_rs=1 THEN a1.orderid END) read_num,
COUNT(CASE WHEN a1.READ_CONTACT=1 THEN a1.orderid END) see_num,
COUNT(CASE WHEN a1.MANUAL_REFUSE=1 THEN a1.orderid END) manual_refuse_num,
COUNT(CASE WHEN a1.ONLINE_it=1 OR a1.OFFLINE_it=1 THEN a1.orderid END) it_num,
COUNT(CASE WHEN a1.AUTO_REFUSE=1 THEN orderid END) auto_refuse_num,
COUNT(CASE WHEN a1.AUTO_FORWARD=1 OR a1.MANUAL_FORWARD=1 THEN orderid END) forward_num
FROM test.ur a1
GROUP BY a1.testid) a8
ON a1.id=a8.cid
LEFT JOIN (
SELECT a1.testid cid,

......

a1.READ_rs=1 THEN a1.positionid END) get_read_rs_po_num
FROM test.ur a1
GROUP BY testid) a10
ON a1.id=a10.cid
LEFT JOIN (
SELECT a1.testid cid,
COUNT(DISTINCT CASE WHEN a1.READ_CONTACT=1 THEN a1.positionid END) get_see_rs_po_num
FROM test.ur a1
GROUP BY testid) a11
ON a1.id=a11.cid
LEFT JOIN (
SELECT a1.testid cid,
COUNT(DISTINCT CASE WHEN a1.ONLINE_it=1 OR a1.OFFLINE_it=1 THEN a1.positionid END) 
......

ON a1.id=a12.cid

错误现象和解决方法

出现这个错误的原因非常奇葩,根据猜测是因为今天在给进群添加资源管理Llama时出现的,开启Llama然后关闭,它会修改impalad的资源上限,之前是32G的,结果被修改成了8G,而我还不知道被改了,也是看了很久才发现的。

今天在线上测试Llama后,因为感觉不太合适就关掉了,然后就开始出现各种的Memory Limit的错误,之前的正常运行的大Query今天集群失败,以前是没有错误的。定位后,修改一下大小就行了。

这个问题出现后,还出现过一次其它的问题,但是只出现了一次,不明白是什么原因,因为没有复现,所以没再处理。

Memory limit exceeded The memory limit is set too low initialize the spilling operator. The minimum required memory to spill this operator is 528.00 MB.

2016-04-07 19:53:00

转载于:https://www.cnblogs.com/dantezhao/p/5365118.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值