1. 提交spark作业到yarn,采用client模式的时候作业可以运行,但是采用cluster模式的时候作业会一直初一accept状态。
背景:这个测试环境的资源比较小,提交作业后一直处于accept状态,所以把作业的配置也设置的小。
submit 语句:
spark-submit \
spark-submit \
--class a.kafka_streaming.KafkaConsumer \
--master yarn \
--deploy-mode cluster \
--driver-memory 1G \
--num-executors 1 \
--executor-cores 1 \
--executor-memory 1G \
--jars spark-streaming-kafka_2.10-1.6.2.jar,kafka_2.10-0.8.2.1.jar,metrics-core-2.2.0.jar \
my_streaming.jar
2: 报错如下:
18/03/13 09:51:57 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
18/03/13 09:51:58 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
18/03/13 09:51:59 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
18/03/13 09:52:00 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
18/03/13 09:52:01 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
18/03/13 09:52:02 INFO Client: Application report for application_1520510149375_0015 (state: ACCEPTED)
3:环境的资源少了,
4: 措施(最后这个问题还是没有解决):
1:把以下配置有1G调小,
yarn.scheduler.minimum-allocation-mb: 256m
2 修改capacity-scheduler.xml。
yarn.scheduler.capacity.maximum-am-resource-percent从更改0.1为0.5。
3:也可以把Driver和Executor的内存设置到合适位置(不能大也不能小)