Prediction(5)Cluster Trouble Shooting

本文介绍了一位用户在使用本地Zeppelin与Spark 1.5.0、Hadoop 2.7.1环境时遇到的问题及配置调整。面对疑似内存不足导致的任务失败,作者详细记录了其对Zeppelin和Spark的配置修改过程,并最终成功部署了包含Ubuntu Master节点和两个Dev节点的YARN集群。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Prediction(5)Cluster Trouble Shooting

Face to some issue on local zeppelin with spark 1.5.0, zeppelin 0.6.0, hadoop 2.7.1.
It may be memory issue. But it does not help I configure the zeppelin as follow:
export MASTER="yarn-client"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"

export SPARK_HOME="/opt/spark"
. ${SPARK_HOME}/conf/spark-env.sh
export ZEPPELIN_CLASSPATH="${SPARK_CLASSPATH}"
export ZEPPELIN_JAVA_OPTS="-Dspark.yarn.driver.memoryOverhead=512 -Dspark.yarn.executor.memoryOverhead=512 -Dspark.akka.frameSize=100 -Dspark.executor.instances=2 -Dspark.driver.memory=3g -Dspark.storage.memoryFraction=0.7 -Dspark.core.connection.ack.wait.timeout=800 -Dspark.rdd.compress=true -Dspark.default.parallelism=18 -Dspark.executor.memory=3g"

Spark as follow:
export SPARK_DAEMON_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop"
#export SPARK_WORKER_MEMORY=1024m
#export SPARK_JAVA_OPTS="-Dbuild.env=lmm.sparkvm"
export USER=carl

Install phantomjs
> sudo apt-get install phantomjs

Build Zeppelin Again
> mvn clean package -Pspark-1.5 -Dspark.version=1.5.0 -Dhadoop.version=2.7.1 -Phadoop-2.6 -Pyarn -DskipTests

I just set up ubuntu-pilot to run the zeppelin and spark. ubuntu-master and ubuntu-dev1, ubuntu-dev2 will be the yarn cluster. Everything works fine now.

References:
http://machinelearningmastery.com/non-linear-classification-in-r-with-decision-trees/
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值