搭建Spark环境

将spark的包解压至/opt/module/并改名为spark

设置环境变量
vim /etc/profile

export SPARK_HOME=/opt/module/spark2.1.1
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH

修改配置文件
cp ./conf/spark-env.sh.template ./conf/spark-env.sh

spark-env.sh的配置内容

export SPARK_DIST_CLASSPATH=$(/opt/module/hadoop-2.7.2/bin/hadoop classpath)
export SPARK_MASTER_IP=192.168.106.101
export HADOOP_CONF_DIR=/opt/module/hadoop-2.7.2/etc/hadoop
export PYSPARK_PYTHON=/usr/bin/python
export PYSPARK_DRIVER_PYTHON=/usr/bin/python
export JAVA_HOME=/opt/module/jdk1.8.0_144

slaves的配置内容
cp ./slaves.template ./slaves

将其他主机的IP地址填入即可

测试Spark

cd /opt/module/spark
./bin/run-example SparkPi

常见错误

pySpark API 使用过程中出现 “ImportError: No module named ‘py4j’“错误

确认/etc/profile下有
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
进入/opt/module/spark2.1.1/python/lib目录下查看py4j-0.10.4-src.zip是否和导入的相同

spark-shell启动报错:Yarn application has already ended! It might have been killed or unable to launch application master

resourceManager页面:(Current usage: 58.4 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.)
原文链接

yarn-site.xml 增加配置:

<!--以下为解决spark-shell 以yarn client模式运行报错问题而增加的配置,估计spark-summit也会有这个问题。2个配置只用配置一个即可解决问题,当然都配置也没问题-->
<!--虚拟内存设置是否生效,若实际虚拟内存大于设置值 ,spark 以client模式运行可能会报错,"Yarn application has already ended! It might have been killed or unable to l"-->
<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
    <description>Whether virtual memory limits will be enforced for containers</description>
</property>
<!--配置虚拟内存/物理内存的值,默认为2.1,物理内存默认应该是1g,所以虚拟内存是2.1g-->
<property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
    <description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>

其他问题及解决方法

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值