How do I set the driver's python version in spark?

最新推荐文章于 2025-03-24 23:38:24 发布

伙伴几时见

最新推荐文章于 2025-03-24 23:38:24 发布

阅读量401

点赞数

分类专栏： python数据挖掘

python数据挖掘专栏收录该内容

74 篇文章

订阅专栏

本文介绍如何确保在Spark环境中正确使用Python 3。包括设置环境变量、配置spark-env.sh文件以及通过修改.bashrc来指定Python路径。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

You need to make sure the standalone project you're launching is launched with python 3. If your are submitting your standalone program through spark-submit then it should work fine, but if you are launching it with python make sure you use python3 to start your app.

Also make sure you have set your env variables in ./conf/spark-spark-env.sh (if it doesnt exist you can use spark-env.sh.template as a base.

@Kevin - I am having same problem, could you please post your solution regarding what change you made in spark-evn.sh. – Dev Patel Jun 22 '15 at 17:14

This is the right way of inducing PATH variables to Spark, instead of modifying .bashrc. – CᴴᴀZ Aug 3 at 12:31

Why is using python 3 required @Holden? – jerzy Aug 17 at 19:44

Spark can run in python2, but in this case the user was trying to specify python3 in their question. Whichever Python version it is it needs to be done consistently. – Holden Aug 26 at 9:45

add a comment |

up vote 14 down vote

Setting PYSPARK_PYTHON=python3 and PYSPARK_DRIVER_PYTHON=python3 both to python3 works for me. I did this using export in my .bashrc. In the end, these are the variables I create:

export SPARK_HOME="$HOME/Downloads/spark-1.4.0-bin-hadoop2.4"
export IPYTHON=1
export PYSPARK_PYTHON=/usr/bin/python3
export PYSPARK_DRIVER_PYTHON=ipython3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"