Statictis

本文探讨了Py4JJavaError异常的具体原因及其解决方案,该异常发生在将Python代码部署到Spark集群时,而在本地环境中运行正常。文章深入分析了pyspark版本不一致可能导致的问题,并提供了相应的解决思路。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

py4j.protocol.Py4JJavaError: An error occurred while calling o39.colStats.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/worker.py", line 166, in main
    func, profiler, deserializer, serializer = read_command(pickleSer, infile)
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/worker.py", line 55, in read_command
    command = serializer._read_with_length(file)
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/serializers.py", line 169, in _read_with_length
    return self.loads(obj)
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/serializers.py", line 451, in loads
    return pickle.loads(obj, encoding=encoding)
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 784, in _make_skel_func
    closure = _reconstruct_closure(closures) if closures else None
  File "/opt/modules/spark-2.2.0/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 776, in _reconstruct_closure
    return tuple([_make_cell(v) for v in values])

TypeError: 'int' object is not iterable

--------------------------------------------

问题描述:

本地用终端连接pyspark shell窗口下能正常运行代码,写成spark应用程序集群运行导致上述报错;

解决:

可能是安装pyspark包时,默认安装了最新的版本,而pyspark终端启动默认使用的是pyspark里面的,而应用程序则是


评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值