python multiprocessing------PicklingError: Can't pickle

最新推荐文章于 2025-05-25 13:35:32 发布

原创最新推荐文章于 2025-05-25 13:35:32 发布 · 1.6w 阅读

4 ·

CC 4.0 BY-SA版权

python 同时被 2 个专栏收录

11 篇文章

订阅专栏

解决问题

7 篇文章

订阅专栏

当在Python的multiprocessing中遇到PicklingError时，通常是因为传递给进程的函数或参数不可序列化。可序列化的数据类型包括None、布尔值、数字、字符串等。类中的方法无法直接序列化，需要将它们定义在模块的顶层。解决此类问题的方法是确保所有传递给进程的函数都在模块顶层定义，并且所有参数都是可序列化的。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

http://bbs.chinaunix.net/thread-4111379-1-1.html

import multiprocessing

class someClass(object):
    def __init__(self):
        pass

    def f(self, x):
        return x*x

    def go(self):
        pool = multiprocessing.Pool(processes=4)             
        #result = pool.apply_async(self.f, [10])     
        #print result.get(timeout=1)           
        print pool.map(self.f, range(10))

注意：如果出现
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

数据序列化：
https://zhidao.baidu.com/question/525917268.html
https://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/00138683221577998e407bb309542d9b6a68d9276bc3dbe000
应该是说一个数据结构，比如二叉树之类，序列化以后会变成一个char数组或者一个string字符串这样，方便你存到文件里面或者通过网络传输。然后要恢复的时候就是“反序列化”，把文件里读出来/从网络收到的char数组或者string恢复成一棵二叉树或者其他什么东西。主要就是方便保存

可以被序列化的类型有：
https://zhidao.baidu.com/question/619353578954760372.html
* None,True 和 False;
* 整数，浮点数，复数;
* 字符串，字节流，字节数组;
* 包含可pickle对象的tuples，lists，sets和dictionaries；
* 定义在module顶层的函数：
* 定义在module顶层的内置函数；
* 定义在module顶层的类；
* 拥有dict()或setstate()的自定义类型；

https://stackoverflow.com/questions/8804830/python-multiprocessing-pickling-error

The problem is that the pool methods all use a queue.Queue to pass tasks to the worker processes. Everything that goes through the queue.Queue must be pickable, and foo.work is not picklable since it is not defined at the top level of the module.
It can be fixed by defining a function at the top level

大意是类中的方法不能被序列化，而进程中的参数或者函数必须被序列化，所以报错

解决：

import multiprocessing

def func(x):
    return x*x

class someClass(object):
    def __init__(self,func):
        self.f = func

    def go(self):
        pool = multiprocessing.Pool(processes=4)
        #result = pool.apply_async(self.f, [10])
        #print result.get(timeout=1)
        print pool.map(self.f, range(10))

a=someClass(func)
a.go()

defining a function at the top level
将进程需要调用的函数变成定义在module顶层的函数

解决过程中又出现的问题：
1. con = multiprocessing.Process(target=self.connect, args=(k, metrics, request, return_dict, ))
同样是PicklingError: Can’t pickle错误
原因是：

type(metrics)
<type 'dict'>

虽然metrics的类型是dict但是里面的具体内容是
'vmware_vm_net_transmitted_average': <prometheus_client.core.GaugeMetricFamily object at 0x7161650>
也会报PicklingError: Can’t pickle

可能是里里面的<prometheus_client.core.GaugeMetricFamily object at 0x7161650>不能被序列化