注意:多斤程执行分批处理的时候的if条件检查是否满足,如果不满足,永远不会执行,p.start(),运用pycharm的debug方法即可调试。
def run_url(arg):
return arg + 'vv'
class TempMul(object):
@staticmethod
def run():
pool = ThreadPool(20)
return pool.map(run_url, ['1', '2', '3', '4'])
if __name__ == '__main__':
TestUtils.my_print(TempMul.run())
def run_url(arg):
return arg + 'vv'
if __name__ == '__main__':
pool = ThreadPool(20)
re = pool.map(run_url, ['1', '2', '3', '4'])
TestUtils.my_print(re)
错误:
def test_save_features_to_db(self):
df1 = pd.read_csv('/home/sc/PycharmProjects/risk-model/xg_test/statis_data/shixin_company.csv')
com_list = df1['company_name'].values.tolist()
p_list = []
i = 1
p_size = len(com_list)
for company_name in com_list[1:4]:
p = Process(target=self.__save_data_iter_method, args=[company_name])
p_list.append(p)
if i % 20 == 0 or i == p_size:
for p in p_list:
p.start()
for p in p_list:
p.join()
p_list = []
i += 1
错误原因:p_size = len(com_list)
for company_name in com_list[1:4]:后i永远不会等于p_size,即i == p_size不会成立;
更正:
def test_add_save_features_to_db(self):
df1 = pd.read_csv('/home/sc/PycharmProjects/risk-model/xg_test/statis_data/shixin_company.csv')
com_list = df1['company_name'].values.tolist()
p_list = []
i = 1
com_list = com_list[1:4]
p_size = len(com_list)
for company_name in com_list:
p = Process(target=self.__add_save_data_iter_method, args=[company_name])
print company_name, os.getpid()
p_list.append(p)
if i % 20 == 0 or i == p_size:
for p in p_list:
p.start()
for p in p_list:
p.join()
p_list = []
i += 1
要写成这样:com_list = com_list[1:4]
p_size = len(com_list)
**深入思考????
多进程——怎么多进程回调函数单独能执行,放入多进程后就不能执行?
怎么区分多进程里面哪个是父进程,哪个是子进程?
如果用pool = ThreadPool(20)
re = pool.map(run_url, ['1', '2', '3', '4'],
回调函数不能放在类里面,且调用的类里面不能调用其他类的函数,所以即便把类里面的函数放在外面。如果里面还调用了其他累的函数,也是不可以的;
但是:p = Process(target=self.__add_save_data_iter_method, args=[company_name])可以调用类里面的函数,自身类,自身类里面也调用别的类的方法。**