一. 背景
上一回讲解了celery+django+redis+supervisor实现django定时运行脚本功能,get_version.py脚本的功能是逐个去服务器收集服务版本信息,服务器比较多,数据量比较大,整体耗时一个多小时,现在引入多线程优化这个脚本。
二. 编写测试脚本
vim /opt/scripts/thread1.py
import threading
import time
list1=[]
threads=[]
def worker(n):
print("before")
time.sleep(10)
print("finished")
list1.append(n)
n = 1
while n < 10000:
t = threading.Thread(target=worker, args=(n,))
t.start()
threads.append(t)
n += 1
for t in threads:
t.join()
print(list1)
python3 /opt/scripts/threading1.py
只需15s左右脚本就会运行完,比串行节省了很多时间,现在优化git_version.py脚本
三. 优化get_version.py脚本
vim get_version.py
import paramiko
import requests
import datetime
import redis
import json
import os
import subprocess
from threading import Timer
import logging
import threading
logger = logging.getLogger('mtzlogger')
class GetVersion():
def __init__(self):
pass
def par_ver(self, host0, app_name):
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=host0, port=2222, username='dc')
stdin, stdout, stderr = client.exec_command("awk '/jfrog/' /data/scripts/deploy_%s.sh | tail -1 | awk '{print $4}' | awk -F/ '{print $4}'" % (app_name))
out = stdout.read().decode('utf-8')
err = stderr.read().decode('utf-8')
if out == '':
out = '0'
client.close()
out = out.strip()
return out
def chaoshi(self, args, timeout):
p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
timer = Timer(timeout, lambda process: process.kill(), [p])
try:
timer.start()
stdout, stderr = p.communicate()
return_code = p.returncode
if stdout != b'':
return True
else:
return False
finally:
timer.cancel()
def worker(self,host0,app_name,objs,i,center,hosts_str,env):
result = self.chaoshi(['telnet', host0, '53742'], 2)
if result == False:
os.system('echo %s >> /tmp/hosts_questions.txt' % (host0))
return
ver = self.par_ver(host0, app_name)
now = datetime.datetime.now().strftime('%Y-%m-%d-%H:%M')
objs.append([i, center, app_name, ver, hosts_str, now, env])
def main(self):
os.system('rm -f /tmp/hosts_questions.txt')
all_keys = requests.get("http://10.0.0.10:10080/assets/inventory/--list/None/")
all_objs = all_keys.json()
i = 1
objs = []
threads = []
for item in all_objs:
if item == 'all' or item == '_meta':
continue
if 'ktz_data_apps' in item:
env = item.split('_ktz_data_apps_')[0]
center = 'ktz_data_apps'
app_name = item.split('_ktz_data_apps_')[-1]
elif 'ktz_m' in item:
env = item.split('_ktz_m_')[0]
center = 'mtz_m'
app_name = item.split('_ktz_m_')[-1]
else:
env = item.split('_')[0]
center = item.split('_')[1]
app_name = item.split('_')[-1]
hosts = all_objs[item]['hosts']
str = ''
for host in hosts:
str += host + ','
hosts_str = str
host0 = all_objs[item]['hosts'][0]
t = threading.Thread(target=self.worker, args=(host0,app_name,objs,i,center,hosts_str,env))
t.start()
threads.append(t)
i += 1
for t in threads:
t.join()
red = redis.Redis(host='localhost', port=6379, db=1)
objs_json = json.dumps(objs)
red.set('versions', objs_json)
if __name__ == '__main__':
gv = GetVersion()
gv.main()
1. 优化完成,运行时间从之前一个小时,现在缩短到20秒左右;
2. redis 中 set 函数如果 key 存在,则修改,如果不存在,则创建,所以不需要删除 key 重新创建
三. 总结
凡是很耗费时间的任务,尽量并行处理,这样,会节省很多时间,也不容易出错
云计算之celery系统优化
最新推荐文章于 2025-10-04 12:54:37 发布
通过引入多线程优化Django定时任务脚本,将原本耗时一个多小时的get_version.py脚本运行时间缩短至20秒左右。本文详细介绍了如何使用Python的threading模块进行并行处理,显著提升效率。
921

被折叠的 条评论
为什么被折叠?



