Python Multi-Processing多线程编程

本文介绍了Python中创建多线程的两种方法:使用thread模块的start_new_thread函数和使用threading模块创建Thread类实例。并通过示例展示了如何利用互斥锁解决多线程间的数据同步问题。

今天总结一下Python的Multi-Processing多线程编程。

创建多进程的方法

start new thread

使用thread.start_new_thread是一种最简单的创建多进程的方法。

#!/usr/bin/python

import thread
import time

# Define a function for the thread
def print_time(threadName, delay):
   count = 0
   while count < 5:
      time.sleep(delay)
      count += 1
      print "%s: %s"%(threadName, time.ctime(time.time()))

# Create two threads as follows
try:
   thread.start_new_thread(print_time, ("Thread-1", 2,))
   thread.start_new_thread(print_time, ("Thread-2", 4,))
except:
   print "Error: unable to start thread"

while 1:
   pass

输出结果为:

Thread-1: Wed Sep 27 19:52:57 2017
Thread-2: Wed Sep 27 19:52:59 2017
Thread-1: Wed Sep 27 19:52:59 2017
Thread-1: Wed Sep 27 19:53:01 2017
Thread-1: Wed Sep 27 19:53:03 2017
Thread-2: Wed Sep 27 19:53:03 2017
Thread-1: Wed Sep 27 19:53:05 2017
Thread-2: Wed Sep 27 19:53:07 2017
Thread-2: Wed Sep 27 19:53:11 2017
Thread-2: Wed Sep 27 19:53:15 2017

如果没有程序末尾的while无限循环,thread1和thread2之外的主线程瞬间就会结束,程序就退出了,因此会什么也不打印。

threading module

#!/usr/bin/python

import threading
import time

exitFlag = 0

class myThread(threading.Thread):

   def __init__(self, threadID, name, delay):
      threading.Thread.__init__(self)
      self.threadID = threadID
      self.name = name
      self.delay = delay

   def run(self):
      print "Starting " + self.name
      print_time(self.name, 5, self.delay)
      print "Exiting " + self.name

def print_time(threadName, counter, delay):
   while counter:
      if exitFlag:
         threadName.exit()
      time.sleep(delay)
      print "%s: %s" % (threadName, time.ctime(time.time()))
      counter -= 1

# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)

# Start new Threads
thread1.start()
thread2.start()

print "Exiting Main Thread"

输出结果为:

Starting Thread-1
Starting Thread-2
Exiting Main Thread
Thread-1: Wed Sep 27 20:00:16 2017
Thread-2: Wed Sep 27 20:00:17 2017
Thread-1: Wed Sep 27 20:00:17 2017
Thread-1: Wed Sep 27 20:00:18 2017
Thread-2: Wed Sep 27 20:00:19 2017
Thread-1: Wed Sep 27 20:00:19 2017
Thread-1: Wed Sep 27 20:00:20 2017
Exiting Thread-1
Thread-2: Wed Sep 27 20:00:21 2017
Thread-2: Wed Sep 27 20:00:23 2017
Thread-2: Wed Sep 27 20:00:25 2017
Exiting Thread-2

也可以通过加入线程列表的形式,等待线程执行结束后主线程才退出:

#!/usr/bin/python

import threading
import time

exitFlag = 0

class myThread(threading.Thread):

   def __init__(self, threadID, name, delay):
      threading.Thread.__init__(self)
      self.threadID = threadID
      self.name = name
      self.delay = delay

   def run(self):
      print "Starting " + self.name
      print_time(self.name, 5, self.delay)
      print "Exiting " + self.name

def print_time(threadName, counter, delay):
   while counter:
      if exitFlag:
         threadName.exit()
      time.sleep(delay)
      print "%s: %s" % (threadName, time.ctime(time.time()))
      counter -= 1

# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)

# Start new Threads
thread1.start()
thread2.start()

# Add threads to thread list
threads = []
threads.append(thread1)
threads.append(thread2)

# Wait for all threads to complete
for t in threads:
    t.join()

print "Exiting Main Thread"

输出结果为:

Starting Thread-1
Starting Thread-2
Thread-1: Wed Sep 27 20:04:21 2017
Thread-1: Wed Sep 27 20:04:22 2017
Thread-2: Wed Sep 27 20:04:22 2017
Thread-1: Wed Sep 27 20:04:23 2017
Thread-2: Wed Sep 27 20:04:24 2017
Thread-1: Wed Sep 27 20:04:24 2017
Thread-1: Wed Sep 27 20:04:25 2017
Exiting Thread-1
Thread-2: Wed Sep 27 20:04:26 2017
Thread-2: Wed Sep 27 20:04:28 2017
Thread-2: Wed Sep 27 20:04:30 2017
Exiting Thread-2
Exiting Main Thread

互斥锁同步线程

不加锁的问题

当不同的线程操作同一批数据的时候,就有可能产生意想不到的问题。比如说:

# encoding: UTF-8
import threading
import time

class MyThread(threading.Thread):
    def run(self):
        global num
        time.sleep(1)
        num = num+1
        msg = self.name+' set num to '+str(num)
        print msg
num = 0
threads = []
def test():
    for i in range(1000):
        t = MyThread()
        t.start()
    threads.append(t)
if __name__ == '__main__':
    test()
    for t in threads:
        t.join()
    print num

理论上说,一千个线程都让num增1,结果应该是1000,但是输出的结果不是1000:(而且每次都不一样)

............
Thread-998 set num to 905
Thread-988 set num to 906
Thread-994 set num to 904
Thread-992 set num to 901
Thread-990 set num to 907
Thread-1000 set num to 908
908

这是因为不同线程同时操作同样数据导致的线程不同步。此时需要通过加互斥锁实现线程的同步。

加锁的效果

# encoding: UTF-8
import threading
import time

class MyThread(threading.Thread):
    def run(self):
        global num
        time.sleep(1)
    threadLock.acquire() 
        num = num+1
    threadLock.release()
        msg = self.name+' set num to '+str(num)
        print msg
num = 0
threadLock = threading.Lock()
threads = []
def test():
    for i in range(1000):
        t = MyThread()
        t.start()
    threads.append(t)
if __name__ == '__main__':
    test()
    for t in threads:
        t.join()
    print num

加锁相当于每个数据在每一时刻只能被一个线程获取到,而获取不到锁的线程就会处于暂时的阻塞状态。输出结果是这样的:

.........
Thread-994 set num to 994
Thread-995 set num to 995
Thread-993 set num to 993
Thread-996 set num to 996
Thread-999 set num to 997
Thread-1000 set num to 998
Thread-998 set num to 999
Thread-997 set num to 1000
1000

可以看出,并不是按照线程的创建时间获取数据的处理权限,但是结果却可以保证每个线程都能依次处理数据,从而保证数据被自增了1000次。

Python 中,多线程编程是一种常见的并发编程方式,特别适用于 I/O 密集型任务。Python 的标准库 `threading` 提供了对多线程编程的支持,使得开发者可以轻松地创建和管理线程。 ### 实现多线程编程的基本方法 #### 创建线程 通过 `threading.Thread` 类可以创建一个新的线程实例。以下是一个简单的示例: ```python import threading def worker(): print("Worker thread is running") # 创建线程 thread = threading.Thread(target=worker) # 启动线程 thread.start() # 等待线程结束 thread.join() ``` #### 多线程函数 在实际应用中,通常需要处理多个任务。以下是一个多线程函数的示例,展示了如何为多个 URL 启动多个线程进行爬取: ```python import threading import time def task(url): print(f"Processing {url}") time.sleep(1) urls = ["http://example.com", "http://example.org", "http://example.net"] def multi_thread(): threads = [] for url in urls: thread = threading.Thread(target=task, args=(url,)) threads.append(thread) thread.start() for thread in threads: thread.join() multi_thread() ``` ### 线程同步 在多线程编程中,为了避免多个线程同时访问共享资源导致的数据竞争问题,需要使用同步机制。Python 提供了多种同步机制,如锁(`threading.Lock`)、条件变量(`threading.Condition`)等。 #### 使用锁 锁是一种简单的同步机制,确保同一时间只有一个线程访问共享资源: ```python import threading lock = threading.Lock() shared_resource = 0 def increment(): global shared_resource for _ in range(100000): lock.acquire() shared_resource += 1 lock.release() threads = [] for _ in range(4): thread = threading.Thread(target=increment) threads.append(thread) thread.start() for thread in threads: thread.join() print(f"Shared resource value: {shared_resource}") ``` ### 线程池 线程池是一种管理线程的技术,可以有效地减少线程创建和销毁的开销。Python 提供了 `concurrent.futures.ThreadPoolExecutor` 来简化线程池的使用: ```python from concurrent.futures import ThreadPoolExecutor def task(url): print(f"Processing {url}") time.sleep(1) urls = ["http://example.com", "http://example.org", "http://example.net"] with ThreadPoolExecutor(max_workers=5) as executor: executor.map(task, urls) ``` ### 实际应用场景 多线程编程在以下场景中非常有用: - **网络请求**:例如,同时从多个网站抓取数据。 - **文件处理**:例如,同时处理多个文件或目录。 - **GUI 应用程序**:避免主线程阻塞,提高响应速度。 #### 示例:多线程爬虫 以下是一个简单的多线程爬虫示例,展示了如何使用多线程加速网页抓取过程: ```python import threading import requests import time def fetch_url(url): response = requests.get(url) print(f"Fetched {url} with status code {response.status_code}") urls = [ "https://example.com", "https://example.org", "https://example.net" ] def multi_thread_craw(): threads = [] for url in urls: thread = threading.Thread(target=fetch_url, args=(url,)) threads.append(thread) thread.start() for thread in threads: thread.join() start_time = time.time() multi_thread_craw() end_time = time.time() print(f"Total time taken: {end_time - start_time} seconds") ``` ### 提高性能的技巧 - **避免全局解释器锁(GIL)的影响**:对于 CPU 密集型任务,可以考虑使用多进程或异步编程- **使用线程池**:减少线程创建和销毁的开销。 - **合理分配线程数量**:过多的线程可能导致上下文切换的开销增加。 通过上述方法和技巧,开发者可以有效地利用 Python多线程编程来提高程序的并发性能和响应速度。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值