Python中关于Thread的一点小知识

最新推荐文章于 2025-05-24 21:29:32 发布

转载最新推荐文章于 2025-05-24 21:29:32 发布 · 212 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://juejin.im/post/5b405b94f265da0f7e626c69

文章标签：

#python #数据库 #操作系统

最近在实现了一个对sqlite3进行简单封装的异步库aiosqlite，让其支持异步的方式调用。因为是python2.7，标准库中没有原生的类似asyncio的模块，所以依赖第三方tornado库。

由于sqlite3本身查询数据文件的操作是阻塞的，要想实现异步调用，就不得不通过多线程的方式，在执行查询语句的时候通过多线程操作，从而达到伪异步。

使用多线程的过程中，刚好跟同事聊了几句关于多线程的问题，尽管可能是一些基础的知识，但是平时由于没怎么考虑过这块的问题，所以觉得记录一下也好。

以下代码皆基于python2.7版本

Python中当开启的线程任务结束后，线程是否被销毁了？

关于这个问题我理所当然的认为是销毁了，但是同事的看法是：线程执行完成任务后退出，但实际并没有销毁，python本身也没有销毁线程的功能，想销毁线程的话要通过操作系统本身提供的接口。

通常我们在使用Thread时，都是start()开始线程任务，最终join()结束线程，所以看了下cpython中threading.Thread的源码，关于join()方法的说明中，也并未明确指出线程销毁的问题。

最终还是得通过实践出真知。在CentOS 7 x64系统中写了点测试代码简单验证一下。关于进程的线程数可以通过cat /proc/<pid>/status|grep Thread查看。

import time
from threading import Thread

def t1():
    print('Thread 1 start')
    time.sleep(10)
    print('Thread 1 done')

def t2():
    print('Thread 2 start')
    time.sleep(30)
    print('Thread 2 done')

t1 = Thread(target=t1, daemon=True)
t2 = Thread(target=t2, daemon=True)

for task in (t1, t2):
    task.start()

for task in (t1, t2):
    task.join()
复制代码

开始执行后查看到的Thread数是3，当Thread 1结束后再次查看发现Thread 2数变为2。可见，线程任务结束后，线程销毁了的。

线程任务中，如果其中一个线程阻塞，其他的线程是否还正常运行？

关于这个问题，就显得我有些愚蠢了。由于满脑子想的都是GIL，同一时间只可能有一个线程在跑，那如果这个线程阻塞了，其他的线程肯定也跑不下去了，所以就认为一个线程阻塞，其他的线程肯定也阻塞了。同事的看法则是，肯定不会阻塞，不然还叫什么多线程。

实际通过demo测试后发现，一个线程阻塞并不会对其他线程造成影响。由于对GIL一知半解，所以造成这种错误认知。看了下GIL的资料后了解到，Python的多线程是调用系统多线程接口，GIL只是一把全局锁，一个线程执行时获取到GIL后，执行过程中如果遇到IO阻塞，会释放掉GIL，这样轮到其他的线程执行。所以，不会存在一个线程阻塞，其他线程也跟着阻塞的问题。

这真是个低级的错误。。。

执行多线程任务时，如果其中一个线程中执行了sys.exit()整个进程是否会退出？

同事的看法是会退出，我和另一个同事则不太敢肯定。demo跑跑。

import sys
import time
from threading import Thread

def t1():
    print('Thread 1 start')
    sys.exit()
    print('Thread 1 done')

def t2():
    k = 10
    if k:
        print('Thread 2 is running')
        time.sleep(3)
        k -= 1

t1 = Thread(target=t1, daemon=True)
t2 = Thread(target=t2, daemon=True)

for task in (t1, t2):
    task.start()

for task in (t1, t2):
    task.join()
复制代码

结果是，直到t2运行结束后进程才会退出，t1中的sys.exit()并不会造成整个进程的退出。

看源码sysmodule.c

static PyObject *
sys_exit(PyObject *self, PyObject *args)
{
    PyObject *exit_code = 0;
    if (!PyArg_UnpackTuple(args, "exit", 0, 1, &exit_code))
        return NULL;
    /* Raise SystemExit so callers may catch it or clean up. */
    PyErr_SetObject(PyExc_SystemExit, exit_code);
    return NULL;
}
复制代码

可以看到，返回值总是NULL，但在exit_code不为0时，会set一个PyExc_SystemExit。全局搜索一下PyExc_SystemExit 在_threadmodule.c中可以找到

...
PyDoc_STRVAR(start_new_doc,
"start_new_thread(function, args[, kwargs])\n\
(start_new() is an obsolete synonym)\n\
\n\
Start a new thread and return its identifier.  The thread will call the\n\
function with positional arguments from the tuple args and keyword arguments\n\
taken from the optional dictionary kwargs.  The thread exits when the\n\
function returns; the return value is ignored.  The thread will also exit\n\
when the function raises an unhandled exception; a stack trace will be\n\
printed unless the exception is SystemExit.\n");

static PyObject *
thread_PyThread_exit_thread(PyObject *self)
{
    PyErr_SetNone(PyExc_SystemExit);
    return NULL;
}
复制代码

其实线程任务正常退出也会set一个PyExc_SystemExit，所以在线程中sys.exit()并不会让整个进程退出。

以上仅为个人见解，如有认知错误，欢迎指正，谢谢。

参考：

Python的GIL是什么

python GIL

Python/sysmodule.c

Modules/_threadmodule.c