C++ Thread Local Storage

最新推荐文章于 2025-09-16 10:02:56 发布

原创最新推荐文章于 2025-09-16 10:02:56 发布 · 4.1k 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#tls

c/c++ 专栏收录该内容

48 篇文章

订阅专栏

本文探讨了线程局部存储(TLS)的概念及其在多线程环境中的应用。从TLS的诞生背景讲起，介绍了POSIX提供的TLS接口及其实现机制，并对比了显式和隐式TLS变量的优缺点。此外，还讨论了C++11中引入的thread_local关键字如何解决了非POD类型的支持问题。

Quote
- From : http://ju.outofmemory.cn/entry/66238

我查阅了相关的资料, 发现线程局部存储(TLS)是一个后来者, 产生于多线程概念之后.而在软件发展的早期, 全局变量经常用在库函数中, 用于存储全局信息, 比如errno, 多线程程序产生之后, 全局变量errno就成为所有线程都共享的一个变量, 而实际上, 每个线程都想维护一份自己的errno, 隔离于其他线程.

这个时候, 没人愿意去修改库函数的接口. 于是线程局部存储就诞生了, 根据wikipedia的介绍

Thread-local storage (TLS) is a computer programming method that uses static or global memory local to a thread.

为了在各个平台上都能用上线程局部变量, POSIX Thread定义了一组接口, 用于显式构造使用线程局部存储.

#include <pthread.h>
int pthread_key_create(pthread_key_t *key, void (*destructor)(void*));
int pthread_key_delete(pthread_key_t key);
void *pthread_getspecific(pthread_key_t key);
int pthread_setspecific(pthread_key_t key, const void *value);

显式构造线程局部变量的方法, 有一个显著优点就是能注册各种类型的对象, 包括内置对象和自定义对象. 而且对象的销毁方式destructor可以显式告诉pthread_key_create, 这样线程退出的时候, 线程局部变量就可以正常销毁, 不至于造成内存泄露.

优点再多, 也禁不住它太难用了, 于是有人就想在编译器添加新功能, 支持特定关键字__thread, 隐式构造线程局部变量

 __thread int i;
 extern __thread struct state s;
 static __thread char *p;

这样的方式, 使用起来是很方便, 但是需要操作系统, 编译器, 连接器, glibc要相应做出修改, 甚至ELF文件格式都需要调整, 这个Ulrich Drepper在tls.pdf中做了详细的介绍.

另一方面, __thread只支持POD类型, 不能用于定义STL中的容器和类, 比如std::string. 非要这么做, 编译器会报错:

main.cpp:8: error: ‘a’ cannot be thread-local because it has non-POD type ‘std::string’
main.cpp:8: error: ‘a’ is thread-local and so cannot be dynamically initialized

gcc也在文档中专门谈到了Thread-Local, 提到了_thread修饰的变量只能做static initialize

In C++, if an initializer is present for a thread-local variable, it must be a constant-expression, as defined in 5.19.2 of the ANSI/ISO C++ standard.

既然线程局部存储有两种使用方式, 而且各有优缺点, 就有人提出结合二者, 开发一个使用更方便, 又能支持non-POD类型的实现库. 比如 blog 线程局部变量与 __thread

C++11也意识到这个问题, 于是在C++11中引入了新的关键字thread_local, Destructor support for thread_local variables介绍说:

One of the key features is that they support non-trivial constructors and destructors that are called on first-use and thread exit respectively.

除了支持non-POD类型的线程局部变量, 它还提到了上文提到的线程局部变量和动态加载so的问题

The compiler can handle constructors, but destructors need more runtime context. It could be possible that a dynamically loaded library defines and constructs a thread_local variable, but is dlclose()’d before thread exit. The destructor of this variable will then have the rug pulled from under it and crash.

解决的思路是实现函数__cxa_thread_atexit_impl(), 供libstdc++在构造对象的时候调用

int __cxa_thread_atexit_impl (void (*dtor) (void *), void *obj,
                          void *dso_symbol);

连接器(ld)为dso_symbol所属的so维护一个引用计数, 维护由它定义的线程局部变量个数. 如果某个线程局部变量被析构, 那引用计数相应减1, 只有引用计数等于0, dlclose()才能卸载so.

其他视角

Walter Bright 在文章It’s Not Always Nice To Share中认为现有的线程局部变量实现都不友好, 在多线程环境下, static和global的变量应该默认就是TLS的, 而不是shared, 这样单核时代的代码, 比如C运行库,不用改动就可以运行在多线程环境中; 如果应用程序非要全局shared的变量, 那应该加上shared关键字以明确指明.

总结

再回头看看上文的建议, 如果非要在动态加载so中使用线程局部变量.

显式线程局部变量:
- pthread_key_create注册了destructor, 在dlclose()调用之前, 确保调用pthread_key_delete()删除线程局部变量
- pthread_key_create()中的destructor置为NULL.
隐式线程局部变量: 因为只支持POD类型, 所以可以用在动态加载so中.