ThreadLocal机制分析

最新推荐文章于 2025-09-06 09:26:50 发布

转载最新推荐文章于 2025-09-06 09:26:50 发布 · 63 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://my.oschina.net/u/3729778/blog/1612304

文章标签：

#python #java

本文详细解析了ThreadLocal的工作原理及其可能导致的内存泄漏问题。介绍了ThreadLocal如何为每个线程创建独立的变量副本，以及如何通过get和set方法进行访问和修改。同时，深入分析了ThreadLocal内存泄漏的根本原因及解决办法。

2019独角兽企业重金招聘Python工程师标准>>>

一、ThreadLocal介绍

ThreadLocal在很多地方被用到，下面来看一下具体ThreadLocal是如何实现的。首先看ThreadLocal的注释：

This class provides thread-local variables. These variables differ from their normal counterparts in that each thread that accesses one (via its get or set method) has its own, independently initialized copy of the variable. ThreadLocal instances are typically private static fields in classes that wish to associate state with a thread (e.g., a user ID or Transaction ID).
For example, the class below generates unique identifiers local to each thread. A thread's id is assigned the first time it invokes ThreadId.get() and remains unchanged on subsequent calls.
  
   Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).

翻译过来意思是：该类提供线程局部变量.。这些变量与它们的正常对应变量不同，因为每个访问一个线程这些变量在多线程环境下访问(通过get或set方法访问)时能保证各个线程里的变量相对独立于其他线程内的变量，都有它自己的、独立初始化的变量副本。ThreadLocal实例通常是类中希望将状态与线程相关联的私有静态字段(例如，用户ID或事务ID)。

每个线程都包含对线程局部变量副本的隐式引用，只要线程是活动的，并且线程本地实例是可访问的；线程消失后，线程本地实例的所有副本都会受到垃圾收集(除非存在对这些副本的其他引用)。

例子如JDK代码注释所示。

 import java.util.concurrent.atomic.AtomicInteger;
  
   public class ThreadId {
       // Atomic integer containing the next thread ID to be assigned
       private static final AtomicInteger nextId = new AtomicInteger(0);
  
       // Thread local variable containing each thread's ID
       private static final ThreadLocal<Integer> threadId =
           new ThreadLocal<Integer>() {
               @Override protected Integer initialValue() {
                   return nextId.getAndIncrement();
           }
       };
  
       // Returns the current thread's unique ID, assigning it if necessary
       public static int get() {
           return threadId.get();
       }
   }

二、ThreadLocal机制分析

ThreadLocal可以看做是一个容器，容器里面存放着属于当前线程的变量。ThreadLocal类提供了四个对外开放的接口方法，这也是用户操作ThreadLocal类的基本方法：

//该方法返回当前线程所对应的线程局部变量
public T get() {}
//设置当前线程的线程局部变量的值
public void set(T value) {}
//将当前线程局部变量的值删除，目的是为了减少内存的占用，该方法是JDK 5.0新增的方法。需要指出的是，当线程结束后，对应该线程的局部变量将自动被垃圾回收，所以显式调用该方法清除线程的局部变量并不是必须的操作，但它可以加快内存回收的速度。 
public void remove() {}
//返回该线程局部变量的初始值，该方法是一个protected的方法，显然是为了让子类覆盖而设计的。这个方法是一个延迟调用方法，在线程第1次调用get()或set(Object)时才执行，并且仅执行1次，ThreadLocal中的缺省实现直接返回一个null。
private T setInitialValue() {}

(1)get()方法是用来获取ThreadLocal在当前线程中保存的变量副本

(2)set()用来设置当前线程中变量的副本，

(3)remove()用来移除当前线程中变量的副本，

(4)initialValue()是一个protected方法，一般是用来在使用时进行重写的，它是一个延迟加载方法。

　　先看下get方法的具体实现：

/**
 * Returns the value in the current thread's copy of this
 * thread-local variable.  If the variable has no value for the
 * current thread, it is first initialized to the value returned
 * by an invocation of the {@link #initialValue} method.
 *
 * @return the current thread's value of this thread-local
 */
public T get() {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null) {
        ThreadLocalMap.Entry e = map.getEntry(this);
        if (e != null)
            return (T)e.value;
    }
    return setInitialValue();
}

　　第一句是取得当前线程，然后通过getMap(t)方法获取到一个map，如下：

/**
 * Get the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param  t the current thread
 * @return the map
 */
ThreadLocalMap getMap(Thread t) {
    return t.threadLocals;
}

map的类型为ThreadLocalMap。在getMap中，是调用当前线程t，返回当前线程t中的一个成员变量threadLocals。查看Thread类可以看到：

/* ThreadLocal values pertaining to this thread. This map is maintained
 * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

会发现Thread中有ThreadLocal.ThreadLocalMap类型的成员变量threadLocals

然后接着下面获取到<key,value>键值对，注意这里获取键值对传进去的是 this，而不是当前线程t。

　　如果获取成功，则返回value值。

　　如果map为空，则调用setInitialValue方法返回value。

　　我们上面的ThreadLocalMap，这个类型是ThreadLocal类的一个内部类，我们继续取看ThreadLocalMap的实现：　

/**
 * ThreadLocalMap is a customized hash map suitable only for
 * maintaining thread local values. No operations are exported
 * outside of the ThreadLocal class. The class is package private to
 * allow declaration of fields in class Thread.  To help deal with
 * very large and long-lived usages, the hash table entries use
 * WeakReferences for keys. However, since reference queues are not
 * used, stale entries are guaranteed to be removed only when
 * the table starts running out of space.
 */
static class ThreadLocalMap {

    /**
     * The entries in this hash map extend WeakReference, using
     * its main ref field as the key (which is always a
     * ThreadLocal object).  Note that null keys (i.e. entry.get()
     * == null) mean that the key is no longer referenced, so the
     * entry can be expunged from table.  Such entries are referred to
     * as "stale entries" in the code that follows.
     */
    //map中的每个节点Entry,其键key是ThreadLocal并且还是弱引用，这也导致了后续会产生内存泄漏问题的原因。
    static class Entry extends WeakReference<ThreadLocal> {
        /** The value associated with this ThreadLocal. */

        Object value;

        Entry(ThreadLocal k, Object v) {
            super(k);
            value = v;
        }
    }

    /**
     * The initial capacity -- MUST be a power of two.
     */
    private static final int INITIAL_CAPACITY = 16;

    /**
     * The table, resized as necessary.
     * table.length MUST always be a power of two.
     * 真正用于存储线程的每个ThreadLocal的数组，将ThreadLocal和其对应的值包装为一个Entry。
     */
     */
    private Entry[] table;

　　可以看到ThreadLocalMap的Entry继承了WeakReference，并且使用ThreadLocal作为键值。

　　然后再继续看setInitialValue方法的具体实现：

/**
 * Variant of set() to establish initialValue. Used instead
 * of set() in case user has overridden the set() method.
 *
 * @return the initial value
 */
private T setInitialValue() {
    T value = initialValue();
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
    return value;
}

　　很容易了解，就是如果map不为空，就设置键值对，为空，再创建Map，看一下createMap的实现：

/**
 * Create the map associated with a ThreadLocal. Overridden in
 * InheritableThreadLocal.
 *
 * @param t the current thread
 * @param firstValue value for the initial entry of the map
 * @param map the map to store.
 */
void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}

　　至此，我们明白了ThreadLocal是如何为每个线程创建变量的副本的：

　　首先，在每个线程Thread内部有一个ThreadLocal.ThreadLocalMap类型的成员变量threadLocals，这个threadLocals就是用来存储实际的变量副本的，键值为当前ThreadLocal变量，value为变量副本（即T类型的变量）。

　　初始时，在Thread里面，threadLocals为空，当通过ThreadLocal变量调用get()方法或者set()方法，就会对Thread类中的threadLocals进行初始化，并且以当前ThreadLocal变量为键值，以ThreadLocal要保存的副本变量为value，存到threadLocals。

　　然后在当前线程里面，如果要使用副本变量，就可以通过get方法在threadLocals里面查找。

下面上一个简单的例子：

package com.jason.threadlocal;

public class Context {

    private String traceId;
    private String ordNo;

    public String getTraceId() {
        return traceId;
    }

    public void setTraceId(String traceId) {
        this.traceId = traceId;
    }

    public String getOrdNo() {
        return ordNo;
    }

    public void setOrdNo(String ordNo) {
        this.ordNo = ordNo;
    }

    @Override
    public String toString() {
        return "Context{" +
                "traceId='" + traceId + '\'' +
                ", ordNo='" + ordNo + '\'' +
                '}';
    }
}

package com.jason.threadlocal;

public class TestThreadLocal {

    private static final ThreadLocal<Context> contextThreadLocal = new ThreadLocal<>();

    public static void main(String[] args) throws InterruptedException {
        Context context = new Context();
        context.setTraceId(Thread.currentThread().getName());
        context.setOrdNo(Thread.currentThread().getId() + "");
        contextThreadLocal.set(context);
        System.out.println(context);
        Thread thread1 = new Thread(new Runnable() {

            @Override
            public void run() {
                Context context = new Context();
                context.setTraceId(Thread.currentThread().getName());
                context.setOrdNo(Thread.currentThread().getId() + "");
                contextThreadLocal.set(context);
                System.out.println(context);

            }

        });
        thread1.start();
        //等待thread1终止
        thread1.join();
        System.out.println(context);
    }
}

会打印结果如下：

Context{traceId='main', ordNo='1'}
Context{traceId='Thread-0', ordNo='9'}
Context{traceId='main', ordNo='1'}

从上面的例子可以看到main线程和thread1线程在contextThreadLocal各自保存的值是不一样的，之所以后面又打印一次main线程的值是为了证明共享变量contextThreadLocal中保存的确实是main线程自己的那一份副本。

综上所述，可以得出

（1）ThreadLocal创建的副本实际上是通过每个线程的threadLocals中保存：

（2）为何threadLocals的类型ThreadLocalMap的键值为ThreadLocal对象，因为每个线程中可有多个threadLocal变量。

三、ThreadLocal内存泄漏问题

ThreadLocal 引用关系图

ThreadLocal
ThreadLocal的实现是这样的：每个Thread 维护一个 ThreadLocalMap 映射表，这个映射表的 key 是 ThreadLocal 实例本身，value 是真正需要存储的 Object。

也就是说 ThreadLocal 本身并不存储值，它只是作为一个 key 来让线程从 ThreadLocalMap 获取 value。值得注意的是图中的虚线，表示 ThreadLocalMap 是使用 ThreadLocal 的弱引用作为 Key 的，弱引用的对象在 GC 时会被回收。

`ThreadLocal`为什么会内存泄漏

ThreadLocalMap使用ThreadLocal的弱引用作为key，如果一个ThreadLocal没有外部强引用来引用它，那么系统 GC 的时候，这个ThreadLocal势必会被回收，这样一来，ThreadLocalMap中就会出现key为null的Entry，就没有办法访问这些key为null的Entry的value，如果当前线程再迟迟不结束的话，这些key为null的Entry的value就会一直存在一条强引用链：Thread Ref -> Thread -> ThreaLocalMap -> Entry -> value永远无法回收，造成内存泄漏。

其实，ThreadLocalMap的设计中已经考虑到这种情况，也加上了一些防护措施：在ThreadLocal的get(),set(),remove()的时候都会清除线程ThreadLocalMap里所有key为null的value。

但是这些被动的预防措施并不能保证不会内存泄漏：

使用static的ThreadLocal，延长了ThreadLocal的生命周期，可能导致的内存泄漏（参考ThreadLocal 内存泄露的实例分析）。
分配使用了ThreadLocal又不再调用get(),set(),remove()方法，那么就会导致内存泄漏。

为什么使用弱引用

从表面上看内存泄漏的根源在于使用了弱引用。网上的文章大多着重分析ThreadLocal使用了弱引用会导致内存泄漏，但是另一个问题也同样值得思考：为什么使用弱引用而不是强引用？

我们先来看看官方文档的说法：

To help deal with very large and long-lived usages, the hash table entries use WeakReferences for keys.
为了应对非常大和长时间的用途，哈希表使用弱引用的 key。

下面我们分两种情况讨论：

key 使用强引用：引用的ThreadLocal的对象被回收了，但是ThreadLocalMap还持有ThreadLocal的强引用，如果没有手动删除，ThreadLocal不会被回收，导致Entry内存泄漏。
key 使用弱引用：引用的ThreadLocal的对象被回收了，由于ThreadLocalMap持有ThreadLocal的弱引用，即使没有手动删除，ThreadLocal也会被回收。value在下一次ThreadLocalMap调用set,get，remove的时候会被清除。

比较两种情况，我们可以发现：由于ThreadLocalMap的生命周期跟Thread一样长，如果都没有手动删除对应key，都会导致内存泄漏，但是使用弱引用可以多一层保障：弱引用ThreadLocal不会内存泄漏，对应的value在下一次ThreadLocalMap调用set,get,remove的时候会被清除。

因此，ThreadLocal内存泄漏的根源是：由于ThreadLocalMap的生命周期跟Thread一样长，如果没有手动删除对应key就会导致内存泄漏，而不是因为弱引用。