Java内存模型与线程同步：理解缓存一致性与伪共享-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_28136919/article/details/125139109

一、Java内存模型与线程

1.1.Java内存模型

1.1.1.现代计算机的内存模型

CPU在计算过程中需要从内存中读取数据,但是从内存读取数据的I/O操作相对于CPU的运算速度有几个量级的差距.

所以在现代计算机系统中,一般给CPU加入一层读写速度尽可能接近CPU运算速度的高速缓存,来作为内存与CPU之间的缓冲.

基于高速缓存的存储交互很好的解决了处理器与内存的速度矛盾,但是也引入了新的问题:缓存一致性问题.

为了解决缓存一致性问题,CPU访问缓存时都需要遵守一些协议,如:MSI,MESI等.

以下为计算机CPU、高速缓存、主内存之间的关系.

在这里插入图片描述

1.1.1.1.高速缓存

一般高速缓存存在多级缓存,如下图所示:

在这里插入图片描述
(图片来源https://blog.youkuaiyun.com/weixin_42523774/article/details/123203289)

在现代处理器中，按大小增加和速度递减的顺序，高速缓存存储器分为三个部分：L1，L2和L3高速缓存。L3高速缓存是最大也是最慢的(第三代Ryzen CPU具有高达64MB的大型L3高速缓存)高速缓存级别。L2和L1比L3小得多，并且速度更快，并且每个内核都分开。较早的处理器不包括三级L3高速缓存，并且系统内存直接与L2高速缓存交互：

L1,L2,L3缓存主要特点如下:

L1
1. L1缓存进一步可以分为两部分:L1数据缓存和L1指令缓存.后者包含需要由CPU执行的指令，而前者用于保存将被写回到主存储器的数据。
2. 单核L1缓存最大64Kb,
L2:L2缓存比L1大得多，但同时也慢一些。旗舰级CPU的大小为4-8MB(每个内核512KB)。每个内核都有自己的L1和L2缓存，而最后一级L3缓存在裸片上的所有内核之间共享。
L3:L3缓存是最低级别的缓存。从10MB到64MB不等。服务器芯片具有多达256MB的三级缓存。

查看L1缓存大小:

路径/sys/devices/system/cpu/cpu0/cache

[root@localhost cache]# pwd
/sys/devices/system/cpu/cpu0/cache
[root@localhost cache]# ls
index0  index1  index2  index3

其中:

Index0: L1 数据缓存
Index1: L1 指令缓存
Index2: L2缓存
Index3: L3缓存

index0文件夹内内容如下:

[root@localhost index0]# pwd
/sys/devices/system/cpu/cpu0/cache/index0
[root@localhost index0]# ls
coherency_line_size  level           physical_line_partition  shared_cpu_map  type
id                   number_of_sets  shared_cpu_list          size            ways_of_associativity
[root@localhost index0]# cat type 
Data
[root@localhost index0]# cat size 
48K
[root@localhost index0]# cat level 
1
[root@localhost index0]# cat ways_of_associativity 
12
[root@localhost index0]# cat coherency_line_size 
64

其中:

type:代表当前缓存类型
size:大小 ,单位KB
Level: 级别
ways_of_associativity,coherency_line_size用于计算cahceLine大小: cacheLine = size /(ways_of_associativity * coherency_line_size) = 64bytes

同理可以查看L2,L3缓存大小.

1.1.1.1.1.Cache Line

什么是cacheline?

Cache Line可以简单的理解为CPU Cache中的最小缓存单位,CPU从内存读取数据到Cache的时候,不是一个字节一个字节读取,而是一块一块读取,这一块数据称为Cache Line.

目前主流的CPU Cache的Cache Line大小都是64Bytes。假设我们有一个512字节的一级缓存，那么按照64B的缓存单位大小来算，这个一级缓存所能存放的缓存个数就是512/64 = 8个.也就是说CPU L1缓存放了8个Cache Line,每次从L2读取至少一个Cache Line.

1.1.1.1.2.伪共享

在jvm中一个变量最大为8个字节,如long类型,假如现在存在以下定义:

public class A{
  private long a;
  private long b;
}

在实际存储是类A的实例中例如: objA中a,b是连续存储的.如果 Cahce Line 的大小是 64 字节，并且变量 A 在 Cahce Line 的开头位置，那么这两个数据是位于同一个 Cache Line 中，又因为 CPU Line 是 CPU 从内存读取数据到 Cache 的单位，所以这两个数据会被同时读入到了两个 CPU 核心中各自 Cache 中。

在这里插入图片描述
(图片来源:https://zhuanlan.zhihu.com/p/458926355)

假如存在两个线程A,B,分别运行在CPU核1核CPU核2上,同时处理objA,都将变量a,b读取到当前cpu核的同一个cache line上.

这是A线程改变a的值,B线程改变b的值会出现什么问题?

根据MESI协议,在线程A修改成功a,线程B修改成功b,他们会互相同时各自的a,b所在的缓存行失效.进而需要从内存里面从新读取.

因为多个线程同时读写同一个 Cache Line 的不同变量时，而导致 CPU Cache 失效的现象称为伪共享（*False Sharing*）。

如何解决:

public class A{
  private long a;
  private long l1;
  private long l2;
  private long l3;
  private long l4;
  private long l5;
  private long l6;
  private long l7;
  private long b;
}

可以在变量前后添加无用的变量,保证有可能被不同线程同时操作的变量,不在一个cacheline里面.

1.1.2.JMM内存模型

1.1.2.1.JMM内存模型-主内存和工作内存的关系

Java内存模型的主要目标是定义程序中各个变量的访问规则,即在虚拟机中将变量存储到内存和从内存取出变量这样的底层细节.

在这里插入图片描述

JMM规定:

所有的变量都存储在主内存
每条线程有自己的工作内存,线程对变量的所有操作都必须在工作内存中进行,不能直接读写主内存中的变量

1.1.2.2.内存间的交互

JMM定义了8中操作来完成主内存和工作内存之间的交互:

操作	说明
lock	作用与主内存变量，标记一个线程独占状态
unlock	释放线程的独占状态
read	从主内存中读取数据，传输到工作内存，供load使用
load	将从主内存read过来的变量，放入工作内存的副本中
use	将变量传给执行引擎
assign	从执行引擎接收一个值，赋值到工作内存的变量副本
store	将工作内存中的变量的值传递到主内存中，共write使用
write	将从工作内存传递过来的值，写入主内存

工作原理如下图:

在这里插入图片描述

这八个操作要符合下面的规则:

read/load、store/write操作不允许单独出现
不允许丢弃最近的assign操作，即变量在工作内存中改变了之后必须把该变化同步回主内存
没有发生assign操作，不允许将工作内存的变量同步到主内存
一个新的变量只能在主内存中诞生，不允许在工作内存直接使用要给未被初始化的变量，在use、store之前必须先assign和load
一个变量在同一个时刻只允许一个线程对其进行lock操作，多次lock后，要执行相同次数的unlock
对一个变量进行lock，会清空工作内存中此变量的值，需要重新执行load\assign操作初始化该值
对变量没有lock，就不允许进行unlock
对变量执行unlock之前，必须把变量同步回主内存

1.1.2.3.原子性、可见性、顺序性

Java内存模型是围绕着在并发过程中如何处理原子性、可见性、有序性这三个特征来建立.

原子性:

read、load、use、assign、store、write本身就是原子操作，所以基础数据类型的读、写都具备原子性
通过lock\unlock操作来满足大范围的原子操作，synchronized关键字。

可见性

volatile关键字，保证了变量在线程间的可见性

有序性:

as-if-Serial 在本线程内，所有的操作都是有序的
volatile和synchronized 两个关键字保证了线程间的有序性，volatile：指令重排序，synchronized:同一时刻，只有一个线程可以进入同步代码块

二、线程安全和锁

2.1.Java中的线程

2.1.1.Java线程和操作系统线程的关系

在java虚拟机中,它的每一个Java线程都是直接映射到一个操作系统原生线程来实现的.jvm不会干涉线程的调度,一切交给操作系统去调度.

2.1.2.Java线程的状态转换

Java中定了了6中线程状态:

New 新建: 创建后尚未启动的线程.未调用start方法
Runnable 运行: 包括操作系统线程状态中的Running和Ready,也就是处于此状态的线程有可能正在执行,也有可能正在等待操作系统为它分配执行时间
Waiting 无限期等待: 操作系统不会给他分配执行时间,需要被显式的唤醒.以下方法,会让线程进入该状态
1. 没有设置Timeout参数的Object::wait方法
2. 没有设置Timeout参数的Thread::join方法
3. LockSupport::park方法
Timed Waiting 限期等待:操作系统不会给他分配执行时间,但是过了指定时间后会被操作系统自动唤醒,以下方法,会让线程进入该状态
1. Thread.sleep()方法
2. 设置Timeout参数的Object::wait方法
3. 没有设置Timeout参数的Thread::join方法
4. LockSupport.parkNanos(long nanos)方法
5. LockSupport.parkUntil(long deadline)方法
Blocked 阻塞:该状态于等待状态的区别是,等待状态要不是等待一段时间被系统唤醒,要不是被别的线程显式唤醒,而阻塞状态是等待获取一个排它锁,它会在其它线程释放该排它锁时被调用
Terminated: 已经终止的线程状态.

在这里插入图片描述

2.1.2.Java线程使用

2.1.2.1.join方法

使用方法:

 public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(()->{
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println("线程"+Thread.currentThread().getName());
        },"t_1");

        Thread t2 = new Thread(()->{
            try {
                t.join();
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println("线程"+Thread.currentThread().getName());
        },"t_2");
        Thread t3 = new Thread(()->{
            try {
                t2.join();
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println("线程"+Thread.currentThread().getName());
        },"t_3");

        t.start();
        t2.start();
        t3.start();

        System.out.println("done");
    }

done
线程t_1
线程t_2
线程t_3

join方法的含义是等待当前线程结束.

假如在线程B的run方法中调用线程A.join()方法,那么B会在A.join()处一直等到A线程执行完成,才能继续往下执行.

Thread提供了三种形式的join方法:

join():一直等待当前线程结束
join(long millis): 最长等待设置的毫秒数
join(long millis, int nanos): 最长等待精确到纳秒数,

我们看下join方法源码:


  public final void join() throws InterruptedException {
        join(0);
    }
    
  public final synchronized void join(long millis, int nanos)
    throws InterruptedException {

        if (millis < 0) {
            throw new IllegalArgumentException("timeout value is negative");
        }

        if (nanos < 0 || nanos > 999999) {
            throw new IllegalArgumentException(
                                "nanosecond timeout value out of range");
        }

        if (nanos >= 500000 || (nanos != 0 && millis == 0)) {
            millis++;
        }

        join(millis);
    }
  public final synchronized void join(long millis)
    throws InterruptedException {
        long base = System.currentTimeMillis();
        long now = 0;

        if (millis < 0) {
            throw new IllegalArgumentException("timeout value is negative");
        }

        if (millis == 0) {
            while (isAlive()) {
                wait(0);
            }
        } else {
            while (isAlive()) {
                long delay = millis - now;
                if (delay <= 0) {
                    break;
                }
                wait(delay);
                now = System.currentTimeMillis() - base;
            }
        }
    }

最核心的还是join(long millis),执行流程如下:

如果设置时间小于0,抛出异常
如果设置时间为0,只要当前线程还存活,就一直等待(wait(0)表示一直等待)
如果设置时间大于0,等待设置的时间,如果超过,则返回

总结: A调用B的join方法,则A等待B. 自己调用自己,则永远等待,知道被动中断.

2.1.2.2.interrupt VS interrupted VS isInterrupted

从源码来看这三个函数的区别:

public void interrupt() {
        if (this != Thread.currentThread())
            checkAccess();

        synchronized (blockerLock) {
            Interruptible b = blocker;
            if (b != null) {
                interrupt0();           // Just to set the interrupt flag
                b.interrupt(this);
                return;
            }
        }
        interrupt0();
  }

 public static boolean interrupted() {
        return currentThread().isInterrupted(true);
    }
    
    

  public boolean isInterrupted() {
        return isInterrupted(false);
    }

/**
     * Tests if some Thread has been interrupted.  The interrupted state
     * is reset or not based on the value of ClearInterrupted that is
     * passed.
     */
 private native boolean isInterrupted(boolean ClearInterrupted);
 private native void interrupt0();

总结:

interrupt: 调用native方法interrupt0, 设置中断标志, 猜测底层jvm 线程类维护一个interrupted state ,假设将当前interrupted state设置为true,标志当前线程中断
interrupted: 返回当前线程的interrupted state,并重置interrupted state,如果中断是true,那么现在应该是false
isInterrupted: 返回当前线程的interrupted state,不重置interrupted state.

对interrupt的特别说明

如果当前线程已经处在wait状态,比如调用了wait()、wait(long) 、wait(long, int)、join()、join(long)、join(long ,int)、sleep(long)、sleep(long,int)方法,那么已经设置的中断标示将会被删除,并且抛出InterruptedException异常
如果当前线程有I/O 阻塞,那么当前线程将会关闭I/O通道,并且收到ClosedByInterruptException异常和设置中断状态

对interrupted特别说明

它是返回的当前线程的中断状态,不是那个线程实例的状态.

 public static boolean interrupted() {
        return currentThread().isInterrupted(true);
    }

它是一个静态方法,不是实例方法.

例子1:

  public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(()->{
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                System.out.println("exception:"+Thread.currentThread().isInterrupted());
                e.printStackTrace();

            }
            System.out.println("线程"+Thread.currentThread().getName());
        },"t_1");
        t.start();
        System.out.println("执行中断");
        t.interrupt();
    }


执行中断
exception:false
线程t_1
java.lang.InterruptedException: sleep interrupted
	at java.lang.Thread.sleep(Native Method)
	at com.study.jvm.thread.ThreadTest.lambda$main$0(ThreadTest.java:13)
	at java.lang.Thread.run(Thread.java:748)

例子2: interrupted

   public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(()->{
            for(int i=0;i<1000;i++){
                System.out.println("i="+i);
            }
        },"t_1");
        t.start();
        System.out.println("执行中断");
        t.interrupt();
        Thread.sleep(1);
        System.out.println("interrupt-1:"+t.interrupted());
        System.out.println("interrupt-2:"+t.interrupted());
        Thread.currentThread().interrupt();;
        System.out.println("main-interrupt-1:"+t.interrupted());
        System.out.println("main-interrupt-2:"+t.interrupted());
    }



执行中断
i=0
...
i=80
interrupt-1:false
interrupt-2:false
i=81
i=82
main-interrupt-1:true
...
main-interrupt-2:false
i=114
i=115
...

2.1.2.3.优雅的停止线程

例子1,循环

public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(()->{
            for(int i=0;i<1000;i++){
                if(Thread.currentThread().isInterrupted()){
                    System.out.println("检测到中断");
                    break;
                }
                System.out.println("i="+i);
            }

        },"t_1");
        t.start();
        Thread.sleep(1);
       t.interrupt();
    }

例子二: 处于wait状态的特殊处理,需要在catch(InterruptedException)里面再次中断,

 public static void main(String[] args) throws InterruptedException {
        Thread t = new Thread(()->{
            try {
                Thread.sleep(10);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                e.printStackTrace();
            }
        },"t_1");
        t.start();
        Thread.sleep(1);
       t.interrupt();
    }

2.1.2.4.守护线程

The Java Virtual Machine exits when the only threads running are all daemon threads.

当jvm中不存在任何一个正在运行的非守护线程时,jvm就会退出.

如何设置:

thread.setDaemon(true);

守护线程的特点:

当jvm中不存在任何一个正在运行的非守护线程时,jvm就会退出.,守护线程拥有自动结束自己生命周期的特性，而非守护线程不具备这个特点。

垃圾回收线程就是典型的守护线程.

2.2.线程安全

2.2.1.什么是线程安全

当多个线程同时访问同一个对象时,如果不考虑这些线程在运行时环境下的调度和交替进行,也不需要额外的同步,或者在调用方进行任何其它的协调操作,调用这个对象的行为都可以获得正确的结果,那么这个对象就是线程安全的. 引用自深入理解Java虚拟机:JVM高级特性与最佳实践

2.2.2.Java语言中的线程安全

java中各种操作共享的数据分为以下五类:

不可变
1. String对象
2. Number的部分子类
  1. Long\Double\Float\Integer\Short\Byte\BigDecimal\BigInteger
3. 枚举
绝对线程安全
1. java中不存在绝对安全的类
相对线程安全
1. java中的类大多数时相对线程安全
线程兼容: 对象本身不是线程安全的,但是可以通过正确的使用同步手段来保证使用对象时线程安全
线程对立: 不管调用端是否采取了同步措施,都无法在多线程环境中并发使用代码