多线程之你真的了解Spin_lock吗？（四）

最新推荐文章于 2025-09-26 15:20:42 发布

原创最新推荐文章于 2025-09-26 15:20:42 发布 · 1.4k 阅读

1 ·

CC 4.0 BY-SA版权

UNIX C编程同时被 2 个专栏收录

9 篇文章

订阅专栏

UNIX C编程学习记录汇总

8 篇文章

订阅专栏

本文探讨了 Pthread_spin_lock 和 Spin_lock 的区别及其应用场景，分析了 Spin_lock 在单核非抢占内核中的意义，并通过实验对比了 Spin_lock 与 Mutex 的性能差异。

1. 写在之前

在阅读本文前，如果下面的问题您都能回答出来，那么本文你可以从略。

Pthread_spin_lock和Spin_lock的区别？
Spin_lock对于单核非抢占内核，为何是无意义的？
那么中断处理函数的执行是在内核态或者用户态呢？

如果上面3个问题，您都能回答出来，说明本文所讲述的内容，您大体上都了解了。本文基于Pthread_spin_lock探究下其用法及进行相关的实验。

2. 基本概念

首先不要搅浑任务调度和内核抢占，两个概念是指着不同的角度说的。内核抢占是对于处于内核态的进程是否能被调度的描述，而任务调度是指进程处于用户态时发生的调度。那么对于非抢占内核和抢占内核，在用户态的调度是没有区别的，但是在内核态对待调度行为的差别区分了抢占内核和非抢占内核。以下描述取自《ULK 3th 》：

in nonpreemptive kernels, the current process cannot be replaced unless it is about to switch to User Mode.

以下内容摘自维基百科：

Kernel preemption is a method used mainly in monolithic and hybrid kernels where all or most device drivers are run in kernel space, whereby the scheduler is permitted to forcibly perform a context switch (i.e. preemptively schedule; on behalf of a runnable and higher priority process) on a driver or other part of the kernel during its execution, rather than co-operatively waiting for the driver or kernel function (such as a system call) to complete its execution and return control of the processor to the scheduler.

本文不打算介绍内核调度机制，但是基本的概念要明白：在内核可抢占的系统中，调度可以发生在内核态的执行过程，换句话说，对于非抢占的系统中，调度只能等内核态的执行流执行完，不要混淆调度算法中每个进程时间片耗尽时任务调度，对于非抢占和抢占内核，在用户态的切换应该是没有区别的。区别只发生在一种场景：在内核态的任务调度的处理方式！
　　回过头来思考第一个问题：Pthread_spin_lock和Spin_lock的区别？答案应该是即相互区别又相互联系，前者用在用户态的编程，后者用在内核态的编程。如果进行应用程序的开发，那么前者是我们的实现spin_lock的手段。但当我们进行驱动编程或者内核态进程编程时，后者就是我们实现spin_lock的手段。

第二个问题：Spin_lock对于单核非抢占内核，为何是无意义的？

以下资料来自quora, 该回答太棒了，因为比较多，笔者节选了其中部分。

Time Sliced Scheduling: In this scenario what happens is, Process A acquires the spin lock for resource X and then having exhausted its time slice is swapped for Process B. The Process B attempts to acquire the spin lock and starts busy-waiting for the lock to be free’d. After a while, it completes its time slice and is Process A is swapped back on the CPU and continues processing. This cycle repeats until process A completes its task and releases the lock at which point Process B can now acquire the lock and start working. Do you see how wasteful this scenario was?

如果您不想看英文，那么看看我的总结：有人会说，在单核中会发生deadlock，这不准确。取决于使用何种调度算法，还有使用的spin_lock发生内核态还是用户态，如果使用CFS调度算法及在用户态进行spin_lock，此时会严重降低系统性能。有些调度算法比如EDF,则会造成dead_lock。
　　那么上面还是没有提到抢占非抢占的概念，如果我们使用spin_lock在内核态，比如在驱动编写中Spin_lock如果针对的是非抢占的内核，而系统又是单核的话，这就是灾难！
　　第三个问题：那么中断处理函数的执行实在内核态或者用户态呢？ 这个答案是百分之百在内核态，并且在处理完中断返回的时候，如果被中断的时候进程处于用户态，则中断处理函数返回的时候会发生一次调度。

3. 主要API

#include <pthread.h>
int pthread_spin_init(pthread_spinlock_t *lock, int pshared);
int pthread_spin_destroy(pthread_spinlock_t *lock);
Both return: 0 if OK, error number on failure

int pthread_spin_lock(pthread_spinlock_t *lock);
int pthread_spin_trylock(pthread_spinlock_t *lock);
int pthread_spin_unlock(pthread_spinlock_t *lock);
All return: 0 if OK, error number on failure

简单阐述下pthread_spin_lock的用法:

pthread_spin_init初始化一个锁，pthread_spin_lock当前线程加锁，如果未获得锁，将一直忙等待，此时该线程上的cpu一直处于忙等待。
如果调度算法为CFS，那么spin_lock可能会降低系统性能。只有靠我们自己评估context switch和spin_lock的代价大小了。

NOTE: 以上步骤假设使用SMP架构处理器

4. 实验

如果我们简单的将我们上一篇使用的Mutex替换为spin_lock，看看会发生什么？

4.1 测试一：

[root@localhost ~]# diff -Naur 11_4.c 11_6.c
--- 11_4.c      2018-02-09 17:48:17.463201599 +0800
+++ 11_6.c      2018-02-09 20:09:33.666472377 +0800
@@ -13,7 +13,7 @@
 };
 
 struct foo *fh[NHASH];
-pthread_mutex_t hashlock = PTHREAD_MUTEX_INITIALIZER;
+pthread_spinlock_t spin;
 pthread_mutex_t hashr_mutex = PTHREAD_MUTEX_INITIALIZER;
 pthread_cond_t hashr_cond = PTHREAD_COND_INITIALIZER;
 
@@ -28,12 +28,12 @@
         if ((fp = malloc(sizeof(struct foo))) != NULL)
         {
                 fp->f_id = id;
-                foo_printf(fp);
                 idx = HASH(id);
-                pthread_mutex_lock(&hashlock);
+                pthread_spin_lock(&spin);
                 fp->f_next = fh[idx];
                 fh[idx] = fp;
-                pthread_mutex_unlock(&hashlock);
+                foo_printf(fp);
+                pthread_spin_unlock(&spin);
         }
         return(fp);
 }
@@ -41,7 +41,7 @@
 struct foo * foo_find(int id)
 {
         struct foo * fp;
-        pthread_mutex_lock(&hashlock);
+        pthread_spin_lock(&spin);
         for (fp = fh[HASH(id)]; fp != NULL; fp = fp->f_next)
         {
                 if (fp->f_id == id)
@@ -49,7 +49,7 @@
                         break;
                 }
         }
-        pthread_mutex_unlock(&hashlock);
+        pthread_spin_unlock(&spin);
         return(fp);
 }
 
@@ -57,18 +57,18 @@
 {
         struct foo * fp;
        struct foo * fp1;
-        pthread_mutex_lock(&hashlock);
+        pthread_spin_lock(&spin);
         for (fp = fh[HASH(id)]; fp != NULL; fp = fp->f_next)
         {
                 if (fp->f_id == id)
                 {
-                        if( fp == fh[HASH(id)])
+                        if(fp == fh[HASH(id)])
                         {
                                 fh[HASH(id)] = fp->f_next;
                         }
                         else
                         {
-                            fp1=fh[HASH(id)];
+                                fp1=fh[HASH(id)];
                                 while (fp1->f_next != fp)
                                         fp1 = fp1->f_next;
                                 fp1->f_next = fp->f_next;
@@ -76,7 +76,7 @@
                         break;
                 }
         }
-        pthread_mutex_unlock(&hashlock);
+        pthread_spin_unlock(&spin);
         if(fp == NULL)
                 return 1;
         free(fp);
@@ -124,8 +124,9 @@
         int done[3]={4,4,4};
         pthread_t tid;
         char i=0;
+       pthread_spin_init(&spin,PTHREAD_PROCESS_PRIVATE);
         for( num=0;num<NHASH;num++)
-        {                                                                                                                                                                                                             
+        {
                 err = pthread_create(&tid, NULL, thr_fn,(void *)(num*NHASH));
                 if (err != 0)
                         errx(1, "can’t create thread:%d\n",num);

结果如下：

[root@localhost ~]# ./11_6
thread 140351228299008, foo_id:0
thread 140351219906304, foo_id:3
thread 140351236589312, foo_id:0
thread 140351142622976, foo_id:6
thread 140351236589312, foo_id:3
thread 140351236589312, foo_id:6

正常完成了测试，发现竟然简单的替换就成功了。但是需要进行下简单的时间测试，增加测试时间的代码（为了篇幅，不贴出来）并与上个使用mutex的实验对比，结果如下：

//spin_lock版本
[root@localhost ~]# ./11_6_1                           
thread 140228922906368, foo_id:0
thread 140228914513664, foo_id:3
thread 140228931196672, foo_id:0
thread 140228906120960, foo_id:6
thread 140228931196672, foo_id:3
thread 140228931196672, foo_id:6
run time:3769

//mutex版本
[root@localhost ~]# ./11_6_2
thread 140255847200512, foo_id:0
thread 140255830415104, foo_id:6
thread 140255838807808, foo_id:3
thread 140255855490816, foo_id:0
thread 140255855490816, foo_id:3
thread 140255855490816, foo_id:6
run time:2139

Spin_lock版本比mutex版本运行的时间还要长，也就是说spin_lock不适合在这样的场合！比较笼统的解释这样的结果就是spin_lock自身进行忙等待的消耗大于了mutex所引发的线程调度的上下文切换的开销，说白了就是spin所保护的代码段所进行的操作比较长。

4.2 测试二：

[root@localhost ~]# diff -Naur 11_6.c 11_6_3.c
--- 11_6.c      2018-02-09 20:09:33.666472377 +0800
+++ 11_6_3.c    2018-02-09 20:12:12.235815697 +0800
@@ -33,7 +33,7 @@
                 fp->f_next = fh[idx];
                 fh[idx] = fp;
                 foo_printf(fp);
-                pthread_spin_unlock(&spin);
+               // pthread_spin_unlock(&spin); 刻意不释放锁
         }
         return(fp);
 }

结果：

[root@localhost ~]# ./11_6_3                                                      
thread 139762245465856, foo_id:0
//程序卡死在这里

这里写图片描述
　　
　　３个核心deadlock,因为有一个线程调用了foo_alloc，首先获得了锁，然后又会调用pthread_cond_wait从而自己sleep了。此时其他的两个线程就会deadlock，然而主线程当其调用foo_find时也会deadlock。所以top命令的结果如上所示。同时从侧面我们也发现Linux任务调度的单位是线程，在Linux线程本质上作为进程处理，更具体详细的信息需要看相关内核调度的知识了。

5. 留在最后

接触了spin_lock，笔者才发现自己的内功太浅,以后会再把内核相关的书好好看看，因为有些问题是基本功！笔者注意到驱动编程和在应用层编程区别有些明显啊，在使用POSIX接口API进行编程的时候，不应该考虑到内核层面，所以我得出一个结论：进行POSIX编程时，忽略内核层面特征，只考虑API本身的特征，举例：多线程就考虑多个执行流同时执行，别瞎想什么调度啊，内核态用户态什么哒~