死锁检测与内存泄露
死锁检测
死锁
死锁是多线程并发执行时,竞争对方持有的资源而无法向前继续执行;一般线程进程长时间处于BLOCKED状态可能存在死锁的情况;
预防死锁
1. 避免循环等待条件:按照固定顺序申请资源、确保不会形成环形等待
2. 设置超时:在尝试获取锁时设置一个较为合理的超时时间;超时后释放已经获取的资;
死锁检测
对于正在运行时的程序或者进程,可以采用外部的工具进行观察进程状态和系统调用情况、检查是否有线程卡在lock操作上无法继续执行:
gdb调试:
1. 使用 info threads命令,显示所有线程状态以及其堆栈跟踪,找到阻塞在互斥量上的线程
2. 使用thread thread_id进入线程,bt或者where查看当前堆栈信息
3. 查找等待锁或者条件变量的线程,若有多个线程在彼此等待对方持有的资源就是死锁
(gdb) info thread
Id Target Id Frame
* 1 Thread 0x7ffff7a563c0 (LWP 55512) "deadlock" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0,
futex_word=0x7ffff7a52910) at ./nptl/futex-internal.c:57
2 Thread 0x7ffff7a52640 (LWP 55515) "deadlock" futex_wait (private=0, expected=2, futex_word=0x55555555a1a0 <mutex2>)
at ../sysdeps/nptl/futex-internal.h:146
3 Thread 0x7ffff7251640 (LWP 55516) "deadlock" futex_wait (private=0, expected=2, futex_word=0x55555555a1e0 <mutex3>)
at ../sysdeps/nptl/futex-internal.h:146
4 Thread 0x7ffff6a50640 (LWP 55517) "deadlock" futex_wait (private=0, expected=2, futex_word=0x55555555a160 <mutex1>)
at ../sysdeps/nptl/futex-internal.h:146
(gdb) thread 2
[Switching to thread 2 (Thread 0x7ffff7a52640 (LWP 55515))]
#0 futex_wait (private=0, expected=2, futex_word=0x55555555a1a0 <mutex2>) at ../sysdeps/nptl/futex-internal.h:146
146 ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) bt
#0 futex_wait (private=0, expected=2, futex_word=0x55555555a1a0 <mutex2>) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait (futex=futex@entry=0x55555555a1a0 <mutex2>, private=0) at ./nptl/lowlevellock.c:49
#2 0x00007ffff7bd6002 in lll_mutex_lock_optimized (mutex=0x55555555a1a0 <mutex2>) at ./nptl/pthread_mutex_lock.c:48
#3 ___pthread_mutex_lock (mutex=0x55555555a1a0 <mutex2>) at ./nptl/pthread_mutex_lock.c:93
#4 0x000055555555583a in __gthread_mutex_lock (__mutex=0x55555555a1a0 <mutex2>) at /usr/include/x86_64-linux-gnu/c++/11/b
#5 0x00005555555559b0 in std::mutex::lock (this=0x55555555a1a0 <mutex2>) at /usr/include/c++/11/bits/std_mutex.h:100
#6 0x0000555555555a2a in std::lock_guard<std::mutex>::lock_guard (this=0x7ffff7a51d68, __m=...) at /usr/include/c++/11/bi
#7 0x00005555555553a2 in FuncA () at deadlock.cpp:14
#8 0x00005555555564cb in std::__invoke_impl<void, void (*)()> (__f=@0x55555556ceb8: 0x555555555334 <FuncA()>) at /usr/inc
#9 0x0000555555556477 in std::__invoke<void (*)()> (__fn=@0x55555556ceb8: 0x555555555334 <FuncA()>) at /usr/include/c++/1
#10 0x0000555555556418 in std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul> (this=0x55555556ceb8)
at /usr/include/c++/11/bits/std_thread.h:259
#11 0x00005555555563e8 in std::thread::_Invoker<std::tuple<void (*)()> >::operator() (this=0x55555556ceb8) at /usr/include
#12 0x00005555555563c8 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run (this=0x555555
at /usr/include/c++/11/bits/std_thread.h:211
#13 0x00007ffff7e63253 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#14 0x00007ffff7bd2ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#15 0x00007ffff7c64850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
如上所示,使用info thread可以显示当前说有线程信息,第2、3、4个线程分别在执行 futex_wait 函数,并且各自等待着不同的互斥量(mutex1、mutex2、mutex3),它们的地址分别是 0x55555555a160、0x55555555a1a0 和 0x55555555a1e0。
在进入线程2后,打印当前堆栈信息,通过分析堆栈信息可以得出,当前线程在执行FuncA函数时,在deadlock.cpp文件的14行获取mutex2的互斥锁时发生死锁;
pstack命令:
1. 可以打印当前所有线程的堆栈信息,可以等大一下多次获取堆栈信息,查看是否存在某几个线程的堆栈信息么有变化,查看时候在相互等待获取对方持有的资源
2. 使用pstack id可以打印指定线程的堆栈信息
strace命令:
1. strace时一个强大的系统调用追踪工具,它可以跟踪进程执行时的系统调用和信号
2. 使用是trace检查死锁时,关注fatex系统调用,它通常用于实现pthread_mutex_lock等多线程同步原语
3. strace -p pid 查看输出,若看到大量的线程反复尝试获取某个futex而失败,且没有线程释放,可能存在死锁
perf监控:
perf时linux系统自带的系统性能分析工具,主要用来分析系统的性能瓶颈,可以通过监控线程状态和cpu占用情况开间接判断是否存在死锁
valgrind + helgrind:
-
valgrind是一款内存错误检测工具,而helgrind是valgrind工具集成的一款工具,用于检测多线程程序中的竞争条件和死锁;
-
helgrind通过获取模拟死锁的获取和释放来检测线程之间是否存在非法的资源访问序列,从而查找可能出现死锁的编程错误
-
valgrind --tool=helgrind ./your_program
编程通过hook函数将锁的获取和释放情况管理,分析出现死锁的线程函数
// build: gcc -o deadlock deadlock.c -lpthread -ldl
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <stdint.h>
#if 1
typedef unsigned long int uint64;
#define MAX 100
enum Type {
PROCESS, RESOURCE};
struct source_type {
uint64 id;
enum Type type;
uint64 lock_id;
int degress;
};
struct vertex {
struct source_type s;
struct vertex *next;
};
struct task_graph {
struct vertex list[MAX];
int num;
struct source_type locklist[MAX];
int lockidx; //
pthread_mutex_t mutex;
};
struct task_graph *tg = NULL;
int path[MAX+1];
int visited[MAX];
int k = 0;
int deadlock = 0;
struct vertex *create_vertex(struct source_type type) {
struct vertex *tex = (struct vertex *)malloc(sizeof(struct vertex ));
tex->s = type;
tex->next = NULL;
return tex;
}
int search_vertex(struct source_type type) {
int i = 0;
for (i = 0;i < tg->num;i ++) {
if (tg->list[i].s.type == type.type && tg->list[i].s.id == type.id) {
return i;
}
}
return -1;
}
void add_vertex(struct source_type type) {
if (search_vertex(type) == -1) {
tg->list[tg->num].s = type;
tg->list[tg->num].next = NULL;
tg->num ++;
}
}
int add_edge(struct source_type from, struct source_type to) {
add_vertex(from);
add_vertex(to)