Unix下最普遍的线程包是POSIX标准的Pthreads。Pthreads使用的抢占式线程管理策略,程序中的一个线程可能在任何时候被另一个线程中断。所以,使用Pthreads开发的应用程序有些错误不太容易重现。
GDB线程相关命令汇总
- info threads 给出关于当前所有线程的信息
- thread n 改为线程n,或者说是进入线程n的栈中进行观察
- break line_num thread n 表示当线程n到达源码行line_num时停止执行
- break line_num thread n if expression 上一命令增加条件断点而已
加入怀疑线程之间有死锁,可以用gdb进行调试定位。流程大致是:
- 用gdb启动或者插入待调试程序
- 当程序挂起时候,通过按下Ctrl+C组合键中断它;
- 这个时候用info threads查看所有线程都在干嘛,然后找到自己的工作线程(注意排除main线程和pthreads的管理线程)
- 分别查看自己的工作线程在干嘛,用bt(backtrace)查看对应的帧,记得用thread n切换进入对应线程的帧
- 关注像__pthread_wait_for_restart_signal()和lock等函数,如果有源码的话,会比较方便地定位到具体的问题代码位置
下面是一个简单的例子。如果在worker线程里面,上锁和解锁没有匹配,则会发生死锁
// finds the primes between 2 and n; uses the Sieve of Eratosthenes, // deleting all multiples of 2, all multiples of 3, all multiples of 5, // etc.; not efficient, e.g. each thread should do deleting for a whole // block of values of base before going to nextbase for more // usage: sieve nthreads n // where nthreads is the number of worker threads #include <stdio.h> #include <math.h> #include <pthread.h> #define MAX_N 100000000 #define MAX_THREADS 100 // shared variables int nthreads, // number of threads (not counting main()) n, // upper bound of range in which to find primes prime[MAX_N+1], // in the end, prime[i] = 1 if i prime, else 0 nextbase; // next sieve multiplier to be used int work[MAX_THREADS]; // to measure how much work each thread does, // in terms of number of sieve multipliers checked // lock index for the shared variable nextbase pthread_mutex_t nextbaselock = PTHREAD_MUTEX_INITIALIZER; // ID structs for the threads pthread_t id[MAX_THREADS]; // "crosses out" all multiples of k, from k*k on void crossout(int k) { int i; for (i = k; i*k <= n; i++) { prime[i*k] = 0; } } // worker thread routine void *worker(int tn) // tn is the thread number (0,1,...) { int lim,base; // no need to check multipliers bigger than sqrt(n) lim = sqrt(n); do { // get next sieve multiplier, avoiding duplication across threads pthread_mutex_lock(&nextbaselock); base = nextbase += 2; pthread_mutex_unlock(&nextbaselock); if (base <= lim) { work[tn]++; // log work done by this thread // don't bother with crossing out if base is known to be // composite if (prime[base]) crossout(base); } else return; } while (1); } main(int argc, char **argv) { int nprimes, // number of primes found totwork, // number of base values checked i; void *p; n = atoi(argv[1]); nthreads = atoi(argv[2]); for (i = 2; i <= n; i++) prime[i] = 1; crossout(2); nextbase = 1; // get threads started for (i = 0; i < nthreads; i++) { pthread_create(&id[i],NULL,(void *) worker,(void *) i); } // wait for all done totwork = 0; for (i = 0; i < nthreads; i++) { pthread_join(id[i],&p); printf(" %d values of base done /n ",work[i]); totwork += work[i]; } printf(" %d total values of base done /n ",totwork); // report results nprimes = 0; for (i = 2; i <= n; i++) if (prime[i]) nprimes++; printf("the number of primes found was %d /n ",nprimes); }