GDB是linux调试的基本工具,虽然比起windbg来说略显简陋,但也没办法,linux的调试工具只有它了。文章慢慢补充,知识慢慢积累,不急。
周末学习成果:
gdb, gcc,g++先要安装,就不讲了,linux系统都是自带的,手工升级太难,就用系统自带的吧。
- gcc: C语言环境:sudo apt-get install gcc
- g++: c++语言环境: sudo apt-get install g++
目录
------------------------------------------
第一讲:程序控制命令
GDB命令格式:
- gdb [option] [exec or process id]
加载和退出:
- gdb: 可以直接加载文件运行
- 可以通过file命令加载程序调试
- kill: 终止正在调试的程序
- quit:退出gdb调试环境
程序控制命令:
- run:程序开始执行
- continue: 运行到下一个断点
- next: 下一条指令
- step: 运行到下一条指令,可以进入函数
- demo程序用于练手
gehello.c
#include <iostream>
#include <unistd.h>
using namespace std;
int main(int argc, char* argv[]) {
int loops = 100;
if(argc>1)
loops = atoi(argv[1]);
for(int i=0;i<loops;i++) {
sleep(1);
cout << "Hello world from gedu!" << endl;
}
return 0;
}
程序就是循环sleep,避免无法调试。
g++ -g -o gegdb gehello.c
gdb gegdb
r -- run表示运行
ctlr+c --退出运行状态
c -- contiune继续运行
ctrl+c
bt -- backtrace 显示栈
显示顺序:栈帧序号 函数地址
frame 1 --回到1号栈帧
frame 0 --回到0号栈帧
disassemble
左边是内存地址,指令偏移,指令助记符
stp---是把信息压栈--arm64 一次压一对pair:bp,lr
frame 3 --我们自己的代码
l --list当前源码
disassemble
80位置是bl: branch and link
再敲bt
是从sleep返回的地址,是bl的下一条指令
kill pid --也能停下来
kill -s sigint 10310
kill -n 2 10321 --sigint
info signal
列出所有信号
info handle
handle sigint nostop
handle SIGINT nostop
ctrl-c停不下来
kill -s SIGSEGV pid --段错误
bt
看到段错误
handle SIGINT stop
bt tab补齐
----------------------------------
第二讲:设置断点
- break 函数名 行号
- info break 显示断点
- delete breakpoint 断点号
- clear 断点号
- disable breakpoint 断点号 禁止断点
- enable breakpoint 断点号
- save breakpoint aa.txt 保存到一个文本文件
------------------------------------
第三讲:查看变量
- list: 列出源代码
- watch 变量名:显示变化
- print 变量名:打印变量
- whatis 变量名或函数名: 显示类型
- Ptype: 显示数据结构的定义
- set args:设置程序运行参数
- Show args:显示运行参数
gex.c
#include <iostream>
#include <string>
using namespace std;
struct Person {
string name;
int age;
float height;
};
int main()
{
int var=10;
float floatvar=3.14f;
char charvar='A';
string stringvar="hello";
int intArray[5]={1,2,3,4,5};
Person person={'John',30,1.85f};
int *ptrvar = new int(42);
cout<<"integer:"<<var<<endl;
cout<<"float:"<<floatvar<<endl;
cout<<"char:"<<charvar<<endl;
cout<<"string:"<<stringvar<<endl;
cout<<"person name:"<<person.name<<endl;
cout<<"pointer value:"<<*ptrvar<<endl;
return 0;
}
b main --在main函数中断
i b --查看断点
r --运行起来
n --下一步
p var --显示未执行的值,随机值
n
p var --有值了
l --看一下代码
p floatvar
显示一个$1,$2...是啥意思? gdb定义的伪变量
p $$ --上一个伪变量
p charvar
p &charvar --A\n--gdb变成字符串,偶然遇到了\n
p stringvar
p &stringvar[0] --stl的string
pt stringvar --std::string 变量类型
p intArray
p intArray[0]
p intArray[1]@2 --打印1号开始的2个元素,指定范围
p person
p person.name
p ptrvar
info locals --显示所有局部变量
p *ptrvar
bt --当前代码行,代表将要执行的
n
p *ptrvar --执行后正确了
p ptrvar --指针的值也不一样了
display var --和print的区别,print打印一次。display是个挂在那里的画,每次执行都显示。类似商店的橱窗。
用于经常变的变量。
常用:display $pc --程序指针是经常变化的。执行中程序指针不断移动。
修改变量值
set varaiable var=15
p var
info locals
set floatvar=5
set var charvar='b'
p $pc --显示当前运行的指针
l --显示程序
bt --显示行号
修改变量值
set var var=1588
p var
display
info display --当前的橱窗展示
查看内存地址,观察原始内存
x /4xw &var -- 观察栈上的原始数据。 4表示个单元 x表示16进制,w表示word. word大小和平台相关,windows 16bit,linux下32bit
----------------------------------
第四讲:生成core文件
缺省是不生成core文件了,需要设置两个值:
1)cat /proc/sys/kernel/core_pattern
发现里面有个管道符|xxxx,这是缺省给放到其他文件中了?修改如下:
sudo su
echo "core-%e-%t" >/proc/sys/kernel/core_pattern
2)ulimit -c
0 --缺省不生成core文件,改为:
ulimit -c unlimited
core文件分析
制造一个空指针 gearr.c
#include <iostream>
int main() {
int* ptr=nullptr;
*ptr=10;
return 0;
}
进入后的截图,bt只看到一行,不够丰富,bt -past-main on 就丰富了:
看变量p 变量名
看最近的10条指令 x $pc
pc是当前指令
查看寄存器:i r
查看线程:thread apply all bt
只有一个线程,不太精彩。实际问题往往线程很多,这是打印线程堆栈的经典方法。
再练习一个数组越界问题:
/* array out of bound */
#include <stdio.h>
int main()
{
int arr[5]={1,2,3,4,5};
printf("indx 4=%d\n", arr[4]);
printf("now attemp illegal access...\n");
arr[10]=666;
return 0;
}
~/gegdb$ g++ -g -ogearr2 gearr2.c
~/gegdb$ ./gearr2
indx 4=5
now attemp illegal access...
段错误 (核心已转储)
~/gegdb$
这次看不到符号了:
(gdb) bt
#0 0x00007fb70000029a in ?? ()
#1 0x00007ffd14a0c0c0 in ?? ()
#2 0x00007ffd14a0c198 in ?? ()
#3 0x00000001f7b81040 in ?? ()
#4 0x00005577f7b82189 in frame_dummy ()
#5 0x00007ffd14a0c198 in ?? ()
#6 0xfee4326495406c11 in ?? ()
#7 0x0000000000000001 in ?? ()
#8 0x0000000000000000 in ?? ()
(gdb) x/10i $pc
=> 0x7fb70000029a: Cannot access memory at address 0x7fb70000029a
这是栈被破坏了,显示栈上数据
(gdb) x/16gx $sp
0x7ffd14a0c080: 0x00007ffd14a0c0c0 0x00007ffd14a0c198
0x7ffd14a0c090: 0x00000001f7b81040 0x00005577f7b82189
0x7ffd14a0c0a0: 0x00007ffd14a0c198 0xfee4326495406c11
0x7ffd14a0c0b0: 0x0000000000000001 0x0000000000000000
0x7ffd14a0c0c0: 0x00005577f7b84db0 0x00007fb70d6a9000
0x7ffd14a0c0d0: 0xfee4326496606c11 0xfe7001ab14626c11
0x7ffd14a0c0e0: 0x00007ffd00000000 0x0000000000000000
0x7ffd14a0c0f0: 0x0000000000000000 0x0000000000000001
继续看栈
(gdb) x/100ga
0x7ffd14a0c100: 0x7ffd14a0c190 0xd1a13c2606089400
0x7ffd14a0c110: 0x7ffd14a0c170 0x7fb70d47028b <__libc_start_main_impl+139>
0x7ffd14a0c120: 0x7ffd14a0c1a8 0x5577f7b84db0
0x7ffd14a0c130: 0x7ffd14a0c1a8 0x5577f7b82189 <main()>
0x7ffd14a0c140: 0x0 0x0
0x7ffd14a0c150: 0x5577f7b820a0 <_start> 0x7ffd14a0c190
0x7ffd14a0c160: 0x0 0x0
0x7ffd14a0c170: 0x0 0x5577f7b820c5 <_start+37>
0x7ffd14a0c180: 0x7ffd14a0c188 0x38
0x7ffd14a0c190: 0x1 0x7ffd14a0e0b8
0x7ffd14a0c1a0: 0x0 0x7ffd14a0e0c1
0x7ffd14a0c1b0: 0x7ffd14a0e0d1 0x7ffd14a0e11d
0x7ffd14a0c1c0: 0x7ffd14a0e130 0x7ffd14a0e144
程序指针的值:
(gdb) p $pc
$1 = (void (*)(void)) 0x7fb70000029a
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007fb70d46e800 0x00007fb70d5f5c8d Yes /lib/x86_64-linux-gnu/libc.so.6
0x00007fb70d672000 0x00007fb70d69c195 Yes /lib64/ld-linux-x86-64.so.2
程序指针不在代码段了,跑飞了。常见情况是栈溢出了。
退出来单步跟踪这个gearr2
arr[10]是函数返回地址,被覆盖了。
(gdb) p &arr[10]
$2 = (int *) 0x7fffffffdbe8
(gdb) x/gx &arr[10]
0x7fffffffdbe8: 0x00007ffff7dbe1ca
(gdb) info symbol 0x00007ffff7dbe1ca
__libc_start_call_main + 122 in section .text of /lib/x86_64-linux-gnu/libc.so.6
(gdb) x/10i 0x00007ffff7dbe1ca
0x7ffff7dbe1ca <__libc_start_call_main+122>: mov %eax,%edi
0x7ffff7dbe1cc <__libc_start_call_main+124>: call 0x7ffff7ddbb90 <__GI_exit>
0x7ffff7dbe1d1 <__libc_start_call_main+129>:
call 0x7ffff7e2d280 <__GI___nptl_deallocate_tsd>
0x7ffff7dbe1d6 <__libc_start_call_main+134>:
lock subl $0x1,0x1d8ef2(%rip) # 0x7ffff7f970d0 <__nptl_nthreads>
0x7ffff7dbe1de <__libc_start_call_main+142>:
je 0x7ffff7dbe1f0 <__libc_start_call_main+160>
0x7ffff7dbe1e0 <__libc_start_call_main+144>: mov $0x3c,%edx
0x7ffff7dbe1e5 <__libc_start_call_main+149>: nopl (%rax)
0x7ffff7dbe1e8 <__libc_start_call_main+152>: xor %edi,%edi
0x7ffff7dbe1ea <__libc_start_call_main+154>: mov %edx,%eax
0x7ffff7dbe1ec <__libc_start_call_main+156>: syscall
(gdb) x/20ga $sp
0x7fffffffdbc0: 0x200000001 0x400000003
0x7fffffffdbd0: 0x7fff00000005 0x8931b396f301d300
0x7fffffffdbe0: 0x7fffffffdc80 0x7ffff7dbe1ca <__libc_start_call_main+122>
0x7fffffffdbf0: 0x7fffffffdc30 0x7fffffffdd08
0x7fffffffdc00: 0x155554040 0x555555555189 <main()>
0x7fffffffdc10: 0x7fffffffdd08 0x7c20670a3ee608f
0x7fffffffdc20: 0x1 0x0
0x7fffffffdc30: 0x555555557db0 0x7ffff7ffd000 <_rtld_global>
0x7fffffffdc40: 0x7c20670ad0e608f 0x7c21638d9ec608f
0x7fffffffdc50: 0x7fff00000000 0x0
(gdb) n
9 return 0;
(gdb) x/20ga $sp
0x7fffffffdbc0: 0x200000001 0x400000003
0x7fffffffdbd0: 0x7fff00000005 0x8931b396f301d300
0x7fffffffdbe0: 0x7fffffffdc80 0x7fff0000029a
0x7fffffffdbf0: 0x7fffffffdc30 0x7fffffffdd08
0x7fffffffdc00: 0x155554040 0x555555555189 <main()>
0x7fffffffdc10: 0x7fffffffdd08 0x7c20670a3ee608f
0x7fffffffdc20: 0x1 0x0
0x7fffffffdc30: 0x555555557db0 0x7ffff7ffd000 <_rtld_global>
0x7fffffffdc40: 0x7c20670ad0e608f 0x7c21638d9ec608f
0x7fffffffdc50: 0x7fff00000000 0x0
(gdb) p/x 666
$3 = 0x29a
(gdb) bt -past-main on
#0 main () at gearr2.c:9
#1 0x00007fff0000029a in ?? ()
#2 0x00007fffffffdc30 in ?? ()
(gdb) x/10i $pc
=> 0x5555555551f6 <main()+109>: mov $0x0,%eax
0x5555555551fb <main()+114>: mov -0x8(%rbp),%rdx
0x5555555551ff <main()+118>: sub %fs:0x28,%rdx
0x555555555208 <main()+127>: je 0x55555555520f <main()+134>
0x55555555520a <main()+129>: call 0x555555555080 <__stack_chk_fail@plt>
0x55555555520f <main()+134>: leave
0x555555555210 <main()+135>: ret
0x555555555211: add %al,(%rax)
0x555555555213: add %dh,%bl
0x555555555215 <_fini+1>: nop %edx
(gdb) disp/3i $pc
1: x/3i $pc
=> 0x5555555551f6 <main()+109>: mov $0x0,%eax
0x5555555551fb <main()+114>: mov -0x8(%rbp),%rdx
0x5555555551ff <main()+118>: sub %fs:0x28,%rdx
(gdb) ni
10 }
1: x/3i $pc
=> 0x5555555551fb <main()+114>: mov -0x8(%rbp),%rdx
0x5555555551ff <main()+118>: sub %fs:0x28,%rdx
0x555555555208 <main()+127>: je 0x55555555520f <main()+134>
(gdb) ni
0x00005555555551ff 10 }
1: x/3i $pc
=> 0x5555555551ff <main()+118>: sub %fs:0x28,%rdx
0x555555555208 <main()+127>: je 0x55555555520f <main()+134>
0x55555555520a <main()+129>: call 0x555555555080 <__stack_chk_fail@plt>
(gdb)
0x0000555555555208 10 }
1: x/3i $pc
=> 0x555555555208 <main()+127>: je 0x55555555520f <main()+134>
0x55555555520a <main()+129>: call 0x555555555080 <__stack_chk_fail@plt>
0x55555555520f <main()+134>: leave
(gdb)
0x000055555555520f 10 }
1: x/3i $pc
=> 0x55555555520f <main()+134>: leave
0x555555555210 <main()+135>: ret
0x555555555211: add %al,(%rax)
(gdb)
0x0000555555555210 10 }
1: x/3i $pc
=> 0x555555555210 <main()+135>: ret
0x555555555211: add %al,(%rax)
0x555555555213: add %dh,%bl
(gdb) x/1gx $sp
0x7fffffffdbe8: 0x00007fff0000029a
(gdb) ni
0x00007fff0000029a in ?? ()
1: x/3i $pc
=> 0x7fff0000029a: <error: Cannot access memory at address 0x7fff0000029a>
----------------------------------------------------
第五讲:多线程调试
例子程序,启动两个线程
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
int x=0,y=0;
pthread_t pthid1,pthid2;
void *pth1_main(void *arg);
void *pth2_main(void *arg);
int main()
{
if(pthread_create(&pthid1,NULL,pth1_main,(void*)0)!=0)
{
printf("create pthid1 failed\n");
return -1;
}
if(pthread_create(&pthid2,NULL,pth2_main,(void*)0)!=0)
{
printf("create pthid2 failed\n");
return -1;
}
printf("thread1\n");
pthread_join(pthid1,NULL);
printf("thread2\n");
pthread_join(pthid2,NULL);
printf("waiting...\n");
return 0;
}
void *pth1_main(void *arg)
{
for(x=0;x<100;x++)
{
printf("x=%d\n",x);
sleep(1);
}
pthread_exit(NULL);
}
void *pth2_main(void *arg)
{
for(y=0;y<100;y++)
{
printf("y=%d\n",y);
sleep(1);
}
pthread_exit(NULL);
}
ps aux | grep gethread --查看进程
ps -aL | grep gethread --查看线程
pstree -p pid --查看主线程和子线程关系
gdb -p pid
info threads *代表gdb眼里的当前线程,目前观察的,比如bt就是当前线程
主线程栈帧3,jion等待其他两个线程会和后再继续。
栈帧1:wait_common
thread 2
info threads *在2号线程了
frame 3 --3号栈帧的源码 thread_start
frame 1 --2号栈帧的源码
thread apply all bt
b 32
b 13
b 41
i b
没有在自己的程序中,所以行号是系统函数的。
b main 然后设置断点,这样才是自己的程序中的行号。
clear
b main
r
b 13
b 32
b 41
r
set sheduler-locking on --只执行2号线程。集中注意力只调试某个线程。
watch y
执行2号线程了
thread 2
bt
frame 3
p x
set sheduler-locking on --全部线程执行了
thread apply 3 n
这是3号线程单步执行,其实其他线程也跑了一步
线程执行时间
time ./gethread
qwq:~/gegdb$ ps aux|grep gethread
xxx 12264 0.0 0.0 19072 1036 pts/0 Sl+ 22:15 0:00 ./gethread
xxx 12268 0.0 0.0 12260 2188 pts/1 S+ 22:15 0:00 grep --color=auto gethread
qwq:~/gegdb$ ps -aL|grep gethread
12264 12264 pts/0 00:00:00 gethread
12264 12265 pts/0 00:00:00 gethread
12264 12266 pts/0 00:00:00 gethread
@qwq:~/gegdb$ pstree 12264
@qwq:~/gegdb$ pstree -p 12264
@qwq:~/gegdb$ ps -aL|grep gethread
12285 12285 pts/0 00:00:00 gethread
12285 12286 pts/0 00:00:00 gethread
12285 12287 pts/0 00:00:00 gethread
@qwq:~/gegdb$ pstree -p 12285
gethread(12285)─┬─{gethread}(12286)
└─{gethread}(12287)
qwq:~/gegdb$ gdb -p 12303
@qwq:~/gegdb$ sudo gdb -p 12321
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fdb36f68740 (LWP 12321) "gethread" 0x00007fdb37003d61 in __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=12322,
futex_word=0x7fdb36f67990) at ./nptl/futex-internal.c:57
2 Thread 0x7fdb367666c0 (LWP 12323) "gethread" 0x00007fdb37057adf in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fdb36765e70,
rem=rem@entry=0x7fdb36765e70) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
3 Thread 0x7fdb36f676c0 (LWP 12322) "gethread" 0x00007fdb37057adf in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fdb36f66e70,
rem=rem@entry=0x7fdb36f66e70) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
多线程看堆栈
(gdb) thread apply all bt
Thread 3 (Thread 0x7fdb36f676c0 (LWP 12322) "gethread"):
#0 0x00007fdb37057adf in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fdb36f66e70, rem=rem@entry=0x7fdb36f66e70) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1 0x00007fdb37064a27 in __GI___nanosleep (req=req@entry=0x7fdb36f66e70, rem=rem@entry=0x7fdb36f66e70) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2 0x00007fdb37079c63 in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3 0x0000563bd815530c in pth1_main (arg=0x0) at gethread.c:35
#4 0x00007fdb37007a94 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#5 0x00007fdb37094c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
Thread 2 (Thread 0x7fdb367666c0 (LWP 12323) "gethread"):
#0 0x00007fdb37057adf in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fdb36765e70, rem=rem@entry=0x7fdb36765e70) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1 0x00007fdb37064a27 in __GI___nanosleep (req=req@entry=0x7fdb36765e70, rem=rem@entry=0x7fdb36765e70) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2 0x00007fdb37079c63 in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3 0x0000563bd8155372 in pth2_main (arg=0x0) at gethread.c:44
#4 0x00007fdb37007a94 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#5 0x00007fdb37094c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
--Type <RET> for more, q to quit, c to continue without paging--
Thread 1 (Thread 0x7fdb36f68740 (LWP 12321) "gethread"):
#0 0x00007fdb37003d61 in __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=12322, futex_word=0x7fdb36f67990) at ./nptl/futex-internal.c:57
#1 __futex_abstimed_wait_common (cancel=true, private=128, abstime=0x0, clockid=0, expected=12322, futex_word=0x7fdb36f67990) at ./nptl/futex-internal.c:87
#2 __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7fdb36f67990, expected=12322, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128) at ./nptl/futex-internal.c:139
#3 0x00007fdb37009793 in __pthread_clockjoin_ex (threadid=140579496687296, thread_return=0x0, clockid=0, abstime=0x0, block=<optimized out>) at ./nptl/pthread_join_common.c:102
#4 0x0000563bd8155291 in main () at gethread.c:24