gcc和gdb的使用以及实战(bomblab）

本文链接：https://blog.youkuaiyun.com/weixin_44520881/article/details/108244830

gcc

简介

GCC（GNU Compiler Collection，GNU编译器套件）是由GNU开发的编程语言译器。gcc 与 g++ 分别是 gnu 的 c & c++ 编译器。（gcc 命令只能编译 C++ 源文件，而不能自动和 C++ 程序使用的库连接。因此，通常使用 g++ 命令来完成 C++ 程序的编译和连接，该程序会自动调用 gcc 实现编译。）

编译过程

以一个输出"hello world"的代码为例，编译的过程中分为如下4步(下图来自《csapp》)：

预处理，生成.i文件。
将预处理后的文件转换成汇编语言, 生成文件 .s [编译器egcs]
由汇编变为目标代码(机器代码)生成 .o 的文件[汇编器as]
连接目标代码, 生成可执行程序 [链接器ld]

参数

-x language filename：忽视后缀名。比如gcc -x c hello.jpg，但是编译c++代码要使用g++。
-c ：只进行预处理，编译，和汇编，通过源代码生成.o文件。
大写-S（常用）：通过预处理和编译，生成汇编代码。
-E：只激活预处理,这个不生成文件, 需要把它重定向到一个输出文件里面。如：gcc -E hello.c > pianoapan.txt
-o（常用）：可以自定义生成的可执行文件的名称，否则可执行文件名为a.out难听死。
O0 、-O1 、-O2 、-O3（常用）：编译器的优化选项的 4 个级别，-O0 表示没有优化, -O1 为默认值，-O3 优化级别最高。
-g（常用）：生成调试信息。GNU 调试器（如gdb）可利用该信息。

其他的参考GCC参数详解|菜鸟教程

gdb

调试

可以在gdb中运行一个程序进行调试，也可以attach一个运行中的程序进行调试。还有其他方法暂未了解。

参数

gdb
program也就是执行文件，一般在当然目录下。
gdb
如果程序是一个服务程序，那么可以指定这个服务程序运行时的进程ID。gdb会自动attach上去，并调试他。program应该在PATH环境变量中搜索得到。

运行指令

下述指令即使输入前缀中的一部分也可以的比如：l、n、r。

list ：查看源代码
空格：执行上一条指令
run：执行代码
break line：line为在指定的行号添加断点(一般先break在run)，也可以通过函数名给函数添加断点。
next：执行一行指令
p var：显示变量var的值
continue：在程序被attach上后，就会暂停，此时如果要恢复程序运行，使用continue或c命令。

还有很多种指令，就不一一列出，下面简单的进行测试。

简单调试一个运行着的程序

test.c

#include<stdio.h>


void test() {
    int a = 1;
    int b = 2;
    int t  = 1;
    while(t) {
        a++;
        b++;
    }
}

int main() {
    test();
    return 0;
}

编译生成可执行程序testgcc -g test.c -o test
通过ps -aux|grep test查看进程号，输入gdb test <pid>attach上该进程进行调试。
在这里插入图片描述

查看源代码及对应的行数
在这里插入图片描述
使用next继续向下执行，通过p查看变量的值

反汇编

语法

disassemble：打印当前栈帧（需要启动进程），或者在gdb中启动，才能运行该条指令
disassemble [Function]
disassemble [Address]
disassemble [Start],[End]
disassemble [Function],+[Length]
disassemble [Address],+[Length]
disassemble /m [...]
disassemble /r [...]

Parameters

Function
Specifies the function to disassemble. If specified, the disassemble command will produce the disassembly output of the entire function.
Address
Specifies the address inside a function to disassemble. Note that when only one address is specified, this command will disassemble the entire function that includes the given address, including the instructions above it.
Start/End
Specifies starting and ending addresses to disassemble. If this form is used, the command won’t disassemble the entire function, but only the instructions between the starting and ending addresses.
Length
Specifies the amount of bytes to disassemble starting from the given address or function.
/m
When this option is specified, the disassemble command will show the source lines that correspond to the disassembled instructions.
/r
When this option is specified, the disassemble command will show the raw byte values of all disassembled instructions.

注意：
如果尝试使用disassemble命令在任何已知函数之外反汇编代码，它将失败。而是使用x / i命令。

x/5i funcname

参考disassemble command

objdump

也可以使用objdump -d进行反汇编。

汇编代码格式

上述指令得到的汇编代码格式是ATT格式的汇编代码，这是GCC、OBJDUMP工具的格式。与之前接触的Microsoft工具不同，都是Inter格式的。使用下述命令行，GCC可以产生multstore函数的Intel格式的代码。
linux>gcc -Og -S -masm=intel mstore.c
下图是《csapp》中列出的整数寄存器以及作用
在这里插入图片描述

实战(csapp)bomblab

题目来自lab

已知主函数

#include <stdio.h>
#include <stdlib.h>
#include "support.h"
#include "phases.h"


FILE *infile;

int main(int argc, char *argv[]) {
    char *input;
    
    if (argc == 1) {  //从终端读取 
		infile = stdin;
    } else if (argc == 2) {	//从文件中读取 
		if (!(infile = fopen(argv[1], "r"))) {
		    printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);
		    exit(8);
		}
    } else {
		printf("Usage: %s [<input_file>]\n", argv[0]);
		exit(8);
    }
    
    initialize_bomb();

    printf("Welcome to my fiendish little bomb. You have 6 phases with\n");
    printf("which to blow yourself up. Have a nice day!\n");


    input = read_line();        
    phase_1(input);	//判断密码是否正确 
    phase_defused();                 

    printf("Phase 1 defused. How about the next one?\n");

    input = read_line();
    phase_2(input);	//判断密码是否正确
    phase_defused();
    printf("That's number 2.  Keep going!\n");

    input = read_line();
    phase_3(input);	//判断密码是否正确
    phase_defused();
    printf("Halfway there!\n");

    input = read_line();
    phase_4(input);	//判断密码是否正确
    phase_defused();
    printf("So you got that one.  Try this one.\n");
    
    
    input = read_line();
    phase_5(input);	//判断密码是否正确
    phase_defused();
    printf("Good work!  On to the next...\n");
    input = read_line();	
    phase_6(input);	//判断密码是否正确
    phase_defused();
    return 0;
}

反汇编

输入命令gdb bomb，之后一次通过phase_1~6函数汇编代码破解密码。

phase_1

disassemble phase_1

   0x0000000000400ee0 <+0>:	sub    $0x8,%rsp
   0x0000000000400ee4 <+4>:	mov    $0x402400,%esi		//esi = 0x402400
   0x0000000000400ee9 <+9>:	callq  0x401338 <strings_not_equal>
   0x0000000000400eee <+14>:	test   %eax,%eax				//eax & eax
   0x0000000000400ef0 <+16>:	je     0x400ef7 <phase_1+23>	// 如果eax == 0则跳转
   0x0000000000400ef2 <+18>:	callq  0x40143a <explode_bomb>	// eax != 0 引爆炸弹
   0x0000000000400ef7 <+23>:	add    $0x8,%rsp	//安全
   0x0000000000400efb <+27>:	retq

(gdb) disassemble strings_not_equal

   0x0000000000401338 <+0>:	push   %r12
   0x000000000040133a <+2>:	push   %rbp
   0x000000000040133b <+3>:	push   %rbx			//保存寄存器
   0x000000000040133c <+4>:	mov    %rdi,%rbx	//rbx = input地址
   0x000000000040133f <+7>:	mov    %rsi,%rbp	//rbp = 0x402400
   0x0000000000401342 <+10>:	callq  0x40131b <string_length>	
   0x0000000000401347 <+15>:	mov    %eax,%r12d	//保存input的长度
   0x000000000040134a <+18>:	mov    %rbp,%rdi		//rdi = 0x402400（参数）
   0x000000000040134d <+21>:	callq  0x40131b <string_length>	//此时eax是0x402400的长度
   0x0000000000401352 <+26>:	mov    $0x1,%edx	//edx = 1
   0x0000000000401357 <+31>:	cmp    %eax,%r12d	//比较输入字符串和和0x402400位置的字符串
   0x000000000040135a <+34>:	jne    0x40139b <strings_not_equal+99>	//如果不相等，则返回1（会爆）
   0x000000000040135c <+36>:	movzbl (%rbx),%eax	//eax =input[0]
   0x000000000040135f <+39>:	test   %al,%al	
   0x0000000000401361 <+41>:	je     0x401388 <strings_not_equal+80>	//如果是字符串结尾，则结束遍历比较，返回0！
   0x0000000000401363 <+43>:	cmp    0x0(%rbp),%al	//比较0x402400和input
   0x0000000000401366 <+46>:	je     0x401372 <strings_not_equal+58>	//相等
   0x0000000000401368 <+48>:	jmp    0x40138f <strings_not_equal+87>	//不相等会爆炸
   0x000000000040136a <+50>:	cmp    0x0(%rbp),%al	//判断两个字符是否相同
   0x000000000040136d <+53>:	nopl   (%rax)	//对齐用的,应该也影响了标志位
   0x0000000000401370 <+56>:	jne    0x401396 <strings_not_equal+94>	//如果不相等则
   0x0000000000401372 <+58>:	add    $0x1,%rbx	//0x402400和input一起向后遍历
   0x0000000000401376 <+62>:	add    $0x1,%rbp	//同上
   0x000000000040137a <+66>:	movzbl (%rbx),%eax	//eax = input
   0x000000000040137d <+69>:	test   %al,%al	//判断0x402400是否到（0）结尾
   0x000000000040137f <+71>:	jne    0x40136a <strings_not_equal+50>	//如果不是0则跳到50继续比较
   0x0000000000401381 <+73>:	mov    $0x0,%edx	//edx = 0（安全了）
   0x0000000000401386 <+78>:	jmp    0x40139b <strings_not_equal+99>
   0x0000000000401388 <+80>:	mov    $0x0,%edx	//edx = 0
   0x000000000040138d <+85>:	jmp    0x40139b <strings_not_equal+99>
   0x000000000040138f <+87>:	mov    $0x1,%edx	//edx = 1跳转会爆
   0x0000000000401394 <+92>:	jmp    0x40139b <strings_not_equal+99>
   0x0000000000401396 <+94>:	mov    $0x1,%edx
   0x000000000040139b <+99>:	mov    %edx,%eax	//返回0 (安全）
   0x000000000040139d <+101>:	pop    %rbx
   0x000000000040139e <+102>:	pop    %rbp
   0x000000000040139f <+103>:	pop    %r12
   0x00000000004013a1 <+105>:	retq

(gdb) disassemble string_length

   0x000000000040131b <+0>:	cmpb   $0x0,(%rdi)	
   0x000000000040131e <+3>:	je     0x401332 <string_length+23>	//如果长度为0rdi == 0，返回ax = 0
   0x0000000000401320 <+5>:	mov    %rdi,%rdx	//rdx = 输入input的地址
   0x0000000000401323 <+8>:	add    $0x1,%rdx	//rdx 每次加1
   0x0000000000401327 <+12>:	mov    %edx,%eax	//ax = edx（edx为input地址一直向后移动）
   0x0000000000401329 <+14>:	sub    %edi,%eax	//ax 减去起始地址，即当前扫描到的input的长度
   0x000000000040132b <+16>:	cmpb   $0x0,(%rdx)	//rdx到输入的末尾了此时ax = 
   0x000000000040132e <+19>:	jne    0x401323 <string_length+8>	//rdx != 0 则循环
   0x0000000000401330 <+21>:	repz retq 	//返回 （此时ax = input的长度）
   0x0000000000401332 <+23>:	mov    $0x0,%eax	
   0x0000000000401337 <+28>:	retq  	//返回ax=0

代码虽长，分析的过程中发现当input和0x402400位置的字符串相同即可。
在这里插入图片描述

phase_2

(gdb) disassemble phase_2

   0x0000000000400efc <+0>:	push   %rbp
   0x0000000000400efd <+1>:	push   %rbx
   0x0000000000400efe <+2>:	sub    $0x28,%rsp	//栈顶扩了40个字节，
   0x0000000000400f02 <+6>:	mov    %rsp,%rsi	//rsi保存当前的栈顶指针
   0x0000000000400f05 <+9>:	callq  0x40145c <read_six_numbers>	//通过名字应该是要读6个数字
   0x0000000000400f0a <+14>:	cmpl   $0x1,(%rsp)	//比较(%rsp)和1的大小
   0x0000000000400f0e <+18>:	je     0x400f30 <phase_2+52>	//相等跳转，否则下一条语句爆炸
   0x0000000000400f10 <+20>:	callq  0x40143a <explode_bomb>	//boom!!
   0x0000000000400f15 <+25>:	jmp    0x400f30 <phase_2+52>
   0x0000000000400f17 <+27>:	mov    -0x4(%rbx),%eax	//eax = 取出rbx-4处的值
   0x0000000000400f1a <+30>:	add    %eax,%eax	//eax = eax * 2每次倍增
   0x0000000000400f1c <+32>:	cmp    %eax,(%rbx)	//判断rbx位置的值是否和eax相等
   0x0000000000400f1e <+34>:	je     0x400f25 <phase_2+41>	//必须相等，否则下一条指令爆炸
   0x0000000000400f20 <+36>:	callq  0x40143a <explode_bomb>
   0x0000000000400f25 <+41>:	add    $0x4,%rbx	//rbx+4准备读取下一个int(4字节）
   0x0000000000400f29 <+45>:	cmp    %rbp,%rbx	//rbx == rbp了则跳出去
   0x0000000000400f2c <+48>:	jne    0x400f17 <phase_2+27>
   0x0000000000400f2e <+50>:	jmp    0x400f3c <phase_2+64>
   0x0000000000400f30 <+52>:	lea    0x4(%rsp),%rbx	//rbx = rsp的地址+4
   0x0000000000400f35 <+57>:	lea    0x18(%rsp),%rbp	//rbp = 原始栈顶的地址+18(10进制24)
   0x0000000000400f3a <+62>:	jmp    0x400f17 <phase_2+27>
   0x0000000000400f3c <+64>:	add    $0x28,%rsp
   0x0000000000400f40 <+68>:	pop    %rbx
   0x0000000000400f41 <+69>:	pop    %rbp
   0x0000000000400f42 <+70>:	retq

通过phase_2的汇编代码，（栈底的地址低，栈顶的地址高，这是规定！）可以猜测从rsp开始依次存放的6个(int)值就是我们输入的6个数字，而通过逻辑此段代码逻辑很容易判断，这6个值需要依次为1、2、4、8、16、32。对于如何把我输入的6个数字放到栈上的，就不详细看read_six_numbers函数的逻辑了。下面通过对汇编代码打断点进行测试看看是不是我输入的数字。
单步调试汇编参考：
汇编指令打断点
 单步调试汇编
 x（输出指定地址的内容）命令的使用
通过把断点打在汇编指令中read_six_numbers下面的第一条指令上，断点调试可以发现在这里插入图片描述

数据都被读到了栈上，符合猜测。