程序员的自我修养 ch3 目标文件

ELF文件解析

参考《程序员的自我修养》ch3

1. 目标文件的种类

>> file hello.o
hello.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped

>> file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.8, stripped

>> file /lib/tls/i686/cmov/librt-2.9.so
/lib/tls/i686/cmov/librt-2.9.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, stripped

2. ELF的格式

>> cat ch3.c
int global_init_var = 84;
int global_uninit_var;


void func1(int i)
{
        printf("%d\n",i);
}


int main()
{
        static int static_var = 85;
        static int static_var2;


        int a = 1;
        int b;


        func1(static_var +static_var2 + a + b);
        return 0;
}

使用objdump查看section,

>> objdump -h ch3.o

ch3.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000005d  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000008  00000000  00000000  00000094  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000004  00000000  00000000  0000009c  2**2
                  ALLOC
  3 .rodata       00000004  00000000  00000000  0000009c  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      00000024  00000000  00000000  000000a0  2**0
                  CONTENTS, READONLY
  5 .note.GNU-stack 00000000  00000000  00000000  000000c4  2**0
                  CONTENTS, READONLY
查看各个section的大小

>> size ch3.o
   text    data     bss     dec     hex filename
     97       8       4     109      6d ch3.o


2.1 .text

>> objdump -s -d ch3.o

ch3.o:     file format elf32-i386

Contents of section .text:
 0000 5589e583 ec088b45 08894424 04c70424  U......E..D$...$
 0010 00000000 e8fcffff ffc9c38d 4c240483  ............L$..
 0020 e4f0ff71 fc5589e5 5183ec14 c745f801  ...q.U..Q....E..
 0030 0000008b 15040000 00a10000 00008d04  ................
 0040 020345f8 0345f489 0424e8fc ffffffb8  ..E..E...$......
 0050 00000000 83c41459 5d8d61fc c3        .......Y].a..
Contents of section .data:
 0000 54000000 55000000                    T...U...
Contents of section .rodata:
 0000 25640a00                             %d..
Contents of section .comment:
 0000 00474343 3a202855 62756e74 7520342e  .GCC: (Ubuntu 4.
 0010 332e332d 35756275 6e747534 2920342e  3.3-5ubuntu4) 4.
 0020 332e3300                             3.3.

Disassembly of section .text:

00000000 <func1>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 08                sub    $0x8,%esp
   6:   8b 45 08                mov    0x8(%ebp),%eax
   9:   89 44 24 04             mov    %eax,0x4(%esp)
   d:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  14:   e8 fc ff ff ff          call   15 <func1+0x15>
  19:   c9                      leave
  1a:   c3                      ret

0000001b <main>:
  1b:   8d 4c 24 04             lea    0x4(%esp),%ecx
  1f:   83 e4 f0                and    $0xfffffff0,%esp
  22:   ff 71 fc                pushl  -0x4(%ecx)
  25:   55                      push   %ebp
  26:   89 e5                   mov    %esp,%ebp
  28:   51                      push   %ecx
  29:   83 ec 14                sub    $0x14,%esp
  2c:   c7 45 f8 01 00 00 00    movl   $0x1,-0x8(%ebp)
  33:   8b 15 04 00 00 00       mov    0x4,%edx
  39:   a1 00 00 00 00          mov    0x0,%eax
  3e:   8d 04 02                lea    (%edx,%eax,1),%eax
  41:   03 45 f8                add    -0x8(%ebp),%eax
  44:   03 45 f4                add    -0xc(%ebp),%eax
  47:   89 04 24                mov    %eax,(%esp)
  4a:   e8 fc ff ff ff          call   4b <main+0x30>
  4f:   b8 00 00 00 00          mov    $0x0,%eax
  54:   83 c4 14                add    $0x14,%esp
  57:   59                      pop    %ecx
  58:   5d                      pop    %ebp
  59:   8d 61 fc                lea    -0x4(%ecx),%esp
  5c:   c3                      ret

2.2 .data

.data : 保存初始化的全局变量和局部静态变量
“54000000”即84,可以看出是LSB。对应的,如果是MSB,则为“00000054”。(这是看出一个系统的大小端的另一个方法,无须运行程序,只需编译)。
objcopy - copy and translate object files

>> objcopy -I binary -O elf32-i386 -B i386 background.jpg b.o
[iotanfs: /home/honghaos/c/xiu ]
>> file b.o
b.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
>> objdump -ht b.o

b.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         0000d8e1  00000000  00000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
00000000 l    d  .data  00000000 .data
00000000 g       .data  00000000 _binary_background_jpg_start
0000d8e1 g       .data  00000000 _binary_background_jpg_end
0000d8e1 g       *ABS*  00000000 _binary_background_jpg_size

2.3 使用自定义section

>> cat ch3_1.c
__attribute__((section("FOO"))) int global = 42;

>> objdump -d -s ch3_1.o

ch3_1.o:     file format elf32-i386

Contents of section FOO:
 0000 2a000000                             *...
Contents of section .comment:
 0000 00474343 3a202855 62756e74 7520342e  .GCC: (Ubuntu 4.
 0010 332e332d 35756275 6e747534 2920342e  3.3-5ubuntu4) 4.
 0020 332e3300                             3.3.

3 ELF文件的头部

>> readelf -h ch3.o
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          280 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           40 (bytes)
  Number of section headers:         11
  Section header string table index: 8

格式定义在:/usr/include/elf.h (http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h)
/* The ELF file header.  This appears at the start of every ELF file.  */

#define EI_NIDENT (16)

typedef struct
{
  unsigned char	e_ident[EI_NIDENT];	/* Magic number and other info */
  Elf32_Half	e_type;			/* Object file type */
  Elf32_Half	e_machine;		/* Architecture */
  Elf32_Word	e_version;		/* Object file version */
  Elf32_Addr	e_entry;		/* Entry point virtual address */
  Elf32_Off	e_phoff;		/* Program header table file offset */
  Elf32_Off	e_shoff;		/* Section header table file offset */
  Elf32_Word	e_flags;		/* Processor-specific flags */
  Elf32_Half	e_ehsize;		/* ELF header size in bytes */
  Elf32_Half	e_phentsize;		/* Program header table entry size */
  Elf32_Half	e_phnum;		/* Program header table entry count */
  Elf32_Half	e_shentsize;		/* Section header table entry size */
  Elf32_Half	e_shnum;		/* Section header table entry count */
  Elf32_Half	e_shstrndx;		/* Section header string table index */
} Elf32_Ehdr;

4. section表

e_shoff指定了section header的偏移量,使用"objdump -h"仅能显示重要的段,使用“readelf -S”来显示所有的段。

>> readelf.exe -S ch3.o
There are 11 section headers, starting at offset 0x118:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 00005d 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 00041c 000028 08      9   1  4
  [ 3] .data             PROGBITS        00000000 000094 000008 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 00009c 000004 00  WA  0   0  4
  [ 5] .rodata           PROGBITS        00000000 00009c 000004 00   A  0   0  1
  [ 6] .comment          PROGBITS        00000000 0000a0 000024 00      0   0  1
  [ 7] .note.GNU-stack   PROGBITS        00000000 0000c4 000000 00      0   0  1
  [ 8] .shstrtab         STRTAB          00000000 0000c4 000051 00      0   0  1
  [ 9] .symtab           SYMTAB          00000000 0002d0 0000f0 10     10  10  4
  [10] .strtab           STRTAB          00000000 0003c0 00005c 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

section header的定义,

/* Section header.  */

typedef struct
{
  Elf32_Word	sh_name;		/* Section name (string tbl index) */
  Elf32_Word	sh_type;		/* Section type */
  Elf32_Word	sh_flags;		/* Section flags */
  Elf32_Addr	sh_addr;		/* Section virtual addr at execution */
  Elf32_Off	sh_offset;		/* Section file offset */
  Elf32_Word	sh_size;		/* Section size in bytes */
  Elf32_Word	sh_link;		/* Link to another section */
  Elf32_Word	sh_info;		/* Additional section information */
  Elf32_Word	sh_addralign;		/* Section alignment */
  Elf32_Word	sh_entsize;		/* Entry size if section holds table */
} Elf32_Shdr;

.rel.text : 重定位表
.shstrtab : section header string table
.strtab   : 字符串表

4.1 符号表:Symbol table (.symtab)

>> nm -C ch3.o
00000000 T func1
00000000 D global_init_var
00000004 C global_uninit_var
0000001b T main
         U printf
00000004 d static_var.1198
00000000 b static_var2.1199

定义如下,

/* Symbol table entry.  */

typedef struct
{
  Elf32_Word	st_name;		/* Symbol name (string tbl index) */
  Elf32_Addr	st_value;		/* Symbol value */
  Elf32_Word	st_size;		/* Symbol size */
  unsigned char	st_info;		/* Symbol type and binding */
  unsigned char	st_other;		/* Symbol visibility */
  Elf32_Section	st_shndx;		/* Section index */
} Elf32_Sym;
查看一个目标文件中的所有符号,
>> readelf -s ch3.o

Symbol table '.symtab' contains 15 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS ch3.c
     2: 00000000     0 SECTION LOCAL  DEFAULT    1
     3: 00000000     0 SECTION LOCAL  DEFAULT    3
     4: 00000000     0 SECTION LOCAL  DEFAULT    4
     5: 00000000     0 SECTION LOCAL  DEFAULT    5
     6: 00000000     4 OBJECT  LOCAL  DEFAULT    4 static_var2.1199
     7: 00000004     4 OBJECT  LOCAL  DEFAULT    3 static_var.1198
     8: 00000000     0 SECTION LOCAL  DEFAULT    7
     9: 00000000     0 SECTION LOCAL  DEFAULT    6
    10: 00000000     4 OBJECT  GLOBAL DEFAULT    3 global_init_var
    11: 00000000    27 FUNC    GLOBAL DEFAULT    1 func1
    12: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
    13: 0000001b    66 FUNC    GLOBAL DEFAULT    1 main
    14: 00000004     4 OBJECT  GLOBAL DEFAULT  COM global_uninit_var

4.1.1 符号修饰与函数签名

>> gcc -fleading-underscore -c ch3.c

>> nm -C ch3.o
00000000 T _func1
00000000 D _global_init_var
00000004 C _global_uninit_var
0000001b T _main
         U _printf
00000004 d _static_var.1198
00000000 b _static_var2.1199
Demangle C++ Symbols
>> /usr/bin/c++filt _ZN1N1C4funcEi
N::C::func(int)
>> /usr/bin/c++filt _ZZ4mainE3foo
main::foo
>> c++filt _ZZ4funcvE3foo
func()::foo


>> cat  ch3_5.c
#include <stdio.h>
namespace myname {
        int var = 42;
}

extern "C" int _ZN6myname3varE;

int main()
{
        printf( "%d\n", _ZN6myname3varE );
        return 0;
}

>> g++ ch3_5.c
>> ./a.out
42

4.1.2 extern "C"的作用
>> cat func.c
#include <cstdio>
using namespace std;

int func()
{
        printf("Hello world!\n");
}

>> g++ -c func.c
>> readelf -s func.o
     9: 0000000000000000    21 FUNC    GLOBAL DEFAULT    1 _Z4funcv

上面的例子里,C++编译后,func()的符号名变为_Z4funcv

使用extern "C"修饰符后,

>> cat func.c
#include <cstdio>
using namespace std;

extern "C" {
int func()
{
        printf("Hello world!\n");
}
}
>> g++ -c func.c
>> readelf -s func.o
     9: 0000000000000000    21 FUNC    GLOBAL DEFAULT    1 func

推广,查看/usr/include/string.h,可以发现下面的语句,为了同时支持C和C++中使用。

__BEGIN_DECLS

...
__END_DECLS


这两个宏是在文件/usr/include/sys/cdefs.h定义的,

/* C++ needs to know that types and declarations are C, not C++.  */
#ifdef  __cplusplus
# define __BEGIN_DECLS  extern "C" {
# define __END_DECLS    }
#else
# define __BEGIN_DECLS
# define __END_DECLS
#endif

4.1.3 弱符号与强符号
对于C/C++语言来说,编译器默认函数和初始化了的全局变量为强符号,未初始化的全局变量为弱符号。我们也可以通过GCC的"__attribute__((weak))"来定义任何一个强符号为弱符号。注意,强符号和弱符号都是针对定义来说的,不是针对符号的引用。

针对强弱符号的概念,链接器就会按如下规则处理与选择被多次定义的全局符号:

规则1:不允许强符号被多次定义(即不同的目标文件中不能有同名的强符号);如果有多个强符号定义,则链接器报符号重复定义错误。
规则2:如果一个符号在某个目标文件中是强符号,在其他文件中都是弱符号,那么选择强符号。
规则3:如果一个符号在所有目标文件中都是弱符号,那么选择其中占用空间最大的一个。比如目标文件A定义全局变量global为int型,占4个字节;目标文件B定义global为double型,占8个字节,那么目标文件A和B链接后,符号global占8个字节(尽量不要使用多个不同类型的弱符号,否则容易导致很难发现的程序错误)。

例如在两个.c文件中,同时定义一个未初始化的全局变量,编译时是可以通过的,因为它们都是弱符号。

4.1.4 弱引用和强引用

>> cat weakref.c
//__attribute__((weakref)) void foo();
void foo() __attribute__((weak));

int main()
{
        foo();
}

>> gcc weakref.c
>> ./a.out
Memory fault(coredump)

在Linux程序的设计中,如果一个程序被设计成可以支持单线程或多线程的模式,就可以通过弱引用的方法来判断当前的程序是链接到了单线程的Glibc库还是多线程的Glibc库(是否在编译时有-lpthread选项),从而执行单线程版本的程序或多线程版本的程序。
例如:

>> cat p.c
#include <stdio.h>
#include <pthread.h>
int pthread_create(
        pthread_t*,
        const pthread_attr_t*,
        void* (*)(void*),
        void*) __attribute__ ((weak));

int main()
{
        if(pthread_create) {
                printf("This is multi-thread version!\n");
                // run the multi-thread version
                // main_multi_thread()
        } else {
                printf("This is single-thread version!\n");
                // run the single-thread version
                // main_single_thread()
        }
}

运行结果,

>> gcc  p.c
[iotaate: /local/honghaos/c/ch3 ]
>> ./a.out
This is single-thread version!
[iotaate: /local/honghaos/c/ch3 ]
>> gcc  -lpthread p.c
[iotaate: /local/honghaos/c/ch3 ]
>> ./a.out
This is multi-thread version!

4.1.5 调试信息 -g

http://www.ibm.com/developerworks/opensource/library/os-debugging/index.html?ca=drs-
The DWARF(debug with arbitrary record format) and STAB formats are the most widely used executable and linking format (ELF).

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值