Introduction to x64 debugging, part 2

本文介绍了x64架构下的调试技巧,重点讲述了x64调用约定如何简化调试过程。文章对比了x64与x86架构下参数传递方式的不同,并强调了新的堆栈展开模型如何提供更可靠的堆栈跟踪。

Introduction to x64 debugging, part 2

Last time, I talked about some of the basic differences you’ll see when switching to an x64 system if you are doing debugging using the Debugging Tools for Windows package.  In this installment, I’ll run through some of the other differences with debugging that you’ll likely run into – in particular, how changes to the x64 calling convention will make your life much easier when debugging.

Although the x64 architecture is in many respects very similar to x86, many of the conventions of x86-Win32 that you might be familiar with have changed.  Microsoft took the opportunity to “clean house” with many aspects of Win64, since for native x64 programs, there is no concern of backwards binary compatibility.

One of the major changes that you will quickly discover is that the calling conventions that x86 used (__fastcall, __cdecl, __stdcall) are not applicable to x64.  Instead of many different calling conventions, x64 unifies everything into a single calling conention that all functions use.  You can read the full details of the new calling convention on MSDN, but I’ll give you the executive summary as it applies to debugging programs here.

  •  The first four arguments of a function are passed as registers; rcx, rdx, r8, and r9 respectively.  Subsequent arguments are passed on the stack.
  • The caller allocates the space on the stack for parameter passing, like for __stdcall on x86.  However, the caller must allocate at least 32 bytes of stack space for the callee to use a “register home space” the first four parameters (or scratch space).  This must be done even if the callee has no arguments or less than four arguments.
  • The caller always cleans the stack of arguments passed (like __cdecl on x86) if necessary.
  • Stack unwinding and exception handling are significantly different on x64; more details on that later.  The new stack unwinding model is data-driven rather than code-driven (like on x86).
  • Except for dynamic stack adjustments (like _alloca), all stack space must be allocated in the prologue.  Effectively, for most functions, the stack pointer will remain constant throughout the execution process.
  • The rax register is used for return values.  For return values larger than 64 bits, a hidden pointer argument is used.  There is no more spillover into a second register for large return values (like edx:eax, on x86).
  • The rax, rcx, rdx, r8, r9, r10, r11 registers are volatile, all other registers must be preserved.  For floating point usage, the xmm0, xmm1, xmml2, xmm3, xmm4, xmm5 registers are volatile, and the other registers must be preserved.
  • For floating point arguments, the xmm0 through xmm3 registers are used for the first four arguments, after which stack spillover is performed.
  • The instructions permitted in function prologues and epilogues are highly restricted to a very small subset of the instruction set to facilitate unwinding operations.

The main takeaways here from a debugging pespective are thus:

  • Even though a register calling convention like __fastcall is used, the register arguments are often spilled to the “home area” and so are typically visible in call stacks, especially in debug builds.
  • Due to the nature of parameter passing on x64, the “push” instruction is seldom used for setting up arguments.  Instead, the compiler allocates all space up front (like for local variables on x86) and uses the “mov” instruction to write stack parameters onto the stack for function calls.  This also means that you typically will not see an “add rsp” (or equivalent) after each function call, despite the fact that the caller cleans the stack space.
  • The first stack arguments (argument 5, etc) will appear at [rsp+28h] instead of [rsp+08h], because of the mandatory register home area.  This is a departure from how __fastcall worked on x86, where the first stack argument would be at [esp+04h].
  • Because of the data driven unwind semantics, you will see perfect stack unwinding even without symbols.  This means that even if you don’t have any symbols at all for a third party binary, you should always get a complete stack trace all the way back to the thread start routine.  As a side effect, this means that the stack traces captured by PageHeap or handle traces will be much more reliable than on x86, where they tended break at the first function that did not use ebp (because those stack traces never used symbols).
  • Because of the restrictions on the prologue and epilogue instruction usage, it is very easy to recognize where the actual important function code begins and the boilerplate prologue/epilogue code ends.

If you’ve been debugging on x86 for a long time, then you are probably pretty excited about the features of the new calling convention.  Because of the perfect unwind semantics and constant stack pointer throughout function execution model, debugging code that you don’t have symbols for (and using the built-in heap and handle verification utilities) is much more reliable than x86.  Additionally, compiler generated code is usually easier to understand, because you don’t have to manually track the value of the stack pointer changing throughout the function call like you often did on x86 functions compiled with frame pointer omission (FPO) optimizations.

 There are some exceptions to the rules I laid out above for the x64 calling convention.  For functions that do not call any other functions (called “leaf” functions), it is permissible to utilize custom calling conventions so long as the stack pointer (rsp) is not modified.  If the stack pointer is modified then regular calling convention semantics are required.

Next time, I’ll go into more detail on how exception handling and unwinding is different on x64 from the perspective of what the changes mean to you if you are debugging programs, and how you can access some of the metadata associated with unwinding/exception handling and use it to your advantage within the debugger.

请告诉我具体的实验步骤: Lab: system calls In the last lab you used system calls to write a few utilities. In this lab you will add some new system calls to xv6, which will help you understand how they work and will expose you to some of the internals of the xv6 kernel. You will add more system calls in later labs. Before you start coding, read Chapter 2 of the xv6 book, and Sections 4.3 and 4.4 of Chapter 4, and related source files: • The user-space "stubs" that route system calls into the kernel are in user/usys.S, which is generated by user/usys.pl when you run make. Declarations are in user/user.h • The kernel-space code that routes a system call to the kernel function that implements it is in kernel/syscall.c and kernel/syscall.h. • Process-related code is kernel/proc.h and kernel/proc.c. To start the lab, switch to the syscall branch: $ git fetch $ git checkout syscall $ make clean If you run make grade you will see that the grading script cannot exec trace and sysinfotest. Your job is to add the necessary system calls and stubs to make them work. Using gdb (easy) In many cases, print statements will be sufficient to debug your kernel, but sometimes being able to single step through some assembly code or inspecting the variables on the stack is helpful. To learn more about how to run GDB and the common issues that can arise when using GDB, check out this page. To help you become familiar with gdb, run and then fire up gdb in another window (see the gdb bullet on the guidance page). Once you have two windows open, type in the gdb window: make qemu-gdb (gdb) b syscall Breakpoint 1 at 0x80002142: file kernel/syscall.c, line 243. (gdb) c Continuing. [Switching to Thread 1.2] Thread 2 hit Breakpoint 1, syscall () at kernel/syscall.c:243 243 { (gdb) layout src (gdb) backtrace The layout command splits the window in two, showing where gdb is in the source code. The backtrace prints out the stack backtrace. See Using the GNU Debugger for helpful GDB commands. Answer the following questions in answers-syscall.txt. Looking at the backtrace output, which function called syscall? Type a few times to step past struct proc *p = myproc(); Once past this statement, type , which prints the current process's proc struct (see kernel/proc.h>) in hex. np /x *p What is the value of p->trapframe->a7 and what does that value represent? (Hint: look user/initcode.S, the first user program xv6 starts.) The processor is running in kernel mode, and we can print privileged registers such as sstatus (see RISC-V privileged instructions for a description): (gdb) p /x $sstatus What was the previous mode that the CPU was in? In the subsequent part of this lab (or in following labs), it may happen that you make a programming error that causes the xv6 kernel to panic. For example, replace the statement num = p->trapframe->a7; with num = * (int *) 0; at the beginning of syscall, run , and you will see something similar to: make qemu xv6 kernel is booting hart 2 starting hart 1 starting scause 0x000000000000000d sepc=0x000000008000215a stval=0x0000000000000000 panic: kerneltrap Quit out of qemu. To track down the source of a kernel page-fault panic, search for the sepc value printed for the panic you just saw in the file kernel/kernel.asm, which contains the assembly for the compiled kernel. Write down the assembly instruction the kernel is panicing at. Which register corresponds to the variable num? To inspect the state of the processor and the kernel at the faulting instruction, fire up gdb, and set a breakpoint at the faulting epc, like this: (gdb) b *0x000000008000215a Breakpoint 1 at 0x8000215a: file kernel/syscall.c, line 247. (gdb) layout asm (gdb) c Continuing. [Switching to Thread 1.3] Thread 3 hit Breakpoint 1, syscall () at kernel/syscall.c:247 Confirm that the faulting assembly instruction is the same as the one you found above. Why does the kernel crash? Hint: look at figure 3-3 in the text; is address 0 mapped in the kernel address space? Is that confirmed by the value in scause above? (See description of scause in RISC-V privileged instructions) Note that scause was printed by the kernel panic above, but often you need to look at additional info to track down the problem that caused the panic. For example, to find out which user process was running when the kernel paniced, you can print out the process's name: (gdb) p p->name What is the name of the binary that was running when the kernel paniced? What is its process id (pid)? This concludes a brief introduction to tracking down bugs with gdb; it is worth your time to revisit Using the GNU Debugger when tracking down kernel bugs. The guidance page also has some other other useful debugging tips.
11-16
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值