Reversing MS VC++Part I: Exception Handling
摘要
MS VC++ 是Win32平台上最广泛使用的编译器,因此熟悉它的内部工作机制对于Win32逆向爱好者非常重要。能够理解编译器生成的附加(glue)代码有助于快速理解程序员写的实际代码。同样也有助于恢复程序的高级结构。
在这个两部分组成的系列文章的Part I中,我会专注于栈的结构,异常处理和由MSVC编译出的程序的相关结构。前提是假设你对汇编器,寄存器,调用习惯有一定程度的熟悉。
术语:
· 栈帧:堆栈上由一个函数占用的一段。通常包括函数参数,返回到调用者的地址,保存的寄存器值,局部变量和这个函数中的其它特定数据。在X86(以及其它大多数架构)中调用者和被调用者的栈帧是连续的。
· 帧指针:它是一个寄存器或者变量,指向栈帧内部的一个固定地址。通常栈帧内所有数据都是以相对于这个指针的地址引用的。在X86上通常是ebp,并且指向返回地址的下一个位置。
· 对象。一个C++类的实例。
· 可展开对象。由auto storage-class指示符修饰的局部对象,它分配在栈上,并且当超出域作用范围(scope)时需要析构。
· 栈展开。当发生异常,控制离开对象域作用范围(scope)时会导致对象的自动析构,就是栈展开。
有两种类型的异常可以用在C或C++程序中。
Ÿ SEH异常(Structured Exception Handling)。也被叫做Win32异常或系统异常。它们已经被著名的Matt Pietrek[1]解释的非常详尽。它们只能被用在C程序中。编译器级的支持包括关键字__try, __except,__finally和其它一些。
Ÿ C++异常(有时候也叫做EH)。它基于SEH实现,C++异常允许抛出和捕获任意类型的异常。C++的一个非常重要的特点是在异常处理过程中自动的栈展开,并且MSVC使用了一种非常复杂的底层框架来确保它在任何情况都能正常运作。
在下面的图例中,内存地址从上到下增加,所以栈是“增长”的。这也是IDA采用的描述栈的方法,但和几乎其它所有描述相反。
基本的帧布局
最基本的栈帧布局如下,
...
Local variables
Other saved registers
Saved ebp
Return address
Function arguments
...
注意:如果允许了忽略帧指针 (frame pointeromission),则saved ebp可能不存在。
SEH
在使用了编译器级SEH (__try/__except/__finally)的时候,栈的布局变得有一点复杂。
SEH3Stack Layout
当在某函数中没有__except块(只有__finally)时,不再使用saved ebp。Scopetable是一个记录(record)的数组,每个record描述了一个__try块,以及块之间的关系。
struct_SCOPETABLE_ENTRY {
DWORD EnclosingLevel;
void* FilterFunc;
void* HandlerFunc;
}
更多的SEH实现细节请看[1]。为了恢复try块,请注意观察try块的层次变量是如何更新的。每一个try块都分配了一个唯一的数作为标识,scopetable表中条目(entry)间的关系则描述了try块的嵌套关系。例如,如果scopetable的第i项的EnclosingLevel等于j,则表示try块j包围了try块i。 函数体自身被认为拥有级别-1。请参看附录1作为例子。
Buffer Overrun Protection
Whidbey(MSVC2005)编译器为SEH帧增加了一些缓冲区溢出(overrun)保护。完整的栈帧布局如下:
SEH4Stack Layout
GS cookie只有在编译时打开/GS参数才存在。EH cookie总是存在。SEH4 scopetable基本和SEH3一样,只是加了一个头,
struct _EH4_SCOPETABLE {
DWORD GSCookieOffset;
DWORD GSCookieXOROffset;
DWORD EHCookieOffset;
DWORD EHCookieXOROffset;
_EH4_SCOPETABLE_RECORD ScopeRecord[1];
};
struct _EH4_SCOPETABLE_RECORD {
DWORD EnclosingLevel;
long (*FilterFunc)();
union {
void (*HandlerAddress)();
void (*FinallyFunc)();
};
};
GSCookieOffset =-2 意味着没有使用GScookie。 EH cookie总是存在。偏移量是相对于ebp的。检查按照下列方式进行: (ebp+CookieXOROffset) ^ [ebp+CookieOffset] == _security_cookie。指向栈中scopetable的指针同样也和__security_cookie进行了异或。而且,在SEH4中最外层的级别是-2,而不是SEH3的-1。
C++异常模块实现
当函数采用C++异常处理(try/catch)或者有可展开对象时,情形更加复杂。
C++EH Stack Layout
EH handler对每个函数都不相同(SEH正好相反),通常像这样,
(VC7+)
mov eax, OFFSET __ehfuncinfo
jmp ___CxxFrameHandler
__ehfuncinfo是一个类型为FuncInfo的结构体,它完整地描述了所有 try/catch块和所有可展开对象。
struct FuncInfo {
// compiler version.
// 0x19930520: up to VC6, 0x19930521: VC7.x(2002-2003), 0x19930522: VC8(2005)
DWORD magicNumber;
// number of entries in unwind table
int maxState;
// table of unwind destructors
UnwindMapEntry* pUnwindMap;
// number of try blocks in the function
DWORD nTryBlocks;
// mapping of catch blocks to try blocks
TryBlockMapEntry* pTryBlockMap;
// not used on x86
DWORD nIPMapEntries;
// not used on x86
void* pIPtoStateMap;
// VC7+ only, expected exceptions list (function "throw"specifier)
ESTypeList* pESTypeList;
// VC8+ only, bit 0 set if function was compiled with /EHs
int EHFlags;
};
Unwind map和SHE的scopetable类似,但没有过滤(filter)函数。
structUnwindMapEntry {
int toState; // targetstate
void (*action)(); // action toperform (unwind funclet address)
};
Try块描述子,描述了一个try块及其相关的catch块,
struct TryBlockMapEntry{
int tryLow;
int tryHigh; // this try {}covers states ranging from tryLow to tryHigh
int catchHigh; // highest stateinside catch handlers of this try
int nCatches; // number of catchhandlers
HandlerType* pHandlerArray; //catch handlers table
};
Catch块描述子,描述了一个try块的某一个catch块(因为一个try可以同时有几个catch块)。
structHandlerType {
// 0x01: const, 0x02: volatile, 0x08:reference
DWORD adjectives;
// RTTI descriptor of the exception type.0=any (ellipsis)
TypeDescriptor* pType;
// ebp-based offset of the exception objectin the function stack.
// 0 = no object (catch by type)
int dispCatchObj;
// address of the catch handler code.
// returns address where to continuesexecution (i.e. code after the try block)
void* addressOfHandler;
};
可预期异常链表(expected exceptions)(默认情况下,MSVC实现了它但没有打开,可以用/d1ESrt使之生效)。
struct ESTypeList {
// number of entries in the list
int nCount;
// list of exceptions; it seems only pType field in HandlerType is used
HandlerType* pTypeArray;
};
RTTI类型描述子。描述了单个的C++类型。在这里用它来匹配抛出的异常类型。
struct TypeDescriptor {
// vtable of type_info class
const void * pVFTable;
// used to keep thedemangled name returned by type_info::name()
void* spare;
// mangled type name, e.g.".H" = "int", ".?AUA@@" = "struct A",".?AVA@@" = "class A"
char name[0];
};
不似SEH,每个try块并没有一个与之相关的状态值。编译器不仅在进入和退出try块时修改状态值,还在每次构造和析构对象时修改。这样它就有可能在发生异常时知道哪个对象需要展开。你仍然可以通过检查与之关联的状态范围和由catch handler返回的地址来恢复try块的边界(参看附录2)。
抛出C++异常
Throw语句被转换为对_CxxThrowException()的调用,后者才真正的抛出一个Win32异常,以及异常代码0xE06D7363('msc'|0xE0000000)。 可自定义的Win32异常参数包括指向异常对象的指针,和它的ThrowInfo结构,使用该结构可以让异常处理程序(handler)检查catch处理程序(handler)期待的类型和抛出异常的类型是否匹配。
struct ThrowInfo {
// 0x01: const, 0x02: volatile
DWORD attributes;
// exception destructor
void (*pmfnUnwind)();
// forward compatibility handler
int (*pForwardCompat)();
// list of types that can catch this exception.
// i.e. the actual type and all its ancestors.
CatchableTypeArray* pCatchableTypeArray;
};
struct CatchableTypeArray {
// number of entries in the following array
int nCatchableTypes;
CatchableType* arrayOfCatchableTypes[0];
};
下面描述了一个可以捕获该异常的类型。
struct CatchableType {
// 0x01: simple type (can be copied by memmove), 0x02: can be caught byreference only, 0x04: has virtual bases
DWORD properties;
// see above
TypeDescriptor* pType;
// how to cast the thrown object to this type
PMD thisDisplacement;
// object size
int sizeOrOffset;
// copy constructor address
void (*copyFunction)();
};
// Pointer-to-member descriptor.
struct PMD {
// member offset
int mdisp;
// offset of the vbtable (-1 if not a virtual base)
int pdisp;
// offset to the displacement value inside the vbtable
int vdisp;
};
在下一篇文章中我们会更加深入。
Prologs and Epilogs
相对于在函数体内生成代码来建立栈帧的方法,编译器可能会选择调用特定的prolog和epilog函数。它们有若干变种,每一种用于特定的函数类型。
Name | Type | EH Cookie | GS Cookie | Catch Handlers |
_SEH_prolog/_SEH_epilog | SEH3 | - | - |
|
_SEH_prolog4/_SEH_epilog4 S | EH4 | + | - |
|
_SEH_prolog4_GS/_SEH_epilog4_GS | SEH4 | + | + |
|
_EH_prolog | C++ EH | - | - | +/- |
_EH_prolog3/_EH_epilog3 | C++ EH | + | - | - |
_EH_prolog3_catch/_EH_epilog3 | C++ EH | + | - | + |
_EH_prolog3_GS/_EH_epilog3_GS | C++ EH | + | + | - |
_EH_prolog3_catch_GS/_EH_epilog3_catch_GS | C++ EH | + | + | + |
SEH2
显然,在过去它用于MSVC 1.XX编译器(由crtdll.dll导出)。可能会在一些老的NT程序中碰到它。
...
Saved edi
Saved esi
Saved ebx
Next SEH frame
Current SEH handler (__except_handler2)
Pointer to the scopetable
Try level
Saved ebp (of this function)
Exception pointers
Local variables
Saved ESP
Local variables
Callee EBP
Return address
Function arguments
...
Appendix I: SEH 样例
让我们思考下面的反汇编代码。
func1 proc near
_excCode = dword ptr -28h
buf = byte ptr -24h
_saved_esp = dword ptr -18h
_exception_info = dword ptr -14h
_next = dword ptr -10h
_handler = dword ptr -0Ch
_scopetable = dword ptr -8
_trylevel = dword ptr -4
str = dword ptr 8
push ebp
mov ebp, esp
push -1
push offset _func1_scopetable
push offset _except_handler3
mov eax, large fs:0
push eax
mov large fs:0, esp
add esp, -18h
push ebx
push esi
push edi
;--- end of prolog ---
mov [ebp+_trylevel], 0;trylevel -1 -> 0: beginning of try block 0
mov [ebp+_trylevel], 1;trylevel 0 -> 1: beginning of try block 1
mov large dword ptr ds:123,456
mov [ebp+_trylevel], 0;trylevel 1 -> 0: end of try block 1
jmp short _endoftry1
_func1_filter1: ; __except() filter oftry block 1
mov ecx, [ebp+_exception_info]
mov edx,[ecx+EXCEPTION_POINTERS.ExceptionRecord]
mov eax,[edx+EXCEPTION_RECORD.ExceptionCode]
mov [ebp+_excCode], eax
mov ecx, [ebp+_excCode]
xor eax, eax
cmp ecx,EXCEPTION_ACCESS_VIOLATION
setz al
retn
_func1_handler1: ; beginning of handlerfor try block 1
mov esp, [ebp+_saved_esp]
push offset aAccessViolatio ;"Access violation"
call _printf
add esp, 4
mov [ebp+_trylevel], 0;trylevel 1 -> 0: end of try block 1
_endoftry1:
mov edx, [ebp+str]
push edx
lea eax, [ebp+buf]
push eax
call _strcpy
add esp, 8
mov [ebp+_trylevel], -1 ;trylevel 0 -> -1: end of try block 0
call _func1_handler0 ; execute __finally of try block 0
jmp short _endoftry0
_func1_handler0: ; __finally handler oftry block 0
push offset aInFinally ;"in finally"
call _puts
add esp, 4
retn
_endoftry0:
;--- epilog ---
mov ecx, [ebp+_next]
mov large fs:0, ecx
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
retn
func1 endp
_func1_scopetable
;try block 0
dd-1 ;EnclosingLevel
dd0 ;FilterFunc
ddoffset _func1_handler0 ;HandlerFunc
;try block 1
dd0 ;EnclosingLevel
ddoffset _func1_filter1 ;FilterFunc
ddoffset _func1_handler1 ;HandlerFunc
Try块0没有filter,因此它的handler是一个__finally块。Try块1的EnclosingLevel是0,所以它被置于try块0内部。考虑到这些,我们就可以试着重构出函数的结构:
void func1 (char* str)
{
char buf[12];
__try // try block 0
{
__try // try block 1
{
*(int*)123=456;
}
__except(GetExceptCode() == EXCEPTION_ACCESS_VIOLATION)
{
printf("Access violation");
}
strcpy(buf,str);
}
__finally
{
puts("in finally");
}
}
Appendix II: C++异常样例
func1 proc near
_a1 = dword ptr -24h
_exc = dword ptr -20h
e = dword ptr -1Ch
a2 = dword ptr -18h
a1 = dword ptr -14h
_saved_esp = dword ptr -10h
_next = dword ptr -0Ch
_handler = dword ptr -8
_state = dword ptr -4
push ebp
mov ebp, esp
push 0FFFFFFFFh
push offset func1_ehhandler
mov eax, large fs:0
push eax
mov large fs:0, esp
push ecx
sub esp, 14h
push ebx
push esi
push edi
mov [ebp+_saved_esp], esp
;--- end of prolog ---
lea ecx, [ebp+a1]
call A::A(void)
mov [ebp+_state], 0 ; state -1 -> 0: a1 constructed
mov [ebp+a1], 1 ; a1.m1 = 1
mov byte ptr [ebp+_state], 1 ;state 0 -> 1: try {
lea ecx, [ebp+a2]
call A::A(void)
mov [ebp+_a1], eax
mov byte ptr [ebp+_state], 2 ;state 2: a2 constructed
mov [ebp+a2], 2 ; a2.m1 = 2
mov eax, [ebp+a1]
cmp eax, [ebp+a2] ; a1.m1 == a2.m1?
jnz short loc_40109F
mov [ebp+_exc], offsetaAbc ; _exc = "abc"
push offset __TI1?PAD ; char *
lea ecx, [ebp+_exc]
push ecx
call _CxxThrowException ; throw "abc";
loc_40109F:
mov byte ptr [ebp+_state], 1 ;state 2 -> 1: destruct a2
lea ecx, [ebp+a2]
call A::~A(void)
jmp short func1_try0end
; catch (char * e)
func1_try0handler_pchar:
mov edx, [ebp+e]
push edx
push offset aCaughtS ;"Caught %s\n"
call ds:printf ;
add esp, 8
mov eax, offset func1_try0end
retn
; catch (...)
func1_try0handler_ellipsis:
push offset aCaught___ ;"Caught ...\n"
call ds:printf
add esp, 4
mov eax, offset func1_try0end
retn
func1_try0end:
mov [ebp+_state], 0 ; state 1 -> 0: }//try
push offset aAfterTry ;"after try\n"
call ds:printf
add esp, 4
mov [ebp+_state], -1 ; state 0 -> -1: destruct a1
lea ecx, [ebp+a1]
call A::~A(void)
;--- epilog ---
mov ecx, [ebp+_next]
mov large fs:0, ecx
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
retn
func1 endp
func1_ehhandler proc near
mov eax, offset func1_funcinfo
jmp __CxxFrameHandler
func1_ehhandler endp
func1_funcinfo
dd19930520h ; magicNumber
dd4 ; maxState
ddoffset func1_unwindmap ; pUnwindMap
dd1 ; nTryBlocks
ddoffset func1_trymap ; pTryBlockMap
dd0 ; nIPMapEntries
dd0 ; pIPtoStateMap
dd0 ; pESTypeList
func1_unwindmap
dd-1
ddoffset func1_unwind_1tobase ; action
dd0 ; toState
dd0 ; action
dd1 ; toState
ddoffset func1_unwind_2to1 ; action
dd0 ; toState
dd0 ; action
func1_trymap
dd1 ; tryLow
dd 2 ; tryHigh
dd3 ; catchHigh
dd2 ; nCatches
ddoffset func1_tryhandlers_0 ; pHandlerArray
dd0
func1_tryhandlers_0
dd 0 ; adjectives
dd offset char * `RTTI Type Descriptor' ;pType
dd -1Ch ; dispCatchObj
dd offset func1_try0handler_pchar ;addressOfHandler
dd 0 ; adjectives
dd 0 ; pType
dd 0 ; dispCatchObj
dd offset func1_try0handler_ellipsis ;addressOfHandler
func1_unwind_1tobase proc near
a1 = byte ptr -14h
lea ecx, [ebp+a1]
call A::~A(void)
retn
func1_unwind_1tobase endp
func1_unwind_2to1 proc near
a2 = byte ptr -18h
lea ecx, [ebp+a2]
call A::~A(void)
retn
func1_unwind_2to1 endp
我们看看能找到些什么。FuncInfo结构的maxState域是4,表示我们在unwindmap中有4项,从0到3。通过检查这个map,我们看到下列动作在栈展开中被执行:
Ÿ state 3 -> state 0 (noaction)
Ÿ state 2 -> state 1 (destructa2)
Ÿ state 1 -> state 0 (noaction)
Ÿ state 0 -> state -1(destruct a1)
再看看try map,我们可以推断状态1和2对应于try块,状态3对应于catch块。这样,从状态0转换到1指明了try块的开始,从1到0表示try块执行完毕。从函数代码,我们也可以看到从-1到0是构造a1,从1到2是构造a2。所以状态图应该象这样:
那箭头1到3从何而来?我们在函数代码中看不到,在FuncInfo也看不到,因为它是异常handler完成的。如果一个异常发生在try块内部,异常handler首先展开栈到tryLow表示的状态(这里指状态1),然后在调用catch handler前设置状态值为tryHigh+1(2+1=3)。
这个try块有两个catchhandlers。第一个指定了一个期待的异常类型(char*),并从栈中获得异常对象e(-1Ch=e)。第二个没有指定类型(比如那个省略号)。它们都返回用于恢复执行流的地址,例如,刚好在try块后面的那个地址。现在,我们恢复的函数代码如下:
void func1 ()
{
A a1;
a1.m1 = 1;
try {
A a2;
a2.m1 = 2;
if (a1.m1 == a1.m2) throw "abc";
}
catch(char* e)
{
printf("Caught %s\n",e);
}
catch(...)
{
printf("Caught ...\n");
}
printf("after try\n");
}
Appendix III: IDC Helper Script
我写过一个IDC脚本用于辅助逆向MSVC程序。它在整个程序中搜索典型的SEH/EH代码序列,并标注出所有相关的结构和域。类似于栈变量,异常处理程序,异常类型等等都被标注了出来。它还试图修复有时候会被IDA错误判定的函数边界。你可以从这里下载。
Links and References
[1] Matt Pietrek. A Crash Course on the Depths of Win32 StructuredException Handling.
http://www.microsoft.com/msj/0197/exception/exception.aspx
Still THE definitive guide on the implementation of SEH in Win32.
[2] Brandon Bray. Security Improvements to the Whidbey Compiler.
http://blogs.msdn.com/branbray/archive/2003/11/11/51012.aspx
Short description on changes in the stack layout for cookie checks.
[3] Chris Brumme. The Exception Model.
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
Mostly about .NET exceptions, but still contains a good deal of informationabout SEH and C++ exceptions.
[4] Vishal Kochhar. How a C++ compiler implements exception handling.
http://www.codeproject.com/cpp/exceptionhandler.asp
An overview of C++ exceptions implementation.
[5] Calling Standard for Alpha Systems. Chapter 5. Event Processing.
http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PY8AC-TET1_html/callCH5.html
Win32 takes a lot from the way Alpha handles exceptions and this manual has avery detailed description on how it happens.
Structure definitions and flag values were also recovered from the followingsources:
- VC8 CRT debug information (many structure definitions)
- VC8 assembly output (/FAs)
- VC8 WinCE CRT source
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Reversing Microsoft Visual C++ Part I: Exception Handling
原文链接http://www.openrce.org/articles/full_view/21
Microsoft Visual C++ is the most widely used compiler for Win32 so it is important for the Win32 reverser to be familiar with its inner working. Being able to recognize the compiler-generated glue code helps to quickly concentrate on the actual code written by the programmer. It also helps in recovering the high-level structure of the program.
In part I of this 2-part article (see also: Part II: Classes, Methods and RTTI), I will concentrate on the stack layout, exception handling and related structures in MSVC-compiled programs. Some familiarity with assembler, registers, calling conventions etc. is assumed.
Terms:
- Stack frame: A fragment of the stack segment used by a function. Usually contains function arguments, return-to-caller address, saved registers, local variables and other data specific to this function. On x86 (and most other architectures) caller and callee stack frames are contiguous.
- Frame pointer: A register or other variable that points to a fixed location inside the stack frame. Usually all data inside the stack frame is addressed relative to the frame pointer. On x86 it's usually ebp and it usually points just below the return address.
- Object: An instance of a (C++) class.
- Unwindable Object: A local object with auto storage-class specifier that is allocated on the stack and needs to be destructed when it goes out of scope.
- Stack UInwinding: Automatic destruction of such objects that happens when the control leaves the scope due to an exception.
- SEH exceptions (from "Structured Exception Handling"). Also known as Win32 or system exceptions. These are exhaustively covered in the famous Matt Pietrek article[1]. They are the only exceptions available to C programs. The compiler-level support includes keywords __try, __except, __finally and a few others.
- C++ exceptions (sometimes referred to as "EH"). Implemented on top of SEH, C++ exceptions allow throwing and catching of arbitrary types. A very important feature of C++ is automatic stack unwinding during exception processing, and MSVC uses a pretty complex underlying framework to ensure that it works properly in all cases.
The most basic stack frame looks like following:
- ...
- Local variables
- Other saved registers
- Saved ebp
- Return address
- Function arguments
- ...
Note: If frame pointer omission is enabled, saved ebp might be absent.
In cases where the compiler-level SEH (__try/__except/__finally) is used, the stack layout gets a little more complicated.

SEH3 Stack Layout
When there are no __except blocks in a function (only __finally), Saved ESP is not used. Scopetable is an array of records which describe each __try block and relationships between them:
- struct _SCOPETABLE_ENTRY {
- DWORD EnclosingLevel;
- void* FilterFunc;
- void* HandlerFunc;
- }
For more details on SEH implementation see[1]. To recover try blocks watch how the try level variable is updated. It's assigned a unique number per try block, and nesting is described by relationship between scopetable entries. E.g. if scopetable entry i has EnclosingLevel=j, then try block j encloses try block i. The function body is considered to have try level -1. See Appendix 1 for an example.
The Whidbey (MSVC 2005) compiler adds some buffer overrun protection for the SEH frames. The full stack frame layout in it looks like following:

SEH4 Stack Layout
The GS cookie is present only if the function was compiled with /GS switch. The EH cookie is always present. The SEH4 scopetable is basically the same as SEH3 one, only with added header:
- struct _EH4_SCOPETABLE {
- DWORD GSCookieOffset;
- DWORD GSCookieXOROffset;
- DWORD EHCookieOffset;
- DWORD EHCookieXOROffset;
- _EH4_SCOPETABLE_RECORD ScopeRecord[1];
- };
- struct _EH4_SCOPETABLE_RECORD {
- DWORD EnclosingLevel;
- long (*FilterFunc)();
- union {
- void (*HandlerAddress)();
- void (*FinallyFunc)();
- };
- };
When C++ exceptions handling (try/catch) or unwindable objects are present in the function, things get pretty complex.

C++ EH Stack Layout
EH handler is different for each function (unlike the SEH case) and usually looks like this:
- (VC7+)
- mov eax, OFFSET __ehfuncinfo
- jmp ___CxxFrameHandler
__ehfuncinfo is a structure of type FuncInfo which fully describes all try/catch blocks and unwindable objects in the function.
- struct FuncInfo {
- // compiler version.
- // 0x19930520: up to VC6, 0x19930521: VC7.x(2002-2003), 0x19930522: VC8 (2005)
- DWORD magicNumber;
- // number of entries in unwind table
- int maxState;
- // table of unwind destructors
- UnwindMapEntry* pUnwindMap;
- // number of try blocks in the function
- DWORD nTryBlocks;
- // mapping of catch blocks to try blocks
- TryBlockMapEntry* pTryBlockMap;
- // not used on x86
- DWORD nIPMapEntries;
- // not used on x86
- void* pIPtoStateMap;
- // VC7+ only, expected exceptions list (function "throw" specifier)
- ESTypeList* pESTypeList;
- // VC8+ only, bit 0 set if function was compiled with /EHs
- int EHFlags;
- };
Unwind map is similar to the SEH scopetable, only without filter functions:
- struct UnwindMapEntry {
- int toState; // target state
- void (*action)(); // action to perform (unwind funclet address)
- };
Try block descriptor. Describes a try{} block with associated catches.
- struct TryBlockMapEntry {
- int tryLow;
- int tryHigh; // this try {} covers states ranging from tryLow to tryHigh
- int catchHigh; // highest state inside catch handlers of this try
- int nCatches; // number of catch handlers
- HandlerType* pHandlerArray; //catch handlers table
- };
Catch block descriptor. Describes a single catch() of a try block.
- struct HandlerType {
- // 0x01: const, 0x02: volatile, 0x08: reference
- DWORD adjectives;
- // RTTI descriptor of the exception type. 0=any (ellipsis)
- TypeDescriptor* pType;
- // ebp-based offset of the exception object in the function stack.
- // 0 = no object (catch by type)
- int dispCatchObj;
- // address of the catch handler code.
- // returns address where to continues execution (i.e. code after the try block)
- void* addressOfHandler;
- };
List of expected exceptions (implemented but not enabled in MSVC by default, use /d1ESrt to enable).
- struct ESTypeList {
- // number of entries in the list
- int nCount;
- // list of exceptions; it seems only pType field in HandlerType is used
- HandlerType* pTypeArray;
- };
RTTI type descriptor. Describes a single C++ type. Used here to match the thrown exception type with catch type.
- struct TypeDescriptor {
- // vtable of type_info class
- const void * pVFTable;
- // used to keep the demangled name returned by type_info::name()
- void* spare;
- // mangled type name, e.g. ".H" = "int", ".?AUA@@" = "struct A", ".?AVA@@" = "class A"
- char name[0];
- };
throw statements are converted into calls of _CxxThrowException(), which actually raises a Win32 (SEH) exception with the code 0xE06D7363 ('msc'|0xE0000000). The custom parameters of the Win32 exception include pointers to the exception object and its ThrowInfo structure, using which the exception handler can match the thrown exception type against the types expected by catch handlers.
- struct ThrowInfo {
- // 0x01: const, 0x02: volatile
- DWORD attributes;
- // exception destructor
- void (*pmfnUnwind)();
- // forward compatibility handler
- int (*pForwardCompat)();
- // list of types that can catch this exception.
- // i.e. the actual type and all its ancestors.
- CatchableTypeArray* pCatchableTypeArray;
- };
- struct CatchableTypeArray {
- // number of entries in the following array
- int nCatchableTypes;
- CatchableType* arrayOfCatchableTypes[0];
- };
Describes a type that can catch this exception.
- struct CatchableType {
- // 0x01: simple type (can be copied by memmove), 0x02: can be caught by reference only, 0x04: has virtual bases
- DWORD properties;
- // see above
- TypeDescriptor* pType;
- // how to cast the thrown object to this type
- PMD thisDisplacement;
- // object size
- int sizeOrOffset;
- // copy constructor address
- void (*copyFunction)();
- };
- // Pointer-to-member descriptor.
- struct PMD {
- // member offset
- int mdisp;
- // offset of the vbtable (-1 if not a virtual base)
- int pdisp;
- // offset to the displacement value inside the vbtable
- int vdisp;
- };
We'll delve more into this in the next article.
Instead of emitting the code for setting up the stack frame in the function body, the compiler might choose to call specific prolog and epilog functions instead. There are several variants, each used for specific function type:
Name | Type | EH Cookie | GS Cookie | Catch Handlers |
_SEH_prolog/_SEH_epilog | SEH3 | - | - | |
_SEH_prolog4/_SEH_epilog4 S | EH4 | + | - | |
_SEH_prolog4_GS/_SEH_epilog4_GS | SEH4 | + | + | |
_EH_prolog | C++ EH | - | - | +/- |
_EH_prolog3/_EH_epilog3 | C++ EH | + | - | - |
_EH_prolog3_catch/_EH_epilog3 | C++ EH | + | - | + |
_EH_prolog3_GS/_EH_epilog3_GS | C++ EH | + | + | - |
_EH_prolog3_catch_GS/_EH_epilog3_catch_GS | C++ EH | + | + | + |
Apparently was used by MSVC 1.XX (exported by crtdll.dll). Encountered in some old NT programs.
- ...
- Saved edi
- Saved esi
- Saved ebx
- Next SEH frame
- Current SEH handler (__except_handler2)
- Pointer to the scopetable
- Try level
- Saved ebp (of this function)
- Exception pointers
- Local variables
- Saved ESP
- Local variables
- Callee EBP
- Return address
- Function arguments
- ...
Let's consider the following sample disassembly.
- func1 proc near
- _excCode = dword ptr -28h
- buf = byte ptr -24h
- _saved_esp = dword ptr -18h
- _exception_info = dword ptr -14h
- _next = dword ptr -10h
- _handler = dword ptr -0Ch
- _scopetable = dword ptr -8
- _trylevel = dword ptr -4
- str = dword ptr 8
- push ebp
- mov ebp, esp
- push -1
- push offset _func1_scopetable
- push offset _except_handler3
- mov eax, large fs:0
- push eax
- mov large fs:0, esp
- add esp, -18h
- push ebx
- push esi
- push edi
- ; --- end of prolog ---
- mov [ebp+_trylevel], 0 ;trylevel -1 -> 0: beginning of try block 0
- mov [ebp+_trylevel], 1 ;trylevel 0 -> 1: beginning of try block 1
- mov large dword ptr ds:123, 456
- mov [ebp+_trylevel], 0 ;trylevel 1 -> 0: end of try block 1
- jmp short _endoftry1
- _func1_filter1: ; __except() filter of try block 1
- mov ecx, [ebp+_exception_info]
- mov edx, [ecx+EXCEPTION_POINTERS.ExceptionRecord]
- mov eax, [edx+EXCEPTION_RECORD.ExceptionCode]
- mov [ebp+_excCode], eax
- mov ecx, [ebp+_excCode]
- xor eax, eax
- cmp ecx, EXCEPTION_ACCESS_VIOLATION
- setz al
- retn
- _func1_handler1: ; beginning of handler for try block 1
- mov esp, [ebp+_saved_esp]
- push offset aAccessViolatio ; "Access violation"
- call _printf
- add esp, 4
- mov [ebp+_trylevel], 0 ;trylevel 1 -> 0: end of try block 1
- _endoftry1:
- mov edx, [ebp+str]
- push edx
- lea eax, [ebp+buf]
- push eax
- call _strcpy
- add esp, 8
- mov [ebp+_trylevel], -1 ; trylevel 0 -> -1: end of try block 0
- call _func1_handler0 ; execute __finally of try block 0
- jmp short _endoftry0
- _func1_handler0: ; __finally handler of try block 0
- push offset aInFinally ; "in finally"
- call _puts
- add esp, 4
- retn
- _endoftry0:
- ; --- epilog ---
- mov ecx, [ebp+_next]
- mov large fs:0, ecx
- pop edi
- pop esi
- pop ebx
- mov esp, ebp
- pop ebp
- retn
- func1 endp
- _func1_scopetable
- ;try block 0
- dd -1 ;EnclosingLevel
- dd 0 ;FilterFunc
- dd offset _func1_handler0 ;HandlerFunc
- ;try block 1
- dd 0 ;EnclosingLevel
- dd offset _func1_filter1 ;FilterFunc
- dd offset _func1_handler1 ;HandlerFunc
The try block 0 has no filter, therefore its handler is a __finally{} block. EnclosingLevel of try block 1 is 0, so it's placed inside try block 0. Considering this, we can try to reconstruct the function structure:
- void func1 (char* str)
- {
- char buf[12];
- __try // try block 0
- {
- __try // try block 1
- {
- *(int*)123=456;
- }
- __except(GetExceptCode() == EXCEPTION_ACCESS_VIOLATION)
- {
- printf("Access violation");
- }
- strcpy(buf,str);
- }
- __finally
- {
- puts("in finally");
- }
- }
- func1 proc near
- _a1 = dword ptr -24h
- _exc = dword ptr -20h
- e = dword ptr -1Ch
- a2 = dword ptr -18h
- a1 = dword ptr -14h
- _saved_esp = dword ptr -10h
- _next = dword ptr -0Ch
- _handler = dword ptr -8
- _state = dword ptr -4
- push ebp
- mov ebp, esp
- push 0FFFFFFFFh
- push offset func1_ehhandler
- mov eax, large fs:0
- push eax
- mov large fs:0, esp
- push ecx
- sub esp, 14h
- push ebx
- push esi
- push edi
- mov [ebp+_saved_esp], esp
- ; --- end of prolog ---
- lea ecx, [ebp+a1]
- call A::A(void)
- mov [ebp+_state], 0 ; state -1 -> 0: a1 constructed
- mov [ebp+a1], 1 ; a1.m1 = 1
- mov byte ptr [ebp+_state], 1 ; state 0 -> 1: try {
- lea ecx, [ebp+a2]
- call A::A(void)
- mov [ebp+_a1], eax
- mov byte ptr [ebp+_state], 2 ; state 2: a2 constructed
- mov [ebp+a2], 2 ; a2.m1 = 2
- mov eax, [ebp+a1]
- cmp eax, [ebp+a2] ; a1.m1 == a2.m1?
- jnz short loc_40109F
- mov [ebp+_exc], offset aAbc ; _exc = "abc"
- push offset __TI1?PAD ; char *
- lea ecx, [ebp+_exc]
- push ecx
- call _CxxThrowException ; throw "abc";
- loc_40109F:
- mov byte ptr [ebp+_state], 1 ; state 2 -> 1: destruct a2
- lea ecx, [ebp+a2]
- call A::~A(void)
- jmp short func1_try0end
- ; catch (char * e)
- func1_try0handler_pchar:
- mov edx, [ebp+e]
- push edx
- push offset aCaughtS ; "Caught %s\n"
- call ds:printf ;
- add esp, 8
- mov eax, offset func1_try0end
- retn
- ; catch (...)
- func1_try0handler_ellipsis:
- push offset aCaught___ ; "Caught ...\n"
- call ds:printf
- add esp, 4
- mov eax, offset func1_try0end
- retn
- func1_try0end:
- mov [ebp+_state], 0 ; state 1 -> 0: }//try
- push offset aAfterTry ; "after try\n"
- call ds:printf
- add esp, 4
- mov [ebp+_state], -1 ; state 0 -> -1: destruct a1
- lea ecx, [ebp+a1]
- call A::~A(void)
- ; --- epilog ---
- mov ecx, [ebp+_next]
- mov large fs:0, ecx
- pop edi
- pop esi
- pop ebx
- mov esp, ebp
- pop ebp
- retn
- func1 endp
- func1_ehhandler proc near
- mov eax, offset func1_funcinfo
- jmp __CxxFrameHandler
- func1_ehhandler endp
- func1_funcinfo
- dd 19930520h ; magicNumber
- dd 4 ; maxState
- dd offset func1_unwindmap ; pUnwindMap
- dd 1 ; nTryBlocks
- dd offset func1_trymap ; pTryBlockMap
- dd 0 ; nIPMapEntries
- dd 0 ; pIPtoStateMap
- dd 0 ; pESTypeList
- func1_unwindmap
- dd -1
- dd offset func1_unwind_1tobase ; action
- dd 0 ; toState
- dd 0 ; action
- dd 1 ; toState
- dd offset func1_unwind_2to1 ; action
- dd 0 ; toState
- dd 0 ; action
- func1_trymap
- dd 1 ; tryLow
- dd 2 ; tryHigh
- dd 3 ; catchHigh
- dd 2 ; nCatches
- dd offset func1_tryhandlers_0 ; pHandlerArray
- dd 0
- func1_tryhandlers_0
- dd 0 ; adjectives
- dd offset char * `RTTI Type Descriptor' ; pType
- dd -1Ch ; dispCatchObj
- dd offset func1_try0handler_pchar ; addressOfHandler
- dd 0 ; adjectives
- dd 0 ; pType
- dd 0 ; dispCatchObj
- dd offset func1_try0handler_ellipsis ; addressOfHandler
- func1_unwind_1tobase proc near
- a1 = byte ptr -14h
- lea ecx, [ebp+a1]
- call A::~A(void)
- retn
- func1_unwind_1tobase endp
- func1_unwind_2to1 proc near
- a2 = byte ptr -18h
- lea ecx, [ebp+a2]
- call A::~A(void)
- retn
- func1_unwind_2to1 endp
Let's see what we can find out here. The maxState field in FuncInfo structure is 4 which means we have four entries in the unwind map, from 0 to 3. Examining the map, we see that the following actions are executed during unwinding:
- state 3 -> state 0 (no action)
- state 2 -> state 1 (destruct a2)
- state 1 -> state 0 (no action)
- state 0 -> state -1 (destruct a1)

Where did the arrow 1->3 come from? We cannot see it in the function code or FuncInfo structure since it's done by the exception handler. If an exception happens inside try block, the exception handler first unwinds the stack to the tryLow value (1 in our case) and then sets state value to tryHigh+1 (2+1=3) before calling the catch handler.
The try block has two catch handlers. The first one has a catch type (char*) and gets the exception object on the stack (-1Ch = e). The second one has no type (i.e. ellipsis catch). Both handlers return the address where to resume execution, i.e. the position just after the try block. Now we can recover the function code:
- void func1 ()
- {
- A a1;
- a1.m1 = 1;
- try {
- A a2;
- a2.m1 = 2;
- if (a1.m1 == a1.m2) throw "abc";
- }
- catch(char* e)
- {
- printf("Caught %s\n",e);
- }
- catch(...)
- {
- printf("Caught ...\n");
- }
- printf("after try\n");
- }
I wrote an IDC script to help with the reversing of MSVC programs. It scans the whole program for typical SEH/EH code sequences and comments all related structures and fields. Commented are stack variables, exception handlers, exception types and other. It also tries to fix function boundaries that are sometimes incorrectly determined by IDA. You can download it from MS SEH/EH Helper.
[1] Matt Pietrek. A Crash Course on the Depths of Win32 Structured Exception Handling.
http://www.microsoft.com/msj/0197/exception/exception.aspx
Still THE definitive guide on the implementation of SEH in Win32.
[2] Brandon Bray. Security Improvements to the Whidbey Compiler.
http://blogs.msdn.com/branbray/archive/2003/11/11/51012.aspx
Short description on changes in the stack layout for cookie checks.
[3] Chris Brumme. The Exception Model.
http://blogs.msdn.com/cbrumme/archive/2003/10/01/51524.aspx
Mostly about .NET exceptions, but still contains a good deal of information about SEH and C++ exceptions.
[4] Vishal Kochhar. How a C++ compiler implements exception handling.
http://www.codeproject.com/cpp/exceptionhandler.asp
An overview of C++ exceptions implementation.
[5] Calling Standard for Alpha Systems. Chapter 5. Event Processing.
http://www.cs.arizona.edu/computer.help/policy/DIGITAL_unix/AA-PY8AC-TET1_html/callCH5.html
Win32 takes a lot from the way Alpha handles exceptions and this manual has a very detailed description on how it happens.
Structure definitions and flag values were also recovered from the following sources:
- VC8 CRT debug information (many structure definitions)
- VC8 assembly output (/FAs)
- VC8 WinCE CRT source