Use wrapper to align stack in terms of GCC calling convention
As the mentioned by “Do not hybrid compile and link source code by using VC and GCC”, there are some gasp between VC and GCC in the calling convention. If they are both blent to use, the program will be crashed in the runtime, especially when SSE instructions are adopted.
To overcome this issue, this article does some attempts by wrapping function. Although it has some flaws in the view of whole solution, readers can understand a feasible method when having troubles in this environment. Before introducing it, we had better capture some basic concepts of C language.
1. The form to push parameters of function when invoking it. For example, void func( int arg1, int arg2 ) and pseudo source codes are:
push arg2 push arg1 call func |
The direction is from right to left when pushing parameters of functions.
2. How to return value of function. For instance, int func( void ) should return an integer after finishing its job, and the returning value is stored into eax. The C and ASM source codes are listed in the following table.
C Source codes | ASM Source codes |
int func( void ) { return 100; } | Disassembly of section .text:
00000000 <_func>: 0: 55 push ebp 1: 89 e5 mov ebp,esp 3: b8 64 00 00 00 mov eax,0x64 8: c9 leave 9: c3 ret a: 90 nop b: 90 nop |
3. The naming rule of C function is add ‘_’ prefix in the ASM source codes. The above table shows it from func to _func.
Ok, we have comprehended them here. Let enter the topic of this article. As we know, GCC makes full use of 16-bytes-alignment calling address, and refer it when allocating variables aligned in the stack. It is a static strategy. It means that program will be crashed if the rule is destroyed! Unfortunately, VC uses dynamical way to align stack. If SSE instructions need access variables birthed in the stack, the prefix ASM source codes like:
… sub esp, 16 // Allocate a variable in the stack and esp, ~15 // Allocate the address in the light of 16 bytes mov XMMWORD PTR [ esp ], xmm0 … |
If SSE instructions are resided in those functions compiled by GCC, VC does not know this situation, the dynamical prefix will not generated by VC when calling these functions. Obviously, the result is blue ruin(Because VC does not guarantee the 16-bytes-alignment address of stack when invoking these functions). Readers maybe have obtained my thinkings, and that is right. The purpose of wrapping function is for alignment when VC calls GCC functions. Please refer to the next source codes of ASM.
SECTION .text align = 16
; ; void gcc_stack_align( void (*func)( int arg ), int arg ) ; global _gcc_stack_align align 16 _gcc_stack_align: push ebp mov ebp, esp sub esp, 8 and esp, ~15 mov [ esp + 4 ], eax mov eax, [ ebp + 12 ] mov [ esp ], eax mov eax, [ ebp + 8 ] call eax mov eax, [ esp + 4 ] mov esp, ebp pop ebp ret |
1. “push ebp”, “mov ebp, esp” and “sub esp, 8” are a standard non-nesting ENTER instructions.
2. “and esp, ~15” is used to align 16 bytes in the stack.
3. [esp + 4] is used to store the value of eax, and restore it into eax after finishing operations.
4. [esp + 12] is used to store the arg which is parameter of func.
5. [esp + 8] encapsulates the address of func.
6. To call func, the parameter of func must be copied from [esp + 12] to [esp]. esp is aligned by 16bytes, and it matches the calling convention of GCC.
7. “mov esp, ebp” and “pop ebp” to restore the context of stack. It can be replaced by “leave” instruction.
8. If “void foo_gcc( int i )” need be invoked by VC, the calling way: “gcc_stack_align( foo_gcc, i )”
9. ASM adopts yasm syntax, and please use “yasm -f win32 foo_gcc_stack_align.asm” to build it.
[Summarization]
1. It brings about a probing direction to manually fix the gaps of calling convention.
2. It is a fragmentary solution in the sample, and readers can extend it in terms of it thinkings.
3. “Do not hybrid compile and link source code by using VC and GCC”, unless you know what you are doing.
[Appendix]
The testing files to confirm this idea.
foo_gcc.h | #ifndef _FOO_GCC_H #define _FOO_GCC_H
#if defined( __cplusplus ) || defined( c_plusplus ) extern "C" { #endif
#ifdef BUILD_DLL #define GCC_API __declspec( dllexport ) #else #define GCC_API __declspec( dllimport ) #endif
GCC_API void foo_gcc( int i );
#if defined( __cplusplus ) || defined( c_plusplus ) } #endif
#endif |
foo_gcc.c | #include "foo_gcc.h"
GCC_API void foo_gcc( int i ) { char unused[ 16 ] __attribute__ ((__aligned__ ( 16 )));
__asm__ __volatile__( "movdqa %%xmm0, %0/n/t" : "=m"(unused) ); } |
foo_vc.c | #include "foo_gcc.h"
void foo_vc( int aligned ) { __declspec( align( 16 ) ) char unused[ 16 ];
_asm { lea esi, unused movdqa xmm0, xmmword ptr[ esi ] }
if ( aligned ) gcc_stack_align( foo_gcc, aligned ); else foo_gcc( aligned ); }
int main( int argc, char *argv[] ) { int aligned = (argc > 1);
printf( "aligned = %d/n", aligned );
foo_vc( aligned );
return 0; } |
gcc_stack_align.asm | SECTION .text align = 16
; ; void gcc_stack_align( void (*func)( int arg ), int arg ) ; global _gcc_stack_align align 16 _gcc_stack_align: push ebp mov ebp, esp sub esp, 8 and esp, ~15 mov [ esp + 4 ], eax mov eax, [ ebp + 12 ] mov [ esp ], eax mov eax, [ ebp + 8 ] call eax mov eax, [ esp + 4 ] mov esp, ebp pop ebp ret |
How to build?
#!/bin/sh
rm ./foo_gcc.o rm ./foo_gcc.dll rm ./foo_gcc.asm rm ./foo_gcc.lib rm ./libfoo_gcc.a rm ./foo_gcc.exp rm ./foo_gcc.def gcc -c -O3 -DBUILD_DLL ./foo_gcc.c objdump -m i386:intel -D ./foo_gcc.o > foo_gcc.asm gcc -shared -o ./foo_gcc.dll ./foo_gcc.o -Wl,--out-implib,./libfoo_gcc.a pexports ./foo_gcc.dll > ./foo_gcc.def lib /machine:i386 /def:foo_gcc.def /out:foo_gcc.lib
rm -f ./foo_stack_align.obj yasm -f win32 ./foo_stack_align.asm
rm ./foo_vc.obj rm ./foo_vc.asm rm ./foo_vc.exe cl -c -O2 ./foo_vc.c objdump -m i386:intel -D ./foo_vc.obj > foo_vc.asm cl ./foo_vc.obj ./foo_stack_align.obj ./foo_gcc.lib |
[Reference]
1. How to build the environment of MSYS for the crossing compiler?
2. How to update YASM from 0.7.2 to 0.8.0 in MinGW?
3. Do not hybrid compile and link source code by using VC and GCC!
4. How to use GCC to build DLL by DEF file in MinGW?
5. How to generate DLL files by GCC in the MinGW?