C++/Constructors of Global Object

本文详细解析了C++中全局对象的初始化过程及其顺序,包括静态与动态初始化的区别,并探讨了不同源文件间全局对象初始化顺序的不确定性,提供了解决方案。
Answer
This behaviour is defined in the ISO C++ standard in section "3.6.2 Initialisation of non-local objects" and covers over a page.

1/ Storage for such objects are first zero-initialised.

Zero initialisation and initialisation with a constant expression are called static initialisation. All other types of initialisation are called dynamic initialisation. 

As you ask about when global objects have their constructors called it is therefore dynamic initialisation we are interested in as such constructors would be called for the dynamic initialisation of those global objects that require it.

2/ These points are important: 

Global objects (i.e. non-local objects with static storage duration) defined in namespace scope (i.e. not in a function) in the same translation unit (source file) and dynamically initialised shall be initialised in the order in which their definitions appear in the translation unit.

That is, provided your global objects are in the same source file then they are dynamically initialised in the order they appear in the source file.

If however your global objects are in different translation units (i.e. different source files, as you indicate in your question) then the order of initialisation is _undefined_. That is there is _no_ predicable order of initialisation for this case. This can apply not only between building a project using different tools (compilers, linkers etc), or even building using differing versions of the same tool set, but even between different builds of the same project using exactly the same tools! 

So say you think you have a related set of global objects initialised in a sensible sequence, then maybe you modify some code, re-build your project, and crunch! The order of global object initialisation has changed and one object is now initialised _before_ some other object it relies on. It may even be that just re-building the whole project from scratch could cause a change that breaks the order of initialisation. This I think is due to the possible ways the compiled code (in object code files) could be linked together by a linker. A partial re-build could cause a different global object initialisation sequence to that produced by a full re-build - as maybe a quick rebuild uses incremental linking and a full rebuild requires a full link.

Generally such dynamic initialisation is done before the first statement of main. However a compiler is permitted to defer dynamic initialisation of global objects until after the first statement of main but before first use of any object or function that is defined in the _same_ translation unit as the global object requiring initialisation is defined. Again such compiler-controlled initialisation is restricted to a single translation unit. Again I suspect this is because the language, which is implemented by compilers, has little or no say over the ways that linkers function. Or where it does then it just covers existing linker technology.

So, in short, the constructor of a global object is called either before the first statement of main or may be deferred until after the first statement of main but before the first use of any object or function in the same translation unit as the global object is defined in.

Only global objects in the same translation unit have a predicable sequence of initialisation, which matches the order they are defined in the translation unit.

There is _no_ defined or predicable order of initialisation between global objects defined in different translation units. Code relying on such orderings is _extremely_ fragile and likely to break with no obvious reason. For example: It works for you, you check the code in, and then it fails to work for anyone else...

This problem of undefined initialisation order of global objects is well known and so is a solution - see Scott Meyers' excellent book "Effective C++" - every serious C++ developer should read this book. Of particular interest here is "Item 47 Ensure that non-local static objects are initialised before they're used".

The answer is to replace such objects with functions that return a reference to a local static object, so for example rather than:

In a.cpp:

   SomeClass gSomeObject( gSomeOtherObject );

In b.cpp:

   SomeOtherClass gSomeOtherObject( 1, 2.0, 4 );

Where it is undefined whether gSomeOtherObject defined in b.cpp will be initialised before it is used to construct gSomeObject in a.cpp, use something like:

In a.cpp:

   SomeClass & GSomeObject()
   {
       static SomeClass obj( GSomeOtherObject() );

       return obj;
   }

In b.cpp:

   SomeOtherClass & GSomeOtherObject()
   {
       static SomeOtherClass obj( 1, 2.0, 4 );

       return obj;
   }

Obviously a.cpp and b.cpp would need to include appropriate headers containing the relevant class definitions and function declarations.

The reason this fixes the initialisation order problem is that the point of initialisation for _local_ static objects (those defined as static in functions) _is_ well defined - it occurs when execution first passes through the static object declaration. 

Thus if the static SomeOtherClass obj in GSomeOtherObject() is not initialised when GSomeObject() is first called then the initialisation of obj in GSomeObject() will call GSomeOtherObject() and cause its static obj to be initialised and then a reference to it (now initialised) will be returned for use in the initialisation of SomeClass obj in GSomeObject().

As someone I worked with once pointed out it does not help so much with the associated problem of order of static object destruction. However local static objects are destroyed as a result of returning from main, and done so in the reverse order to which they were initialised so hopefully this will be sufficient in most cases.

For more sophisticated schemes see for example Andrei Alexandrescu's book "Modern C++ Design", specifically chapter 6 "Implementing Singletons".

Finally I should of course note that reliance on (too much) global data (many would say this includes singletons!) is considered bad design/development practice. It is worth considering if other techniques might be applicable, such as the parameterisation from above (PfA) pattern - see  http://accu.org/index.php/journals/1411 http://accu.org/index.php/journals/1420,   http://www.software-architect.co.uk/slides/sa07-KevlinHenney-Selfish_Object.pdf  then maybe read  http://accu.org/index.php/journals/1327  for some balance!


Hope this helps  
<think>我们正在处理一个与ARM启动代码相关的问题,特别是关于全局C++对象构造函数的调用。 给定的代码片段是ARM汇编,用于遍历并调用位于`__ctors_start__`和`__ctors_end__`之间的构造函数列表。 在C++环境中,全局对象的构造函数需要在main函数执行前被调用。这些构造函数的地址通常被编译器放置在特定的段中(如.ctors或.init_array)。 以下代码将实现一个循环,遍历这个构造函数列表并逐个调用它们。 假设: - `__ctors_start__` 是构造函数指针数组的起始地址。 - `__ctors_end__` 是构造函数指针数组的结束地址(最后一个元素之后的位置)。 - 每个指针的大小为4字节(ARM架构)。 步骤: 1. 加载`__ctors_start__`到寄存器r0,指向第一个构造函数的指针。 2. 加载`__ctors_end__`到寄存器r1,指向构造函数数组的末尾。 3. 比较r0和r1,如果相等,说明没有构造函数,则跳过循环。 4. 否则,开始循环,每次从r0指向的位置读取一个函数指针,然后调用该函数。 5. 每次调用后,r0增加4(指向下一个函数指针),直到r0等于r1。 注意:在调用函数时,需要保存和恢复可能被破坏的寄存器(如lr),因为我们在一个函数调用环境中(比如在启动代码中)。 下面给出完整的汇编代码及详细注释: </think>### ARM启动代码中的全局C++对象构造函数调用 您提供的代码片段是ARM嵌入式系统中启动代码的关键部分,负责在`main()`函数之前调用全局C++对象的构造函数。这是C++运行时环境初始化必不可少的步骤。以下是完整实现和详细解释: #### 完整构造函数调用实现 ```armasm .section .text.boot .global _start _start: /* 初始化栈指针 */ ldr sp, =stack_top /* 清零BSS段 */ ldr r0, =__bss_start__ ldr r1, =__bss_end__ mov r2, #0 bss_clear_loop: cmp r0, r1 strlt r2, [r0], #4 blt bss_clear_loop /* 调用全局C++构造函数 */ ldr r0, =__ctors_start__ @ R0 = 构造函数指针数组起始地址 ldr r1, =__ctors_end__ @ R1 = 构造函数指针数组结束地址 ctors_loop: cmp r0, r1 @ 检查是否到达结束位置 beq ctors_done @ 如果相等则跳转完成 ldr r2, [r0], #4 @ 加载构造函数地址到R2,R0 += 4 stmfd sp!, {r0, r1} @ 保存寄存器(中断安全) blx r2 @ 调用构造函数 ldmfd sp!, {r0, r1} @ 恢复寄存器 b ctors_loop @ 继续循环 ctors_done: /* 调用main函数 */ bl main /* 程序退出处理 */ b . @ 无限循环(可选:软复位处理) .size _start, . - _start ``` #### 关键指令解析: 1. **`ldr r0, =__ctors_start__`** - 加载构造函数指针表的起始地址到R0 - `__ctors_start__`由链接脚本定义,指向`.ctors`或`.init_array`段开头 2. **`ldr r1, =__ctors_end__`** - 加载构造函数指针表的结束地址到R1 - 指向构造函数指针数组末尾的下一个位置 3. **`ldr r2, [r0], #4`** - 从当前地址加载函数指针到R2 - 后递增寻址:读取后R0自动增加4字节(指针大小) 4. **`blx r2`** - 跳转到构造函数地址并执行 - 保存返回地址到LR寄存器(需保护现场) #### 链接脚本定义(linker.ld) ```ld SECTIONS { .text : { *(.text.boot) *(.text) } .rodata : { *(.rodata) } .data : { *(.data) } .bss : { __bss_start__ = .; *(.bss) *(COMMON) __bss_end__ = .; } .ctors : { __ctors_start__ = .; KEEP(*(.ctors)) KEEP(*(.init_array)) __ctors_end__ = .; } .stack (NOLOAD) : { . = . + 0x1000; stack_top = .; } /DISCARD/ : { *(.comment) } } ``` #### 构造函数的存储结构 ``` 内存布局示例: 0x8000: __ctors_start__ +0x00: Constructor1地址 -> [blx r2]调用 +0x04: Constructor2地址 +0x08: Constructor3地址 ... 0x8100: __ctors_end__ (指向最后一个构造函数地址+4) ``` #### 典型C++全局对象示例 ```cpp // 定义全局对象 class SystemLogger { public: SystemLogger() { // 初始化串口等硬件 } }; SystemLogger globalLogger; // 构造函数将被放入.ctors段 int main() { // globalLogger已初始化完成 while(1); } ``` ### 高级应用:优先级构造函数 某些系统需要按优先级初始化: ```cpp __attribute__((constructor(101))) void highPriInit() { // 高优先级初始化函数 } __attribute__((constructor(102))) void lowPriInit() { // 低优先级初始化函数 } ``` 对应的链接脚本调整: ```ld .init_array : { __ctors_start__ = .; KEEP(*(SORT(.init_array.*))) // 按优先级排序 KEEP(*(.init_array)) __ctors_end__ = .; } ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值