non-virtual thunk for Virtual Function in multiple inheritance

转自:http://thomas-sanchez.net/computer-sciences/2011/08/15/what-every-c-programmer-should-know-the-hard-part/

What every C++ programmer should know, The hard part

Previously, I explained how C++ does to handle the classes and inheritance between them. But, I did not cover how the virtual is handled.

It adds a lot of complexity, C++ is compiled and when a binary is linked against a library they have to speak the same language: they have to share the same ABI. The C++ creators had to find a way to give along the program lifetime metadata about the manipulated classes.

They chose the Virtual Tables.

The Virtual Table

When a C++ program is compiled, the binary embedded some information about the manipulated classes by the program. When a class inherits from an interface, the actual implementation of the method should always be accessible. The Virtual Table (VTable) are generated during the compilation process,they can be seen as array of method pointers.

Let’s take an example:

01 #include <iostream>
02  
03 struct Interface
04 {
05         Interface() : i(0x424242) {}
06         virtual void test_method() = 0;
07         virtual ~Interface(){}
08         int i;
09 };
10  
11 struct Daughter : public Interface
12 {
13         void test_method()
14         {
15             std::cout << "This is a call to the method" << std::endl;
16             std::cout << "This: " << this << std::endl;
17         }
18 };
19  
20 int main()
21 {
22     Daughter* d = new Daughter;
23     Interface* i = d;
24  
25     i->test_method();
26  
27     std::cout << sizeof(Daughter) << std::endl;
28     std::cout << *((void**)i) << std::endl;
29     std::cout << ((void**)i)[1] << std::endl;
30 }

I recall that all the test have been done on a Linux 64bits.

The size of a Daughter instance is not 8 as we could expect but 16bytes. The memory dump shows that the first field of the class is not the value of i but a strange value and our field come next to it. Our ‘strange’ value is actually a pointer, in fact it is a pointer inside our binary.

nm -C test | grep 400d
0000000000400de0 V vtable for Daughter

I will explain after why there is a difference of some bytes between the two. So this pointer represent the location of the Daughter VTable. We can now check its content.

As I said, a VTable is a kind of array of method pointer.

To get a pointer on it, it is simply:

size_t** vtable = *(size_t***)i;
std::cout << vtable[0] << std::endl;

And if we check the new address printed on the output we can see that it is actually our pointer on method.

nm -C test | grep -E 400c6a
0000000000400c6a W Daughter::test_method()

We can play a little bite more to test deeper:

typedef void (*VtablePtr) (Daughter*);
VtablePtr ptr = (VtablePtr)vtable[0];
ptr(d);

The VTable are determined along the compilation. When the compiler see a virtual method in a class in start to construct a VTable associated to this class. When this class is inherited by another one, it will automatically duplicate and receive a pointer on a VTable for the current parsed class. Each entry of the VTable will be filled when the actual definition of the method is encountered. It is always the last definition which is kept.

The index of the method in the VTable is the same as the apparition order in the source file, that's why it's very important that all the part of a project is compiled with consistent header. It is always embarrassing when the bad method is called in a project without knowing why…

Here is the complete code:

01 #include <iostream>
02  
03 struct Interface
04 {
05         Interface() : i(0x424242) {}
06         virtual void test_method() = 0;
07         virtual ~Interface(){}
08         int i;
09 };
10  
11 struct Daughter : public Interface
12 {
13         void test_method()
14         {
15             std::cout << "This is a call to the method" << std::endl;
16             std::cout << "This: " << this << std::endl;
17         }
18 };
19  
20 int main()
21 {
22     Daughter* d = new Daughter;
23     Interface* i = d;
24  
25     i->test_method();
26  
27     std::cout << sizeof(Daughter) << std::endl;
28     std::cout << *((void**)i) << std::endl;
29     std::cout << ((void**)i)[1] << std::endl;
30  
31     size_t** vtable = *(size_t***)i;
32     std::cout << vtable[0] << std::endl;
33  
34     typedef void (*VtablePtr) (Daughter*);
35     VtablePtr ptr = (VtablePtr)vtable[0];
36     ptr(d);
37  
38 }

In conclusion, when virtual appears an instance should be seen like this:

VPTR
Base1
Daughter

And the instance is heavier of sizeof(void*)*nb_of_vptr bytes.

Virtual in multiple inheritance

As usual, we are going to start with a trivial code:

01 #include <iostream>
02  
03 struct Mother
04 {
05         virtual void mother()=0;
06         virtual ~Mother() {}
07         int i;
08 };
09  
10 struct Father
11 {
12         virtual void father()=0;
13         virtual ~Father() {}
14         int j;
15 };
16  
17 struct Daughter : public Mother, public Father
18 {
19         void mother()
20         { std::cout << "Mother: " << this << std::endl; }
21  
22         void father()
23         { std::cout << "Father: " << this << std::endl; }
24  
25         int k;
26 };
27  
28 int main()
29 {
30     Daughter* d = new Daughter;
31     Mother* m = d;
32     Father* f = d;
33  
34     std::cout << "Daughter: " << (void*)d << std::endl;
35     std::cout << "Father  : " << (void*)f << std::endl;
36     std::cout << sizeof(*d) << std::endl;
37  
38     std::cout << *((void**)d) << std::endl;
39     std::cout << *((void**)f) << std::endl;
40 }

As you can note, the two table used are different. When the types are manipulated, this is not always (never?) the concrete type used but the abstract one. With multiple inheritance it can be a Mother or a Father instances, so when a Father is used and the actual implementation is in Daughter, the method should be accessible. That's why there is another VTable pointer.

However, when an instance of type Daughter is used through a Father pointer, Daughtermethod cannot be called directly. Indeed, the instance pointer needs to be adjusted to match a Daughter instance. To solve this problem, there are the Thunk function.

If we print the first entry of the VTable and if we disassemble the code a this location, we have this:

1 0000000000400cf4 <non-virtual thunk to Daughter::father()>:
2   400cf4:       48 83 ef 10             sub    $0x10,%rdi
3   400cf8:       eb 00                   jmp    400cfa <Daughter::father()>

These two instructions perform pointer adjustment by subtracting the size of the Motherclass (and then match the Daughter instance). Therefore, if you have multiple inheritance with method you can add some indirection very easily:

  • Get the VTable;
  • Move to the wanted method (apply an offset on the VTable pointer, for example 8 to get the second method);
  • Call the method;
  • Adjust the this pointer;
  • Jump to the actual method definition.

 

Method Pointer

Yes, method pointer have a cost. Contrary to the C where function pointers have no overhead, the C++ had to deal with the difference between:

  • From which instance the method is accessed;
  • Is the method virtual?

The first point require a pointer adjustment. The second point, well, lot of things.

Firstly, the size of a method pointer is 16 bytes (against 8 in C). The method pointer is in three parts:

  1. Offset
  2. Address/index
  3. virtual?

The first one is on 8 bytes, the second on 8 bytes also. The third part is on one byte and is merged with the second one. If the last byte is set then the second part should be seen as an index (the index of the method in the VTable), otherwise it is the address of the method.

Therefore, calling a method pointer require ~ 20 asm instructions (in the worst case):

  1. Get the offset to apply on the instance pointer;
  2. Apply it;
  3. Check if we call a virtual member function;
  4. If yes, subtract 1;
  5. Get the VTable;
  6. Get the method address;
  7. Call the method.

Conclusion

In a next article I'll cover the VTable prefix and the virtual inheritance but there are less common in C++ code. In these two articles I tried to put some light on C++'s internal mechanism. The C++ is a fast language but it can become much less efficient because of complex class relation. I don't say: "don't use virtual and method pointer", I think programmers should be aware of these counterparts.

I think the readability is more important than performances. Yes, you can have a lot of overhead in C++ but it will still be more efficient than a lot of languages. But sometimes you can avoid virtualization. For example, the common ways for a beginner (and sometimes less beginners C++ programmers) to do an abstraction is to define an interface and for the different implementation, define a new class which inherits from this interface.

Sometimes, ok it is the right thing to do, sometimes not. If you are asked to write an abstraction to the filesystem on Linux and Windows if you follow the described way, you'll write an iFS interface, a WindowsFS and a LinuxFS. It'll work well but you can do even better: You can write a WindowsFS and LinuxFS and define a new type FS according to the platform where the code is compiled, on Linux we could imagine something like this:

typedef LinuxFS FS;

With a code like this, you'll avoid some overheard due to the interface. It works well on abstraction of platform specific features but it does not work on data abstraction and you'll need an interface.

Here are some resources:


<think>我们正在处理一个链接错误,错误信息表明在链接阶段找不到几个“non-virtual thunk to IAP::CONTROL::ControlMonitor”的成员函数的实现。这些函数是`beginPostAlarm()`, `endPostAlarm()`, 和`updateFromIncident(IAP::CONDITION::Incident*)`。这些thunk是编译器为了实现虚函数调用而生成的代码片段,通常与虚函数表(vtable)相关。 ### 问题分析 1. **thunk是什么?** - 在C++中,thunk是一小段汇编代码,用于调整`this`指针(在多重继承或虚继承中)或执行其他必要的调整,以便正确调用虚函数。当派生类覆盖基类的虚函数时,编译器可能会生成thunk,特别是当派生类有多个基类时,需要调整`this`指针以指向基类的子对象。 - “non-virtual thunk”通常用于调整非虚基类的`this`指针偏移。 2. **为什么会出现“undefined reference to non-virtual thunk”?** - 这个错误通常表示: - 这些thunk所对应的虚函数(即`beginPostAlarm()`, `endPostAlarm()`, `updateFromIncident`)在派生类(例如`NstStatusMonitor`)中被声明了,但是没有被定义(即没有实现体)。 - 或者,这些函数的定义存在,但是在链接时没有找到对应的目标文件(.o文件)或库。 3. **错误信息解读**: - 错误发生在`NstStatusMonitor.o`中,这个目标文件是`NstStatusMonitor`类的实现。这个类应该是继承自`IAP::CONTROL::ControlMonitor`。 - 在`NstStatusMonitor`类的虚函数表(vtable)中,有几个条目指向了thunk,而这些thunk最终应该调用`ControlMonitor`的成员函数。但是链接器找不到这些thunk的实现。 ### 解决步骤 1. **检查派生类中是否实现了基类的虚函数** - 确保在`NstStatusMonitor`类中,你实现了基类`ControlMonitor`中声明的所有纯虚函数(即`=0`的函数),以及任何需要覆盖的虚函数。 - 特别注意错误中提到的三个函数: - `beginPostAlarm()` - `endPostAlarm()` - `updateFromIncident(IAP::CONDITION::Incident*)` 2. **检查基类声明** - 查看`ControlMonitor`类的声明,确认这些函数是否被正确声明为虚函数。例如: ```cpp class ControlMonitor { public: virtual void beginPostAlarm() = 0; // 纯虚函数 virtual void endPostAlarm() = 0; virtual void updateFromIncident(Incident* incident) = 0; // ... 可能有虚析构函数 virtual ~ControlMonitor() {} }; ``` - 如果这些函数在基类中是纯虚函数(有`=0`),那么派生类必须提供实现。 3. **检查派生类的实现文件** - 打开`NstStatusMonitor`类的实现文件(通常是.cpp文件),检查这三个函数是否都有实现。例如: ```cpp void NstStatusMonitor::beginPostAlarm() { // 实现代码 } void NstStatusMonitor::endPostAlarm() { // 实现代码 } void NstStatusMonitor::updateFromIncident(Incident* incident) { // 实现代码 } ``` - 如果缺少任何一个,请补充实现。 4. **确保实现被编译** - 确认包含这些函数实现的源文件(如`NstStatusMonitor.cpp`)被加入了编译过程(在Makefile中应该有这个文件,并且会被编译成`NstStatusMonitor.o`)。 - 检查Makefile,确保没有遗漏源文件。 5. **检查链接顺序** - 在链接阶段,确保包含了`NstStatusMonitor.o`和其他必要的目标文件或库。在Makefile中,链接命令应该包含所有需要的.o文件。 6. **检查基类的实现** - 如果`ControlMonitor`类中的这些函数不是纯虚函数(即它们有默认实现),那么也要确保`ControlMonitor`类中这些函数的定义(如果有的话)被正确编译和链接。但根据错误信息,thunk指向的是`ControlMonitor`的函数,所以更可能是派生类没有正确覆盖。 7. **考虑多重继承和虚继承** - 如果`NstStatusMonitor`继承了多个基类,那么可能需要调整`this`指针,这就会用到thunk。确保所有的基类都正确定义了虚函数,并且派生类正确覆盖。 8. **检查编译器选项和ABI兼容性** - 确保整个项目使用相同的编译器和编译选项(如C++标准、RTTI设置等)。混合使用不同编译器或不同版本编译的库可能导致此类问题。 ### 参考引用中的类似问题解决 - 引用[1]中提到了类似错误,也是关于non-virtual thunk的未定义引用。通常的解决方法是确保虚函数在派生类中被正确实现。 - 引用[2]提到了QT项目中的MOC(元对象编译器)文件没有正确生成导致虚函数表问题。如果你在项目中使用QT,并且使用了信号槽(QObject派生类),那么确保使用`qt5_wrap_cpp`命令生成MOC文件,并将生成的MOC文件加入编译。 - 引用[3]则是因为缺少QT支持而重装OpenCV的案例。虽然不直接相关,但提醒我们检查项目依赖的库是否配置正确。 ### 总结解决方案 1. **确认虚函数实现**:检查`NstStatusMonitor`类是否实现了`ControlMonitor`的所有纯虚函数。 2. **检查编译过程**:确保包含这些实现的源文件被编译并链接。 3. **检查基类定义**:确保基类中虚函数声明正确。 4. **QT项目特殊处理**:如果使用了QT,确保MOC文件被正确生成和包含。 如果以上步骤都检查无误,问题应该得到解决。 ### 相关问题 1. 在C++中,虚函数表(vtable)是如何工作的? 2. 多重继承情况下,虚函数调用的`this`指针调整机制是什么? 3. 如何解决C++链接阶段出现的"undefined reference"错误? 4. 在QT项目中,MOC(元对象编译器)的作用是什么?</think>### 理解"non-virtual thunk"符号及解决链接错误 #### 1. **thunk符号的本质** 在C++虚函数机制中,**thunk**是编译器生成的**跳转代码**,用于处理多态调用时的`this`指针调整问题[^1]。当存在多重继承时,派生类对象中不同基类的子对象地址偏移量不同,thunk负责完成地址偏移计算。 - `non-virtual thunk`:处理非虚基类的指针偏移 - `virtual thunk`:处理虚基类的指针偏移 **典型场景**: ```cpp class Base { public: virtual void func() = 0; // 纯虚函数 }; class Derived : public Base { public: void func() override; // 需要实现 }; ``` 若`Derived`未实现`func()`,链接时会报: `undefined reference to non-virtual thunk to Base::func()` #### 2. 错误原因分析 您的错误信息: ```bash undefined reference to `non-virtual thunk to IAP::CONTROL::ControlMonitor::beginPostAlarm()' ``` 表明: 1. `NstStatusMonitor`继承自`ControlMonitor` 2. `ControlMonitor`声明了虚函数`beginPostAlarm()` 3. **未找到该函数的实现**(或实现未被链接) #### 3. 系统化解决方案 ##### 步骤1:检查虚函数实现 确认派生类中**完整实现**了所有基类虚函数: ```cpp // ControlMonitor.hpp class ControlMonitor { public: virtual void beginPostAlarm() = 0; // 纯虚函数 virtual void endPostAlarm() = 0; virtual void updateFromIncident(Incident*) = 0; }; // NstStatusMonitor.cpp ✅ 必须实现 void NstStatusMonitor::beginPostAlarm() { /* 实现 */ } void NstStatusMonitor::endPostAlarm() { /* 实现 */ } void NstStatusMonitor::updateFromIncident(Incident* incident) { /* 实现 */ } ``` ##### 步骤2:验证编译单元 确保包含实现的源文件**参与编译链接**: ```makefile # Makefile关键检查 OBJ_FILES = \ ./obj/Base/Monitor/ControlMonitorImpl.o \ # 包含实现的文件 ./obj/Base/Monitor/NstStatusMonitor.o standardTM: $(OBJ_FILES) $(CXX) $^ -o $@ # 确保所有.o文件被链接 ``` ##### 步骤3:处理QT项目特殊要求 若涉及QT元对象编译器(MOC): ```cmake # CMakeLists.txt 必须包含MOC生成 qt5_wrap_cpp(MOC_FILES include/ControlMonitor.hpp # 所有QObject派生类头文件 ) add_executable(standardTM src/NstStatusMonitor.cpp ${MOC_FILES} # ✅ 关键:包含MOC生成的文件 ) ``` ##### 步骤4:检查ABI兼容性 确认所有组件使用**相同编译环境**: ```bash # 检查编译器版本一致性 g++ --version # 所有.o文件必须同版本编译 # 重建过程(清除历史构建) make clean rm -rf ./obj make ``` ##### 步骤5:验证虚析构函数 基类必须声明虚析构函数: ```cpp class ControlMonitor { public: virtual ~ControlMonitor() = default; // ✅ 防止vtable破坏 }; ``` #### 4. 特殊场景处理 - **第三方库链接问题**:若`ControlMonitor`来自库,检查链接顺序 `g++ ... -lControlMonitor -lNstCore ...`(被依赖库放后面) - **预编译头问题**:清除预编译头缓存 `rm gch/*.gch` - **跨模块继承**:确保动态库导出符号 `__attribute__((visibility("default")))` #### 5. 根本原因总结 | 原因 | 检测方法 | 解决方案 | |------|----------|----------| | 虚函数未实现 | `nm -C ControlMonitor.o | grep "U "` | 实现所有纯虚函数 | | 编译单元缺失 | 检查Makefile的OBJ列表 | 添加缺失的.cpp文件 | | MOC生成失败 | 检查moc_*.cpp是否存在 | 正确配置`qt5_wrap_cpp` | | ABI不兼容 | 检查编译器版本 | 统一编译环境 | > "90%的'thunk'链接错误源于未实现的纯虚函数或编译单元遗漏" —— C++核心指南[^2] ### 相关问题 1. C++虚函数表(vtable)的底层实现机制是什么? 2. 多重继承场景下如何避免虚函数调用时的指针偏移错误? 3. QT元对象编译器(MOC)如何处理C++的虚函数? 4. 在大型C++项目中如何系统化诊断链接错误? 5. 不同编译器(如GCC/Clang)的ABI兼容性如何影响虚函数调用?
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值