侯捷 C++ 课程学习笔记：STL源码剖析与内存管理机制

本文链接：https://blog.youkuaiyun.com/z_mz11/article/details/146498522

侯捷 C++ 课程学习笔记：STL源码剖析与内存管理机制

前言

在学习侯捷老师的C++系列课程中，最令我印象深刻的莫过于对STL源码的剖析与内存管理机制的讲解。这部分内容帮助我理解了C++标准库的设计哲学以及高效内存管理的技巧。本文将结合侯捷老师课程内容，分享我在STL内存管理方面的学习心得。

一、STL内存分配器的设计哲学

侯捷老师在课程中提及：“了解STL，必须从allocator说起”。

1.1 为什么需要自定义分配器

传统的new/delete操作包含两个步骤：

分配/释放内存
调用构造函数/析构函数

STL的设计者将这两个步骤分离，使得内存管理更加灵活。在课程中，侯捷老师通过剖析std::allocator的实现，展示了这种分离思想：

template <class T>
class allocator {
public:
    // 分配内存但不构造对象
    T* allocate(size_t n) {
        return static_cast<T*>(::operator new(n * sizeof(T)));
    }
    
    // 释放内存但不析构对象
    void deallocate(T* p, size_t n) {
        ::operator delete(p);
    }
    
    // 在已分配内存上构造对象
    void construct(T* p, const T& val) {
        new(p) T(val);  // placement new
    }
    
    // 析构对象但不释放内存
    void destroy(T* p) {
        p->~T();
    }
};

这种设计让我意识到，内存管理与对象生命周期管理的分离，是实现高效容器的关键。

1.2 SGI STL的双层配置器

而对于SGI STL中的双层配置器设计。侯捷老师详细讲解了其中的精妙之处：

第一级配置器：直接使用malloc/free
第二级配置器：维护自由链表（free-list），管理小块内存

当请求内存大于128字节时，使用第一级配置器；当请求小于等于128字节时，使用第二级配置器，这可以显著减少内存碎片和提高分配效率。

// 第二级配置器的核心实现
class __pool_alloc {
private:
    static const int __ALIGN = 8;           // 小型区块的上调边界
    static const int __MAX_BYTES = 128;     // 小型区块的上限
    static const int __NFREELISTS = 16;     // free-lists个数

    // free-lists的结构
    union obj {
        union obj* free_list_link;
        char client_data[1];
    };
    
    // 16个free-lists
    static obj* volatile free_list[__NFREELISTS];
    
    // 根据bytes计算使用第几号free-list
    static size_t FREELIST_INDEX(size_t bytes) {
        return ((bytes + __ALIGN-1) / __ALIGN - 1);
    }
  
};

理解这段代码后，我在自己的项目中也实现了类似的内存池，性能提升显著。

二、容器的内存管理策略

2.1 vector的内存增长策略

侯捷老师提出：“vector的实现体现了空间换时间的思想”。通过源码剖析，理解了vector是如何管理内存的：

template <class T>
void vector<T>::push_back(const T& x) {
    if (finish != end_of_storage) {  // 还有备用空间
        construct(finish, x);        // 在备用空间构造对象
        ++finish;                    // 调整水位
    }
    else {  // 已无备用空间
        const size_type old_size = size();
        // 分配原大小的两倍空间（或其他增长策略）
        const size_type len = old_size != 0 ? 2 * old_size : 1;
        
        T* new_start = allocate(len);    // 配置新空间
        T* new_finish = new_start;
        
        try {
            // 复制原vector内容到新空间
            new_finish = uninitialized_copy(start, finish, new_start);
            // 在新空间构造新元素
            construct(new_finish, x);
            ++new_finish;
        }
        catch(...) {
            // 异常处理...
            deallocate(new_start, len);
            throw;
        }
        
        // 析构并释放原vector
        destroy(start, finish);
        deallocate(start, end_of_storage - start);
        
        // 调整迭代器，指向新vector
        start = new_start;
        finish = new_finish;
        end_of_storage = new_start + len;
    }
}

课程中，侯捷老师特别强调了vector的容量增长策略（通常是倍增）和它的优缺点：

优点：摊还复杂度为O(1)，减少了频繁内存分配的开销
缺点：可能造成内存浪费，且数据搬迁成本高

2.2 list的节点分配器

list节点的内存管理：与vector不同，list采用了离散的节点分配方式：

template <class T>
struct __list_node {
    __list_node<T>* prev;
    __list_node<T>* next;
    T data;
};

template <class T, class Alloc = allocator<T>>
class list {
protected:
    // 专属的节点分配器，每次分配一个节点大小
    typedef allocator<__list_node<T>> node_allocator;
    
    __list_node<T>* get_node() {
        return node_allocator::allocate(1);
    }
    
    void put_node(__list_node<T>* p) {
        node_allocator::deallocate(p, 1);
    }
    
    // ... 其他实现 ...
};

课程中，侯捷老指出正因为list的节点分配与释放频繁，因此需要高效的内存管理策略。这就是为什么STL容器各自有不同的内存管理策略（因为它们的使用场景和操作特性不同。

三、深入理解内存池技术

3.1 内存池的实现原理

在侯捷老师的课程中深入聊了内存池的实现原理。根据讲解，实现一个简化版的内存池供参考：

template <typename T, size_t BlockSize = 4096>
class MemoryPool {
private:
    // 内存块结构
    struct Block {
        Block* next;
    };
    
    // 内存槽结构
    union Slot {
        T element;
        Slot* next;
    };
    
    Block* currentBlock_ = nullptr;   // 当前内存块
    Slot* currentSlot_ = nullptr;     // 当前可用的内存槽
    Slot* lastSlot_ = nullptr;        // 最后一个内存槽
    Slot* freeSlots_ = nullptr;       // 自由内存槽链表
    
public:
    // 分配一个对象的内存
    T* allocate() {
        if (freeSlots_) {
            T* result = reinterpret_cast<T*>(freeSlots_);
            freeSlots_ = freeSlots_->next;
            return result;
        }
        
        if (currentSlot_ >= lastSlot_) {
            // 分配新的内存块
            Block* newBlock = reinterpret_cast<Block*>(
                std::malloc(BlockSize));
            newBlock->next = currentBlock_;
            currentBlock_ = newBlock;
            
            // 设置内存槽指针
            currentSlot_ = reinterpret_cast<Slot*>(
                reinterpret_cast<char*>(currentBlock_) + sizeof(Block*));
            lastSlot_ = reinterpret_cast<Slot*>(
                reinterpret_cast<char*>(currentBlock_) + BlockSize);
        }
        
        return reinterpret_cast<T*>(currentSlot_++);
    }
    
    // 释放一个对象的内存
    void deallocate(T* p) {
        if (p) {
            reinterpret_cast<Slot*>(p)->next = freeSlots_;
            freeSlots_ = reinterpret_cast<Slot*>(p);
        }
    }
    
};

高效的内存池往往需要平衡以下几个方面：

减少系统调用
避免内存碎片
提高局部性
线程安全考虑

3.2 使用内存池的性能对比

一个简单的性能测试，比较标准分配器和内存池：

#include <chrono>
#include <iostream>
#include <vector>

// 测试标准分配器
void test_standard_allocator() {
    auto start = std::chrono::high_resolution_clock::now();
    
    for (int i = 0; i < 1000000; ++i) {
        int* p = new int(i);
        delete p;
    }
    
    auto end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> diff = end - start;
    std::cout << "标准分配器耗时: " << diff.count() << " 秒\n";
}

// 测试内存池
void test_memory_pool() {
    auto start = std::chrono::high_resolution_clock::now();
    
    MemoryPool<int> pool;
    for (int i = 0; i < 1000000; ++i) {
        int* p = pool.allocate();
        *p = i;
        pool.deallocate(p);
    }
    
    auto end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> diff = end - start;
    std::cout << "内存池耗时: " << diff.count() << " 秒\n";
}