Pool allocator
接下来介绍了我在实现内存分配管理工具的具体细节,其中包括内存分配、内存池、垃圾回收的实现。
其他文章请详见:
内存分配:添加链接描述
内存池:添加链接描述
垃圾回收:添加链接描述
项目代码:添加链接描述
Overview
Pool allocator其通常允许O(1)分配,当空闲块被发现马上,而无需搜索自由列表。为了实现这种快速分配,池分配器通常使用预定义大小的blocks 。这个想法类似于Segregated list,但是具有更快的blocks 确定。
该方法可以极大地提高处理许多预定义形状对象的系统的性能。例如,在游戏应用程序中,我们可能需要分配数百个甚至数千个相同类型的对象。在这种情况下,碎片malloc可能是分配速度较慢的来源。这就是游戏机积极为此类对象使用内存池的原因。
所以,让我们深入研究细节,并实现一个。
Blocks and Chunks
池分配器使用blocks和每个block内的chunks的概念进行操作。
每个chunk都有预定义的大小,并编码对象头,它存储分配器或收集器目的所需的元信息。
让我们先从chunks开始
Chunks: individual objects
由于大小是预定义的,所以我们不需要将其存储在头中,而只能保留对下一个对象的引用。我们将分配表示为一个Chunk结构:
struct Chunk {
/**
* When a chunk is free, the `next` contains the
* address of the next chunk in a list.
*
* When it's allocated, this space is used by
* the user.
*/
Chunk *next;
};
当一个块被分配时,Mutator(用户代码)可以完全占用它,包括我们的next指针最初占用的空间。出于这个原因,我们甚至不需要一个块是否空闲的标志——这是由始终指向当前空闲块的分配指针解决的,我们很快就会看到。
这是空闲Chunks在内存中的样子:
所以,——所有可用空闲Chunks的简单链表。再次注意如何将Allocation pointer设置为当前空闲Chunks,这将在分配请求时立即找到。
并且一旦分配了一些对象,分配指针就会相应地前进,并且仍然指向当前的空闲块,准备立即返回。
Blocks: groups of chunks
为了支持这种快速分配,应该已经预先分配了chunks 的内存。这种预分配完全称为对象池,我们在实现中将其称为Block。
但是,block在我们的代码中并未表示为实际的单独结构。这是一个相当抽象的概念,它通过分配足够的空间来存储所需数量的chunks 来对chunks 进行分组。
block的大小由每个block的chunks 数决定。
让我们开始定义我们的PoolAlloctor类,接受块的数量作为参数:
/**
* The allocator class.
*
* Features:
*
* - Parametrized by number of chunks per block
* - Keeps track of the allocation pointer
* - Bump-allocates chunks
* - Requests a new larger block when needed
*
*/
class PoolAllocator {
public:
PoolAllocator(size_t chunksPerBlock)
: mChunksPerBlock(chunksPerBlock) {}
void *allocate(size_t size);
void deallocate(void *ptr, size_t size);
private:
/**
* Number of chunks per larger block.
*/
size_t mChunksPerBlock;
/**
* Allocation pointer.
*/
Chunk *mAlloc = nullptr;
/**
* Allocates a larger block (pool) for chunks.
*/
Chunk *allocateBlock();
};
如我们所见,该类跟踪allocation pointer( mAlloc),allocateBlock在需要新块时具有私有例程,并且还提供标准allocate和deallocate方法作为公共 API。
让我们首先关注分配。
Allocation
为了满足分配请求,我们需要返回一个指向当前块内空闲chunk 的指针。然而,当当前block中没有剩余chunk时,或者当我们根本没有任何block时,我们需要首先通过标准机制分配block本身malloc。其标志是当分配指针mAlloc设置为 时nullptr。
void *PoolAllocator::allocate(size_t size) {
// No chunks left in the current block, or no any block
// exists yet. Allocate a new one, passing the chunk size:
if (mAlloc == nullptr) {
mAlloc = allocateBlock(size);
}
...
}
现在让我们看看allocateBlock.
Block allocation
大block的大小是每个chunks 的块数乘以块大小。一旦分配了block,我们还需要链接其中的所有chunks ,以便我们可以轻松访问next每个chunk中的指针。
/**
* Allocates a new block from OS.
*
* Returns a Chunk pointer set to the beginning of the block.
*/
Chunk *PoolAllocator::allocateBlock(size_t chunkSize) {
cout << "\nAllocating block (" << mChunksPerBlock << " chunks):\n\n";
size_t blockSize = mChunksPerBlock * chunkSize;
// The first chunk of the new block.
Chunk *blockBegin = reinterpret_cast<Chunk *>(malloc(blockSize));
// Once the block is allocated, we need to chain all
// the chunks in this block:
Chunk *chunk = blockBegin;
for (int i = 0; i < mChunksPerBlock - 1; ++i) {
chunk->next =
reinterpret_cast<Chunk *>(reinterpret_cast<char *>(chunk) + chunkSize);
chunk = chunk->next;
}
chunk->next = nullptr;
return blockBegin;
}
因此,我们返回Chunk指向块开头的指针 - blockBegin,并且该值mAlloc在allocate函数中设置为。
现在让我们回到allocate, 并处理block内的chunk分配。
Chunk allocation
好的,所以我们已经分配了一个块,现在mAlloc不是nullptr。在这种情况下,我们在分配指针的当前位置返回一个空闲块mAlloc。我们还为未来的分配请求进一步推进(碰撞)分配指针。
让我们看看完整的allocate功能实现:
/**
* Returns the first free chunk in the block.
*
* If there are no chunks left in the block,
* allocates a new block.
*/
void *PoolAllocator::allocate(size_t size) {
// No chunks left in the current block, or no any block
// exists yet. Allocate a new one, passing the chunk size:
if (mAlloc == nullptr) {
mAlloc = allocateBlock(size);
}
// The return value is the current position of
// the allocation pointer:
Chunk *freeChunk = mAlloc;
// Advance (bump) the allocation pointer to the next chunk.
//
// When no chunks left, the `mAlloc` will be set to `nullptr`, and
// this will cause allocation of a new block on the next request:
mAlloc = mAlloc->next;
return freeChunk;
}
好的,我们现在可以在块内 Bump-allocate 块,并malloc从 OS分配块。现在让我们看看解除分配。
Deallocation
解除分配一个块更简单——我们只是在chunks 列表的前面返回它,设置mAlloc指向它。
这是该deallocate函数的完整代码:
/**
* Puts the chunk into the front of the chunks list.
*/
void PoolAllocator::deallocate(void *chunk, size_t size) {
// The freed chunk's next pointer points to the
// current allocation pointer:
reinterpret_cast<Chunk *>(chunk)->next = mAlloc;
// And the allocation pointer is now set
// to the returned (free) chunk:
mAlloc = reinterpret_cast<Chunk *>(chunk);
}
下图显示了在block A被释放后如何调整分配指针,以及next返回block 的指针A现在如何指向的前一个位置mAlloc:
现在,当我们可以分配和解除分配时,让我们使用我们的自定义池分配器创建一个类,并查看它的运行情况。
Objects with custom allocator
C++ 允许覆盖new和delete运算符的默认行为。我们利用这个优势来设置我们的池分配器,它将处理分配请求。
/**
* The `Object` structure uses custom allocator,
* overloading `new`, and `delete` operators.
*/
struct Object {
// Object data, 16 bytes:
uint64_t data[2];
// Declare out custom allocator for
// the `Object` structure:
static PoolAllocator allocator;
static void *operator new(size_t size) {
return allocator.allocate(size);
}
static void operator delete(void *ptr, size_t size) {
return allocator.deallocate(ptr, size);
}
};
// Instantiate our allocator, using 8 chunks per block:
PoolAllocator Object::allocator{8};
一旦为我们的Object类声明并实例化了分配器,我们现在可以正常创建 的实例Object,并且分配请求应该被路由到我们的池分配器。
使用和测试
最后,让我们测试一下我们的分配器,看看块和块的管理在起作用。
#include <iostream>
using std::cout;
using std::endl;
int main(int argc, char const *argv[]) {
// Allocate 10 pointers to our `Object` instances:
constexpr int arraySize = 10;
Object *objects[arraySize];
// Two `uint64_t`, 16 bytes.
cout << "size(Object) = " << sizeof(Object) << endl << endl;
// Allocate 10 objects. This causes allocating two larger,
// blocks since we store only 8 chunks per block:
cout << "About to allocate " << arraySize << " objects" << endl;
for (int i = 0; i < arraySize; ++i) {
objects[i] = new Object();
cout << "new [" << i << "] = " << objects[i] << endl;
}
cout << endl;
// Deallocated all the objects:
for (int i = arraySize; i >= 0; --i) {
cout << "delete [" << i << "] = " << objects[i] << endl;
delete objects[i];
}
cout << endl;
// New object reuses previous block:
objects[0] = new Object();
cout << "new [0] = " << objects[0] << endl << endl;
}
作为此执行的结果,您应该看到以下输出:
size(Object) = 16
About to allocate 10 objects
Allocating block (8 chunks):
new [0] = 0x7fb266402ae0
new [1] = 0x7fb266402af0
new [2] = 0x7fb266402b00
new [3] = 0x7fb266402b10
new [4] = 0x7fb266402b20
new [5] = 0x7fb266402b30
new [6] = 0x7fb266402b40
new [7] = 0x7fb266402b50
Allocating block (8 chunks):
new [8] = 0x7fb266402b60
new [9] = 0x7fb266402b70
delete [9] = 0x7fb266402b70
delete [8] = 0x7fb266402b60
delete [7] = 0x7fb266402b50
delete [6] = 0x7fb266402b40
delete [5] = 0x7fb266402b30
delete [4] = 0x7fb266402b20
delete [3] = 0x7fb266402b10
delete [2] = 0x7fb266402b00
delete [1] = 0x7fb266402af0
delete [0] = 0x7fb266402ae0
new [0] = 0x7fb266402ae0
请注意,您的机器上的特定地址可能不同,但这里重要的是,块中的对象(块)使用 Bump 分配器密集分配,一个接一个。
还要注意0x7fb266402ae0当所有对象都被释放时我们如何开始重用地址,并且我们收到了一个新的分配请求。
Summary
A pool allocator is a useful and practical tool, which you can use to speed up your app in case of many predefined size objects. As an exercise, experiment and extend the allocator with some other convenient methods: for example, allow deallocating a whole block at once if requested, and returning it back to the global OS allocator.
You can find a full source code for this article in this gist.