leveldb代码精读 skiplist

最新推荐文章于 2019-10-15 20:44:22 发布

cmbj68996

最新推荐文章于 2019-10-15 20:44:22 发布

阅读量171

点赞数

文章标签：数据结构与算法

skiplist是在链表的基础上，在节点内添加若干个指向后面第n个节点的指针，使得链表实现类似树状的结构。

它和普通链表的显著区别
1 skiplist的节点由于记录了后面第n个节点的地址，也就是有了分支，也和树一样有高度的概念，
2 skiplist是排序链表，而每个节点的数据类型是模板实现的，也就是可以是任意类型。因此为了排序，用户需要指定一个Comparator类的实现，告诉skiplist如何比较两个节点数据的大小。
3 正是因为skiplist是排序的，因此在skiplist里查找数据的效率特别高，节点记录了下一个和下面第n个节点的地址，可以实现类似二分法查找的效率。

如下图，节点里的指针越多，就形象地成为高度越高，形成了若干层。

最下面串起所有节点的是第0层，最上面是第3层。

leveldb中skiplist的定义和实现都在 skiplist.h 一个文件里。
skiplist是一个模板类，类里面包含了node类和一个自定义的iterator类用来遍历

template<typename Key, class Comparator>
class SkipList {
private:
struct Node;
...
class Iterator {
...
}
...
}

先看看Node里有什么
Node是以结构体定义的，因此变量和函数默认是public的。
template < typename Key , class Comparator >

struct SkipList<Key,Comparator>::Node {
...
}

public成员就是Node的数据(key)，构造函数也就是对key进行简单赋值

explicit Node(const Key& k) : key(k) { }
Key const key;

私有成员只有一个指针数组，指向后面的几个节点地址。

leveldb里的port模块将各个平台的基本操作封装起来，比如port::AtomicPointer本质上就是void*。因此port::AtomicPointer next_[1]其实就是执行后面节点的指针数组。

指针越多，节点的level越高。

port::AtomicPointer next_[1];

下面是Node的操作

都是简单的指针操作，函数很简单。

获取后面第n个节点地址

Node* Next(int n) {
assert(n >= 0);
// Use an 'acquire load' so that we observe a fully initialized
// version of the returned Node.
return reinterpret_cast<Node*>(next_[n].Acquire_Load());
}

设置后面第n个节点的数据

void SetNext(int n, Node* x) {
assert(n >= 0);
// Use a 'release store' so that anybody who reads through this
// pointer observes a fully initialized version of the inserted node.
next_[n].Release_Store(x);
}

下面两个函数同样是获取节点和设置数据，区别是用的是“NoBarrier”操作

// No-barrier variants that can be safely used in a few locations.
Node* NoBarrier_Next(int n) {
assert(n >= 0);
return reinterpret_cast<Node*>(next_[n].NoBarrier_Load());
}
void NoBarrier_SetNext(int n, Node* x) {
assert(n >= 0);
next_[n].NoBarrier_Store(x);
}

关于Next和NoBarrier-Next的区别，主要是里面这处

next_[n].Acquire_Load()
next_[n].NoBarrier_Load()

下面是两个函数的具体内容
NoBarrier_Load是直接将指针返回
而Acquire_Load在返回指针之前先要记性一次“内存屏障”操作

inline void* NoBarrier_Load() const { return rep_; }
inline void* Acquire_Load() const {
void* result = rep_;
MemoryBarrier();
return result;
}

关于内存屏障，大概意思就是cpu为了尽量发挥自己的性能，减少对内存操作的等待，能够在不影响最终结果的前提下自行改变代码的执行顺序。
但是在多核处理器处理共享内存区域时，各个核心自行修改代码执行顺序，就不能保证每个核心的执行结果都正确了，各个核心对变量的一致性被破坏。
内存屏障是一条cpu指令，它有两个主要功能
1 确保内存屏障之前的代码全都执行完成后才能执行屏障之后的。也就是屏障前后的代码执行顺序不能颠倒。
2 强制cpu刷一次之前操作的变量缓存，保证每个cpu都能看到最新的变量值。

再来看skiplist主体的内容
关键的成员变量(都是私有的)

1 最大高度，初始化用

enum { kMaxHeight = 12 };

2 用于比较两个节点的数据大小，排序和查找会用到

Comparator const compare_;

3 leveldb的内存池

关于内存池是如何实现的，参考我的博客 http://blog.itpub.net/26239116/viewspace-1832774/

Arena* const arena_;

4 头节点

Node* const head_;

5 在插入时修改skiplist的最大高度

port::AtomicPointer max_height_;

6 用来随机决定新节点的高度

Random rnd_;

构造函数

指定节点的数据类型和用户自己实现的Comparator类，用于比较节点内数据的大小。

还要指定一个内存池arena。skiplist的内存都要从这里分配。

template<typename Key, class Comparator>
SkipList<Key,Comparator>::SkipList(Comparator cmp, Arena* arena)
// 对成员变量进行赋值
: compare_(cmp),
arena_(arena),
// 创建头节点
head_(NewNode(0 /* any key will do */, kMaxHeight)),
// 初始化最大高度
max_height_(reinterpret_cast<void*>(1)),
rnd_(0xdeadbeef) {
// 从头节点开始，初始化一个空skiplist
for (int i = 0; i < kMaxHeight; i++) {
head_->SetNext(i, NULL);
}
}

先看一下主要的私有函数，后面的public函数会用到

1 创建新节点，指定结存存放的数据类型和高度

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node*
SkipList<Key,Comparator>::NewNode(const Key& key, int height) {
// 从arena里分配一块内存，大小是Node本身大小加上一系列指针的大小。height - 1是因为Node本身的next_就可以指向第0层。
char* mem = arena_->AllocateAligned(
sizeof(Node) + sizeof(port::AtomicPointer) * (height - 1));
// new后面跟地址，表示在指定的地址分配空间
return new (mem) Node(key);
}

2 返回给定的key是否应该在指定Node的后面，即是否大于Node里的key。

template<typename Key, class Comparator>
bool SkipList<Key,Comparator>::KeyIsAfterNode(const Key& key, Node* n) const {
// NULL n is considered infinite
return (n != NULL) && (compare_(n->key, key) < 0); // 这里就调用了用户自己实现的compare_，比较两个自定义对象的大小。
}

3 查找和给定key相等或者比它大的最小节点

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node* SkipList<Key,Comparator>::FindGreaterOrEqual(const Key& key, Node** prev)
const {
// x是当前的节点。从头节点开始
Node* x = head_;
// 头节点所在的是最高层
int level = GetMaxHeight() - 1;
while (true) {
// 在第level层里找
Node* next = x->Next(level);
// 比较大小
if (KeyIsAfterNode(key, next)) {
// Keep searching in this list
x = next; // 如果key比next节点的key大，就将x指向next，继续寻找。相当于在这一层向右移动着找。
} else {
// 这个prev是有调用者提供的空节点，level至少跟this的相当。
// prev作用是保存每一层比key小的最大节点，也就是查找过程中，在每一层走过的路径。
if (prev != NULL) prev[level] = x;
if (level == 0) {
return next;
} else {
// Switch to next list
level--;
}
}
}
}

4 查找比给定key小的最大的节点

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node*
SkipList<Key,Comparator>::FindLessThan(const Key& key) const {
Node* x = head_;
int level = GetMaxHeight() - 1;
while (true) {
assert(x == head_ || compare_(x->key, key) < 0); // 当前节点必须比key小。
Node* next = x->Next(level); // 在本层寻找
if (next == NULL || compare_(next->key, key) >= 0) {
// 如果下一个节点比key打，就不在本层继续了，要去下一层。除非已经到底0层了。
if (level == 0) {
return x;
} else {
// Switch to next list
level--;
}
} else {
x = next; // 在本层继续往后找
}
}
}

5 查找最后一个非空节点。

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node* SkipList<Key,Comparator>::FindLast()
const {
Node* x = head_;
int level = GetMaxHeight() - 1;
while (true) {
Node* next = x->Next(level);
if (next == NULL) {
// 如果下一个是空节点，就看level，如果没到0，就继续到下层找，如果到0了，x就是最后一个节点
if (level == 0) {
return x;
} else {
// Switch to next list
level--;
}
} else {
x = next;
}
}
}

以上几个私有函数虽然都是查找，但是因为skiplist是排序的，因此每次插入都要先查找一个合适的位置。

代码可能不只管，下面以查找12举例

1 从头节点最高层开始，判断12是否大于头节点在level 3的next节点（12）。

2 12比6大，因此移动到节点6，在看12是否比6在level 3的next节点（NULL）大。

3 比较大小调用了KeyIsAfterNode，由于是越靠右越大，因此它将null视为无穷大。因此这次比较返回false。

4 由于6在level 3的next大于12，因此从6开始降一层，看level 2。

5 在节点6的level 2 重复上面操作，发现6在level 2的next，也就是25，大于12，因此再下探一层。

6 在节点6的level 1 重复上面操作，找到了接单9。

7 在节点9的level 1 上的next（17）大于12，因此将到level 0。

8 在节点9的level 0 上看next，发现等于12，查找完成。

最终的查找路径就是 head --> 6 --> 9 --> 12，查找效率不错。

skiplist对外提供的函数就是插入和查找。

看过上面的私有函数后，下面两个函数就直观了

1 将key插入到skiplist里。

template<typename Key, class Comparator>
void SkipList<Key,Comparator>::Insert(const Key& key) {
  // TODO(opt): We can use a barrier-free variant of FindGreaterOrEqual()
  // here since Insert() is externally synchronized.
  Node* prev[kMaxHeight];
  Node* x = FindGreaterOrEqual(key, prev);

  // Our data structure does not allow duplicate insertion
  assert(x == NULL || !Equal(key, x->key)); // 不允许插入重复值

  // 随机决定新节点的高度
  int height = RandomHeight();
  if (height > GetMaxHeight()) {
    // 将高过当前最大高度的部分指向头节点，不清楚为什么不像构造函数那样设成NULL
    for (int i = GetMaxHeight(); i < height; i++) {
      prev[i] = head_;
    }
    //fprintf(stderr, "Change height from %d to %d\n", max_height_, height);

    // It is ok to mutate max_height_ without any synchronization
    // with concurrent readers. A concurrent reader that observes
    // the new value of max_height_ will see either the old value of
    // new level pointers from head_ (NULL), or a new value set in
    // the loop below. In the former case the reader will
    // immediately drop to the next level since NULL sorts after all
    // keys. In the latter case the reader will use the new node.
    // 更新skiplist的最大高度
    max_height_.NoBarrier_Store(reinterpret_cast<void*>(height));
  }

  // 创建新节点
  x = NewNode(key, height);
  for (int i = 0; i < height; i++) {
    // NoBarrier_SetNext() suffices since we will add a barrier when
    // we publish a pointer to "x" in prev[i].
    // 由于prev记录了查找key的路径，因此将新节点x插入到prev的每一层和 prev.next之间
    x->NoBarrier_SetNext(i, prev[i]->NoBarrier_Next(i));
    prev[i]->SetNext(i, x);
  }
}