Haproxy的任务调度机制使用的数据结构为弹性二叉树(Ebtree),从源文件中的介绍,应该是作者自己创作并命名的。
本节将会介绍Ebtree的基本概念以及作者实现的一些通用操作。
概念
对于Ebtree的描述如下
[ebtree/ebtree.h]
/*
General idea:
-------------
In a radix binary tree, we may have up to 2N-1 nodes for N keys if all of
them are leaves. If we find a way to differentiate intermediate nodes (later
called "nodes") and final nodes (later called "leaves"), and we associate
them by two, it is possible to build sort of a self-contained radix tree with
intermediate nodes always present. It will not be as cheap as the ultree for
optimal cases as shown below, but the optimal case almost never happens :
Eg, to store 8, 10, 12, 13, 14 :
ultree this theorical tree
8 8
/ \ / \
10 12 10 12
/ \ / \
13 14 12 14
/ \
12 13
Note that on real-world tests (with a scheduler), is was verified that the
case with data on an intermediate node never happens. This is because the
data spectrum is too large for such coincidences to happen. It would require
for instance that a task has its expiration time at an exact second, with
other tasks sharing that second. This is too rare to try to optimize for it.
What is interesting is that the node will only be added above the leaf when
necessary, which implies that it will always remain somewhere above it. So
both the leaf and the node can share the exact value of the leaf, because
when going down the node, the bit mask will be applied to comparisons. So we
are tempted to have one single key shared between the node and the leaf.
The bit only serves the nodes, and the dups only serve the leaves. So we can
put a lot of information in common. This results in one single entity with
two branch pointers and two parent pointers, one for the node part, and one
for the leaf part :
node's leaf's
parent parent
| |
[node] [leaf]
/ \
left right
branch branch
The node may very well refer to its leaf counterpart in one of its branches,
indicating that its own leaf is just below it :
node's
parent
|
[node]
/ \
left [leaf]
branch
Adding keys in such a tree simply consists in inserting nodes between
other nodes and/or leaves :
[root]
|
[node2]
/ \
[leaf1] [node3]
/ \
[leaf2] [leaf3]
On this diagram, we notice that [node2] and [leaf2] have been pulled away
from each other due to the insertion of [node3], just as if there would be
an elastic between both parts. This elastic-like behaviour gave its name to
the tree : "Elastic Binary Tree", or "EBtree". The entity which associates a
node part and a leaf part will be called an "EB node".
We also notice on the diagram that there is a root entity required to attach
the tree. It only contains two branches and there is nothing above it. This
is an "EB root". Some will note that [leaf1] has no [node1]. One property of
the EBtree is that all nodes have their branches filled, and that if a node
has only one branch, it does not need to exist. Here, [leaf1] was added
below [root] and did not need any node.
An EB node contains :
- a pointer to the node's parent (node_p)
- a pointer to the leaf's parent (leaf_p)
- two branches pointing to lower nodes or leaves (branches)
- a bit position (bit)
- an optional key.
The key here is optional because it's used only during insertion, in order
to classify the nodes. Nothing else in the tree structure requires knowledge
of the key. This makes it possible to write type-agnostic primitives for
everything, and type-specific insertion primitives. This has led to consider
two types of EB nodes. The type-agnostic ones will serve as a header for the
other ones, and will simply be called "struct eb_node". The other ones will
have their type indicated in the structure name. Eg: "struct eb32_node" for
nodes carrying 32 bit keys.
We will also node that the two branches in a node serve exactly the same
purpose as an EB root. For this reason, a "struct eb_root" will be used as
well inside the struct eb_node. In order to ease pointer manipulation and
ROOT detection when walking upwards, all the pointers inside an eb_node will
point to the eb_root part of the referenced EB nodes, relying on the same
principle as the linked lists in Linux.
Another important point to note, is that when walking inside a tree, it is
very convenient to know where a node is attached in its parent, and what
type of branch it has below it (leaf or node). In order to simplify the
operations and to speed up the processing, it was decided in this specific
implementation to use the lowest bit from the pointer to designate the side
of the upper pointers (left/right) and the type of a branch (leaf/node).
This practise is not mandatory by design, but an implementation-specific
optimisation permitted on all platforms on which data must be aligned. All
known 32 bit platforms align their integers and pointers to 32 bits, leaving
the two lower bits unused. So, we say that the pointers are "tagged". And
since they designate pointers to root parts, we simply call them
"tagged root pointers", or "eb_troot" in the code.
Duplicate keys are stored in a special manner. When inserting a key, if
the same one is found, then an incremental binary tree is built at this
place from these keys. This ensures that no special case has to be written
to handle duplicates when walking through the tree or when deleting entries.
It also guarantees that duplicates will be walked in the exact same order
they were inserted. This is very important when trying to achieve fair
processing distribution for instance.
Algorithmic complexity can be derived from 3 variables :
- the number of possible different keys in the tree : P
- the number of entries in the tree : N
- the number of duplicates for one key : D
Note that this tree is deliberately NOT balanced. For this reason, the worst
case may happen with a small tree (eg: 32 distinct keys of one bit). BUT,
the operations required to manage such data are so much cheap that they make
it worth using it even under such conditions. For instance, a balanced tree
may require only 6 levels to store those 32 keys when this tree will
require 32. But if per-level operations are 5 times cheaper, it wins.
Minimal, Maximal and Average times are specified in number of operations.
Minimal is given for best condition, Maximal for worst condition, and the
average is reported for a tree containing random keys. An operation
generally consists in jumping from one node to the other.
Complexity :
- lookup : min=1, max=log(P), avg=log(N)
- insertion from root : min=1, max=log(P), avg=log(N)
- insertion of dups : min=1, max=log(D), avg=log(D)/2 after lookup
- deletion : min=1, max=1, avg=1
- prev/next : min=1, max=log(P), avg=2 :
N/2 nodes need 1 hop => 1*N/2
N/4 nodes need 2 hops => 2*N/4
N/8 nodes need 3 hops => 3*N/8
...
N/x nodes need log(x) hops => log2(x)*N/x
Total cost for all N nodes : sum[i=1..N](log2(i)*N/i) = N*sum[i=1..N](log2(i)/i)
Average cost across N nodes = total / N = sum[i=1..N](log2(i)/i) = 2
This design is currently limited to only two branches per node. Most of the
tree descent algorithm would be compatible with more branches (eg: 4, to cut
the height in half), but this would probably require more complex operations
and the deletion algorithm would be problematic.
Useful properties :
- a node is always added above the leaf it is tied to, and never can get
below nor in another branch. This implies that leaves directly attached
to the root do not use their node part, which is indicated by a NULL
value in node_p. This also enhances the cache efficiency when walking
down the tree, because when the leaf is reached, its node part will
already have been visited (unless it's the first leaf in the tree).
- pointers to lower nodes or leaves are stored in "branch" pointers. Only
the root node may have a NULL in either branch, it is not possible for
other branches. Since the nodes are attached to the left branch of the
root, it is not possible to see a NULL left branch when walking up a
tree. Thus, an empty tree is immediately identified by a NULL left
branch at the root. Conversely, the one and only way to identify the
root node is to check that it right branch is NULL. Note that the
NULL pointer may have a few low-order bits set.
- a node connected to its own leaf will have branch[0|1] pointing to
itself, and leaf_p pointing to itself.
- a node can never have node_p pointing to itself.
- a node is linked in a tree if and only if it has a non-null leaf_p.
- a node can never have both branches equal, except for the root which can
have them both NULL.
- deletion only applies to leaves. When a leaf is deleted, its parent must
be released too (unless it's the root), and its sibling must attach to
the grand-parent, replacing the parent. Also, when a leaf is deleted,
the node tied to this leaf will be removed and must be released too. If
this node is different from the leaf's parent, the freshly released
leaf's parent will be used to replace the node which must go. A released
node will never be used anymore, so there's no point in tracking it.
- the bit index in a node indicates the bit position in the key which is
represented by the branches. That means that a node with (bit == 0) is
just above two leaves. Negative bit values are used to build a duplicate
tree. The first node above two identical leaves gets (bit == -1). This
value logarithmically decreases as the duplicate tree grows. During
duplicate insertion, a node is inserted above the highest bit value (the
lowest absolute value) in the tree during the right-sided walk. If bit
-1 is not encountered (highest < -1), we insert above last leaf.
Otherwise, we insert above the node with the highest value which was not
equal to the one of its parent + 1.
- the "eb_next" primitive walks from left to right, which means from lower
to higher keys. It returns duplicates in the order they were inserted.
The "eb_first" primitive returns the left-most entry.
- the "eb_prev" primitive walks from right to left, which means from
higher to lower keys. It returns duplicates in the opposite order they
were inserted. The "eb_last" primitive returns the right-most entry.
- a tree which has 1 in the lower bit of its root's right branch is a
tree with unique nodes. This means that when a node is inserted with
a key which already exists will not be inserted, and the previous
entry will be returned.
*/
以上内容就是作者对Ebtree的详细描述。在此针对此处的实现做一下总结。
1、Ebtree的可以只含有一个根节点。
2、一颗非空的Ebtree到做孩子必然不为空。
3、Ebtree到根节点右孩子始终为空,但是由于需要设置一些状态到原因,可能其右孩子指针可能在NULL上面添加一些状态位。
4、除了叶子节点(leavies)外,其他节点都属于链接节点(nodes);除了root节点之外,每一个链接节点(nodes)都必须处于满状态(必须存在左右孩子),否则它就没有存在的必要。
5、除了root外,链接节点到左右孩子指针必须不相等。
6、每个节点含有指向链接父节点到指针node_p,指向叶子父节点到指针leaf_p,指向左右孩子的branches[2]指针数组,用于说明当前节点值到数值范围的bit,还可能包含一个可选的键值key。由于当前常见的处理器都是32位,因此对于指针来说其最低两位总是为0,因此可用于其他用处。在Ebtree中,node_p和leaf_p会加入指明当前节点是父节点到左孩子还是右孩子的信息;branches会加入用于描述子节点是叶子节点还是链接节点的信息。
7、若当前节点属于叶子节点,那么指向父节点到指针使用leaf_p,否则使用node_p。
8、一个链接节点指向与自己一样的叶子节点时,branches之一与leaf_p均指向自己。
9、node_p永远也不会指向自己。
10、bit如何表示当前节点的值范围呢,其实bit的值是当前节点的值最高位的1出现到位置。对于整棵树来说,数值越大的节点越接近root节点。
11、当bit到取值小于0的时候表示当前节点有重复。是否允许存在重复节点是由root节点到右孩子中添加到额外信息来指明的。
12、对于有两个值相等的孩子节点的链接节点来说,其bit位-1,值越低表示其子树中的重复越多。
13、对于重复节点插入操作,节点将被插入在往右边遍历时遇到到bit值最大到节点之上;如果没有bit为-1的节点,也就是说重复子树中到所有节点的bit均小于-1,那么将节点插入在此重复子树的最右边叶子节点之上;否则将节点插入在其上方包含有空洞的bit值最大的节点之上,比如,若节点a的bit为-2,节点b到bit为-5,并且a->parent的bit为-3,b->parent的bit为-7,并且对于bit比b大的节点,其bit与父节点只相差1(与a类似),那么新节点将被插入在节点b到上方。
14、对于删除操作,删除的始终是叶子节点,如果叶子节点被删了,那么其对应的nodes节点也要删除,然后将其兄弟节点替换至其父节点的位置。删除规则到来源参见第4条。
Ebtree的节点可以含有多个子节点而不仅仅是2个。然而在本程序中,作者只使用两个分支。
数据结构
接下来看下相应的数据结构。
[ebtree/ebtree.h]
/* Number of bits per node, and number of leaves per node */
#define EB_NODE_BITS 1
#define EB_NODE_BRANCHES (1 << EB_NODE_BITS)
#define EB_NODE_BRANCH_MASK (EB_NODE_BRANCHES - 1)
/* Be careful not to tweak those values. The walking code is optimized for NULL
* detection on the assumption that the following values are intact.
*/
#define EB_LEFT 0
#define EB_RGHT 1
#define EB_LEAF 0
#define EB_NODE 1
/* Tags to set in root->b[EB_RGHT] :
* - EB_NORMAL is a normal tree which stores duplicate keys.
* - EB_UNIQUE is a tree which stores unique keys.
*/
#define EB_NORMAL 0
#define EB_UNIQUE 1
/* This is the same as an eb_node pointer, except that the lower bit embeds
* a tag. See eb_dotag()/eb_untag()/eb_gettag(). This tag has two meanings :
* - 0=left, 1=right to designate the parent's branch for leaf_p/node_p
* - 0=link, 1=leaf to designate the branch's type for branch[]
*/
typedef void eb_troot_t;
/* The eb_root connects the node which contains it, to two nodes below it, one
* of which may be the same node. At the top of the tree, we use an eb_root
* too, which always has its right branch NULL (+/1 low-order bits).
*/
struct eb_root {
eb_troot_t *b[EB_NODE_BRANCHES]; /* left and right branches */
};
/* The eb_node contains the two parts, one for the leaf, which always exists,
* and one for the node, which remains unused in the very first node inserted
* into the tree. This structure is 20 bytes per node on 32-bit machines. Do
* not change the order, benchmarks have shown that it's optimal this way.
*/
struct eb_node {
struct eb_root branches; /* branches, must be at the beginning */
eb_troot_t *node_p; /* link node's parent */
eb_troot_t *leaf_p; /* leaf node's parent */
short int bit; /* link's bit position. */
short unsigned int pfx; /* data prefix length, always related to leaf */
};
eb_root_t类型是void的别名,eb_root类型是eb_root_t*的两成员数组组成的结构体。eb_node包括branches,node_p,leaf_p,bit以及pfx。
对于EB_LEAF和EB_NODE,他们是作用于branches上面的,用于给父节点说明其左右节点分别是什么类型的节点。
EB_RIGHT和EB_LAFT则用于node_p和leaf_p,用于告诉子节点,其对应着父亲节点的左孩子还是右孩子。
EB_NORMAL和EB_UNIQUE则用于root的右孩子,用于指明此棵树是不重复的还是常规的(允许重复)。
操作
[ebtree/ebtree.h]
static inline eb_troot_t *eb_dotag(const struct eb_root *root, const int tag)
{
return (eb_troot_t *)((void *)root + tag);
}
/* Converts an eb_troot_t pointer pointer to its equivalent eb_root pointer,
* for use with pointers from ->branch[], leaf_p or node_p. NULL is conserved
* as long as the tree is not corrupted. To be used with EB_LEAF, EB_NODE,
* EB_LEFT or EB_RGHT in <tag>.
*/
static inline struct eb_root *eb_untag(const eb_troot_t *troot, const int tag)
{
return (struct eb_root *)((void *)troot - tag);
}
/* returns the tag associated with an eb_troot_t pointer */
static inline int eb_gettag(eb_troot_t *troot)
{
return (unsigned long)troot & 1;
}
/* Converts a root pointer to its equivalent eb_troot_t pointer and clears the
* tag, no matter what its value was.
*/
static inline struct eb_root *eb_clrtag(const eb_troot_t *troot)
{
return (struct eb_root *)((unsigned long)troot & ~1UL);
}
/* Returns a pointer to the eb_node holding <root> */
static inline struct eb_node *eb_root_to_node(struct eb_root *root)
{
return container_of(root, struct eb_node, branches);
}
/* Walks down starting at root pointer <start>, and always walking on side
* <side>. It either returns the node hosting the first leaf on that side,
* or NULL if no leaf is found. <start> may either be NULL or a branch pointer.
* The pointer to the leaf (or NULL) is returned.
*/
static inline struct eb_node *eb_walk_down(eb_troot_t *start, unsigned int side)
{
/* A NULL pointer on an empty tree root will be returned as-is */
while (eb_gettag(start) == EB_NODE)
start = (eb_untag(start, EB_NODE))->b[side];
/* NULL is left untouched (root==eb_node, EB_LEAF==0) */
return eb_root_to_node(eb_untag(start, EB_LEAF));
}
eb_root_to_node函数中调用了一个宏container_of,对于熟悉的人来说很清楚它是干什么的,不熟悉的话可以查一下Linux内核。
eb_walk_down根据传递进来的开始节点和指定的方向,找到相应的叶子节点,然后返回。
[ebtree/ebtree.h]
/* Return non-zero if the tree is empty, otherwise zero */
static inline int eb_is_empty(struct eb_root *root)
{
return !root->b[EB_LEFT];
}
判断树是否是空的,检查root左孩子的值。
[ebtree/ebtree.h]
/* Return the first leaf in the tree starting at <root>, or NULL if none */
static inline struct eb_node *eb_first(struct eb_root *root)
{
return eb_walk_down(root->b[0], EB_LEFT);
}
/* Return the last leaf in the tree starting at <root>, or NULL if none */
static inline struct eb_node *eb_last(struct eb_root *root)
{
return eb_walk_down(root->b[0], EB_RGHT);
}
第一个叶子节点和最后一个叶子节点的获取操作。
[ebtree/ebtree.h]
/* Return previous leaf node before an existing leaf node, or NULL if none. */
static inline struct eb_node *eb_prev(struct eb_node *node)
{
eb_troot_t *t = node->leaf_p;
while (eb_gettag(t) == EB_LEFT) {
/* Walking up from left branch. We must ensure that we never
* walk beyond root.
*/
if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
return NULL;
t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
}
/* Note that <t> cannot be NULL at this stage */
t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
return eb_walk_down(t, EB_RGHT);
}
首先找出当前叶子节点和前一个叶子节点的共同祖先。在这种情况,当前叶子节点一定位于共同祖先的右子树,而前一个叶节点位于共同祖先的左子树。因此,在找到共同祖先之后,获取其左孩子,然后一直往右边walk down直到叶子节点,此时的叶子节点就是所求节点。While循环对应于当前节点为父节点的左孩子。如果是右孩子,那么他和前一个节点的共同祖先就是其父亲节点。
由于当前节点为叶子节点,所以对其父亲的链接指针位于leaf_p。在查找共同祖先的时候需要检查是否越过了root节点。
[ebtree/ebtree.h]
/* Return next leaf node after an existing leaf node, or NULL if none. */
static inline struct eb_node *eb_next(struct eb_node *node)
{
eb_troot_t *t = node->leaf_p;
while (eb_gettag(t) != EB_LEFT)
/* Walking up from right branch, so we cannot be below root */
t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
/* Note that <t> cannot be NULL at this stage */
t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
if (eb_clrtag(t) == NULL)
return NULL;
return eb_walk_down(t, EB_LEFT);
}
查找当前节点的下一个节点,这种情况与上面的相反,当前节点和下一个叶子节点分别位于共同祖先的左右子树。
与查找前一个节点不一样的是,若当前节点属于父亲的右孩子,那么往上查找共同祖先的时候,不需要检查是否越过root节点,这是因为root右孩子节点必定为空而导致的。由于在查找上层节点时,是从其右子树上来的,而root节点的右子树为空,因此不可能到达root节点。
[ebtree/ebtree.h]
/* Return previous leaf node before an existing leaf node, skipping duplicates,
* or NULL if none. */
static inline struct eb_node *eb_prev_unique(struct eb_node *node)
{
eb_troot_t *t = node->leaf_p;
while (1) {
if (eb_gettag(t) != EB_LEFT) {
node = eb_root_to_node(eb_untag(t, EB_RGHT));
/* if we're right and not in duplicates, stop here */
if (node->bit >= 0)
break;
t = node->node_p;
}
else {
/* Walking up from left branch. We must ensure that we never
* walk beyond root.
*/
if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
return NULL;
t = (eb_root_to_node(eb_untag(t, EB_LEFT)))->node_p;
}
}
/* Note that <t> cannot be NULL at this stage */
t = (eb_untag(t, EB_RGHT))->b[EB_LEFT];
return eb_walk_down(t, EB_RGHT);
}
/* Return next leaf node after an existing leaf node, skipping duplicates, or
* NULL if none.
*/
static inline struct eb_node *eb_next_unique(struct eb_node *node)
{
eb_troot_t *t = node->leaf_p;
while (1) {
if (eb_gettag(t) == EB_LEFT) {
if (unlikely(eb_clrtag((eb_untag(t, EB_LEFT))->b[EB_RGHT]) == NULL))
return NULL; /* we reached root */
node = eb_root_to_node(eb_untag(t, EB_LEFT));
/* if we're left and not in duplicates, stop here */
if (node->bit >= 0)
break;
t = node->node_p;
}
else {
/* Walking up from right branch, so we cannot be below root */
t = (eb_root_to_node(eb_untag(t, EB_RGHT)))->node_p;
}
}
/* Note that <t> cannot be NULL at this stage */
t = (eb_untag(t, EB_LEFT))->b[EB_RGHT];
if (eb_clrtag(t) == NULL)
return NULL;
return eb_walk_down(t, EB_LEFT);
}
Unique的操作和前面的差别是,前面找到前一个节点就返回,而在unique版本里面,如果前一个节点有重复值,那么会继续向前查找,直到找到一个不重复的节点而返回或者没找到而返回NULL。
[ebtree/ebtree.h]__eb_insert_dup()
/* This function is used to build a tree of duplicates by adding a new node to
* a subtree of at least 2 entries. It will probably never be needed inlined,
* and it is not for end-user.
*/
static forceinline struct eb_node *
__eb_insert_dup(struct eb_node *sub, struct eb_node *new)
{
struct eb_node *head = sub;
eb_troot_t *new_left = eb_dotag(&new->branches, EB_LEFT);
eb_troot_t *new_rght = eb_dotag(&new->branches, EB_RGHT);
eb_troot_t *new_leaf = eb_dotag(&new->branches, EB_LEAF);
/* first, identify the deepest hole on the right branch */
while (eb_gettag(head->branches.b[EB_RGHT]) != EB_LEAF) {
struct eb_node *last = head;
head = container_of(eb_untag(head->branches.b[EB_RGHT], EB_NODE),
struct eb_node, branches);
if (head->bit > last->bit + 1)
sub = head; /* there's a hole here */
}
首先将新节点根据不同情况将其指针分别加入不同的信息,然后循环遍历右孩子直到叶子节点,在此过程中,如果发现有某个节点与其父节点之间存在空洞,那么将其记录下来。由于对于重复子树来说,某个节点的子树中重复值越多,其bit越小,那么在往叶子节点方向遍历过程中,bit的值是越来越大的。
[ebtree/ebtree.h]__eb_insert_dup()
/* Here we have a leaf attached to (head)->b[EB_RGHT] */
if (head->bit < -1) {
/* A hole exists just before the leaf, we insert there */
new->bit = -1;
sub = container_of(eb_untag(head->branches.b[EB_RGHT], EB_LEAF),
struct eb_node, branches);
head->branches.b[EB_RGHT] = eb_dotag(&new->branches, EB_NODE);
new->node_p = sub->leaf_p;
new->leaf_p = new_rght;
sub->leaf_p = new_left;
new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_LEAF);
new->branches.b[EB_RGHT] = new_leaf;
return new;
在上面的遍历过程中能够知道最后head到右孩子必定是叶子节点。因此,检查其bit是否为-1,如果是-1,那么新节点将被插入到之前找到空洞中;否则将会被插入叶子节点之上。此处对应着后者,也就是head的bit不为-1到情况。
将新节点到bit设置为-1,这是因为这个节点含有且仅含有两个值一样的孩子节点。那么易知,如果head的bit小于-2,对于下一次与此值相同的节点将会被插入到此节点之上。此处需要注意到是,对于新节点,除了其node_p与左孩子之外,leaf_p和右孩子到指针均指向自己。
[ebtree/ebtree.h]__eb_insert_dup()
} else {
int side;
/* No hole was found before a leaf. We have to insert above
* <sub>. Note that we cannot be certain that <sub> is attached
* to the right of its parent, as this is only true if <sub>
* is inside the dup tree, not at the head.
*/
new->bit = sub->bit - 1; /* install at the lowest level */
side = eb_gettag(sub->node_p);
head = container_of(eb_untag(sub->node_p, side), struct eb_node, branches);
head->branches.b[side] = eb_dotag(&new->branches, EB_NODE);
new->node_p = sub->node_p;
new->leaf_p = new_rght;
sub->node_p = new_left;
new->branches.b[EB_LEFT] = eb_dotag(&sub->branches, EB_NODE);
new->branches.b[EB_RGHT] = new_leaf;
return new;
}
}
对于head的bit为-1的情况,除了插入位置是在之前找到的空洞里面,其他操作基本与head的bit小于-1到情况一样。
对于重复值到插入,有点很奇怪到就是,之前说明重复子树中某节点到其子树中重复节点越多其bit值越小,那么在此处插入一个重复值之后,对于新插入的节点之上的节点来说,其子树中的重复值节点已经增加,为什么它们到bit值都不变呢?
[ebtree/ebtree.h]__eb_delete()
/* Removes a leaf node from the tree if it was still in it. Marks the node
* as unlinked.
*/
static forceinline void __eb_delete(struct eb_node *node)
{
__label__ delete_unlink;
unsigned int pside, gpside, sibtype;
struct eb_node *parent;
struct eb_root *gparent;
if (!node->leaf_p)
return;
/* we need the parent, our side, and the grand parent */
pside = eb_gettag(node->leaf_p);
parent = eb_root_to_node(eb_untag(node->leaf_p, pside));
/* We likely have to release the parent link, unless it's the root,
* in which case we only set our branch to NULL. Note that we can
* only be attached to the root by its left branch.
*/
if (eb_clrtag(parent->branches.b[EB_RGHT]) == NULL) {
/* we're just below the root, it's trivial. */
parent->branches.b[EB_LEFT] = NULL;
goto delete_unlink;
}
之前说过,删除操作只发生在叶子节点;删除时会找到叶子节点到父节点,如果父节点是root节点到话,那么只需要简单的将root到左孩子指针置空即可。若父节点不是root节点,那么删除叶子节点的同时会把父节点也给删除掉。
[ebtree/ebtree.h]__eb_delete()
/* To release our parent, we have to identify our sibling, and reparent
* it directly to/from the grand parent. Note that the sibling can
* either be a link or a leaf.
*/
gpside = eb_gettag(parent->node_p);
gparent = eb_untag(parent->node_p, gpside);
gparent->b[gpside] = parent->branches.b[!pside];
sibtype = eb_gettag(gparent->b[gpside]);
if (sibtype == EB_LEAF) {
eb_root_to_node(eb_untag(gparent->b[gpside], EB_LEAF))->leaf_p =
eb_dotag(gparent, gpside);
} else {
eb_root_to_node(eb_untag(gparent->b[gpside], EB_NODE))->node_p =
eb_dotag(gparent, gpside);
}
/* Mark the parent unused. Note that we do not check if the parent is
* our own node, but that's not a problem because if it is, it will be
* marked unused at the same time, which we'll use below to know we can
* safely remove it.
*/
parent->node_p = NULL;
以上这部分到操作是将父节点从树中移除,并将被删除节点的兄弟节点替代至父节点的位置。
[ebtree/ebtree.h]__eb_delete()
/* The parent node has been detached, and is currently unused. It may
* belong to another node, so we cannot remove it that way. Also, our
* own node part might still be used. so we can use this spare node
* to replace ours if needed.
*/
/* If our link part is unused, we can safely exit now */
if (!node->node_p)
goto delete_unlink;
前面描述过对于叶子节点来说,指向父亲节点的指针是leaf_p。但是可能其node_p也会被使用做其他用途,因此在删除叶子节点的时候,如果其对应的node_p节点没有被使用,那么删除操作至此也就完成了。
节点a之上插入新节点时,新节点对应的叶子节点b的leaf_p就是其自己,node_p是a之前的leaf_p或者node_p,因此叶子节点的node_p是可能被使用的。
[ebtree/ebtree.h]__eb_delete()
/* From now on, <node> and <parent> are necessarily different, and the
* <node>'s node part is in use. By definition, <parent> is at least
* below <node>, so keeping its key for the bit string is OK.
*/
parent->node_p = node->node_p;
parent->branches = node->branches;
parent->bit = node->bit;
/* We must now update the new node's parent... */
gpside = eb_gettag(parent->node_p);
gparent = eb_untag(parent->node_p, gpside);
gparent->b[gpside] = eb_dotag(&parent->branches, EB_NODE);
如果被删除节点还被其他人使用,那么有可能它到值还会被使用,但是此处要求删除它,因此可以将其值赋给父节点,然后将父节点代替它被其他节点使用。
[ebtree/ebtree.h]__eb_delete()
/* ... and its branches */
for (pside = 0; pside <= 1; pside++) {
if (eb_gettag(parent->branches.b[pside]) == EB_NODE) {
eb_root_to_node(eb_untag(parent->branches.b[pside], EB_NODE))->node_p =
eb_dotag(&parent->branches, pside);
} else {
eb_root_to_node(eb_untag(parent->branches.b[pside], EB_LEAF))->leaf_p =
eb_dotag(&parent->branches, pside);
}
}
delete_unlink:
/* Now the node has been completely unlinked */
node->leaf_p = NULL;
return; /* tree is not empty yet */
}
作者最后还处理了node到左右孩子将其指针指向父节点。因为被删除节点是叶子节点。
还有一点就是如果node->node_p为空到话,直接将node->leaf_p置为NULL,那么除非leaf_p节点就是其自身(这种情况是存在的),否则就造成了内存泄漏,至少直观这么看感觉是泄露了。那么是否真的会造成内存泄露呢?
那就需要分析什么情况下会出现node->node_p为空的情况。作者在前面到描述中说到,当一个叶子节点链接到root上面的时候,此时其对应到node节点是被省略掉的;除此之外,之前描述了叶子节点的node_p不为空的情况,节点a之上插入新节点时,新节点对应的叶子节点b的leaf_p就是其自己,node_p是a之前的leaf_p或者node_p;插入新节点的时候,指向其自身的包装成的叶子节点的指针是右孩子,每一次的重复数据插入也是与右孩子相关的,那么就会导致之前作者说的,node2与leaf2不邻接的情况(node2与leaf2实际上是同一个节点,作者说这种树的名字就是由于这个弹性而得名的),如果现在删除leaf2,对于node3会产生什么影响呢?node3会被移除,并且node3的node_p指针被清空,leaf3替代node3的位置(node3和leaf3其实是同一个节点),也就是说leaf3->leaf_p = node2,leaf3->node_p = NULL;又由于此时leaf2的node_p是存在的且指向node2,那么leaf2->leaf_p将会替代node2的位置,也就是node3替代node2,结果就是,leaf3->leaf_p = node3,leaf3->node_p = NULL。因此也就得到一个结论,当某一个leaf节点的node_p为空时,leaf->leaf_p指向其自己或者root节点,因此前面所述的内存泄露是不存在的。