Roaring Bitmap 介绍
存储整数集(类似于存储整数的 Set),并支持整数集的集合运算(交并差等)。之所以在一些场景下使用 Roaring Bitmap,而不是 Set,是因为其节省存储空间以及集合运算效率更优。
使用示例
<dependencies>
<dependency>
<groupId>org.roaringbitmap</groupId>
<artifactId>RoaringBitmap</artifactId>
<version>0.9.0</version>
</dependency>
</dependencies>
package roaringbitmap;
import org.roaringbitmap.RoaringBitmap;
import org.roaringbitmap.longlong.Roaring64Bitmap;
import java.util.Iterator;
public class Test {
public static void main(String[] args) {
RoaringBitmap r1 = RoaringBitmap.bitmapOf(15, 0, Integer.MIN_VALUE, Integer.MAX_VALUE, -1, -8);
print(r1, "r1");
RoaringBitmap r2 = new RoaringBitmap();
r2.add(-1);
r2.add(0);
r2.add(1);
print(r2, "r2");
RoaringBitmap intersect = RoaringBitmap.and(r1, r2);
print(intersect, "r1-and-r2");
System.out.println("=================");
Roaring64Bitmap r64 = Roaring64Bitmap.bitmapOf(1, 2, Long.MAX_VALUE, Long.MIN_VALUE, -1, -3);
System.out.println("RBM[r64] cardinality: " + r64.getLongCardinality());
Iterator<Long> iterator = r64.iterator();
while (iterator.hasNext()) {
System.out.print(iterator.next() + "\t");
}
System.out.println();
}
public static void print(RoaringBitmap r, String name) {
System.out.println("RBM[" + name + "] cardinality: " + r.getLongCardinality());
for(int i: r) {
System.out.print(i + "\t");
}
System.out.println();
}
}
/*
RBM[r1] cardinality: 6
0 15 2147483647 -2147483648 -8 -1
RBM[r2] cardinality: 3
0 1 -1
RBM[r1-and-r2] cardinality: 2
0 -1
=================
RBM[r64] cardinality: 6
1 2 9223372036854775807 -9223372036854775808 -3 -1
*/
Bitmap/Bitset
问:1 亿个设备 id(imeimd5, 长度 32 字符串, 512 bit)需要多少存储空间?
答:1 亿 * 64 Byte = 64 亿 Byte = 6.4 G
但是,如果一个设备 id 使用 1 bit 存储,需要存储空间是 6.4G / 512 = 12.5 M
set = {1, 2, 15}
package roaringbitmap;
public class Test {
public static void main(String[] args) {
byte[] bits = new byte[2];
add(bits, 1);
add(bits, 2);
System.out.println(Integer.toBinaryString(bits[0])); // 110
}
public static void add(byte[] bits, int num){
byte pre = bits[num / 8];
byte newVal = (byte) (pre | (1 << (num % 8)));
bits[num / 8] = newVal;
}
}
缺点:稀疏数据存储依旧浪费空间,举一个极端例子,用位图只存储数字 0 和 1599(8*200 - 1),至少需要开辟长度为 200 的 byte 数组,而普通 set 存储 0 和 1599 两个 Integer 只需要 8 字节!
Roaring Bitmap 理论
We partition the range of 32-bit indexes ( [0,n) ) into chunks of 2^16 integers sharing the same 16 most significant digits. We use specialized containers to store their 16 least significant bits.
When a chunk contains no more than 4096 integers, we use a sorted array of packed 16-bit integers. When there are more than 4096 integers, we use a 2^16 -bit bitmap. Thus, we have two types of containers: an array container for sparse chunks and a bitmap container for dense chunks.
- 根据实际数据状况,桶动态生成(而不是一开始就建立所有桶,避免不必要的空间浪费)
- 根据实际数据状况,若桶中数据少(稀疏数据),使用 array 存;若桶中数据多,使用 bitmap 存
- 为了设计及实现简单,bitmap 大小固定为 2^16 位(8KB)。4096 个 Integer(低 16 位)整好占 8KB 存储,为了节省空间,桶中数据少于 4096 个数时,使用 array,否则使用 bitmap。
Our new addition, the run container, is made of a packed array of pairs of 16-bit integers. The first value of each pair represents a starting value, whereas the second value is the length of a run. For example, we would store the values 11,12,13,14,15 as the pair 11,4 where 4 means that beyond 11 itself,there are 4 contiguous values that follow.
Roaring Bitmap 源码
/**
* RoaringBitmap, a compressed alternative to the BitSet.
*
* <pre>
* {@code
* import org.roaringbitmap.*;
*
* //...
*
* RoaringBitmap rr = RoaringBitmap.bitmapOf(1,2,3,1000);
* RoaringBitmap rr2 = new RoaringBitmap();
* for(int k = 4000; k<4255;++k) rr2.add(k);
* RoaringBitmap rror = RoaringBitmap.or(rr, rr2);
*
* //...
* DataOutputStream wheretoserialize = ...
* rr.runOptimize(); // can help compression
* rr.serialize(wheretoserialize);
* }
* </pre>
*
* Integers are added in unsigned sorted order. That is, they are treated as unsigned integers (see
* Java 8's Integer.toUnsignedLong function).
* Up to 4294967296 integers
* can be stored.
*
*
*
*/
/**
* Roaring64Bitmap is a compressed 64 bit bitmap. It can contain all the numbers of long
* rang[Long.MAX_VALUE, Long.MIN_VALUE]. Since java has no unsigned long,we treat the negative value
* as a successor of the positive value. So the ascending ordering of all the long value is:
* 0,1,2...Long.MAX_VALUE,Long.MIN_VALUE,Long.MIN_VALUE+1.......-1. See Long.toUnsignedString()
*/
关键类图
部分源码
/**
RoaringBitmap 初始化
*/
RoaringBitmap r2 = new RoaringBitmap();
// step 1
public RoaringBitmap() {
highLowContainer = new RoaringArray();
}
// step 2
protected RoaringArray() {
this(INITIAL_CAPACITY); // INITIAL_CAPACITY = 4
}
// step 3
RoaringArray(int initialCapacity) {
this(new char[initialCapacity], new Container[initialCapacity], 0);
}
// step 4
RoaringArray(char[] keys, Container[] values, int size) {
this.keys = keys;
this.values = values;
this.size = size;
}
/**
org.roaringbitmap.RoaringBitmap#add(int)
*/
public void add(final int x) {
final char hb = Util.highbits(x);
final int i = highLowContainer.getIndex(hb);
if (i >= 0) {
highLowContainer.setContainerAtIndex(i,
highLowContainer.getContainerAtIndex(i).add(Util.lowbits(x)));
} else {
final ArrayContainer newac = new ArrayContainer();
highLowContainer.insertNewKeyValueAt(-i - 1, hb, newac.add(Util.lowbits(x)));
}
}
// (tips)获取 Integer 高 16 位
protected static char highbits(int x) {
return (char) (x >>> 16);
}
// (tips)获取 Integer 低 16 位
protected static char lowbits(int x) {
return (char) x;
}
// (tips)查找等于 k 的位置,或第一个大于 k 的位置
// starts with binary search and finishes with a sequential search
protected static int hybridUnsignedBinarySearch(final char[] array, final int begin,
final int end, final char k) {
// next line accelerates the possibly common case where the value would
// be inserted at the end
if ((end > 0) && ((array[end - 1]) < (int) k)) {
return -end - 1;
}
int low = begin;
int high = end - 1;
// 32 in the next line matches the size of a cache line
while (low + 32 <= high) {
final int middleIndex = (low + high) >>> 1;
final int middleValue = (array[middleIndex]);
if (middleValue < (int) k) {
low = middleIndex + 1;
} else if (middleValue > (int) k) {
high = middleIndex - 1;
} else {
return middleIndex;
}
}
// we finish the job with a sequential search
int x = low;
for (; x <= high; ++x) {
final int val = (array[x]);
if (val >= (int) k) {
if (val == (int) k) {
return x;
}
break;
}
}
return -(x + 1);
}
// 查看缓存行大小
// cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
/**
org.roaringbitmap.BitmapContainer
*/
@Override
public Container add(final char i) {
final long previous = bitmap[i >>> 6];
long newval = previous | (1L << i); // 1 << (i % 64)
bitmap[i >>> 6] = newval;
if (USE_BRANCHLESS) { // optimization flag: whether the cardinality of the bitmaps
// is maintained through branchless operations
cardinality += (int)((previous ^ newval) >>> i); // 无需判断 previous 是否等于 newval
// 避免 CPU 分支预测错误带来的性能开销
} else if (previous != newval) {
++cardinality;
}
return this;
}
@Override
public boolean contains(final char i) {
return (bitmap[i >>> 6] & (1L << i)) != 0;
}
@Override
public int getCardinality() {
return cardinality;
}
@Override
public Container remove(final char i) {
int index = i >>> 6;
long bef = bitmap[index];
long mask = 1L << i;
if (cardinality == ArrayContainer.DEFAULT_MAX_SIZE + 1) { // DEFAULT_MAX_SIZE = 4096
// this is
// the
// uncommon
// path
if ((bef & mask) != 0) {
--cardinality;
bitmap[i >>> 6] = bef & ~mask; // 存储数 i 对应位置成 0
return this.toArrayContainer();
}
}
long aft = bef & ~mask; // 存储数 i 的位由 1 置成 0
cardinality -= (aft - bef) >>> 63; // 存储数 i 的位若由 1 置成 0,aft - bef < 0,其最高位必为 1
bitmap[index] = aft;
return this;
}
/**
* Fill the array with set bits
*
* @param array container (should be sufficiently large)
*/
void fillArray(final char[] array) {
int pos = 0;
int base = 0;
for (int k = 0; k < bitmap.length; ++k) {
long bitset = bitmap[k];
while (bitset != 0) {
// Long.numberOfTrailingZeros
// Returns the number of zero bits following the lowest-order ("rightmost") one-bit
// in the two's complement binary representation of the specified long value.
// Returns 64 if the specified value has no one-bits in its two's complement representation,
// in other words if it is equal to zero.
array[pos++] = (char) (base + numberOfTrailingZeros(bitset));
bitset &= (bitset - 1); // 把最低 1 位置成 0
}
base += 64;
}
}
@Override
public Container and(final BitmapContainer value2) {
int newCardinality = 0;
for (int k = 0; k < this.bitmap.length; ++k) {
// [2] CPU 实现了快速计算一个处理字中位 1 个数的指令,Long.bitCount 底层基于此指令实现
newCardinality += Long.bitCount(this.bitmap[k] & value2.bitmap[k]);
}
if (newCardinality > ArrayContainer.DEFAULT_MAX_SIZE) {
final BitmapContainer answer = new BitmapContainer();
for (int k = 0; k < answer.bitmap.length; ++k) {
answer.bitmap[k] = this.bitmap[k] & value2.bitmap[k];
}
answer.cardinality = newCardinality;
return answer;
}
ArrayContainer ac = new ArrayContainer(newCardinality);
Util.fillArrayAND(ac.content, this.bitmap, value2.bitmap);
ac.cardinality = newCardinality;
return ac;
}
@Override
public ArrayContainer and(final ArrayContainer value2) {
final ArrayContainer answer = new ArrayContainer(value2.content.length);
int c = value2.cardinality;
for (int k = 0; k < c; ++k) {
char v = value2.content[k];
answer.content[answer.cardinality] = v;
answer.cardinality += (int)this.bitValue(v);
}
return answer;
}
int bitValue(final char i) {
return (int)(bitmap[i >>> 6] >>> i ) & 1;
}
/**
空间压缩(优化),比较当前容器占据空间和使用 RunContainer 占据空间的大小关系
*/
@Override
public Container runOptimize() {
// 计算一下,如果用 RunContainer 存储,需要多少轮(每轮需要 2 个 char,4 Bytes)
int numRuns = numberOfRunsLowerBound(MAXRUNS); // decent choice
int sizeAsRunContainerLowerBound = RunContainer.serializedSizeInBytes(numRuns);
if (sizeAsRunContainerLowerBound >= getArraySizeInBytes()) {
return this;
}
// else numRuns is a relatively tight bound that needs to be exact
// in some cases (or if we need to make the runContainer the right
// size)
numRuns += numberOfRunsAdjustment();
int sizeAsRunContainer = RunContainer.serializedSizeInBytes(numRuns);
if (getArraySizeInBytes() > sizeAsRunContainer) {
return new RunContainer(this, numRuns);
} else {
return this;
}
}
/**
org.roaringbitmap.ArrayContainer
*/
// ArrayContainer 动态扩容设计,DEFAULT_INIT_SIZE = 4
private void increaseCapacity(boolean allowIllegalSize) {
int newCapacity = (this.content.length == 0) ? DEFAULT_INIT_SIZE
: this.content.length < 64 ? this.content.length * 2
: this.content.length < 1067 ? this.content.length * 3 / 2
: this.content.length * 5 / 4;
// never allocate more than we will ever need
if (newCapacity > ArrayContainer.DEFAULT_MAX_SIZE && !allowIllegalSize) {
newCapacity = ArrayContainer.DEFAULT_MAX_SIZE;
}
// if we are within 1/16th of the max, go to max
if (newCapacity > ArrayContainer.DEFAULT_MAX_SIZE - ArrayContainer.DEFAULT_MAX_SIZE / 16
&& !allowIllegalSize) {
newCapacity = ArrayContainer.DEFAULT_MAX_SIZE;
}
this.content = Arrays.copyOf(this.content, newCapacity);
}
- 大量位运算技巧,加速计算
- 充分利用 CPU 硬件特性(高速缓存/避免分支预测/特殊硬件指令bitCount),加速计算
- ArrayContainer 合理扩容策略,充分节省存储
Roaring Bitmap 实践
在大数据标签平台中,用于标签圈选计算符合圈选条件的人群,一来加速计算,二来压缩存储,效果甚好
参考资料
[1] Roaring Bitmap 官网
[2]《Better bitmap performance with Roaring bitmaps》
[3]《Consistently faster and smaller compressed bitmaps with Roaring》
[4] RoaringBitmap 工程 Github 源码