雪花算法
雪花算法的基本实现
Snowflake 算法是一种分布式唯一 ID 生成算法,主要用于生成分布式系统中的唯一 ID 。它的核心思想是,一个 Long 类型占 8 个字节,每个字节占 8 比特,也就是说一个 long 类型占 64 个比特。
- 第一个 bit 位(1 bit):表示数值符号位,0 为正数,1 为负数。一般生成的 ID 都为正数,所以默认为 0.
- 时间戳部分(41 bit):精确到毫秒级别,可以使用 69 年,(1L << 41) / (1000L * 60 * 60 * 24 * 365) ≈ 69 年。为了能够充分使用到这 69 年的时间,不建议直接使用当前时间戳,而且使用 (当前时间戳 - 一个固定的时间戳)的差值,从而可以从一个比较小的值开始生成 ID。
- 工作机器 ID (10 bit):也称为 workId,配置上很灵活,也可以通过类似机器号 + 数据中心号进行组合得到。同一毫秒对应总共可以部署在 1024 台机器上。
- 序列号(12 bit):自增序列,同一毫秒可以支持 4096 个序列号。
示意图如下:
0 41 51 63
+-----------+------+------------+
|timestamp |node |sequence |
+-----------+------+------------+
示例 Java 代码如下:
public class SnowflakeIdGenerator {
// 起始的固定时间戳
private final long twepoch = 1288834974657L;
private final long workerIdBits = 5L;
private final long datacenterIdBits = 5L;
private final long sequenceBits = 12L;
// 每个部分的最大值
private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private final long sequenceMask = -1L ^ (-1L << sequenceBits);
// 左移bit位数,通过或运算将内容组装起来
private final long workerIdShift = sequenceBits;
private final long datacenterIdShift = sequenceBits + workerIdBits;
private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private long workerId;
private long datacenterId;
private long lastTimestamp = -1L;
private long sequence = 0L;
public SnowflakeIdGenerator(long workerId, long datacenterId) {
if (workerId > maxWorkerId || workerId < 0) {
throw new IllegalArgumentException("worker Id can't be greater than " + maxWorkerId + " or less than 0");
}
if (datacenterId > maxDatacenterId || datacenterId < 0) {
throw new IllegalArgumentException("datacenter Id can't be greater than " + maxDatacenterId + " or less than 0");
}
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException("Clock moved backwards. Refusing to generate id for " + (lastTimestamp - timestamp) + " milliseconds");
}
if (lastTimestamp == timestamp) {
// sequenceMask 作为掩码,保证了只会取到支持的bit位,超出的则有重复的可能。这里对应 sequenceBits,因此只取后12位(4096组)
sequence = (sequence + 1) & sequenceMask;
// 同一毫秒的序列数已经达到最大
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
}
通过雪花算法实现数据库的分库分表
通过雪花算法实现数据库的分库分表需要对生成的唯一 ID 进行解析,从而确定应该存放到哪个库的哪个表中。一种比较常见的实现方式是,将 ID 分为三段,第一段作为数据库的分区键,第二段作为表的分区键,第三段作为记录的主键。
比如,可以将 ID 的前 10 位作为数据库的分区键,中间的 10 位作为表的分区键,最后的 12 位作为记录的主键。这样就可以将大量的记录分散到不同的数据表中,从而实现分库分表的目的。
示意图如下:
0 9 19 31
+-------------+---------+----------------+
| database_id | table_id| record_id |
+-------------+---------+----------------+
| 10 | 50 | 123456789012 |
+-------------+---------+----------------+
在使用此算法时需要特别注意,如果分配的 ID 数量超过了算法所支持的最大数量,就会出现 ID 冲突的情况。因此,在分配 ID 时要确保使用足够的位数来避免 ID 冲突。
以下是基础的Java实现代码:
public class SnowflakeIdGenerator {
private final long twepoch = 1288834974657L;
private final long workerIdBits = 5L;
private final long datacenterIdBits = 5L;
private final long sequenceBits = 12L;
private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private final long workerIdShift = sequenceBits;
private final long datacenterIdShift = sequenceBits + workerIdBits;
private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private final long sequenceMask = -1L ^ (-1L << sequenceBits);
private long workerId;
private long datacenterId;
private long lastTimestamp = -1L;
private long sequence = 0L;
public SnowflakeIdGenerator(long workerId, long datacenterId) {
if (workerId > maxWorkerId || workerId < 0) {
throw new IllegalArgumentException("worker Id can't be greater than " + maxWorkerId + " or less than 0");
}
if (datacenterId > maxDatacenterId || datacenterId < 0) {
throw new IllegalArgumentException("datacenter Id can't be greater than " + maxDatacenterId + " or less than 0");
}
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException("Clock moved backwards. Refusing to generate id for " + (lastTimestamp - timestamp) + " milliseconds");
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
public static void main(String[] args) {
SnowflakeIdGenerator idGenerator = new SnowflakeIdGenerator(1, 1);
long id = idGenerator.nextId();
long databaseId = id >> 22;
long tableId = (id >> 12) & 1023;
long recordId = id & 4095;
System.out.println("ID: " + id);
System.out.println("Database ID: " + databaseId);
System.out.println("Table ID: " + tableId);
System.out.println("Record ID: " + recordId);
}
}