1 简述
当前互联网行业,数据规模越来越大,并发性能要求也越来越高,当传统单表无法支撑时,就需要对数据库进行分库分表,这就需要解决全局唯一ID问题。技术方案有很多,业界比较知名的有 UUID、基于Redis Incr 命令、Twitter的snowflake算法等。
本文主要介绍雪花算法,附带源码实现,并对时间回拨问题做了改进优化。
2 算法
2.1 原理
雪花算法,是 Twitter 开源的分布式 ID 生成算法,名字起的很富趣味,相传,在自然界中不存在两片完全一样的雪花,用该算法生成的 id 如雪花般独一无二。
ID由一个long数据类型表示,包含时间戳、机器id、数据中心id、自增序列等内容。

补充说明:
(1) 1位固定值0, 最高符号位,用来表示一个正数。
(2) 41位的时间戳, 毫秒级,可以算一下它的最大值2^41,再转换为年,大概可以使用69年。
(3) 机器id、数据中心id,共10位,理论最多部署1024个服务节点, 用来避免不同机器生成相同id.
(4) 12位的序列号,1个服务节点,同一毫秒,可以生成 2^12=4096 个不重复 id。
各数据段长度可以微调,来满足实际的业务场景。
2.2 特点
优点:
(1) 算法简单,创建id时,不需要耗费第三方资源,完全在内存中生成。
(2) 性能高,1个服务节点,每秒可生成数百万个id。
(3) id 自增,这种有序特性,可以使数据库进行高效的索引维护。
缺点:
(1) 如果系统时间被回调,可能会造成id冲突。
2.3 源码
/*
* 分布式Id: 雪花算法
*
* 原始格式:1位(0) + 41位(timestamp) + 5位(workerId) + 5位(dataCenterId) + 12位(sequence) = 64位(long)
*
* coded by mutouren on 2023-03-21
*/
public class SnowflakeIdWorker {
private static final int TIMESTAMP_BITS = 41;
private static final int WORKER_ID_BITS = 5;
private static final int DATA_CENTER_ID_BITS = 5;
private static final int SEQUENCE_BITS = 12;
private static final long MAX_WORKER_ID = (1L << WORKER_ID_BITS) - 1;
private static final long MAX_DATA_CENTER_ID = (1L << DATA_CENTER_ID_BITS) - 1;
private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;
// start at 2023-01-01
private static final long START_TIMESTAMP = 1672502400000L;
private final long workerId;
private final long dataCenterId;
private long lastTimestamp = 0L;
private long lastSequence = 0L;
public SnowflakeIdWorker(long workerId, long dataCenterId) {
// check parameters
checkParameters(workerId, dataCenterId);
this.workerId = workerId;
this.dataCenterId = dataCenterId;
}
public synchronized long newId() {
// timestamp
long curTimestamp = this.getCurrentTimestamp();
// sequeue
long curSequence = lastSequence + 1;
if (curTimestamp > this.lastTimestamp) {
curSequence = 0;
}
if (curSequence > MAX_SEQUENCE) {
waitForNextTimestamp();
curTimestamp++;
curSequence = 0;
}
// last
this.lastTimestamp = curTimestamp;
this.lastSequence = curSequence;
// build id
return curTimestamp << (WORKER_ID_BITS + DATA_CENTER_ID_BITS + SEQUENCE_BITS)
| workerId << (DATA_CENTER_ID_BITS + SEQUENCE_BITS)
| dataCenterId << SEQUENCE_BITS
| curSequence;
}
private long getCurrentTimestamp() {
long curTimestamp = System.currentTimeMillis() - START_TIMESTAMP;
if (curTimestamp < this.lastTimestamp) {
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id, curTimestamp:%s, lastTimestamp:%s",
curTimestamp, this.lastTimestamp));
}
return curTimestamp;
}
private void waitForNextTimestamp() {
long timestamp = this.getCurrentTimestamp();
while (timestamp == this.lastTimestamp) {
timestamp = this.getCurrentTimestamp();
}
}
private void checkParameters(long workerId, long dataCenterId) {
int total_bits = 1 + TIMESTAMP_BITS + WORKER_ID_BITS + DATA_CENTER_ID_BITS + SEQUENCE_BITS;
if (total_bits != 64) {
throw new IllegalArgumentException("static parameters error: SnowflakeId must be 64 bits");
}
if (workerId > MAX_WORKER_ID || workerId < 0) {
throw new IllegalArgumentException("workerId can't be greater than MAX_WORKER_ID or less than 0");
}
if (dataCenterId > MAX_DATA_CENTER_ID || dataCenterId < 0) {
throw new IllegalArgumentException("dataCenterId can't be greater than MAX_DATA_CENTER_ID or less than 0");
}
}
public static void main(String[] args) throws Exception {
long START_TIMESTAMP = new SimpleDateFormat("yyyy-MM-dd").parse("2023-01-01").getTime();
System.out.println("START_TIMESTAMP: " + START_TIMESTAMP);
SnowflakeIdWorker idWorker = new SnowflakeIdWorker(31, 1);
for(int i = 0; i < 10; i++) {
long id = idWorker.newId();
System.out.println(String.format("%05d, %s, %s, %s, %s", i, System.currentTimeMillis(), id, Long.toString(id, 2), Long.toString(id, 2).length()));
}
}
}
3 改进(时间回拨)
上面的源码,在出现时间回拨时,直接抛出异常,也可采用扩展位的方式处理,如下:

补充说明:
从5位的dataCenterId中分离出1位,用来表示backwardFlag(回调标志位),正常时标记为0,当时间出现回调时,标记为1,并备份回调前的时间戳(可以通过这个备份的值,来判断时间戳是否恢复到回调前),当时间进度恢复时,再将backwardFlag标记恢复为0,通过该方法,来避免时间回调时,生成重复id。
源码:
/*
* 分布式Id: 雪花算法
*
* 格式v2:1位(0) + 41位(timestamp) + 5位(workerId) + 4位(dataCenterId) + 1位(backwardFlag) + 12位(sequence) = 64位(long)
*
* v2解决问题:
* 通过增加backwardFlag标志位,解决短时、偶发性时间回调缺陷。
*
* 未解决以下场景:
* a. 服务停机后,时间回调。
* b. 短时内,连续时间回调。
* c. 大幅度时间回调。(增加了最大值, 大幅度时间回调,有时需要抛出问题)
*
* coded by mutouren on 2023-03-21
*/
public class SnowflakeIdWorkerV2 {
private static final int TIMESTAMP_BITS = 41;
private static final int WORKER_ID_BITS = 5;
private static final int DATA_CENTER_ID_BITS = 4;
private static final int SEQUENCE_BITS = 12;
private static final long MAX_WORKER_ID = (1L << WORKER_ID_BITS) - 1;
private static final long MAX_DATA_CENTER_ID = (1L << DATA_CENTER_ID_BITS) - 1;
private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;
private static final long MAX_BACKWARD_TIMESTAMP = 3600 * 1000L;
// start at 2023-01-01
private static final long START_TIMESTAMP = 1672502400000L;
private final long workerId;
private final long dataCenterId;
private long lastTimestamp = 0L;
private long lastSequence = 0L;
// 0: normal 1: backward
private long backwardFlag = 0L;
private long backupLastTimestampForBackward = 0L;
public SnowflakeIdWorkerV2(long workerId, long dataCenterId) {
// check parameters
checkParameters(workerId, dataCenterId);
this.workerId = workerId;
this.dataCenterId = dataCenterId;
}
public synchronized long newId() {
// timestamp
long curTimestamp = this.getCurrentTimestamp(false);
// sequeue
long curSequence = lastSequence + 1;
if (curTimestamp > this.lastTimestamp) {
curSequence = 0;
}
if (curSequence > MAX_SEQUENCE) {
this.waitForNextTimestamp();
curTimestamp++;
curSequence = 0;
}
// last
this.lastTimestamp = curTimestamp;
this.lastSequence = curSequence;
// build id
return curTimestamp << (WORKER_ID_BITS + DATA_CENTER_ID_BITS + 1 + SEQUENCE_BITS)
| workerId << (DATA_CENTER_ID_BITS + 1 + SEQUENCE_BITS)
| dataCenterId << (1 + SEQUENCE_BITS)
| backwardFlag << SEQUENCE_BITS
| curSequence;
}
private long getCurrentTimestamp(boolean isWaitForNextTimestamp) {
long curTimestamp = System.currentTimeMillis() - START_TIMESTAMP;
if (isWaitForNextTimestamp) {
return curTimestamp;
}
if ((this.isBackward()) && (curTimestamp > this.backupLastTimestampForBackward)) {
this.backwardFlag = 0L;
return curTimestamp;
}
if (curTimestamp < this.lastTimestamp) {
if (this.isBackward() || (this.lastTimestamp - curTimestamp) > MAX_BACKWARD_TIMESTAMP) {
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id, curTimestamp:%s, lastTimestamp:%s, backwardFlag:%s",
curTimestamp, this.lastTimestamp, this.backwardFlag));
}
this.backwardFlag = 1L;
this.backupLastTimestampForBackward = this.lastTimestamp;
this.lastSequence = 0L;
this.lastTimestamp = curTimestamp;
}
return curTimestamp;
}
private boolean isBackward() {
return this.backwardFlag > 0;
}
private void waitForNextTimestamp() {
long timestamp = this.getCurrentTimestamp(true);
//while (timestamp <= this.lastTimestamp) {
while (timestamp == this.lastTimestamp) {
timestamp = this.getCurrentTimestamp(true);
}
}
private void checkParameters(long workerId, long dataCenterId) {
int total_bits = 1 + TIMESTAMP_BITS + WORKER_ID_BITS + DATA_CENTER_ID_BITS + 1 + SEQUENCE_BITS;
if (total_bits != 64) {
throw new IllegalArgumentException("static parameters error: SnowflakeId must be 64 bits");
}
if (workerId > MAX_WORKER_ID || workerId < 0) {
throw new IllegalArgumentException("workerId can't be greater than MAX_WORKER_ID or less than 0");
}
if (dataCenterId > MAX_DATA_CENTER_ID || dataCenterId < 0) {
throw new IllegalArgumentException("dataCenterId can't be greater than MAX_DATA_CENTER_ID or less than 0");
}
}
public static void main(String[] args) throws Exception {
long START_TIMESTAMP = new SimpleDateFormat("yyyy-MM-dd").parse("2023-01-01").getTime();
System.out.println("START_TIMESTAMP: " + START_TIMESTAMP);
SnowflakeIdWorkerV2 idWorker = new SnowflakeIdWorkerV2(1, 1);
for(int i = 0; i < 10; i++) {
long id = idWorker.newId();
System.out.println(String.format("%05d, %s, %s, %s, %s", i, System.currentTimeMillis(), id, Long.toString(id, 2), Long.toString(id, 2).length()));
}
}
}
4 机器id
为保证生成的id,在分布式环境下唯一,需要保证每台机器分配唯一id,常见方法如下:
4.1 基于配置文件
例如:可在application.properties,增加属性machine.id=1
4.2 基于环境变量
String machineId = System.getenv("machine_id");
4.3 基于第三方资源
通过mysql/redis/zookeeper 等进行协调,在启动时,为每个机器分配不会重复的id。
文章介绍了Twitter的雪花算法,一种用于生成全局唯一ID的分布式算法。该算法将ID分为时间戳、机器ID、数据中心ID和自增序列四部分,具有高性能和自增特性。文章提供了源码实现,并讨论了时间回拨可能导致的ID冲突问题,提出了改进方案,通过扩展数据段来处理时间回拨,以避免生成重复ID。
3070

被折叠的 条评论
为什么被折叠?



