基于Redis的分布式锁

本文详细介绍了如何使用Redis实现分布式锁,重点讨论了单实例实现的不足,提出了Redlock算法,该算法通过在多个独立的Redis实例上获取锁来提高安全性与存活性。Redlock确保在多数节点正常时,能够安全地获取和释放锁,避免了竞态条件和死锁。此外,还探讨了算法的性能、故障恢复和持久化策略。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

在某些环境里,不同的进程需要以互斥的方式使用共享的资源,这时候就需要用到分布式锁。

Distributed locks are a very useful primitive in many environments where different processes must operate with shared resources in a mutually exclusive way.

目前有很多的库和博客都有介绍要如何通过Redis来实现一个DLM(Distributed Lock Manager 即分布式锁管理器),但是这些库都使用了不同的方法来实现,并且其中大部分方法都较为简单。其实我们只要在分布式锁的设计上采用略微复杂一点的方案,就可以达到更高的可靠性的保障。

There are a number of libraries and blog posts describing how to implement a DLM (Distributed Lock Manager) with Redis, but every library uses a different approach, and many use a simple approach with lower guarantees compared to what can be achieved with slightly more complex designs.

本页描述了一种更为规范的算法,来使用Redis实现分布式锁。我们提出了一种算法,叫Redlock,它实现了一个DLM,我们认为它比普通的单实例方法更安全。我们希望社区能帮忙一起进行分析、提供反馈,并基于此算法来构建更复杂的设计或实现。

This page describes a more canonical algorithm to implement distributed locks with Redis. We propose an algorithm, called Redlock, which implements a DLM which we believe to be safer than the vanilla single instance approach. We hope that the community will analyze it, provide feedback, and use it as a starting point for the implementations or more complex or alternative designs.

安全性和存活性保障

我们将从如下三个点进行建模设计,从我们的角度来看,这三个点是以有效的方式使用分布式锁所需的最低的保障:

  • 安全:互斥,在任何指定的时候,只有一个客户端可以持有锁;
  • 存活性A:无死锁,最终总是有机会获取到锁,即使持有锁的客户端崩溃或出现分区错误;
  • 存活性B:容错,只要大多数Redis节点是正常的,那客户端就能获得和释放锁;

We are going to model our design with just three properties that, from our point of view, are the minimum guarantees needed to use distributed locks in an effective way.

  • Safety property: Mutual exclusion. At any given moment, only one client can hold a lock.
  • Liveness property A: Deadlock free. Eventually it is always possible to acquire a lock, even if the client that locked a resource crashes or gets partitioned.
  • Liveness property B: Fault tolerance. As long as the majority of Redis nodes are up, clients are able to acquire and release locks.

为什么基于故障转移的实现是不够的

为了了解我们想要改进的地方,我们先来分析下大多数基于Redis的分布式锁的现状。

To understand what we want to improve, let’s analyze the current state of affairs with most Redis-based distributed lock libraries.

最简单的通过Redis锁定资源的方法,就是创建一个key,这个key通常有过期时间,所以它最终将会被释放(存活性A),当客户端需要释放资源时,它就会删除该key。

The simplest way to use Redis to lock a resource is to create a key in an instance. The key is usually created with a limited time to live, using the Redis expires feature, so that eventually it will get released (property 2 in our list). When the client needs to release the resource, it deletes the key.

从表面上看,这是可行的,但存在一个问题,就是我们架构中存在单点故障,如果Redis的master发生故障怎么办?好吧,我们尝试添加一个副本,当master不可用时,就使用该副本。但不幸的是,这也是行不通的,通过这种方式的实现,我们不能保障互斥性,因为Redis的复制是异步的。

Superficially this works well, but there is a problem: this is a single point of failure in our architecture. What happens if the Redis master goes down? Well, let’s add a replica! And use it if the master is unavailable. This is unfortunately not viable. By doing so we can’t implement our safety property of mutual exclusion, because Redis replication is asynchronous.

这种模式下存在竞态条件:

  1. 客户端A在master节点中获取到锁;
  2. master还未将key写入副本时就已经奔溃了;
  3. 副本被提升为master;
  4. 此时客户端B可以获取客户端A已经获取过的锁;

There is a race condition with this model:

  1. Client A acquires the lock in the master.
  2. The master crashes before the write to the key is transmitted to the replica.
  3. The replica gets promoted to master.
  4. Client B acquires the lock to the same resource A already holds a lock for. SAFETY VIOLATION!

有时候,在特定情况下(如故障期间),多个客户端同时持有锁是可接受的话,那你可以继续沿用基于复制的解决方案,否则,我们建议实施本文件中描述的解决方案。

Sometimes it is perfectly fine that, under special circumstances, for example during a failure, multiple clients can hold the lock at the same time. If this is the case, you can use your replication based solution. Otherwise we suggest to implement the solution described in this document.

在单个实例中正确实现

在尝试突破单实例方式存在的限制之前,我们先看一下如何在这种简单的场景下正确的实现分布式锁,因为在时常出现竞态条件的应用程序中,这实际上是一个可行的解决方案,并且在单个实例中正确的实现分布式锁是我们即将实现的分布式算法的基础。

Before trying to overcome the limitation of the single instance setup described above, let’s check how to do it correctly in this simple case, since this is actually a viable solution in applications where a race condition from time to time is acceptable, and because locking into a single instance is the foundation we’ll use for the distributed algorithm described here.

获取锁的方法如下:

To acquire the lock, the way to go is the following:

    SET resource_name my_random_value NX PX 30000

该命令仅在key不存在时才创建key(NX选项),并且设置其过期时间为30000毫秒(PX选项),该key对应的值为“my_random_value”,这个值在所有客户端和所有锁请求中必须是唯一的。

The command will set the key only if it does not already exist (NX option), with an expire of 30000 milliseconds (PX option). The key is set to a value “my_random_value”. This value must be unique across all clients and all lock requests.

基本上,随机值的使用是为了以安全的方式释放锁,通过脚本可以告诉Redis:仅当key存在且存储在key上的value正好是期望值时,才会删除该key。这是通过lua脚本完成的。

Basically the random value is used in order to release the lock in a safe way, with a script that tells Redis: remove the key only if it exists and the value stored at the key is exactly the one I expect to be. This is accomplished by the following Lua script:

if redis.call("get",KEYS[1]) == ARGV[1] then
    return redis.call("del",KEYS[1])
else
    return 0
end

这一点很重要,可以避免删除由其他客户端创建的锁。例如,一个客户端可能获取了锁,但在执行某些操作时被阻塞,时间超过了锁的有效期(key的过期时间),等阻塞结束后,可能会删除已经被其它客户端获得的锁。因此,仅仅使用DEL是不安全的,因为一个客户端可能会删除另一个客户端的锁。通过上面的脚本,每把锁都使用一个随机字符串进行“签名”,所以这把锁只可能被创建它的客户端删除。

This is important in order to avoid removing a lock that was created by another client. For example a client may acquire the lock, get blocked performing some operation for longer than the lock validity time (the time at which the key will expire), and later remove the lock, that was already acquired by some other client. Using just DEL is not safe as a client may remove another client’s lock. With the above script instead every lock is “signed” with a random string, so the lock will be removed only if it is still the one that was set by the client trying to remove it.

这个随机字符串应该是什么?我们可以假设它是来自/dev/urandom的20个字节,但你也可以找到开销更低的方法,只要这个随机字符串对你的任务来说足够“随机”。例如,一个安全的选择是用/dev/urandom作为RC4的种子,并从中生成一个伪随机流。一个更简单的解决方案是使用精度为微秒的UNIX时间戳,将时间戳与客户端ID相连接。它没有那么安全,但可能对大多数环境来说已经足够了

What should this random string be? We assume it’s 20 bytes from /dev/urandom, but you can find cheaper ways to make it unique enough for your tasks. For example a safe pick is to seed RC4 with /dev/urandom, and generate a pseudo random stream from that. A simpler solution is to use a UNIX timestamp with microsecond precision, concatenating the timestamp with a client ID. It is not as safe, but probably sufficient for most environments.

“锁有效期”是我们key的过期时间。它既是自动释放时间,也是客户端在另一个客户端可能能够再次获取锁之前执行所需操作的时间,在技术上不违反互斥保证,互斥性保证只限于从获得锁的那一刻起的特定时间窗口。

The “lock validity time” is the time we use as the key’s time to live. It is both the auto release time, and the time the client has in order to perform the operation required before another client may be able to acquire the lock again, without technically violating the mutual exclusion guarantee, which is only limited to a given window of time from the moment the lock is acquired.

所以,我们现在有了一个很好的方法来获取和释放锁,有了这个系统,对一个由单一的、总是可用的实例组成的非分布式系统是安全的。让我们把这个概念扩展到一个没有这种保证的分布式系统中。

So now we have a good way to acquire and release the lock. With this system, reasoning about a non-distributed system composed of a single, always available, instance, is safe. Let’s extend the concept to a distributed system where we don’t have such guarantees.

Redlock算法

在该算法的分布式版本中,我们假设我们有N个Redis的master,这些节点是完全独立的,所以我们不使用复制或其它隐式协调系统。我们已经描述了如何在单个实例中安全的获取和释放锁,在这个例子中,我们设置了N=5,这是一个合理的值,所以我们需要在不同的计算机或虚拟机中运行5个Redis主服务器,以确保它们以几乎独立的方式发生故障。

In the distributed version of the algorithm we assume we have N Redis masters. Those nodes are totally independent, so we don’t use replication or any other implicit coordination system. We already described how to acquire and release the lock safely in a single instance. We take for granted that the algorithm will use this method to acquire and release the lock in a single instance. In our examples we set N=5, which is a reasonable value, so we need to run 5 Redis masters on different computers or virtual machines in order to ensure that they’ll fail in a mostly independent way.

为了获取锁,客户端将执行以下操作:

  1. 获取当前系统时间(毫秒);
  2. 它试图依次在所有N个实例中获取锁,在所有实例中使用相同的key和随机值。在步骤2中,当在每个实例中设置锁时,客户端使用一个与总的锁自动释放时间相比很小的超时来获取它。例如,如果自动释放时间是10秒,超时可以在5~50毫秒范围内。这可以防止客户端在试图与Redis节点对话时长时间阻塞:如果一个实例不可用,我们应该尽快尝试与下一个实例对话;
  3. 客户端通过从当前时间减去步骤1中获得的时间戳,计算出获取锁所需的时间。并且只有当客户端能够在大多数实例(至少3个)中获得锁,并且获得锁的总时间小于锁的有效期,锁才被认为已经获得;
  4. 如果获得了锁,则其有效时间被认为是初始有效时间减去经过的时间,如步骤3 中计算的那样;
  5. 如果客户端由于某种原因未能获得锁(要么它无法锁定N/2+1个实例,要么有效性时间为负数),它将尝试解锁所有的实例(甚至是它认为无法锁定的实例);

In order to acquire the lock, the client performs the following operations:

  1. It gets the current time in milliseconds.
  2. It tries to acquire the lock in all the N instances sequentially, using the same key name and random value in all the instances. During step 2, when setting the lock in each instance, the client uses a timeout which is small compared to the total lock auto-release time in order to acquire it. For example if the auto-release time is 10 seconds, the timeout could be in the ~ 5-50 milliseconds range. This prevents the client from remaining blocked for a long time trying to talk with a Redis node which is down: if an instance is not available, we should try to talk with the next instance ASAP.
  3. The client computes how much time elapsed in order to acquire the lock, by subtracting from the current time the timestamp obtained in step 1. If and only if the client was able to acquire the lock in the majority of the instances (at least 3), and the total time elapsed to acquire the lock is less than lock validity time, the lock is considered to be acquired.
  4. If the lock was acquired, its validity time is considered to be the initial validity time minus the time elapsed, as computed in step 3.
  5. If the client failed to acquire the lock for some reason (either it was not able to lock N/2+1 instances or the validity time is negative), it will try to unlock all the instances (even the instances it believed it was not able to lock).

算法是异步的吗?

该算法依赖这样一个假设:虽然进程之间没有同步时钟,但每个进程中的本地时间以大致相同的速度更新,与锁的自动释放时间相比,误差很小。这个假设与现实中的计算机非常相似:每台计算机都有一个本地时钟,我们通常可以接收不同计算机间的时钟偏差。

The algorithm relies on the assumption that while there is no synchronized clock across the processes, the local time in every process updates at approximately at the same rate, with a small margin of error compared to the auto-release time of the lock. This assumption closely resembles a real-world computer: every computer has a local clock and we can usually rely on different computers to have a clock drift which is small.

在这一点上,我们需要更好的说明我们的互斥规则:只要持有锁的客户端在锁的有效期内(步骤3)终止其工作,减去一些时间(只是几毫秒,以补偿进程间的时钟偏差),就可以保证。

At this point we need to better specify our mutual exclusion rule: it is guaranteed only as long as the client holding the lock terminates its work within the lock validity time (as obtained in step 3), minus some time (just a few milliseconds in order to compensate for clock drift between processes).

失败重试

当客户端无法获取锁时,它应该在随机延迟后再次尝试,以尝试对试图在同一时间为同一资源获取锁的多个客户端进行异步处理(这可能会导致脑裂的情况,没有人获胜)。另外,客户端试图在大多数Redis实例中获取锁的速度越快,出现脑裂状况的窗口就越小(需要重试),所以理想情况下,客户端应该尝试使用多路复用技术同时向N个实例发送SET命令。

When a client is unable to acquire the lock, it should try again after a random delay in order to try to desynchronize multiple clients trying to acquire the lock for the same resource at the same time (this may result in a split brain condition where nobody wins). Also the faster a client tries to acquire the lock in the majority of Redis instances, the smaller the window for a split brain condition (and the need for a retry), so ideally the client should try to send the SET commands to the N instances at the same time using multiplexing.

值得强调的是,对于未能获得大多数锁的客户端来说,尽快释放(部分)获得的锁是非常重要的,这样就不需要等待key过期,以便再次获得锁。

It is worth stressing how important it is for clients that fail to acquire the majority of locks, to release the (partially) acquired locks ASAP, so that there is no need to wait for key expiry in order for the lock to be acquired again (however if a network partition happens and the client is no longer able to communicate with the Redis instances, there is an availability penalty to pay as it waits for key expiration).

释放锁

释放锁很简单,无论客户是否认为它能够成功锁定一个给定的实例,都可以进行释放。

Releasing the lock is simple, and can be performed whether or not the client believes it was able to successfully lock a given instance.

安全性论证

这个算法安全吗?让我们看看在不同的场景中会发生什么。

Is the algorithm safe? Let’s examine what happens in different scenarios.

首先,我们假设一个客户端能够在大多数实例中获得锁。所有的实例都将包含一个具有相同存活时间的key。然而,key的设置时间不同,所以key也会在不同的时间过期。但是,如果第一个key在最坏的情况下被设置在时间T1(我们在联系第一个服务器之前的采样时间),而最后一个密钥在最坏的情况下被设置在时间T2(我们从最后一个服务器获得回复的时间),我们可以肯定,集合中第一个过期的key将至少存在MIN_VALIDITY=TTL -(T2-T1)- CLOCK_DRIFT。所有其他的key将在以后过期,所以我们确信这些key将同时被设置为至少这个时间。

To start let’s assume that a client is able to acquire the lock in the majority of instances. All the instances will contain a key with the same time to live. However, the key was set at different times, so the keys will also expire at different times. But if the first key was set at worst at time T1 (the time we sample before contacting the first server) and the last key was set at worst at time T2 (the time we obtained the reply from the last server), we are sure that the first key to expire in the set will exist for at least MIN_VALIDITY=TTL-(T2-T1)-CLOCK_DRIFT. All the other keys will expire later, so we are sure that the keys will be simultaneously set for at least this time.

在大多数key被设置的时间里,另一个客户端将无法获得锁,因为如果N/2+1个key已经存在,N/2+1SET NX操作就无法成功。因此,如果一个锁被获取了,就不可能在同一时间重新获取它(违反了互斥属性)。

During the time that the majority of keys are set, another client will not be able to acquire the lock, since N/2+1 SET NX operations can’t succeed if N/2+1 keys already exist. So if a lock was acquired, it is not possible to re-acquire it at the same time (violating the mutual exclusion property).

然而,我们也要确保多个试图同时获取锁的客户不能同时成功。

However we want to also make sure that multiple clients trying to acquire the lock at the same time can’t simultaneously succeed.

如果一个客户端花费了接近或大于锁的最大有效时间(基本上我们用于SET的TTL)来锁定了大部分实例,它将认为锁无效,并将解锁这些实例,所以我们只需要考虑客户能够在小于锁的有效时间的时间内锁定大部分实例的情况。在这种情况下,对于上面已经表达的论点,对于MIN_VALIDITY来说,没有客户端能够重新获得锁。所以多个客户端将能够同时锁定N/2+1个实例("时间 "为步骤2的结束时间),只有当锁定大多数的时间大于TTL时间时,才会使锁无效。

If a client locked the majority of instances using a time near, or greater, than the lock maximum validity time (the TTL we use for SET basically), it will consider the lock invalid and will unlock the instances, so we only need to consider the case where a client was able to lock the majority of instances in a time which is less than the validity time. In this case for the argument already expressed above, for MIN_VALIDITY no client should be able to re-acquire the lock. So multiple clients will be able to lock N/2+1 instances at the same time (with “time” being the end of Step 2) only when the time to lock the majority was greater than the TTL time, making the lock invalid.

存活性论证

系统的存活度基于三个主要特征:

  1. 锁的自动释放(因为key过期):最终key又可以被重新锁定;
  2. 客户端,通常会在锁没有成功获得的时候,或者在获得了锁而工作完成的时候,主动移除锁,这使得我们有可能不必等待key过期来重新获得锁;
  3. 事实上,当客户端需要重试一个锁时,它所等待的时间要比获取大多数锁所需的时间大得多,以便在概率上使资源争夺期间的脑裂状况不太可能发生;

The system liveness is based on three main features:

  1. The auto release of the lock (since keys expire): eventually keys are available again to be locked.
  2. The fact that clients, usually, will cooperate removing the locks when the lock was not acquired, or when the lock was acquired and the work terminated, making it likely that we don’t have to wait for keys to expire to re-acquire the lock.
  3. The fact that when a client needs to retry a lock, it waits a time which is comparably greater than the time needed to acquire the majority of locks, in order to probabilistically make split brain conditions during resource contention unlikely.

However, we pay an availability penalty equal to TTL time on network partitions, so if there are continuous partitions, we can pay this penalty indefinitely. This happens every time a client acquires a lock and gets partitioned away before being able to remove the lock.

Basically if there are infinite continuous network partitions, the system may become not available for an infinite amount of time.

性能、故障恢复和fsync

许多使用Redis作为锁服务器的用户需要在获取和释放锁的延迟上,以及每秒可能执行的获取/释放操作的数量方面有很高的性能要求。为了满足这一要求,与N个Redis服务器对话以减少延迟的策略肯定是多路复用(将套接字置于非阻塞模式,发送所有的命令,随后读取所有的命令,假设客户端和每个实例之间的RTT是相似的)。

Many users using Redis as a lock server need high performance in terms of both latency to acquire and release a lock, and number of acquire / release operations that it is possible to perform per second. In order to meet this requirement, the strategy to talk with the N Redis servers to reduce latency is definitely multiplexing (putting the socket in non-blocking mode, send all the commands, and read all the commands later, assuming that the RTT between the client and each instance is similar).

然而,如果我们想以崩溃恢复系统模型为目标,围绕持久性还有另一种考虑。

However there is another consideration around persistence if we want to target a crash-recovery system model.

让我们假设我们配置的Redis完全没有持久性。一个客户在5个实例中的3个获得了锁。客户端获取到锁的其中一个实例被重新启动,此时又有3个实例可以为同一资源加锁,另一个客户端可以再次加锁,违反了锁的排他性的安全属性。

Basically to see the problem here, let’s assume we configure Redis without persistence at all. A client acquires the lock in 3 of 5 instances. One of the instances where the client was able to acquire the lock is restarted, at this point there are again 3 instances that we can lock for the same resource, and another client can lock it again, violating the safety property of exclusivity of lock.

如果我们启用AOF持久性,事情会有相当大的改善。例如,我们可以通过向服务器发送SHUTDOWN命令并重新启动它来升级服务器。因为Redis的过期时间在语义上是这样实现的:当服务器关闭时,时间仍然在流逝,所以我们所有的要求都很好。然而,只要是干净的关机,一切都很好。那停电呢?如果Redis被配置为默认的每秒在磁盘上进行fsync,那么在重启之后,我们的key有可能丢失。理论上,如果我们想在任何形式的实例重启时保证锁的安全性,我们需要在持久化设置中启用fsync=always。由于额外的同步开销,这将影响性能。

If we enable AOF persistence, things will improve quite a bit. For example we can upgrade a server by sending it a SHUTDOWN command and restarting it. Because Redis expires are semantically implemented so that time still elapses when the server is off, all our requirements are fine. However everything is fine as long as it is a clean shutdown. What about a power outage? If Redis is configured, as by default, to fsync on disk every second, it is possible that after a restart our key is missing. In theory, if we want to guarantee the lock safety in the face of any kind of instance restart, we need to enable fsync=always in the persistence settings. This will affect performance due to the additional sync overhead.

然而,事情比乍看之下要好得多。基本上,只要当一个实例在崩溃后重新启动时,它不再参与任何当前活动的锁,就可以保留算法的安全性。这意味着,当实例重新启动时,当前活动的锁的集合都是由锁定实例获得的,而不是重新加入系统的那个。

However things are better than they look like at a first glance. Basically, the algorithm safety is retained as long as when an instance restarts after a crash, it no longer participates to any currently active lock. This means that the set of currently active locks when the instance restarts were all obtained by locking instances other than the one which is rejoining the system.

为了保证这一点,我们只需要让一个实例在崩溃后,至少比我们使用的最大TTL多一点的时间不可用。这是实例崩溃时存在的关于锁的所有key变得无效并被自动释放所需的时间。

To guarantee this we just need to make an instance, after a crash, unavailable for at least a bit more than the max TTL we use. This is the time needed for all the keys about the locks that existed when the instance crashed to become invalid and be automatically released.

使用延迟重启基本上可以实现安全,即使没有任何类型的Redis持久性可用,但请注意,这可能转化为可用性惩罚。例如,如果大多数实例崩溃,系统将在TTL内变得全局不可用(这里的全局意味着在这段时间内没有任何资源可以被锁定)。

Using delayed restarts it is basically possible to achieve safety even without any kind of Redis persistence available, however note that this may translate into an availability penalty. For example if a majority of instances crash, the system will become globally unavailable for TTL (here globally means that no resource at all will be lockable during this time).

让算法更可靠:扩展锁

如果客户端执行的工作由小步骤组成,就有可能在默认情况下使用较小的锁的有效性时间,并扩展实现锁扩展机制的算法。基本上,客户端,如果在计算过程中,当锁的有效性接近一个低值时,可以通过向所有的实例发送一个Lua脚本来扩展锁,如果key存在并且它的值仍然是客户端在获得锁时分配的随机值,则扩展锁。

If the work performed by clients consists of small steps, it is possible to use smaller lock validity times by default, and extend the algorithm implementing a lock extension mechanism. Basically the client, if in the middle of the computation while the lock validity is approaching a low value, may extend the lock by sending a Lua script to all the instances that extends the TTL of the key if the key exists and its value is still the random value the client assigned when the lock was acquired.

客户端只有在能够将锁扩展到大多数实例,并且在有效期内(基本上使用的算法与获取锁时使用的算法非常相似),才应该考虑重新获取锁。

The client should only consider the lock re-acquired if it was able to extend the lock into the majority of instances, and within the validity time (basically the algorithm to use is very similar to the one used when acquiring the lock).

然而这在技术上并没有改变算法,所以应该限制锁重新获取的最大次数,否则就违反了有效性属性之一

However this does not technically change the algorithm, so the maximum number of lock reacquisition attempts should be limited, otherwise one of the liveness properties is violated.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值