Why Vector Clocks Are Hard

本文通过实例探讨了向量时钟在分布式系统中面临的两大难题:确定行为者身份及限制向量时钟的增长。并讨论了解决方案,包括将服务器设为行为者以及修剪向量时钟。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

转载地址http://basho.com/why-vector-clocks-are-hard/

April 5, 2010

A couple of months ago, Bryan wrote about vector clocks on this blog. The title of the post was “Why Vector Clocks are Easy”; anyone who read the post would realize that he meant that they’re easy for a client to use when talking to a system that implements them. For that reason, there is no reason to fear or avoid using a service that exposes the existence of vector clocks in its API.

Of course, actually implementing such a system is not easy. Two of the hardest things are deciding what  an actor is (i.e. where the incrementing and resolution is, and what parties get their own field in the vector) and how to keep vclocks from growing without bound over time.

In Bryan’s example the parties that actually proposed changes (“clients”) were the actors in the vector clocks. This is the model that vector clocks are designed for and work well with, but it has a drawback. The width of the vectors will grow proportionally with the number of clients. In a group of friends deciding when to have dinner this isn’t a problem, but in a distributed storage system the number of clients over time can be large. Once the vector clocks get that
large, they not only take up more space in disk and RAM but also take longer to compute comparisons over.

Let’s run through that same example again, but this time visualize the vector clocks throughout the scenario. If you don’t recall the whole story in the example, you should read Bryan’s post again as I am just going to show the data flow aspect of it here.

Vector Clocks by Example, in detail

Start with Alice’s initial message where she suggests Wednesday. (In the diagrams I abbreviate names, so that “Alice” will be “A” in the vclocks and so on for Ben, Cathy, and Dave.)

date = Wednesday
vclock = Alice:1

Ben suggests Tuesday:

date = Tuesday
vclock = Alice:1, Ben:1

Dave replies, confirming Tuesday:

date = Tuesday
vclock = Alice:1, Ben:1, Dave:1

Now Cathy gets into the act, suggesting Thursday:

date = Thursday
vclock = Alice:1, Cathy:1

Dave has two conflicting objects:

date = Tuesday
vclock = Alice:1, Ben:1, Dave:1

and

date = Thursday
vclock = Alice:1, Cathy:1

Dave can tell that these versions are in conflict, because neither vclock “descends” from the other. Luckily, Dave’s a reasonable guy, and chooses Thursday. Dave also created a vector clock that is a successor to all previously-seen vector clocks. He emails this value back to Cathy.

date = Thursday
vclock = Alice:1, Ben:1, Cathy:1, Dave:2

So now when Alice asks Ben and Cathy for the latest decision, the replies she receives are, from Ben:

date = Tuesday
vclock = Alice:1, Ben:1, Dave:1

and from Cathy:

date = Thursday
vclock = Alice:1, Ben:1, Cathy:1, Dave:2

From this, she can tell that Dave intended his correspondence with Cathy to override the decision he made with Ben. All Alice has to do is show Ben the vector clock from Cathy’s message, and Ben will know that he has been overruled.

That worked out pretty well.

Making it Easier Makes it Harder

Notice that even in this short and simple example the vector clock grew from nothing up to a 4-pairs mapping? In a real world scenario with long-lived data, each data element would end up with a vector clock with a length proportional to the number of clients that had ever modified it. That’s a (potentially unbounded) large growth in storage volume and computation, so it’s a good idea to think about how to prevent it.

One straightforward idea is to make the servers handling client requests be the “actors”, instead of representing the clients directly. Since any given system usually has a known bounded number of servers over time and also usually has less servers than clients, this serves to reduce and cap the size of the vclocks. I know of at least two real systems that have tried this. In addition to keeping growth under control, this approach attracts people because it means you don’t expose “hard” things like vector clocks to clients at all.

Let’s think through the same example, but with that difference, to see how it goes. We’ll assume that a 2-server distributed system is coordinating the communication, with clients evenly distributed among them. We’ll be easy on ourselves and allow for client affinity, so for the duration of the session each client will use only one server. Alice and Dave happen to get server X, and Ben and Cathy get server Y. To avoid getting too complicated here I am not going to draw the server communication; instead I’ll just abstract over it by changing the vector clocks accordingly.

We’re fine through the first few steps:

The only real difference so far is that each update increments a vector clock field named for the client’s chosen server instead of the client itself. This will mean that the number of fields needed won’t grow without bound; it will be the same as the number of servers. This is the desired effect of the change.

We run into trouble, though, when Cathy sends her update:

In the original example, this is where a conflict was created. Dave sorted out the conflict, and everything was fine. With our new strategy, though, something else happened. Ben and Cathy were both modifying from the same original object. Since we used their server id instead of their own name to identify the change, Cathy’s message has the same vector clock as Ben’s! This means that Dave’s message (responding to Ben) appears to be a simple successor to Cathy’s… and we lose her data silently!

Clearly, this approach won’t work. Remember the two systems I mentioned that tried this approach. Neither of them stuck with it once they discovered that it can be expected to silently lose updates.

For vector clocks to have their desired effect without causing accidents such as this, the elements represented by the fields in the vclock must be the real units of concurrency. In a case like this little example or a distributed storage system, that means client identifiers, not server-based ones.

Just Lose a Little Information and Everything Will Be Fine

If we use client identifiers, we’re back in the situation where vector clocks will grow and grow as more clients use a system over time. The solution most people end up with is to “prune” their vector clocks as they grow.

This is done by adding a timestamp to each field, and updating it to the current local time whenever that field is incremented. This timestamp is never used for vclock comparison — that is purely a matter of logical time — but is only for pruning purposes.

This way, when a given vclock gets too big, you can remove fields, starting at the one that was updated longest ago, until you hit a size/age threshold that makes sense for your application.

But, you ask, doesn’t this lose information?

Yes, it does — but it won’t make you lose your data. The only case where this kind of pruning will matter at all is when a client holds a very old copy of the unpruned vclock and submits data descended from that. This will create a sibling (conflict) even though you might have been able to resolve it automatically if you had the complete unpruned vclock at the server. That is the tradeoff with pruning: in exchange for keeping growth under control, you run the chance of occasionally having to do a “false merge”… but you never lose data quietly, which makes this approach unequivocally better than moving the field identifiers off of the real client and onto the server.

Review

So, vclocks are hard: even with perfect implementation you can’t have perfect information about causality in an open system without unbounded information growth. Realize this and design accordingly.

Of course, that is just advice for people building brand new distributed systems or trying to improve existing ones. Usinga system that exposes vclocks is still easy.

一、综合实战—使用极轴追踪方式绘制信号灯 实战目标:利用对象捕捉追踪和极轴追踪功能创建信号灯图形 技术要点:结合两种追踪方式实现精确绘图,适用于工程制图中需要精确定位的场景 1. 切换至AutoCAD 操作步骤: 启动AutoCAD 2016软件 打开随书光盘中的素材文件 确认工作空间为"草图与注释"模式 2. 绘图设置 1)草图设置对话框 打开方式:通过"工具→绘图设置"菜单命令 功能定位:该对话框包含捕捉、追踪等核心绘图辅助功能设置 2)对象捕捉设置 关键配置: 启用对象捕捉(F3快捷键) 启用对象捕捉追踪(F11快捷键) 勾选端点、中心、圆心、象限点等常用捕捉模式 追踪原理:命令执行时悬停光标可显示追踪矢量,再次悬停可停止追踪 3)极轴追踪设置 参数设置: 启用极轴追踪功能 设置角度增量为45度 确认后退出对话框 3. 绘制信号灯 1)绘制圆形 执行命令:"绘图→圆→圆心、半径"命令 绘制过程: 使用对象捕捉追踪定位矩形中心作为圆心 输入半径值30并按Enter确认 通过象限点捕捉确保圆形位置准确 2)绘制直线 操作要点: 选择"绘图→直线"命令 捕捉矩形上边中点作为起点 捕捉圆的上象限点作为终点 按Enter结束当前直线命令 重复技巧: 按Enter可重复最近使用的直线命令 通过圆心捕捉和极轴追踪绘制放射状直线 最终形成完整的信号灯指示图案 3)完成绘制 验证要点: 检查所有直线是否准确连接圆心和象限点 确认极轴追踪的45度增量是否体现 保存绘图文件(快捷键Ctrl+S)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值