MySQL Limitations Part 1: Single-Threaded Replication [MySQL 短板 1]

探讨了MySQL中单线程复制的问题及其对高负载应用的影响,并提出了几种可能的解决方案,包括使用DRBD复制、分片及应用程序内复制等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

I recently mentioned a few of the big “non-starter” limitations Postgres has overcome for specific use cases. I decided to write a series of blog posts on MySQL’s unsolved severe limitations. I mean limitations that really hobble it for major, important needs — not in areas where it isn’t used, but in areas where it is used and thus is used very wastefully compared to its potential.

The first glaring problem is single-threaded replication. It is severe and getting much worse as servers get more and more CPUs and CPU cores. The replication replay process executes in a single thread on the replicas, and thus has no hope of keeping up with even a moderately busy write load on the primary, where many updates are occurring concurrently.

In a lot of Web applications, this isn’t really seen as a huge limitation. That’s because these apps are mostly read traffic, so a single primary can delegate the read workload across several replicas, and the write workload is still a relatively small fraction of the server’s total capacity. But eventually, it does become a problem if the app gets large enough, no matter how large the read-to-write ratio is.

What are some workarounds? Here are a few I’ve seen:

  • Use DRBD replication instead of MySQL replication. Problem: you end up with an idle standby server, which can’t be used for serving reads, only for disaster recovery. That’s costly.
  • Shard. Write workload is the single most legitimate reason for sharding. It’s too bad that “the replica can’t keep up with the write workload” becomes the reason to shard. I wish we could avoid that.
  • Do the replication in the application. I know of applications where they just don’t use built-in replication. When they modify some data, they do it in both places. That’s a headache.
  • Try obscure techniques such as external processes to prefetch the data the replica is trying to modify, so it can do it faster. This rarely works.

I’m not criticizing anyone who does these things — there really isn’t much of a good choice. It’s all a matter of picking the solution that’s least evil.

Why isn’t there multi-threaded replication? I think that a) it’s not as easy as it seems, and b) there are a hundred edge cases that make it nearly impossible to choose a solution that works for all situations. For example, mixtures of transactional and non-transactional tables are a nightmare.

Here are a few ideas I’ve either heard, or discussed, or thought of myself:

  1. One thread per database. If the application is built such that each database is fully independent, then on the replica, we could start up a thread for each database we see in the binary log, and simply pass the replication events to the appropriate thread.
  2. A pool of threads on the replica, and a coordinating thread that hands work to each of them. The coordinator would read from the relay log until it has a complete transaction’s worth of events (not a single statement), and hand the whole transaction to the worker thread. The worker thread would run up until the COMMIT, but not actually commit, and then report back to the coordinator thread. The coordinator thread would ensure that all of the transactions begin in the same order as defined in the relay log, and commit in the same order. If any error occurred, such as a deadlock or lock wait timeout, then the coordinator would instruct the workers to roll back, and either retry or make them execute the problematic transactions in serial instead of concurrently.
  3. Multiple binary logs on the primary, one per database; one replication process per binary log on the replica. This would have the advantage that it would allow a replica to subscribe to multiple masters, which is currently impossible.

These solutions represent different types of trade-offs. For example, solution 1) only works for specific uses, and I don’t think it’s general-purpose enough. Solution 2) has potentially complex behavior that might not work well in some cases, such as when deadlocks are common; but it is overall the least disruptive or different from the user’s point of view. Solution 3) requires modifying the binary logging code, which is risky. It also requires maintaining many master.info files on the replica, and new SQL syntax for administering replicas, and is generally not something I personally want to administer (replication is fragile enough… imagine recovering after a crash when you have to fix multiple threads that have forgotten where they should be reading in multiple binlogs?).

Regardless of the solution, it is certain that nothing is going to work in all cases; the most common cases will require use of InnoDB with the proper transaction isolation level, at a minimum. This behavior is going to have to default to single-threaded as replication currently does, and only enable the multi-threaded behavior if the user configures it to do so.

I would be in favor of solution 2) with an option to configure it to behave as solution 1).

The other big question is who wants to sponsor its development? You know who to contact if you do!

from:https://www.percona.com/blog/2010/10/20/mysql-limitations-part-1-single-threaded-replication/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值