int count = updateSql();
if(count==0){
insertSql();
}
这段逻辑常常出现在大批量同步数据的应用中。因为常常需要反复多次同步数据。所以有的人会写这段逻辑:先尝试修改,修改没有命中。则尝试插入。
但是这段逻辑在并发执行的过程中很容易产生死锁。
准备表:
create table t
(
id int not null primary key,
name varchar(10) null
);
INSERT INTO test.t (id, name) VALUES (0, '张三');
INSERT INTO test.t (id, name) VALUES (10, '李四');
INSERT INTO test.t (id, name) VALUES (15, '王五');
开启2个事务,依次执行如下SQL。模拟并发的逻辑:
事务1 | 事务2 |
update t set name='赵四' where id=3; | |
update t set name='钱五' where id=4; | |
insert into t( id, name) VALUES (3,'赵四'); | |
insert into t( id, name) VALUES (4,'钱五'); |
最后会产生死锁。执行show engine innodb status; 观察死锁部分的信息。
LATEST DETECTED DEADLOCK
------------------------
2024-11-21 10:24:01 0x700006473000
*** (1) TRANSACTION:
TRANSACTION 88873, ACTIVE 6 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 41, OS thread handle 123145426866176, query id 3536 localhost 127.0.0.1 root update
/* ApplicationName=DataGrip 2023.3.4 */ insert into t( id, name) VALUES (4,'钱五')
*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 26 page no 4 n bits 80 index PRIMARY of table `test`.`t` trx id 88873 lock_mode X locks gap before rec
Record lock, heap no 9 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000015b0d; asc [ ;;
2: len 7; hex 02000001160c5f; asc _;;
3: len 6; hex e69d8ee59b9b; asc ;;
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26 page no 4 n bits 80 index PRIMARY of table `test`.`t` trx id 88873 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 9 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000015b0d; asc [ ;;
2: len 7; hex 02000001160c5f; asc _;;
3: len 6; hex e69d8ee59b9b; asc ;;
*** (2) TRANSACTION:
TRANSACTION 88872, ACTIVE 11 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 40, OS thread handle 123145428996096, query id 3545 localhost 127.0.0.1 root update
/* ApplicationName=DataGrip 2023.3.4 */ insert into t( id, name) VALUES (3,'赵四')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 26 page no 4 n bits 80 index PRIMARY of table `test`.`t` trx id 88872 lock_mode X locks gap before rec
Record lock, heap no 9 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000015b0d; asc [ ;;
2: len 7; hex 02000001160c5f; asc _;;
3: len 6; hex e69d8ee59b9b; asc ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 26 page no 4 n bits 80 index PRIMARY of table `test`.`t` trx id 88872 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 9 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 8000000a; asc ;;
1: len 6; hex 000000015b0d; asc [ ;;
2: len 7; hex 02000001160c5f; asc _;;
3: len 6; hex e69d8ee59b9b; asc ;;
*** WE ROLL BACK TRANSACTION (2)
2个事务的锁情况一模一样。
获取了space id 26 page no 4 n bits 80位置处的间隙锁“lock_mode X locks gap before rec”
等待获取space id 26 page no 4 n bits 80位置处的插入意向锁“lock_mode X locks gap before rec insert intention waiting”。
因为各自手里都持有阻塞对面的间隙锁。又都不放手,所以死锁产生了。
这里比较难以理解的一个点就是:任何线程都可以持有同样的一段间隙锁,互不排斥。
所以2个事务手里都有间隙锁,但是当他们都想插入的时候,插入意向锁则被对方手里的间隙锁排斥了。
代码写出这个情况,容易误判处在于“update语句表面上走的主键,看似只会锁对应的一条数据。但是在没有命中的情况下,其实去获取了那个对应区间段的整个间隙锁。”
针对这种情况,我们可以将这段逻辑的会话事务等级改为READ COMMITED。就没有间隙锁的事儿了。因为同步数据的时候,其实我们并不在乎会不会有幻读。间隙锁在这段逻辑中只有反作用。