Postgresql MVCC机制源码初探
MVCC(Multi-Version Concurrency Control多版本并发控制)机制几乎是每个数据库的标配,那么postgresql的mvcc机制是怎么实现的?我们从几个简单语句作为切入点,初步了解下相关代码。
以postgresql最常用的RC 隔离级别为例,sessionA session B分别为两个连接会话,前面的数字为语句执行先后顺序,现在已经有一个表t1,有一个列col,插入了一行记录1,再按如下表格中语句顺序执行:
| Session A | Session B |
| 1 begin; BEGIN insert into t1 values(2); 3select xmax,xmin,*from t1; commit;
| 2 select xmax,xmin,*from t1; 4 select xmax,xmin,*from t1;
|
第1步中语句执行后,又插了一条记录2,再在session B中执行第2步,显然我们是只能查到一条记录如下:
postgres=#select xmax,xmin,*from t1;
xmax | xmin | col
------+------+-----
0 | 553 | 1
(1 row)
但是,此时在buffer中t1表的页面上是有两条记录的,只是另外一条对Session B来说不可见。由于原来的记录1先插入,那么在页面上指向真实元组的页内指针在前面,而记录2在后面,因此会先扫描到记录1,那么执行到第2步时,会对每一条记录都做可见性检查判断。
在HeapTupleSatisfiesMVCC接口中判断是否可见,第一条记录判断流程如下:
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
....
if (!HeapTupleHeaderXminCommitted(tuple)) //xmin插入事务是否提交,第一条已经提交,因此这里不能进入,跳过
{
....
}
else
{
/* xmin is committed, but maybe not according to our snapshot */
if (!HeapTupleHeaderXminFrozen(tuple) &&
XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) //此处不成立,继续往下执行
return false; /* treat as still in progress */
}
if (tuple->t_infomask & HEAP_XMAX_INVALID) //成立,返回true,如果插入事务已经commit,而没有进行过delete操作,那是可见的
return true;
.....
}
再看第二条记录判断流程:
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
...
if (!HeapTupleHeaderXminCommitted(tuple)) //进入,插入事务未提交
{
if (HeapTupleHeaderXminInvalid(tuple)) //非invalid,跳过
return false;
/* Used by pre-9.0 binary upgrades */
if (tuple->t_infomask & HEAP_MOVED_OFF) //非HEAP_MOVED_OFF ,跳过
{
...
}
/* Used by pre-9.0 binary upgrades */
else if (tuple->t_infomask & HEAP_MOVED_IN) //跳过
{
...
}
else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple))) //插入事务是否为当前事务,不是,跳过
{
...
}
else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) //插入事务是否在快照中(未提交的事务),成立,返回false
return false;
....
}然后我们再执行第3步,在session A中执行select,第一条记录判断和之前相同,不再赘述,第二条记录判断如下:
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
...
if (!HeapTupleHeaderXminCommitted(tuple)) //进入,插入事务未提交
{
if (HeapTupleHeaderXminInvalid(tuple)) //非invalid,跳过
return false;
/* Used by pre-9.0 binary upgrades */
if (tuple->t_infomask & HEAP_MOVED_OFF) //非HEAP_MOVED_OFF ,跳过
{
...
}
/* Used by pre-9.0 binary upgrades */
else if (tuple->t_infomask & HEAP_MOVED_IN) //跳过
{
...
}
else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple)))//插入事务是否为当前事务,是的,进入
{
if (HeapTupleHeaderGetCmin(tuple) >= snapshot->curcid)
return false; /* inserted after scan started */
if (tuple->t_infomask & HEAP_XMAX_INVALID)//insert和select在同一个事务中,而没有进行过delete操作,那是可见的,返回true,能看到两条记录
return true;
}
.....
}
接下来commit,再运行第4步,从B 会话中select,显然会有两条记录,对于第一条相同不再分析,那么此时第二条是怎么样的流程呢?
bool
HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Buffer buffer)
{
HeapTupleHeader tuple = htup->t_data;
if (!HeapTupleHeaderXminCommitted(tuple)) //进入,虽然插入事务已经提交,原因见下面
{
if (HeapTupleHeaderXminInvalid(tuple))
return false;
/* Used by pre-9.0 binary upgrades */
if (tuple->t_infomask & HEAP_MOVED_OFF) //不成立
{
...
}
/* Used by pre-9.0 binary upgrades */
else if (tuple->t_infomask & HEAP_MOVED_IN) //不成立
{
...
}
else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetRawXmin(tuple))) //不成立
{
...
}
else if (XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot)) //不成立
return false;
else if (TransactionIdDidCommit(HeapTupleHeaderGetRawXmin(tuple))) //成立,检查插入事务是否已经提交,如果已经提交,则在tuple上把状态修改成提交状态HEAP_XMIN_COMMITTED,也就是说,提交时并不马上更新状态,等下次用到时再更新,这个操作只要做一次,如果再select,会因为这个状态已经置成了commited而跳过这些逻辑。
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
HeapTupleHeaderGetRawXmin(tuple));
else
{
/* it must have aborted or crashed */
SetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
InvalidTransactionId);
return false;
}
}
else
{
/* xmin is committed, but maybe not according to our snapshot */
if (!HeapTupleHeaderXminFrozen(tuple) &&
XidInMVCCSnapshot(HeapTupleHeaderGetRawXmin(tuple), snapshot))
return false; /* treat as still in progress */
}
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) //同样在这里返回true
return true;
...
}

本文通过具体示例,详细解析了PostgreSQL中MVCC(多版本并发控制)机制的工作原理,包括如何判断记录的可见性及事务提交后的状态更新过程。

被折叠的 条评论
为什么被折叠?



