初始化RegionServer接口为TransactionalRegionInterface,初始化RegionServer实现类为TransactionalRegionServer;
conf.set(HConstants.REGION_SERVER_CLASS,
TransactionalRegionInterface.class.getName());
conf.set(HConstants.REGION_SERVER_IMPL, TransactionalRegionServer.class
.getName());
TransactionalRegionInterface继承自HRegionInterface,扩展了HRegionInterface的put、get、delete方法,粗略一看都是增加了一个类型为long的transactionId,这正是事务id,另外增加了事务相关了commit、abort方法。
TransactionalRegionServer是对TransactionalRegionInterface的实现,下面会阐述到。
可以看到TransactionalRegionServer接管了HRegionServer的工作,随着hbase启动,TransactionalRegionServer启动了两个新的线程来处理事务:
1)CleanOldTransactionsChore:用于清理已提交的事务;
2)Leases:事务租约;
TransactionalRegionServer重写了instantiateRegion与instantiateHLog方法,分别用TransactionalRegion、THLog接管了HRegion、HLog的工作;
new一个TransactionManager实例:
transactionManager = new TransactionManager(conf);
内部实现如下:
1)获取单例的事务日志处理器对象:
public TransactionManager(final HBaseConfiguration conf) {
this(LocalTransactionLogger.getInstance(), conf);
}
public synchronized static LocalTransactionLogger getInstance() {
if (instance == null) {
instance = new LocalTransactionLogger();
}
return instance;
}
2)获取连接
public TransactionManager(final TransactionLogger transactionLogger,
final HBaseConfiguration conf) {
this.transactionLogger = transactionLogger;
connection = HConnectionManager.getConnection(conf);
}
开始事务:
TransactionState transactionState = transactionManager
.beginTransaction();
内部实现如下:
public TransactionState beginTransaction() {
long transactionId = transactionLogger.createNewTransactionLog();
LOG.debug("Begining transaction " + transactionId);
return new TransactionState(transactionId);
}
其中createNewTransactionLog内部实现如下:
public long createNewTransactionLog() {
long id;
do {
id = random.nextLong();
} while (transactionIdToStatusMap.containsKey(id));
transactionIdToStatusMap.put(id, TransactionStatus.PENDING);
return id;
}
可以看到是随机生成一个事务id号,且将这个id号状态置为TransactionStatus.PENDING,放入transactionIdToStatusMap;
客户端初始化一个TransactionalTable;
table = new TransactionalTable(conf, desc.getName());
TransactionalTable继承自HTable,内部有个静态方法,执行TransactionalRPC的初始化:
static {
TransactionalRPC.initialize();
}
初始化内部如下:
public synchronized static void initialize() {
if (initialized) {
return;
}
HBaseRPC.addToMap(TransactionalRegionInterface.class, RPC_CODE);
initialized = true;
}
增加TransactionalRegionInterface接口到RPC调用的Map;
客户端写入数据:
table.put(transactionState, new Put(ROW2).add(FAMILY, QUAL_A, row1_A
.getValue(COL_A)));
put方法实现如下:
public synchronized void put(TransactionState transactionState, final Put put) throws IOException {
//super.validatePut(put);
super.getConnection().getRegionServerWithRetries(
new TransactionalServerCallable<Object>(super.getConnection(), super
.getTableName(), put.getRow(), transactionState) {
public Object call() throws IOException {
recordServer();
getTransactionServer().put(
transactionState.getTransactionId(),
location.getRegionInfo().getRegionName(), put);
return null;
}
});
}
事实上是封装了一个TransactionalServerCallable,等待获取到RegionServer后,调用TransactionalServerCallable中的回调函数;
回调函数实现过程如下:
a)记录regionServer,执行recordServer方法:
protected void recordServer() throws IOException {
if (transactionState.addRegion(location)) {
getTransactionServer().beginTransaction(
transactionState.getTransactionId(),
location.getRegionInfo().getRegionName());
}
}
调用TransactionalRegion的beginTransaction方法:
public void beginTransaction(final long transactionId) throws IOException {
checkClosing();
String key = String.valueOf(transactionId);
if (transactionsById.get(key) != null) {
TransactionState alias = getTransactionState(transactionId);
if (alias != null) {
alias.setStatus(Status.ABORTED);
retireTransaction(alias);
}
LOG.error("Existing trasaction with id [" + key + "] in region ["
+ super.getRegionInfo().getRegionNameAsString() + "]");
throw new IOException("Already exiting transaction id: " + key);
}
TransactionState state = new TransactionState(transactionId, super.getLog()
.getSequenceNumber(), super.getRegionInfo());
state.setStartSequenceNumber(nextSequenceId.get());
List<TransactionState> commitPendingCopy = new ArrayList<TransactionState>(
commitPendingTransactions);
for (TransactionState commitPending : commitPendingCopy) {
state.addTransactionToCheck(commitPending);
}
synchronized (transactionsById) {
transactionsById.put(key, state);
}
try {
transactionLeases.createLease(getLeaseId(transactionId),
new TransactionLeaseListener(key));
} catch (LeaseStillHeldException e) {
LOG.error("Lease still held for [" + key + "] in region ["
+ super.getRegionInfo().getRegionNameAsString() + "]");
throw new RuntimeException(e);
}
LOG.debug("Begining transaction " + key + " in region "
+ super.getRegionInfo().getRegionNameAsString());
maybeTriggerOldTransactionFlush();
}
b)调用TransactionalRegion的put方法写入数据,可以看到不同于HRegion的put方法,此处只写日志,而没有写入memstore:
public void put(final long transactionId, final Put put) throws IOException {
checkClosing();
TransactionState state = getTransactionState(transactionId);
state.addWrite(put);
this.hlog.writeUpdateToLog(super.getRegionInfo(), transactionId, put);
}
写THLog:
public void append(HRegionInfo regionInfo, Put update, long transactionId)
throws IOException {
long commitTime = System.currentTimeMillis();
THLogKey key = new THLogKey(regionInfo.getRegionName(), regionInfo
.getTableDesc().getName(), -1, commitTime, THLogKey.TrxOp.OP,
transactionId);
for (KeyValue value : convertToKeyValues(update)) {
super.append(regionInfo, key, value);
}
}
可以看到用THLogKey重构了HLogKey,写HLog时,比通常的HLog多写了两个内容:transactionOp与transactionId;
public THLogKey(byte[] regionName, byte[] tablename, long logSeqNum, long now, TrxOp op, long transactionId) {
super(regionName, tablename, logSeqNum, now);
this.transactionOp = op.opCode;
this.transactionId = transactionId;
}
public void write(DataOutput out) throws IOException {
super.write(out);
out.writeByte(transactionOp);
out.writeLong(transactionId);
}
提交事务:
transactionManager.tryCommit(transactionState);
内部实现为:
public void tryCommit(final TransactionState transactionState)
throws CommitUnsuccessfulException, IOException {
long startTime = System.currentTimeMillis();
LOG.debug("atempting to commit trasaction: " + transactionState.toString());
int status = prepareCommit(transactionState);
if (status == TransactionalRegionInterface.COMMIT_OK) {
doCommit(transactionState);
} else if (status == TransactionalRegionInterface.COMMIT_OK_READ_ONLY) {
transactionLogger.forgetTransaction(transactionState.getTransactionId());
}
LOG.debug("Committed transaction ["+transactionState.getTransactionId()+"] in ["+((System.currentTimeMillis()-startTime))+"]ms");
}
doCommit调用TransactionalRegion的commit方法,从TransactionState获取数据,然后调用原生HRegion的put方法,真正写入memstore,并且再写一遍HLog,注意此时写的HLog不带事务状态与事务id;
private void commit(final TransactionState state) throws IOException {
LOG.debug("Commiting transaction: " + state.toString() + " to "
+ super.getRegionInfo().getRegionNameAsString());
// FIXME potential mix up here if some deletes should come before the puts.
for (Put update : state.getPuts()) {
this.put(update, true);
}
for (Delete delete : state.getDeleteSet()) {
this.delete(delete, null, true);
}
// Now the transaction lives in the WAL, we can write a commit to the log
// so we don't have to recover it.
if (state.hasWrite()) {
this.hlog.writeCommitToLog(super.getRegionInfo(), state.getTransactionId());
}
state.setStatus(Status.COMMITED);
if (state.hasWrite()
&& !commitPendingTransactions.remove(state)) {
LOG
.fatal("Commiting a non-query transaction that is not in commitPendingTransactions");
// Something has gone really wrong.
throw new IOException("commit failure");
}
retireTransaction(state);
}
最后writeCommitToLog方法将该事务的提交状态也写入HLog,该条记录只有Key,没有Value,标记一个事务完整结束。
如果事务失败,会将事务的失败状态也写入HLog,并且不会再写一遍HLog,这样数据恢复时,就不会讲这段记录恢复。