. 一、 什么是TIDB
TIDB是 PingCAP 公司开发的开源分布式关系型数据库,结合了传统 OLTP(在线事务处理)和 OLAP(在线分析处理)能力,属于 HTAP(混合事务分析处理)型数据库。其旨在解决 MySQL 等传统数据库在大数据量、高并发场景下的扩展性和高可用性瓶颈
二、核心架构
1. 分层组件
- TiDB Server
无状态 SQL 计算层,处理 SQL 解析、优化与执行,兼容 MySQL 5.7 协议及语法,支持无缝迁移。 - TiKV
分布式键值存储引擎,基于 Raft 协议实现数据多副本强一致性,支持 ACID 事务,默认三副本容 - TiFlash(可选)
列式存储引擎,通过异步复制机制与 TiKV 协同,支持实时 HTAP 分析 - PD (Placement Driver)
集群管理模块,负责元数据存储、负载均衡调度及全局事务 ID 分配
2. 扩展性与高可用
- 计算层(TiDB Server)和存储层(TiKV/TiFlash)可独立水平扩展,支持 PB 级数据
- 基于 Raft 协议实现自动故障切换(RTO≤30 秒,RPO=0),适用于金融级容灾场景
、
三、核心特性
1. 兼容性与迁移友好
- 完全兼容 MySQL 协议,支持主流 ORM 框架和工具(如 Navicat、DBeaver)
- 提供 TiDB DM(数据迁移工具)实现 MySQL 到 TiDB 的低成本迁移
2. 分布式事务与一致性
- 支持跨行 ACID 事务,通过两阶段提交(2PC)和乐观锁机制保障强一致性
3. 实时 HTAP
- TiKV(行存)与 TiFlash(列存)协同,允许同一集群同时处理事务和分析查询,无需 ETL 同步。
4. 云原生支持
- 通过 TiDB Operator 实现 Kubernetes 自动化部署,适配公有云、私有云及混合云架构
五、实践
1、安装准备
1.1创建普通用户
#创建用户
#adduser tidb
#设置密码
#passwd tidb
1.2配置免密码登录
编辑/etc/sudoers文件
vim /etc/sudoers
在文件末尾添加
tidb ALL=(ALL) NOPASSWD:ALL
1.3创建tidb用户ssh key
切换用户
su - tidb
执行命令,一直按回车键就行
ssh-keygen -t rsa
2、安装部署
2.1 文件上传到服务器
使用普通用户登录中控机,以 tidb 用户为例,后续安装 TiUP 及集群管理操作均通过该用户完成:
执行如下命令安装 TiUP 工具:
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
安装完成后,~/.bashrc 已将 TiUP 加入到路径中,你需要新开一个终端或重新声明全局变量 source ~/.bashrc 来使用 TiUP。
安装 TiUP Cluster 组件:
tiup install cluster
根据完成后提示中的第一点,执行如下命令,具体命令根据提示中的来:
source /home/tidb/.bash_profile
至此,TiDB安装所需的TiUP组件已经安装完成,下面进行安装。
3.2配置初始化参数文件
集群初始化配置文件需要手动编写,在tidb用户根目录创建 YAML 格式配置文件,下面是我的配置文件 topology.yaml:
注:ip地址可以替换为安装服务器ip
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/home/tidb/tidb-deploy"
data_dir: "/home/tidb/tidb-data"
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
server_configs:
tidb:
log.slow-threshold: 300
tikv:
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd:
replication.enable-placement-rules: true
replication.location-labels: ["host"]
tiflash:
logger.level: "info"
pd_servers:
- host: 192.168.16.56
tidb_servers:
- host: 192.168.16.56
tikv_servers:
- host: 192.168.16.56
port: 20160
status_port: 20180
config:
server.labels: { host: "logic-host-1" }
- host: 192.168.16.56
port: 20161
status_port: 20181
config:
server.labels: { host: "logic-host-2" }
- host: 192.168.16.56
port: 20162
status_port: 20182
config:
server.labels: { host: "logic-host-3" }
tiflash_servers:
- host: 192.168.16.56
monitoring_servers:
- host: 192.168.16.56
grafana_servers:
- host: 192.168.16.56
3.3部署集群
部署命令
tiup cluster deploy tidb-test v5.1.1 ./topology.yaml --user tidb -p
参数说明:
- 通过 TiUP cluster 部署的集群名称为 tidb-test
- 部署版本为 v4.0.0,其他版本可以执行 tiup list tidb 获取
- 初始化配置文件为 topology.yaml
- –user tidb:通过 tidb 用户登录到目标主机完成集群部署,该用户需要有 ssh 到目标机器的权限,并且在目标机器有 sudo 权限。也可以用其他有 ssh 和 sudo 权限的用户完成部署。
- [-i] 及 [-p]:非必选项,如果已经配置免密登陆目标机,则不需填写。否则选择其一即可,[-i] 为可登录到部署机 root 用户(或 --user 指定的其他用户)的私钥,也可使用 [-p] 交互式输入该用户的密码
如无意外,会出现successfully的提示信息。
3.4启动集群
启动命令
tiup cluster start tidb-test
如无意外,会出现successfully的提示信息。
检查集群状态,命令:tiup cluster display tidb-test
Starting component `cluster`: /home/ubuntu/.tiup/components/cluster/v1.3.2/tiup-cluster display tidb-test
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v5.1.1
SSH type: builtin
Dashboard URL: http://192.168.16.56:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
-- ---- ---- ----- ------- ------ -------- ----------
192.168.16.56:3000 grafana 192.168.16.56 3000 linux/x86_64 Up - /home/ubuntu/tidb/tidb-deploy/grafana-3000
192.168.16.56:2379 pd 192.168.16.56 2379/2380 linux/x86_64 Up|L|UI /home/ubuntu/tidb/tidb-data/pd-2379 /home/ubuntu/tidb/tidb-deploy/pd-2379
192.168.16.56:9090 prometheus 192.168.16.56 9090 linux/x86_64 Up /home/ubuntu/tidb/tidb-data/prometheus-9090 /home/ubuntu/tidb/tidb-deploy/prometheus-9090
192.168.16.56:4000 tidb 192.168.16.56 4000/10080 linux/x86_64 Up - /home/ubuntu/tidb/tidb-deploy/tidb-4000
192.168.16.56:9000 tiflash 192.168.16.56 9000/8123/3930/20170/20292/8234 linux/x86_64 Up /home/ubuntu/tidb/tidb-data/tiflash-9000 /home/ubuntu/tidb/tidb-deploy/tiflash-9000
192.168.16.56:20160 tikv 192.168.16.56 20160/20180 linux/x86_64 Up /home/ubuntu/tidb/tidb-data/tikv-20160 /home/ubuntu/tidb/tidb-deploy/tikv-20160
192.168.16.56:20161 tikv 192.168.16.56 20161/20181 linux/x86_64 Up /home/ubuntu/tidb/tidb-data/tikv-20161 /home/ubuntu/tidb/tidb-deploy/tikv-20161
192.168.16.56:20162 tikv 192.168.16.56 20162/20182 linux/x86_64 Up /home/ubuntu/tidb/tidb-data/tikv-20162 /home/ubuntu/tidb/tidb-deploy/tikv-20162
Total nodes: 8
以上输出的结果中,可以看到tidb的端口号是4000,pd运维端口是2379。我们通过Navicat这种工具连接数据库是使用4000端口,默认密码为空。
3.5 关键组件
几个关键组件信息:
- Pd:元数据及控制调度组件
- Tikv:存储组件
- Tidb:数据库实例组件
- Tiflash:闪存组件
- Tidb虽然和mysql类似,但是它厉害在分布式,如果要使用mysql,数据库变大后,要思考虑分库分表、使用mycat等数据路由工具,Tidb设计从底层一开始分布式,类似hdfs的存储架构,将分布式做成一种原生的架构。
3.6 tiup cluster 命令说明
- 部署集群
部署集群的命令为 tiup cluster deploy,一般用法为:
tiup cluster deploy <cluster-name> <version> <topology.yaml> [flags]
- 查看集群列表
集群部署成功后,可以通过 tiup cluster list 命令在集群列表中查看该集群 3. 启动集群
集群部署成功后,可以执行以下命令启动该集群。如果忘记了部署的集群的名字,可以使用 tiup cluster list 命令查看
tiup cluster start tidb-test
- 检查集群状态
如果想查看集群中每个组件的运行状态,逐一登录到各个机器上查看显然很低效。因此,TiUP 提供了 tiup cluster display 命令,用法如下:
tiup cluster display tidb-test
其他详细的命令与运维操作,可以在官网文档中查看
4、连接测试
因为TiDB内核是MySQL,所以直接用Navicat工具选择连接MySQL就行:
用户名:root
密码:
安装BR工具包
注意:要根据tidb的版本来下载对应的BR工具包
wget https://download.pingcap.org/tidb-toolkit-v5.1.1-linux-amd64.tar.gz
下载解压完成后,做软连接
tar -zxvf tidb-toolkit-v5.1.1-linux-amd64.tar.gz
ln -s tidb-toolkit-v5.1.1-linux-amd64 .tidb-toolkit
配置.bashrc
export PATH=/home/tidb/.tidb-toolkit/bin:$PATH
重新source .bashrc
创建数据库
链接数据库
mysql -h 127.0.0.1 -P 4000 -uroot
mysql> create database br_test;
Query OK, 0 rows affected (0.14 sec)
mysql> use br_test;
Database changed
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.12 sec)
构建数据:
使用python脚本
import pymysql
#host是我自己的本机ip
mydb = pymysql.connect(host="192.168.56.101",user='root', port=4000,database='br_test' )
mycursor = mydb.cursor()
for i in range(10000):
mycursor.execute('insert into br_table values(%s,%s,now())',(i,str(i)+'xxxx'))
if i %1000 == 0:
mycursor.execute('commit')
mycursor.execute('commit')
mycursor.close()
查看生成的数据
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.05 sec)
备份前的准备
调整GC参数
mysql> select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME="tikv_gc_life_time";
+----------------+
| VARIABLE_VALUE |
+----------------+
| 10m0s |
+----------------+
1 row in set (0.00 sec)
mysql> UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME="tikv_gc_life_time";
+----------------+
| VARIABLE_VALUE |
+----------------+
| 720h |
+----------------+
1 row in set (0.00 sec)
备份执行
BR 命令包括备份,恢复两个操作,而备份,恢复又单独针对全库,单库,单表各有操作,因此单独讨论。
而在这些操作之前,其他共用参数单独讨论。
另外需要注意的一点是,因为备份通过 gRPC 发送相关到 TiKV,因此 BR 执行的位置,最好是 PD 节点,避免额外的麻烦。
通用参数
–ca,–cert,–key 如果设置了TLS类连接安全认证,这些参数指定相关安全证书等。
–concurrency 每个节点执行任务的并行度,默认4。
–log-file,–log-level设置日志输出位置以及级别。
-u, --pd 链接 PD 地址,默认127.0.0.1:2379
2.3.2 BR 实操指南
288–ratelimit 限制每个节点的速度,单位是MB
-s, --storage 指定存储位置,比方"local:///data_nfs1"
开始第一次的全量备份
注:我是安装到本地服务器,所以ip是 127.0.0.1,pd进程的默认端口2379,因为是测试所以备份数据临时存放的路径是/tmp,服务器重启后,/tmp下的数据会自动被清除!如需持久测试,请更换存放路径!
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.04.03+0800
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:04:04.462 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=23] [ranges-succeed=23] [ranges-failed=0] [backup-checksum=189.837122ms] [backup-fast-checksum=37.230162ms] [backup-total-ranges=51] [backup-total-regions=51] [total-take=1.315786033s] [total-kv=15863] [total-kv-size=807.2kB] [average-speed=1.014MB/s] ["backup data size(after compressed)"=194.1kB] [BackupTS=426925055851888642]
简单从日志看一下整个备份流程:
* 从 PD 连接获取到所有 TiKV 节点。
* 查询 infoSchema 获取元数据。
* 发送备份请求:{“cluster_id”:6801677637806839235,“start_key”:“dIAAAAAAAAAvX3IAAAAAAAAAAA==”,“end_key”:"dIAAAAAAAAAvX3L
//wA=“,“end_version”:426921533553770497,“concurrency”:4,“storage_backend”:{Backend”:{“Local”:{“path”:“/tmp/userbackupdata”}}}}"
* 各个 TiKV 节点开始执行备份,执行命令完成后,返回到 BR 进行统计。
* 执行表的checksum [table=br_test
.br_table
] [Crc64Xor=12896770389982935753] [TotalKvs=100000] [TotalBytes=4788890]
* 保存备份的元数据。
* 完成备份。
备份过程中,提到的元数据,最终会保存到备份目录下,其主要包含的是校验和,以及备份集的相关描述信息,包括备份集
合中,每个库,表,列的逻辑排列,字符集信息等(对应的是一个 protobuf 格式的描述文件)。
备份完成后,在指定的备份目录,会最终出现命名类似
5_2_23_80992061af3e5194c3f28a5b79d486c5e9db2feda1afb3f84b4ca229ddce9932_write.sst的备份集合,也就是最终的备份文
件。
查看备份的数据
ls /tmp/backupdata/
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589844017_write.sst
5_109_67_2d2f68c250f77d93ffb8ea543a425eebeea7c92d1f033e9d32cdb914ca4341a9_1628589843794_write.sst
5_109_67_302753bc98ec93e98edea12aeac15074da779bb66c735c8ce184eacf43f1e7c2_1628589843957_write.sst
5_109_67_355f0b5a4a8313c61daccaf1f055ac6ce3e0abaeb3d9fda7eb60557014f38dac_1628589843573_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589843589_write.sst
5_109_67_495c5d2e3a1901df43133318e03236d08d1c8882cdd7b452579a4fcd82ccb718_1628589843551_default.sst
5_109_67_495c5d2e3a1901df43133318e03236d08d1c8882cdd7b452579a4fcd82ccb718_1628589843551_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628589843719_write.sst
5_109_67_5efda45e83e08a349095f080e17b995c4e6185537f363448275487b8e043b86d_1628589843751_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589843636_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589843655_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589843645_write.sst
5_109_67_8dd75043152f9a7a07ef9bc0b24b629d464621806e3c93219cac918ec3d5b346_1628589843785_write.sst
5_109_67_994b3ad98a3599f600a3860d1ac7e0516f41765bc5cc4bd119d91a0bcca207c4_1628589843759_write.sst
5_109_67_a24c493bf7e801abe1238806dd31796f911e7dd0970118b9315f0a7366532873_1628589843803_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628589843668_write.sst
5_109_67_c11c7f8053ba1d8a3efe53decec79ee83321252578aaa0e09024538869d60d0d_1628589843947_write.sst
5_109_67_c2baa1d0dee98e26766aa462a873c405eaad66653a52db6bc9e1d584f6aebcc7_1628589843601_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628589843859_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589843663_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628589843809_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628589843677_write.sst
5_109_67_ff781abda573afc7d7878f20390a290dcdb471bf12a8cbabf359ccf9e8d37084_1628589843910_write.sst
backup.lock
backupmeta
恢复数据
为了简化操作,这里在原有集群上进行恢复,往往实际中是要恢复到一个全新的集群上。 首先执行以下语句删除数据:
mysql> drop table br_table;
Query OK, 0 rows affected (0.26 sec)
创建表
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.13 sec)
注意:恢复时候,每个 TiKV 节点都需要访问到所有备份文件,如果不是共享存储,需要手动复制所有备份文件到所 有节点
br restore full --pd "127.0.0.1:2370" --storage "local:///tmp/backupdata"
输出结果:
Detail BR log in /tmp/br.log.2021-08-10T18.05.01+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:05:02.617 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=166.833906ms] [restore-checksum=56.439317ms] [restore-ranges=1] [total-take=1.151067383s] ["restore data size(after decompressed)"=1.014MB] [total-kv=100000] [total-kv-size=4.789MB] [average-speed=16.33MB/s]
查看数据库里面的数据:
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.04 sec)
数据数量一致
从日志查看恢复过程:
-
寻找并确认 PD 节点。
-
执行DDL语句:CREATE DATABASE /!32312 IF NOT EXISTS/ 以及 CREATE TABLE IF NOT EXISTS。
-
执行必要的alter auto incrementID 语句, 防止恢复后从之前的 id 分配,导致数据覆盖。
-
切割 sst 为 Region 负责的小数据集合,分别进行数据写入。
-
完成操作后,输出统计信息
-
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:05:02.617 +08:00] [INFO] [collector.go:66] [“Full restore success summary”] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=166.833906ms] [restore-checksum=56.439317ms] [restore-ranges=1] [total-take=1.151067383s] [“restore data size(after decompressed)”=1.014MB] [total-kv=100000] [total-kv-size=4.789MB] [average-speed=16.33MB/s]
执行完成后,可以看到数据已经恢复完成
增量备份数据
如果想要备份增量,只需要在备份的时候指定上一次的备份时间戳 --lastbackupts
即可。
你可以使用 validate
指令获取上一次备份的时间戳,示例如下:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///home/tidb/backupdata | tail -n1`
Demo:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata | tail -n1`
echo $LAST_BACKUP_TS
结果输出:
426925055851888642
注意增量备份有以下限制:
- 增量备份需要与前一次全量备份在不同的路径下
- GC safepoint 必须在
lastbackupts
之前
第一次增量:
计算时间戳
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata | tail -n1`
echo $LAST_BACKUP_TS
426925055851888642
开始备份:
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr1 --lastbackupts ${LAST_BACKUP_TS}
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.05.49+0800
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:05:50.483 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=11] [ranges-succeed=11] [ranges-failed=0] [backup-checksum=1.413079ms] [backup-total-ranges=51] [backup-total-regions=51] [total-take=723.576856ms] [BackupTS=426925083796439045] [total-kv=15470] [total-kv-size=743.4kB] [average-speed=1.985MB/s] ["backup data size(after compressed)"=138.5kB]
查看对应的备份文件:
ls /tmp/backupdata/incr1
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589950313_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589950146_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628589950169_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589950154_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589950159_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589950157_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628589950164_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628589950222_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589950162_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628589950205_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628589950166_write.sst
backup.lock
backupmeta
注意:计算时间戳的时候要指定上一次的备份的路径,否则得不到正确的时间戳
第二次增量:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr1 | tail -n1`
echo $LAST_BACKUP_TS
426925083796439045
执行备份:
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr2 --lastbackupts ${LAST_BACKUP_TS}
结果输出:
[2021/08/10 18:06:39.478 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=6] [ranges-succeed=6] [ranges-failed=0] [backup-checksum=914.068µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=485.563975ms] [BackupTS=426925096693923847] [total-kv=4067] [total-kv-size=195.4kB] [average-speed=679.5kB/s] ["backup data size(after compressed)"=52.5kB]
查看对应的备份文件:
ls /tmp/backupdata/incr2
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589999418_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589999214_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589999227_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589999236_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589999230_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589999243_write.sst
backup.lock
backupmeta
第三次增量:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr2 | tail -n1`
echo $LAST_BACKUP_TS
426925096693923847
执行备份:
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr3 --lastbackupts ${LAST_BACKUP_TS}
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.07.16+0800
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:07:17.347 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=1] [ranges-succeed=1] [ranges-failed=0] [backup-checksum=746.756µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=567.9266ms] [total-kv=2895] [total-kv-size=139kB] [average-speed=408.7kB/s] ["backup data size(after compressed)"=33kB] [BackupTS=426925106616074243]
查看对应的备份文件:
ls /tmp/backupdata/incr3
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590037236_write.sst
backup.lock
backupmeta
第四次增量:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr3 | tail -n1`
echo $LAST_BACKUP_TS
426925106616074243
执行备份:
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr4 --lastbackupts ${LAST_BACKUP_TS}
结果输出:
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:08:53.055 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=11] [ranges-succeed=11] [ranges-failed=0] [backup-checksum=730.937µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=865.382191ms] [BackupTS=426925131624611844] [total-kv=13502] [total-kv-size=652.7kB] [average-speed=1.275MB/s] ["backup data size(after compressed)"=124.6kB]
查看对应的备份文件:
ls /tmp/backupdata/incr4
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590132837_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628590132559_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628590132624_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628590132567_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628590132577_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628590132571_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628590132592_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628590132732_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628590132582_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628590132691_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628590132606_write.sst
backup.lock
backupmeta
第五次增量:
LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr3 | tail -n1`
echo $LAST_BACKUP_TS
426925131624611844
执行备份:
br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr4 --lastbackupts ${LAST_BACKUP_TS}
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.10.09+0800
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:10:10.122 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=6] [ranges-succeed=6] [ranges-failed=0] [backup-checksum=735.929µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=161.498529ms] ["backup data size(after compressed)"=37.21kB] [BackupTS=426925151993200642] [total-kv=2595] [total-kv-size=124.7kB] [average-speed=1.256MB/s]
查看对应的备份文件:
ls /tmp/backupdata/incr5
结果输出:
1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590210101_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628590210028_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628590210034_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628590210039_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628590210036_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628590210044_write.sst
backup.lock
backupmeta
查看数据库:
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 36036 |
+----------+
1 row in set (0.03 sec)
增量恢复
增量恢复的方法和使用 BR 进行全量恢复的方法并无差别。需要注意,恢复增量数据的时候,需要保证备份时指定的 last backup ts
之前备份的数据已经全部恢复到目标集群
先删除数据
mysql> drop table br_table;
Query OK, 0 rows affected (0.22 sec)
创建表
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.13 sec)
恢复增量数据
按增量备份的顺序:恢复incr1、incr2、incr3、incr4
命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
结果
Detail BR log in /tmp/br.log.2021-08-10T18.13.01+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:13:02.615 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=165.914059ms] [restore-checksum=13.408µs] [restore-ranges=1] [total-take=1.084097852s] [total-kv-size=410kB] [average-speed=4.322MB/s] ["restore data size(after decompressed)"=92.78kB] [total-kv=8541]
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 8541 |
+----------+
1 row in set (0.02 sec)
命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"
结果
Detail BR log in /tmp/br.log.2021-08-10T18.14.19+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:14:19.846 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.90882ms] [restore-checksum=15.614µs] [restore-ranges=1] [total-take=228.696715ms] ["restore data size(after decompressed)"=44.73kB] [total-kv=4059] [total-kv-size=194.8kB] [average-speed=2.883MB/s]
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 12600 |
+----------+
1 row in set (0.01 sec)
命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
结果
Detail BR log in /tmp/br.log.2021-08-10T18.15.09+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:15:09.809 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=2.769111ms] [restore-checksum=1.122126ms] [restore-ranges=1] [total-take=159.917469ms] ["restore data size(after decompressed)"=33kB] [total-kv=2895] [total-kv-size=139kB] [average-speed=3.812MB/s]
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 15495 |
+----------+
1 row in set (0.01 sec)
命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
结果
Detail BR log in /tmp/br.log.2021-08-10T18.15.34+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:15:35.175 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=14.727µs] [split-region=3.20808ms] [restore-ranges=1] [total-take=185.808088ms] ["restore data size(after decompressed)"=78.05kB] [total-kv=7075] [total-kv-size=339.6kB] [average-speed=7.533MB/s]]
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 22570 |
+----------+
1 row in set (0.00 sec)
命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
结果
Detail BR log in /tmp/br.log.2021-08-10T18.16.26+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:16:26.324 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=13.012µs] [split-region=7.31328ms] [restore-ranges=1] [total-take=184.869222ms] ["restore data size(after decompressed)"=29.43kB] [total-kv=2587] [total-kv-size=124.2kB] [average-speed=3.314MB/s]
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 25157 |
+----------+
1 row in set (0.01 sec)
全量命令
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/
结果
Detail BR log in /tmp/br.log.2021-08-10T18.17.42+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:17:42.609 +08:00] [INFO] [collector.go:66] ["Full restore failed summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=16.228805ms] [split-region=3.13305ms] [restore-ranges=1]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch
查看数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 36036 |
+----------+
1 row in set (0.02 sec)
数据是恢复了,不过br也报错了。
重新删除所有的数据,严格按照备份顺序来恢复数据
mysql> drop table br_table;
Query OK, 0 rows affected (0.32 sec)
创建表
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.13 sec)
恢复顺序 full->incr->incr1->incr2-incr3
full:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.18.28+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:18:29.163 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=129.336319ms] [restore-checksum=24.402545ms] [restore-ranges=1] [total-take=841.561304ms] ["restore data size(after decompressed)"=112.2kB] [total-kv=10879] [total-kv-size=511.1kB] [average-speed=8.273MB/s]
查看数据库:
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 10879 |
+----------+
1 row in set (0.02 sec)
Incr1:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.19.22+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:19:22.831 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=740.957µs] [restore-checksum=14.175µs] [restore-ranges=1] [total-take=153.847127ms] [total-kv=8541] [total-kv-size=410kB] [average-speed=7.223MB/s] ["restore data size(after decompressed)"=92.78kB]
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 19420 |
+----------+
1 row in set (0.00 sec)
Incr2:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.20.05+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:20:06.098 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.181962ms] [restore-checksum=28.557µs] [restore-ranges=1] [total-take=152.04905ms] ["restore data size(after decompressed)"=44.73kB] [total-kv=4059] [total-kv-size=194.8kB] [average-speed=5.707MB/s]
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 23479 |
+----------+
1 row in set (0.01 sec)
Incr3:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.20.31+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:20:31.562 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.830579ms] [restore-checksum=16.857µs] [restore-ranges=1] [total-take=181.453478ms] ["restore data size(after decompressed)"=33kB] [total-kv=2895] [total-kv-size=139kB] [average-speed=4.243MB/s]
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 26374 |
+----------+
1 row in set (0.00 sec)
Incr4:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.21.05+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:21:05.900 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=788.923µs] [restore-checksum=13.061µs] [restore-ranges=1] [total-take=149.491308ms] ["restore data size(after decompressed)"=78.05kB] [total-kv=7075] [total-kv-size=339.6kB] [average-speed=7.096MB/s]
查看数据库:
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 33449 |
+----------+
1 row in set (0.04 sec)
Incr5:
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
结果输出:
Detail BR log in /tmp/br.log.2021-08-10T18.21.42+0800
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:21:42.984 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=973.306µs] [restore-checksum=13.075µs] [restore-ranges=1] [total-take=154.392711ms] [total-kv=2587] [total-kv-size=124.2kB] [average-speed=4.41MB/s] ["restore data size(after decompressed)"=29.43kB]
查看数据库:
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 36036 |
+----------+
1 row in set (0.01 sec)
数据恢复完成。
我们来打乱下顺序:
full->5->4->3->1->2
先drop数据库
mysql> drop table br_table;
Query OK, 0 rows affected (0.24 sec)
创建表
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.13 sec)
开始执行备份恢复
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"
再来看下数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 36036 |
+----------+
1 row in set (0.00 sec)
数据库恢复成功
查看数据详情:
mysql> select* from br_table;
并未发现数据乱序的情况
再试下5->4->3->1->2->full
先drop数据库
mysql> drop table br_table;
Query OK, 0 rows affected (0.24 sec)
创建表
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp);
Query OK, 0 rows affected (0.13 sec)
开始执行备份恢复
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"
执行到 br restore full --pd “127.0.0.1:2379” --storage "local:///tmp/backupdata"时,出现以下错误:
Detail BR log in /tmp/br.log.2021-08-11T10.32.26+0800
Full restore <----------------------------------------------------------------------------------------> 100.00%
[2021/08/11 10:32:26.592 +08:00] [INFO] [collector.go:66] ["Full restore failed summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.305227ms] [restore-checksum=21.161653ms] [restore-ranges=1]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch
再来看下数据库
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
| 36036 |
+----------+
1 row in set (0.00 sec)
数据库恢复成功
查看数据详情:
mysql> select* from br_table;
并未发现数据乱序的情况
结论:仅本人不严谨的测试,未按照优先恢复全量的备份的情况下恢复数据,会出现以上问题,不过数据是恢复成功。
警告:
虽然系统表(例如 mysql.tidb
等)可以通过 BR 进行备份和恢复,但是部分系统表在恢复之后可能会出现非预期的状况,已知的异常如下:
- 统计信息表(
mysql.stat_*
)无法被恢复。 - 系统变量表(
mysql.tidb
,mysql.global_variables
)无法被恢复。 - 用户信息表(
mysql.user
,mysql.columns_priv
,等等)无法被恢复。 - GC 数据无法被恢复。
恢复系统表可能还存在更多兼容性问题。为了防止意外发生,请避免在生产环境中恢复系统表。
疑问:
1.gc的时间间隔小于 增量的时间间隔的时候,那么增量备份的数据是否还有意义?