TIDB主从恢复

. 一、 什么是TIDB

TIDB是 PingCAP 公司开发的开源分布式关系型数据库,结合了传统 OLTP(在线事务处理)和 OLAP(在线分析处理)能力,属于 HTAP(混合事务分析处理)型数据库。其旨在解决 MySQL 等传统数据库在大数据量、高并发场景下的扩展性和高可用性瓶颈

二、核心架构

1. 分层组件

  • TiDB Server
    无状态 SQL 计算层,处理 SQL 解析、优化与执行,兼容 MySQL 5.7 协议及语法,支持无缝迁移。
  • TiKV
    分布式键值存储引擎,基于 Raft 协议实现数据多副本强一致性,支持 ACID 事务,默认三副本容
  • TiFlash(可选)
    列式存储引擎,通过异步复制机制与 TiKV 协同,支持实时 HTAP 分析
  • PD (Placement Driver)
    集群管理模块,负责元数据存储、负载均衡调度及全局事务 ID 分配

2. 扩展性与高可用

  • 计算层(TiDB Server)和存储层(TiKV/TiFlash)可独立水平扩展,支持 PB 级数据
  • 基于 Raft 协议实现自动故障切换(RTO≤30 秒,RPO=0),适用于金融级容灾场景

三、核心特性

1. 兼容性与迁移友好

  • 完全兼容 MySQL 协议,支持主流 ORM 框架和工具(如 Navicat、DBeaver)
  • 提供 TiDB DM(数据迁移工具)实现 MySQL 到 TiDB 的低成本迁移

2. 分布式事务与一致性

  • 支持跨行 ACID 事务,通过两阶段提交(2PC)和乐观锁机制保障强一致性

3. 实时 HTAP

  • TiKV(行存)与 TiFlash(列存)协同,允许同一集群同时处理事务和分析查询,无需 ETL 同步。

4. 云原生支持

  • 通过 TiDB Operator 实现 Kubernetes 自动化部署,适配公有云、私有云及混合云架构

五、实践

1、安装准备

1.1创建普通用户
#创建用户
#adduser tidb

#设置密码
#passwd tidb
1.2配置免密码登录

编辑/etc/sudoers文件

vim /etc/sudoers

在文件末尾添加

tidb ALL=(ALL)  NOPASSWD:ALL
1.3创建tidb用户ssh key

切换用户

su - tidb

执行命令,一直按回车键就行

ssh-keygen -t rsa

2、安装部署

2.1 文件上传到服务器

使用普通用户登录中控机,以 tidb 用户为例,后续安装 TiUP 及集群管理操作均通过该用户完成:

执行如下命令安装 TiUP 工具:

curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh

安装完成后,~/.bashrc 已将 TiUP 加入到路径中,你需要新开一个终端或重新声明全局变量 source ~/.bashrc 来使用 TiUP。

安装 TiUP Cluster 组件:

tiup install cluster

根据完成后提示中的第一点,执行如下命令,具体命令根据提示中的来:

source /home/tidb/.bash_profile

至此,TiDB安装所需的TiUP组件已经安装完成,下面进行安装。

3.2配置初始化参数文件

集群初始化配置文件需要手动编写,在tidb用户根目录创建 YAML 格式配置文件,下面是我的配置文件 topology.yaml:

注:ip地址可以替换为安装服务器ip

global:
 user: "tidb"
 ssh_port: 22
 deploy_dir: "/home/tidb/tidb-deploy"
 data_dir: "/home/tidb/tidb-data"

monitored:
 node_exporter_port: 9100
 blackbox_exporter_port: 9115

server_configs:
 tidb:
   log.slow-threshold: 300
 tikv:
   readpool.storage.use-unified-pool: false
   readpool.coprocessor.use-unified-pool: true
 pd:
   replication.enable-placement-rules: true
   replication.location-labels: ["host"]
 tiflash:
   logger.level: "info"

pd_servers:
 - host: 192.168.16.56

tidb_servers:
 - host: 192.168.16.56

tikv_servers:
 - host: 192.168.16.56
   port: 20160
   status_port: 20180
   config:
     server.labels: { host: "logic-host-1" }

 - host: 192.168.16.56
   port: 20161
   status_port: 20181
   config:
     server.labels: { host: "logic-host-2" }

 - host: 192.168.16.56
   port: 20162
   status_port: 20182
   config:
     server.labels: { host: "logic-host-3" }

tiflash_servers:
 - host: 192.168.16.56

monitoring_servers:
 - host: 192.168.16.56

grafana_servers:
 - host: 192.168.16.56
3.3部署集群

部署命令

tiup cluster deploy tidb-test v5.1.1 ./topology.yaml --user tidb -p
参数说明:
  • 通过 TiUP cluster 部署的集群名称为 tidb-test
  • 部署版本为 v4.0.0,其他版本可以执行 tiup list tidb 获取
  • 初始化配置文件为 topology.yaml
  • –user tidb:通过 tidb 用户登录到目标主机完成集群部署,该用户需要有 ssh 到目标机器的权限,并且在目标机器有 sudo 权限。也可以用其他有 ssh 和 sudo 权限的用户完成部署。
  • [-i] 及 [-p]:非必选项,如果已经配置免密登陆目标机,则不需填写。否则选择其一即可,[-i] 为可登录到部署机 root 用户(或 --user 指定的其他用户)的私钥,也可使用 [-p] 交互式输入该用户的密码

如无意外,会出现successfully的提示信息。

3.4启动集群

启动命令

tiup cluster start tidb-test

如无意外,会出现successfully的提示信息。

检查集群状态,命令:tiup cluster display tidb-test

Starting component `cluster`: /home/ubuntu/.tiup/components/cluster/v1.3.2/tiup-cluster display tidb-test
Cluster type:       tidb
Cluster name:       tidb-test
Cluster version:    v5.1.1
SSH type:           builtin
Dashboard URL:      http://192.168.16.56:2379/dashboard
ID                   Role        Host           Ports                            OS/Arch       Status   Data Dir                                     Deploy Dir
--                   ----        ----           -----                            -------       ------   --------                                     ----------
192.168.16.56:3000   grafana     192.168.16.56  3000                             linux/x86_64  Up       -                                            /home/ubuntu/tidb/tidb-deploy/grafana-3000
192.168.16.56:2379   pd          192.168.16.56  2379/2380                        linux/x86_64  Up|L|UI  /home/ubuntu/tidb/tidb-data/pd-2379          /home/ubuntu/tidb/tidb-deploy/pd-2379
192.168.16.56:9090   prometheus  192.168.16.56  9090                             linux/x86_64  Up       /home/ubuntu/tidb/tidb-data/prometheus-9090  /home/ubuntu/tidb/tidb-deploy/prometheus-9090
192.168.16.56:4000   tidb        192.168.16.56  4000/10080                       linux/x86_64  Up       -                                            /home/ubuntu/tidb/tidb-deploy/tidb-4000
192.168.16.56:9000   tiflash     192.168.16.56  9000/8123/3930/20170/20292/8234  linux/x86_64  Up       /home/ubuntu/tidb/tidb-data/tiflash-9000     /home/ubuntu/tidb/tidb-deploy/tiflash-9000
192.168.16.56:20160  tikv        192.168.16.56  20160/20180                      linux/x86_64  Up       /home/ubuntu/tidb/tidb-data/tikv-20160       /home/ubuntu/tidb/tidb-deploy/tikv-20160
192.168.16.56:20161  tikv        192.168.16.56  20161/20181                      linux/x86_64  Up       /home/ubuntu/tidb/tidb-data/tikv-20161       /home/ubuntu/tidb/tidb-deploy/tikv-20161
192.168.16.56:20162  tikv        192.168.16.56  20162/20182                      linux/x86_64  Up       /home/ubuntu/tidb/tidb-data/tikv-20162       /home/ubuntu/tidb/tidb-deploy/tikv-20162
Total nodes: 8

以上输出的结果中,可以看到tidb的端口号是4000,pd运维端口是2379。我们通过Navicat这种工具连接数据库是使用4000端口,默认密码为空。

3.5 关键组件

几个关键组件信息:

  • Pd:元数据及控制调度组件
  • Tikv:存储组件
  • Tidb:数据库实例组件
  • Tiflash:闪存组件
  • Tidb虽然和mysql类似,但是它厉害在分布式,如果要使用mysql,数据库变大后,要思考虑分库分表、使用mycat等数据路由工具,Tidb设计从底层一开始分布式,类似hdfs的存储架构,将分布式做成一种原生的架构。
3.6 tiup cluster 命令说明
  1. 部署集群

部署集群的命令为 tiup cluster deploy,一般用法为:

tiup cluster deploy <cluster-name> <version> <topology.yaml> [flags]
  1. 查看集群列表

集群部署成功后,可以通过 tiup cluster list 命令在集群列表中查看该集群 3. 启动集群

集群部署成功后,可以执行以下命令启动该集群。如果忘记了部署的集群的名字,可以使用 tiup cluster list 命令查看

tiup cluster start tidb-test
  1. 检查集群状态

如果想查看集群中每个组件的运行状态,逐一登录到各个机器上查看显然很低效。因此,TiUP 提供了 tiup cluster display 命令,用法如下:

tiup cluster display tidb-test

其他详细的命令与运维操作,可以在官网文档中查看

4、连接测试

因为TiDB内核是MySQL,所以直接用Navicat工具选择连接MySQL就行:

用户名:root

密码:

安装BR工具包

注意:要根据tidb的版本来下载对应的BR工具包

wget https://download.pingcap.org/tidb-toolkit-v5.1.1-linux-amd64.tar.gz

下载解压完成后,做软连接

tar -zxvf tidb-toolkit-v5.1.1-linux-amd64.tar.gz
ln -s tidb-toolkit-v5.1.1-linux-amd64 .tidb-toolkit

配置.bashrc

export PATH=/home/tidb/.tidb-toolkit/bin:$PATH

重新source .bashrc

创建数据库

链接数据库

mysql -h 127.0.0.1 -P 4000 -uroot 
mysql> create database br_test;
Query OK, 0 rows affected (0.14 sec)
mysql> use br_test;
Database changed
mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.12 sec)
构建数据:

使用python脚本

import pymysql

#host是我自己的本机ip
mydb = pymysql.connect(host="192.168.56.101",user='root', port=4000,database='br_test' )
mycursor = mydb.cursor()
for i in range(10000):
    mycursor.execute('insert into br_table values(%s,%s,now())',(i,str(i)+'xxxx'))
    if i %1000 == 0:
        mycursor.execute('commit')
    mycursor.execute('commit')
mycursor.close()
查看生成的数据
mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
|   10000 |
+----------+
1 row in set (0.05 sec)


备份前的准备

调整GC参数
mysql> select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME="tikv_gc_life_time";
+----------------+
| VARIABLE_VALUE |
+----------------+
| 10m0s          |
+----------------+
1 row in set (0.00 sec)

mysql> UPDATE mysql.tidb SET VARIABLE_VALUE = '720h' WHERE VARIABLE_NAME = 'tikv_gc_life_time';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select VARIABLE_VALUE from mysql.tidb where VARIABLE_NAME="tikv_gc_life_time";
+----------------+
| VARIABLE_VALUE |
+----------------+
| 720h           |
+----------------+
1 row in set (0.00 sec)
备份执行

BR 命令包括备份,恢复两个操作,而备份,恢复又单独针对全库,单库,单表各有操作,因此单独讨论。

而在这些操作之前,其他共用参数单独讨论。

另外需要注意的一点是,因为备份通过 gRPC 发送相关到 TiKV,因此 BR 执行的位置,最好是 PD 节点,避免额外的麻烦。

通用参数

–ca,–cert,–key 如果设置了TLS类连接安全认证,这些参数指定相关安全证书等。

–concurrency 每个节点执行任务的并行度,默认4。

–log-file,–log-level设置日志输出位置以及级别。

-u, --pd 链接 PD 地址,默认127.0.0.1:2379

2.3.2 BR 实操指南

288–ratelimit 限制每个节点的速度,单位是MB

-s, --storage 指定存储位置,比方"local:///data_nfs1"

开始第一次的全量备份

注:我是安装到本地服务器,所以ip是 127.0.0.1,pd进程的默认端口2379,因为是测试所以备份数据临时存放的路径是/tmp,服务器重启后,/tmp下的数据会自动被清除!如需持久测试,请更换存放路径!

 br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.04.03+0800 
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:04:04.462 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=23] [ranges-succeed=23] [ranges-failed=0] [backup-checksum=189.837122ms] [backup-fast-checksum=37.230162ms] [backup-total-ranges=51] [backup-total-regions=51] [total-take=1.315786033s] [total-kv=15863] [total-kv-size=807.2kB] [average-speed=1.014MB/s] ["backup data size(after compressed)"=194.1kB] [BackupTS=426925055851888642]
简单从日志看一下整个备份流程:

* 从 PD 连接获取到所有 TiKV 节点。

* 查询 infoSchema 获取元数据。

* 发送备份请求:{“cluster_id”:6801677637806839235,“start_key”:“dIAAAAAAAAAvX3IAAAAAAAAAAA==”,“end_key”:"dIAAAAAAAAAvX3L

//wA=“,“end_version”:426921533553770497,“concurrency”:4,“storage_backend”:{Backend”:{“Local”:{“path”:“/tmp/userbackupdata”}}}}"

* 各个 TiKV 节点开始执行备份,执行命令完成后,返回到 BR 进行统计。

* 执行表的checksum [table=br_test.br_table] [Crc64Xor=12896770389982935753] [TotalKvs=100000] [TotalBytes=4788890]

* 保存备份的元数据。

* 完成备份。

备份过程中,提到的元数据,最终会保存到备份目录下,其主要包含的是校验和,以及备份集的相关描述信息,包括备份集

合中,每个库,表,列的逻辑排列,字符集信息等(对应的是一个 protobuf 格式的描述文件)。

备份完成后,在指定的备份目录,会最终出现命名类似

5_2_23_80992061af3e5194c3f28a5b79d486c5e9db2feda1afb3f84b4ca229ddce9932_write.sst的备份集合,也就是最终的备份文

件。

查看备份的数据

ls /tmp/backupdata/

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589844017_write.sst
5_109_67_2d2f68c250f77d93ffb8ea543a425eebeea7c92d1f033e9d32cdb914ca4341a9_1628589843794_write.sst
5_109_67_302753bc98ec93e98edea12aeac15074da779bb66c735c8ce184eacf43f1e7c2_1628589843957_write.sst
5_109_67_355f0b5a4a8313c61daccaf1f055ac6ce3e0abaeb3d9fda7eb60557014f38dac_1628589843573_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589843589_write.sst
5_109_67_495c5d2e3a1901df43133318e03236d08d1c8882cdd7b452579a4fcd82ccb718_1628589843551_default.sst
5_109_67_495c5d2e3a1901df43133318e03236d08d1c8882cdd7b452579a4fcd82ccb718_1628589843551_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628589843719_write.sst
5_109_67_5efda45e83e08a349095f080e17b995c4e6185537f363448275487b8e043b86d_1628589843751_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589843636_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589843655_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589843645_write.sst
5_109_67_8dd75043152f9a7a07ef9bc0b24b629d464621806e3c93219cac918ec3d5b346_1628589843785_write.sst
5_109_67_994b3ad98a3599f600a3860d1ac7e0516f41765bc5cc4bd119d91a0bcca207c4_1628589843759_write.sst
5_109_67_a24c493bf7e801abe1238806dd31796f911e7dd0970118b9315f0a7366532873_1628589843803_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628589843668_write.sst
5_109_67_c11c7f8053ba1d8a3efe53decec79ee83321252578aaa0e09024538869d60d0d_1628589843947_write.sst
5_109_67_c2baa1d0dee98e26766aa462a873c405eaad66653a52db6bc9e1d584f6aebcc7_1628589843601_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628589843859_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589843663_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628589843809_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628589843677_write.sst
5_109_67_ff781abda573afc7d7878f20390a290dcdb471bf12a8cbabf359ccf9e8d37084_1628589843910_write.sst
backup.lock
backupmeta
恢复数据

为了简化操作,这里在原有集群上进行恢复,往往实际中是要恢复到一个全新的集群上。 首先执行以下语句删除数据:

mysql> drop table br_table;
Query OK, 0 rows affected (0.26 sec)

创建表

mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.13 sec)

注意:恢复时候,每个 TiKV 节点都需要访问到所有备份文件,如果不是共享存储,需要手动复制所有备份文件到所 有节点

br restore full --pd "127.0.0.1:2370" --storage "local:///tmp/backupdata"

输出结果:

Detail BR log in /tmp/br.log.2021-08-10T18.05.01+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:05:02.617 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=166.833906ms] [restore-checksum=56.439317ms] [restore-ranges=1] [total-take=1.151067383s] ["restore data size(after decompressed)"=1.014MB] [total-kv=100000] [total-kv-size=4.789MB] [average-speed=16.33MB/s]

查看数据库里面的数据:

mysql> select count(*) from br_table;
+----------+
| count(*) |
+----------+
|   10000 |
+----------+
1 row in set (0.04 sec)

数据数量一致

从日志查看恢复过程:

  1. 寻找并确认 PD 节点。

  2. 执行DDL语句:CREATE DATABASE /!32312 IF NOT EXISTS/ 以及 CREATE TABLE IF NOT EXISTS。

  3. 执行必要的alter auto incrementID 语句, 防止恢复后从之前的 id 分配,导致数据覆盖。

  4. 切割 sst 为 Region 负责的小数据集合,分别进行数据写入。

  5. 完成操作后,输出统计信息

  6. Full restore <-----------------------------------------------------------------------------------> 100.00%
    [2021/08/10 18:05:02.617 +08:00] [INFO] [collector.go:66] [“Full restore success summary”] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=166.833906ms] [restore-checksum=56.439317ms] [restore-ranges=1] [total-take=1.151067383s] [“restore data size(after decompressed)”=1.014MB] [total-kv=100000] [total-kv-size=4.789MB] [average-speed=16.33MB/s]

执行完成后,可以看到数据已经恢复完成

增量备份数据

如果想要备份增量,只需要在备份的时候指定上一次的备份时间戳 --lastbackupts 即可。

你可以使用 validate 指令获取上一次备份的时间戳,示例如下:

LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///home/tidb/backupdata | tail -n1`

Demo:

 LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata | tail -n1` 
 echo $LAST_BACKUP_TS

结果输出:

426925055851888642

注意增量备份有以下限制:

  • 增量备份需要与前一次全量备份在不同的路径下
  • GC safepoint 必须在 lastbackupts 之前

第一次增量:

计算时间戳

LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata | tail -n1`
echo $LAST_BACKUP_TS
426925055851888642

开始备份:

br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr1 --lastbackupts ${LAST_BACKUP_TS}

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.05.49+0800 
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:05:50.483 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=11] [ranges-succeed=11] [ranges-failed=0] [backup-checksum=1.413079ms] [backup-total-ranges=51] [backup-total-regions=51] [total-take=723.576856ms] [BackupTS=426925083796439045] [total-kv=15470] [total-kv-size=743.4kB] [average-speed=1.985MB/s] ["backup data size(after compressed)"=138.5kB]

查看对应的备份文件:

ls /tmp/backupdata/incr1

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589950313_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589950146_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628589950169_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589950154_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589950159_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589950157_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628589950164_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628589950222_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589950162_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628589950205_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628589950166_write.sst
backup.lock
backupmeta

注意:计算时间戳的时候要指定上一次的备份的路径,否则得不到正确的时间戳

第二次增量:

LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr1 | tail -n1` 
echo $LAST_BACKUP_TS
426925083796439045

执行备份:

br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr2 --lastbackupts ${LAST_BACKUP_TS}

结果输出:

[2021/08/10 18:06:39.478 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=6] [ranges-succeed=6] [ranges-failed=0] [backup-checksum=914.068µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=485.563975ms] [BackupTS=426925096693923847] [total-kv=4067] [total-kv-size=195.4kB] [average-speed=679.5kB/s] ["backup data size(after compressed)"=52.5kB]

查看对应的备份文件:

ls /tmp/backupdata/incr2

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628589999418_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628589999214_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628589999227_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628589999236_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628589999230_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628589999243_write.sst
backup.lock
backupmeta

第三次增量:

 LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr2 | tail -n1` 
 echo $LAST_BACKUP_TS
426925096693923847

执行备份:

br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr3 --lastbackupts ${LAST_BACKUP_TS}

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.07.16+0800 
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:07:17.347 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=1] [ranges-succeed=1] [ranges-failed=0] [backup-checksum=746.756µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=567.9266ms] [total-kv=2895] [total-kv-size=139kB] [average-speed=408.7kB/s] ["backup data size(after compressed)"=33kB] [BackupTS=426925106616074243]

查看对应的备份文件:

ls /tmp/backupdata/incr3

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590037236_write.sst
backup.lock
backupmeta

第四次增量:

 LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr3 | tail -n1` 
 echo $LAST_BACKUP_TS
426925106616074243

执行备份:

br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr4 --lastbackupts ${LAST_BACKUP_TS}

结果输出:

Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:08:53.055 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=11] [ranges-succeed=11] [ranges-failed=0] [backup-checksum=730.937µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=865.382191ms] [BackupTS=426925131624611844] [total-kv=13502] [total-kv-size=652.7kB] [average-speed=1.275MB/s] ["backup data size(after compressed)"=124.6kB]

查看对应的备份文件:

ls /tmp/backupdata/incr4

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590132837_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628590132559_write.sst
5_109_67_54423f4310d10c8b44947056f08b99e8496fef03e9ab04503162fadfa7e38c44_1628590132624_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628590132567_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628590132577_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628590132571_write.sst
5_109_67_afb978bcbf32cf0ba18923daa88bdab19c00ad22f4fe49863f102827eb831e21_1628590132592_write.sst
5_109_67_d03f5545842231b64ac143f64290e1efab5eb039773fde78f51cc6d98d01e020_1628590132732_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628590132582_write.sst
5_109_67_de8304485bb5b1def0f07f35f376594d100067cec03a21a00f157e91b6148940_1628590132691_write.sst
5_109_67_fd5ee7729668dbddb4eefa496199dcc1b713d020fc184b9383d249a0f7e83028_1628590132606_write.sst
backup.lock
backupmeta

第五次增量:

 LAST_BACKUP_TS=`br validate decode --field="end-version" -s local:///tmp/backupdata/incr3 | tail -n1` 
 echo $LAST_BACKUP_TS
426925131624611844

执行备份:

br backup full --pd "127.0.0.1:2379" -s local:///tmp/backupdata/incr4 --lastbackupts ${LAST_BACKUP_TS}

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.10.09+0800 
Full backup <------------------------------------------------------------------------------------> 100.00%
Checksum <---------------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:10:10.122 +08:00] [INFO] [collector.go:66] ["Full backup success summary"] [total-ranges=6] [ranges-succeed=6] [ranges-failed=0] [backup-checksum=735.929µs] [backup-total-ranges=51] [backup-total-regions=51] [total-take=161.498529ms] ["backup data size(after compressed)"=37.21kB] [BackupTS=426925151993200642] [total-kv=2595] [total-kv-size=124.7kB] [average-speed=1.256MB/s]

查看对应的备份文件:

ls /tmp/backupdata/incr5

结果输出:

1_2_132_9198af8c6c286b502c94a3d4f0b6ac443cb6c343c39043b53a3ed54815b9a5fc_1628590210101_write.sst
5_109_67_36adb8cedcd7af34708edff520499e712e2cfdcb202f5707dc9305a031d55a98_1628590210028_write.sst
5_109_67_755836e42bfb568bf6e89d931d86a7d31453a1bd6f7ffb924e19eb28be74aa00_1628590210034_write.sst
5_109_67_759f1781e3eb9e6aae280a9aa33f2cb97d799e11265ff74363f00f942787a311_1628590210039_write.sst
5_109_67_7c29d2d214a7a70e5c20091535dc0515fe9e2d8c4dffbfad1e3ca1fa7decdc14_1628590210036_write.sst
5_109_67_d0a5865efa1d90636cf0c26e991e1f13ad4ef35c631530b967aa1e342d73208b_1628590210044_write.sst
backup.lock
backupmeta

查看数据库:

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    36036 |
+----------+
1 row in set (0.03 sec)

增量恢复

增量恢复的方法和使用 BR 进行全量恢复的方法并无差别。需要注意,恢复增量数据的时候,需要保证备份时指定的 last backup ts 之前备份的数据已经全部恢复到目标集群

先删除数据

mysql>  drop table br_table;
Query OK, 0 rows affected (0.22 sec)

创建表

mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.13 sec)
恢复增量数据

按增量备份的顺序:恢复incr1、incr2、incr3、incr4

命令

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"

结果

Detail BR log in /tmp/br.log.2021-08-10T18.13.01+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:13:02.615 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=165.914059ms] [restore-checksum=13.408µs] [restore-ranges=1] [total-take=1.084097852s] [total-kv-size=410kB] [average-speed=4.322MB/s] ["restore data size(after decompressed)"=92.78kB] [total-kv=8541]

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|     8541 |
+----------+
1 row in set (0.02 sec)

命令

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"

结果

Detail BR log in /tmp/br.log.2021-08-10T18.14.19+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:14:19.846 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.90882ms] [restore-checksum=15.614µs] [restore-ranges=1] [total-take=228.696715ms] ["restore data size(after decompressed)"=44.73kB] [total-kv=4059] [total-kv-size=194.8kB] [average-speed=2.883MB/s]

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    12600 |
+----------+
1 row in set (0.01 sec)

命令

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"

结果

Detail BR log in /tmp/br.log.2021-08-10T18.15.09+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:15:09.809 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=2.769111ms] [restore-checksum=1.122126ms] [restore-ranges=1] [total-take=159.917469ms] ["restore data size(after decompressed)"=33kB] [total-kv=2895] [total-kv-size=139kB] [average-speed=3.812MB/s]

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    15495 |
+----------+
1 row in set (0.01 sec)

命令

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"

结果

Detail BR log in /tmp/br.log.2021-08-10T18.15.34+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:15:35.175 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=14.727µs] [split-region=3.20808ms] [restore-ranges=1] [total-take=185.808088ms] ["restore data size(after decompressed)"=78.05kB] [total-kv=7075] [total-kv-size=339.6kB] [average-speed=7.533MB/s]]

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    22570 |
+----------+
1 row in set (0.00 sec)

命令

 br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"

结果

Detail BR log in /tmp/br.log.2021-08-10T18.16.26+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:16:26.324 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=13.012µs] [split-region=7.31328ms] [restore-ranges=1] [total-take=184.869222ms] ["restore data size(after decompressed)"=29.43kB] [total-kv=2587] [total-kv-size=124.2kB] [average-speed=3.314MB/s]

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    25157 |
+----------+
1 row in set (0.01 sec)

全量命令

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/

结果

Detail BR log in /tmp/br.log.2021-08-10T18.17.42+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:17:42.609 +08:00] [INFO] [collector.go:66] ["Full restore failed summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [restore-checksum=16.228805ms] [split-region=3.13305ms] [restore-ranges=1]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch

查看数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    36036 |
+----------+
1 row in set (0.02 sec)

数据是恢复了,不过br也报错了。

重新删除所有的数据,严格按照备份顺序来恢复数据

mysql> drop table br_table;
Query OK, 0 rows affected (0.32 sec)

创建表

mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.13 sec)

恢复顺序 full->incr->incr1->incr2-incr3

full:

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.18.28+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:18:29.163 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=129.336319ms] [restore-checksum=24.402545ms] [restore-ranges=1] [total-take=841.561304ms] ["restore data size(after decompressed)"=112.2kB] [total-kv=10879] [total-kv-size=511.1kB] [average-speed=8.273MB/s]

查看数据库:

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    10879 |
+----------+
1 row in set (0.02 sec)

Incr1:

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.19.22+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:19:22.831 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=740.957µs] [restore-checksum=14.175µs] [restore-ranges=1] [total-take=153.847127ms] [total-kv=8541] [total-kv-size=410kB] [average-speed=7.223MB/s] ["restore data size(after decompressed)"=92.78kB]
mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    19420 |
+----------+
1 row in set (0.00 sec)


Incr2:

 br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.20.05+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:20:06.098 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.181962ms] [restore-checksum=28.557µs] [restore-ranges=1] [total-take=152.04905ms] ["restore data size(after decompressed)"=44.73kB] [total-kv=4059] [total-kv-size=194.8kB] [average-speed=5.707MB/s]
mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    23479 |
+----------+
1 row in set (0.01 sec)

Incr3:

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.20.31+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:20:31.562 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.830579ms] [restore-checksum=16.857µs] [restore-ranges=1] [total-take=181.453478ms] ["restore data size(after decompressed)"=33kB] [total-kv=2895] [total-kv-size=139kB] [average-speed=4.243MB/s]
mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    26374 |
+----------+
1 row in set (0.00 sec)


Incr4:

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.21.05+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:21:05.900 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=788.923µs] [restore-checksum=13.061µs] [restore-ranges=1] [total-take=149.491308ms] ["restore data size(after decompressed)"=78.05kB] [total-kv=7075] [total-kv-size=339.6kB] [average-speed=7.096MB/s]

查看数据库:

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    33449 |
+----------+
1 row in set (0.04 sec)

Incr5:

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"

结果输出:

Detail BR log in /tmp/br.log.2021-08-10T18.21.42+0800 
Full restore <-----------------------------------------------------------------------------------> 100.00%
[2021/08/10 18:21:42.984 +08:00] [INFO] [collector.go:66] ["Full restore success summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=973.306µs] [restore-checksum=13.075µs] [restore-ranges=1] [total-take=154.392711ms] [total-kv=2587] [total-kv-size=124.2kB] [average-speed=4.41MB/s] ["restore data size(after decompressed)"=29.43kB]

查看数据库:

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    36036 |
+----------+
1 row in set (0.01 sec)

数据恢复完成。

我们来打乱下顺序:

full->5->4->3->1->2

先drop数据库

mysql> drop table br_table;
Query OK, 0 rows affected (0.24 sec)

创建表

mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.13 sec)

开始执行备份恢复

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"

再来看下数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    36036 |
+----------+
1 row in set (0.00 sec)

数据库恢复成功

查看数据详情:

mysql>  select* from br_table;

并未发现数据乱序的情况

再试下5->4->3->1->2->full

先drop数据库

mysql> drop table br_table;
Query OK, 0 rows affected (0.24 sec)

创建表

mysql> create table br_table(id int primary key,c varchar(128),ctime timestamp); 
Query OK, 0 rows affected (0.13 sec)

开始执行备份恢复

br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr5"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr4"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr3"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr1"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata/incr2"
br restore full --pd "127.0.0.1:2379" --storage "local:///tmp/backupdata"

执行到 br restore full --pd “127.0.0.1:2379” --storage "local:///tmp/backupdata"时,出现以下错误:

Detail BR log in /tmp/br.log.2021-08-11T10.32.26+0800 
Full restore <----------------------------------------------------------------------------------------> 100.00%
[2021/08/11 10:32:26.592 +08:00] [INFO] [collector.go:66] ["Full restore failed summary"] [total-ranges=2] [ranges-succeed=2] [ranges-failed=0] [split-region=1.305227ms] [restore-checksum=21.161653ms] [restore-ranges=1]
Error: failed to validate checksum: [BR:Restore:ErrRestoreChecksumMismatch]restore checksum mismatch

再来看下数据库

mysql>  select count(*) from br_table;
+----------+
| count(*) |
+----------+
|    36036 |
+----------+
1 row in set (0.00 sec)

数据库恢复成功

查看数据详情:

mysql>  select* from br_table;

并未发现数据乱序的情况

结论:仅本人不严谨的测试,未按照优先恢复全量的备份的情况下恢复数据,会出现以上问题,不过数据是恢复成功。

警告:

虽然系统表(例如 mysql.tidb 等)可以通过 BR 进行备份和恢复,但是部分系统表在恢复之后可能会出现非预期的状况,已知的异常如下:

  • 统计信息表(mysql.stat_*)无法被恢复。
  • 系统变量表(mysql.tidbmysql.global_variables)无法被恢复。
  • 用户信息表(mysql.usermysql.columns_priv,等等)无法被恢复。
  • GC 数据无法被恢复。

恢复系统表可能还存在更多兼容性问题。为了防止意外发生,请避免在生产环境中恢复系统表。

疑问:

1.gc的时间间隔小于 增量的时间间隔的时候,那么增量备份的数据是否还有意义?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值