openGauss数据库备份与恢复实践

准备

数据库版本:

openGauss-All-6.0.1-openEuler22.03-x86_64

数据库:

testdb1

参数设置:

gs_guc set -I all -N all -c "enable_cbm_tracking=on"

gs_guc set -I all -N all -c "synchronous_commit=off"

gs_guc set -N all -I all -c "archive_mode=on"

gs_guc set -N all -I all -c "archive_dest='/opt/software/openGauss/install/archive'"

gs_guc set -N all -I all -c "wal_sender_timeout =10s"

gs_guc set -N all -I all -c "archive_timeout =300s"

表:

表初始数据:

开始备份

初始化备份目录:

如果是首次备份,必须初始化一次备份目录,以后备份,就不再需要初始化目录了

执行:

#gs_probackup init -B /opt/software/openGauss/install/backup/bak1

添加一个新的备份实例:

如果是首次备份,必须先增加一个备份实例,以后备份可以在这个实例上进行备份

执行:

#gs_probackup add-instance -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1

进行全量备份:

执行:

#gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full

[omm@hostName1 data]$ gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full
INFO: Backup start, gs_probackup version: 2.4.2, instance: bak20250605_1, backup ID: SXDEOH, backup mode: FULL, wal mode: STREAM, remote: false, compress-algorithm: none,compress-level: 1
LOG: Backup destination is initialized
LOG: This openGauss instance was initialized with data block checksums. Data block corruption will be detected
LOG: Database backup start
LOG: started streaming WAL at 0/8000000 (timeline 1)
[2025-06-05 14:43:30]: check identify system success
[2025-06-05 14:43:30]: send START_REPLICATION 0/8000000 success
[2025-06-05 14:43:30]: keepalive message is received
INFO: PGDATA size: 639MB
[2025-06-05 14:43:30]: keepalive message is received
INFO: Start backing up files
LOG: Creating page header map "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/page_header_map"
[2025-06-05 14:43:35]: keepalive message is received
Progress: [==================================================] 100% (3210/3210, done_files/total_files). backup file
[2025-06-05 14:43:40]: keepalive message is received
[2025-06-05 14:43:45]: keepalive message is received
INFO: Finish backuping file
INFO: Data files are transferred, time elapsed: 15s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
LOG: stop_lsn: 0/80001E8
LOG: Looking for LSN 0/80001E8 in segment: 000000010000000000000008
LOG: Found WAL segment: /opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
LOG: Found LSN: 0/80001E8
LOG: finished streaming WAL at 0/9000000 (timeline 1)
LOG: Getting the Recovery Time from WAL
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
INFO: Syncing backup files to disk
Progress: [==================================================] 100% (3210/3210, done_files/total_files). Sync backup file
INFO: Finish Syncing backup files.
INFO: Backup files are synced, time elapsed: 0
INFO: Validating backup SXDEOH
INFO: Begin validate file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXDEOH data files are valid
INFO: Backup SXDEOH resident size: 655MB
INFO: Backup SXDEOH completed
[omm@hostName1 data]$

备份成功后返回备份ID:SXDEOH

查看备份信息:

执行:

#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

检查备份:

执行:

#gs_probackup validate -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH 

查看备份目录:

备份目录中已经有备份的文件信息了

模拟业务删除数据

delete表:t_1的数据

truncate表: t_2

drop表:t_3

切换wal日志:

执行:

#select pg_xlogfile_name(pg_switch_xlog());

作用:

WAL位置转换:将LSN(Log Sequence Number)转换为实际的WAL文件名
例如将0/156DF40转换为000000010000000000000015

WAL文件识别:帮助管理员快速定位特定事务或操作对应的物理WAL文件

#checkpoint;

作用:

数据持久化保证:将内存中的脏页(Dirty Page)写入磁盘,确保在检查点之前的所有事务修改都已持久化到存储设备;

崩溃恢复基础:作为数据库恢复的起始点,崩溃后只需重放检查点之后的WAL日志;

WAL日志管理:标记哪些WAL日志可以被回收或删除,避免WAL日志无限增长

记录日志号:

#select pg_switch_xlog();

作用:

强制切换WAL日志文件:立即结束当前WAL日志文件,创建一个新的WAL日志文件继续写入
触发检查点(Checkpoint):间接触发检查点操作,确保所有已提交事务的数据页被刷新到磁盘

在备份前执行以确保WAL文件完整性

日志号(LSN)为:0/A000288

记录事务ID(XID):

执行:

#select * from pg_get_variable_info();

其中:latest_completed_xid字段值为事务ID:92739

模拟故障

停止数据库:

执行:

#gs_ctl stop

删除数据目录:

为保险起见,使用mv命令:

进入数据库安装目录:

#cd /opt/software/openGauss/install

#mv data data_bak20250605_1

启动数据库:

执行:

#gs_ctl start

发现启动失败

 

开始恢复

全量恢复

使用restore子命令前,应先停止gaussdb进程

执行:

#gs_probackup restore -B  /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH

[omm@hostName1 install]$ gs_probackup restore -B  /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH
LOG: Restore begin.
LOG: there is no file tablespace_map
LOG: check tablespace directories of backup SXDEOH
LOG: check external directories of backup SXDEOH
INFO: Validating backup SXDEOH
INFO: Begin validate file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXDEOH data files are valid
LOG: Thread [1]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
INFO: Backup SXDEOH WAL segments are valid
INFO: Backup SXDEOH is valid.
INFO: Restoring the database from backup at 2025-06-05 14:43:29+08
LOG: there is no file tablespace_map
LOG: Restore directories and symlinks... in /opt/software/openGauss/install/data/dn1
INFO: Start restoring backup files. DATA size: 655MB
INFO: Begin restore file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). Restore file
INFO: Finish restore file
INFO: Backup files are restored. Transfered bytes: 655MB, time elapsed: 3s
INFO: Restore incremental ratio (less is better): 100% (655MB/655MB)
INFO: Start Syncing restored files to disk
Progress: [==================================================] 100% (3212/3212, done_files/total_files). Sync restore file
INFO: Finish Syncing restored files.
INFO: Restored backup files are synced, time elapsed: 6s
INFO: Restore of backup SXDEOH completed.
[omm@hostName1 install]$

查看数据目录是否恢复

#ll

 数据目录已经恢复了

配置恢复文件:

在数据目录:/opt/software/openGauss/install/data/dn1

增加recovery.conf文件:

如果想基于时间点恢复,可配置为:

recovery_target_time = '2025-06-05 14:12:06'
recovery_target_inclusive = false
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

如果想基于事务ID恢复,可配置为:

recovery_target_xid = '92739'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

如果想基于LSN恢复,可配置为:

recovery_target_lsn = '4FD/13021790'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

如果想基于还原点恢复,可配置为:

recovery_target_name = 'restore_point_1'
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

注意:

1)如果不配置任何恢复目标 或 配置目标不存在,则默认恢复到最新的WAL日志点。

2)recovery_target_inclusive:声明是否在指定恢复目标之后停止(true) 或 之前停止(false),不支持recovery_target_name 配置 

3)还原点可以通过pg_create_restore_point函数创建,例如:

#select pg_create_restore_point('restore_point_1');

注:启动数据库后recovery.conf文件名称变为recovery.done

只有配置了恢复文件recovery.conf变成recovery.done,本次恢复才算结束

启动数据库:

执行:

#gs_ctl start

数据库可以启动了!

启动数据库后recovery.conf文件名称变为recovery.done

只有配置了恢复文件recovery.conf变成recovery.done,本次恢复才算结束

进入数据库查看数据是否恢复:

delete表:t_1表数据全部恢复 

truncate表:t_2表数据全部恢复

drop的表:t_3居然没有恢复过来!好奇怪!

检测恢复状态:

执行:

#SELECT pg_is_in_recovery();

返回结果:t表示仍在恢复模式(只读),f表示已完成

如果返回的是t,即数据库在恢复状态中,则数据库则是只读状态

结束恢复:

如果确认恢复没有问题,即达到了恢复目标,可以执行下面命令,使机器对外提供读写服务:

#select pg_xlog_replay_resume();

再次查看恢复状态:

#SELECT pg_is_in_recovery();

向表插入一条数据,验证一下:

至此,恢复完成!

删除备份:

查看备份信息:

#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

删除备份:

把备份失败的备份删除,执行:

#gs_probackup delete -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXEY2P

再次查看:

#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

注意

在删除全量备份时,如果基于该全量备份有增量备份,那么对应的增量备份也会一起删除

定时任务后台执行备份与恢复:

在实际生产应用中,需要定期对数据库做备份,需要使用到crontab定时任务

数据库数据量比较大,备份、恢复时间都比较长,直接执行备份命令会超时,也可以采用定时任务方式执行备份、恢复,

定时任务执行备份:

例如:

执行命令

#crontab -e

进入定时任务编辑,增加备份任务:

30 10 * * * source /home/omm/.bashrc;/opt/software/openGauss/install/app/bin/gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full -t 3000

说明:

50 10 * * *  表示每天10:50开始执行备份

source /home/omm/.bashrc 导入omm用户环境变量

定时任务执行日志,可以在/var/log目录下的cron文件中查看或cron-年月日(例如:cron-20250701)

到达定时任务指定的时间后,查看备份信息:

#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

定时任务执行完以后,查看定时任务日志:

#vi /var/log/cron

也可以使用nohup命令后台执行备份避免超时;

nohup gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full > /opt/software/openGauss/install/backup/backup_20250206_3.log 2>&1 &

定时任务执行恢复:

恢复数据库时,一般不会用crontab定时任务,但数据库数据非常大时,恢复实践比较长,才用用到crontab定时任务来做恢复,注意配置crontab定时任务时,一定要配置一个一次性的,千万别搞成了定期执行了

配置crontab任务:

执行:

#crontab -e

增加:

15 11 * * * source /home/omm/.bashrc;/opt/software/openGauss/install/app/bin/gs_probackup restore -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXEYJE

等待定时任务指定的时间后,启动数据库,查看恢复情况

增量备份:

在进行增量备份之前,必须至少创建一次全备

执行增量备份:

#gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b PTRACK

[omm@hostName1 ~]$ gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b PTRACK
INFO: Backup start, gs_probackup version: 2.4.2, instance: bak20250605_1, backup ID: SXF9E5, backup mode: PTRACK, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
LOG: Backup destination is initialized
LOG: This openGauss instance was initialized with data block checksums. Data block corruption will be detected
LOG: Database backup start
LOG: Latest valid FULL backup: SXEYJE
INFO: Parent backup: SXEYJE
LOG: started streaming WAL at 0/22000000 (timeline 1)
[2025-06-06 14:44:29]: check identify system success
[2025-06-06 14:44:29]: send START_REPLICATION 0/22000000 success
[2025-06-06 14:44:29]: keepalive message is received
INFO: PGDATA size: 639MB
LOG: Current tli: 1
LOG: Parent start_lsn: 0/1A000028
LOG: start_lsn: 0/22000028
INFO: Extracting pagemap of changed blocks
INFO: change bitmap start lsn location is 0/1A000028
INFO: change bitmap end lsn location is 00000000/22000028
INFO: Pagemap successfully extracted, time elapsed: 0 sec
INFO: Start backing up files
LOG: Creating page header map "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/page_header_map"
[2025-06-06 14:44:30]: keepalive message is received
[2025-06-06 14:44:35]: keepalive message is received
Progress: [==================================================] 100% (3217/3217, done_files/total_files). backup file
INFO: Finish backuping file
INFO: Data files are transferred, time elapsed: 10s
[2025-06-06 14:44:40]: keepalive message is received
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
LOG: stop_lsn: 0/220001E8
LOG: Looking for LSN 0/220001E8 in segment: 000000010000000000000022
LOG: Found WAL segment: /opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022"
LOG: Found LSN: 0/220001E8
LOG: finished streaming WAL at 0/23000000 (timeline 1)
LOG: Getting the Recovery Time from WAL
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022"
INFO: Syncing backup files to disk
Progress: [==================================================] 100% (3217/3217, done_files/total_files). Sync backup file
INFO: Finish Syncing backup files.
INFO: Backup files are synced, time elapsed: 0
INFO: Validating backup SXF9E5
INFO: Begin validate file
Progress: [==================================================] 100% (3219/3219, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXF9E5 data files are valid
INFO: Backup SXF9E5 resident size: 589MB
INFO: Backup SXF9E5 completed
[omm@hostName1 ~]$
[omm@hostName1 ~]$

查看备份信息:

#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

查看当前LSN:

#select pg_current_xlog_location();

模拟数据丢失:

删除数据目录文件:

执行:

#mv data data_2

从增量备份中恢复:

执行:

#gs_probackup restore -B  /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXF9E5 

增加恢复配置文件:recovery.conf

配置内容:

recovery_target_lsn = '0/240003A8'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

启动数据库:

#gs_om -t start

进入数据库验证数据恢复情况:

 

 数据恢复OK!

检测恢复状态:

执行:

#SELECT pg_is_in_recovery();

 结束恢复:

如果确认恢复没有问题,即达到了恢复目标,可以执行下面命令,使机器对外提供读写服务:

#select pg_xlog_replay_resume();

再次查看恢复状态:

#SELECT pg_is_in_recovery();

向表写数据验证数据库是否可读写:

集群恢复:

本测试是单节点数据库备份与恢复测试 ,如果是多节点,比如一主两备,则在主节点做备份,备份方法和单节点一致;但在恢复时,需要先恢复主节点,再恢复备节点,大致方法如下:

停掉主节点:

在主节点执行:

#gs_ctl stop -D /opt/software/openGauss/install/data -M primary

删除备节点数据目录:

在每个备节点执行:

#mv data data_bak

然后创建一个空的data目录

#mkdir data

将主节点数据目录copy到备节点:

在每个备节点执行:

#scp -r 192.xx.xx.xx:/opt/software/openGauss/install/data/* /opt/software/openGauss/install/data/

192.xx.xx.xx为主节点IP

如果有多个备节点,上述步骤要在每个备节点都要执行

将原备节点data目录下的配置文件copy过来:

在每个备节点执行:

#cp /opt/software/openGauss/install/data_bak/dn1/*.conf /opt/software/openGauss/install/data/dn1/

重启数据库集群:

执行:

#cm_ctl start

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值