准备
数据库版本:
openGauss-All-6.0.1-openEuler22.03-x86_64
数据库:
testdb1
参数设置:
gs_guc set -I all -N all -c "enable_cbm_tracking=on"
gs_guc set -I all -N all -c "synchronous_commit=off"
gs_guc set -N all -I all -c "archive_mode=on"
gs_guc set -N all -I all -c "archive_dest='/opt/software/openGauss/install/archive'"
gs_guc set -N all -I all -c "wal_sender_timeout =10s"
gs_guc set -N all -I all -c "archive_timeout =300s"
表:

表初始数据:



开始备份
初始化备份目录:
如果是首次备份,必须初始化一次备份目录,以后备份,就不再需要初始化目录了
执行:
#gs_probackup init -B /opt/software/openGauss/install/backup/bak1

添加一个新的备份实例:
如果是首次备份,必须先增加一个备份实例,以后备份可以在这个实例上进行备份
执行:
#gs_probackup add-instance -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1

进行全量备份:
执行:
#gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full
[omm@hostName1 data]$ gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full
INFO: Backup start, gs_probackup version: 2.4.2, instance: bak20250605_1, backup ID: SXDEOH, backup mode: FULL, wal mode: STREAM, remote: false, compress-algorithm: none,compress-level: 1
LOG: Backup destination is initialized
LOG: This openGauss instance was initialized with data block checksums. Data block corruption will be detected
LOG: Database backup start
LOG: started streaming WAL at 0/8000000 (timeline 1)
[2025-06-05 14:43:30]: check identify system success
[2025-06-05 14:43:30]: send START_REPLICATION 0/8000000 success
[2025-06-05 14:43:30]: keepalive message is received
INFO: PGDATA size: 639MB
[2025-06-05 14:43:30]: keepalive message is received
INFO: Start backing up files
LOG: Creating page header map "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/page_header_map"
[2025-06-05 14:43:35]: keepalive message is received
Progress: [==================================================] 100% (3210/3210, done_files/total_files). backup file
[2025-06-05 14:43:40]: keepalive message is received
[2025-06-05 14:43:45]: keepalive message is received
INFO: Finish backuping file
INFO: Data files are transferred, time elapsed: 15s
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
LOG: stop_lsn: 0/80001E8
LOG: Looking for LSN 0/80001E8 in segment: 000000010000000000000008
LOG: Found WAL segment: /opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
LOG: Found LSN: 0/80001E8
LOG: finished streaming WAL at 0/9000000 (timeline 1)
LOG: Getting the Recovery Time from WAL
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
INFO: Syncing backup files to disk
Progress: [==================================================] 100% (3210/3210, done_files/total_files). Sync backup file
INFO: Finish Syncing backup files.
INFO: Backup files are synced, time elapsed: 0
INFO: Validating backup SXDEOH
INFO: Begin validate file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXDEOH data files are valid
INFO: Backup SXDEOH resident size: 655MB
INFO: Backup SXDEOH completed
[omm@hostName1 data]$
备份成功后返回备份ID:SXDEOH
查看备份信息:
执行:
#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

检查备份:
执行:
#gs_probackup validate -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH

查看备份目录:

备份目录中已经有备份的文件信息了
模拟业务删除数据
delete表:t_1的数据

truncate表: t_2

drop表:t_3

切换wal日志:
执行:
#select pg_xlogfile_name(pg_switch_xlog());
作用:
WAL位置转换:将LSN(Log Sequence Number)转换为实际的WAL文件名
例如将0/156DF40转换为000000010000000000000015
WAL文件识别:帮助管理员快速定位特定事务或操作对应的物理WAL文件
#checkpoint;
作用:
数据持久化保证:将内存中的脏页(Dirty Page)写入磁盘,确保在检查点之前的所有事务修改都已持久化到存储设备;
崩溃恢复基础:作为数据库恢复的起始点,崩溃后只需重放检查点之后的WAL日志;
WAL日志管理:标记哪些WAL日志可以被回收或删除,避免WAL日志无限增长

记录日志号:
#select pg_switch_xlog();
作用:
强制切换WAL日志文件:立即结束当前WAL日志文件,创建一个新的WAL日志文件继续写入
触发检查点(Checkpoint):间接触发检查点操作,确保所有已提交事务的数据页被刷新到磁盘
在备份前执行以确保WAL文件完整性

日志号(LSN)为:0/A000288
记录事务ID(XID):
执行:
#select * from pg_get_variable_info();


其中:latest_completed_xid字段值为事务ID:92739
模拟故障
停止数据库:
执行:
#gs_ctl stop

删除数据目录:
为保险起见,使用mv命令:
进入数据库安装目录:
#cd /opt/software/openGauss/install
#mv data data_bak20250605_1

启动数据库:
执行:
#gs_ctl start
发现启动失败

开始恢复
全量恢复
使用restore子命令前,应先停止gaussdb进程
执行:
#gs_probackup restore -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH
[omm@hostName1 install]$ gs_probackup restore -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXDEOH
LOG: Restore begin.
LOG: there is no file tablespace_map
LOG: check tablespace directories of backup SXDEOH
LOG: check external directories of backup SXDEOH
INFO: Validating backup SXDEOH
INFO: Begin validate file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXDEOH data files are valid
LOG: Thread [1]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXDEOH/database/pg_xlog/000000010000000000000008"
INFO: Backup SXDEOH WAL segments are valid
INFO: Backup SXDEOH is valid.
INFO: Restoring the database from backup at 2025-06-05 14:43:29+08
LOG: there is no file tablespace_map
LOG: Restore directories and symlinks... in /opt/software/openGauss/install/data/dn1
INFO: Start restoring backup files. DATA size: 655MB
INFO: Begin restore file
Progress: [==================================================] 100% (3212/3212, done_files/total_files). Restore file
INFO: Finish restore file
INFO: Backup files are restored. Transfered bytes: 655MB, time elapsed: 3s
INFO: Restore incremental ratio (less is better): 100% (655MB/655MB)
INFO: Start Syncing restored files to disk
Progress: [==================================================] 100% (3212/3212, done_files/total_files). Sync restore file
INFO: Finish Syncing restored files.
INFO: Restored backup files are synced, time elapsed: 6s
INFO: Restore of backup SXDEOH completed.
[omm@hostName1 install]$
查看数据目录是否恢复
#ll

数据目录已经恢复了
配置恢复文件:
在数据目录:/opt/software/openGauss/install/data/dn1
增加recovery.conf文件:
如果想基于时间点恢复,可配置为:
recovery_target_time = '2025-06-05 14:12:06'
recovery_target_inclusive = false
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'
如果想基于事务ID恢复,可配置为:
recovery_target_xid = '92739'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'
如果想基于LSN恢复,可配置为:
recovery_target_lsn = '4FD/13021790'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'
如果想基于还原点恢复,可配置为:
recovery_target_name = 'restore_point_1'
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'
注意:
1)如果不配置任何恢复目标 或 配置目标不存在,则默认恢复到最新的WAL日志点。
2)recovery_target_inclusive:声明是否在指定恢复目标之后停止(true) 或 之前停止(false),不支持recovery_target_name 配置
3)还原点可以通过pg_create_restore_point函数创建,例如:
#select pg_create_restore_point('restore_point_1');

注:启动数据库后recovery.conf文件名称变为recovery.done
只有配置了恢复文件recovery.conf变成recovery.done,本次恢复才算结束
启动数据库:
执行:
#gs_ctl start

数据库可以启动了!

启动数据库后recovery.conf文件名称变为recovery.done
只有配置了恢复文件recovery.conf变成recovery.done,本次恢复才算结束
进入数据库查看数据是否恢复:

delete表:t_1表数据全部恢复
truncate表:t_2表数据全部恢复
drop的表:t_3居然没有恢复过来!好奇怪!
检测恢复状态:
执行:
#SELECT pg_is_in_recovery();

返回结果:t表示仍在恢复模式(只读),f表示已完成
如果返回的是t,即数据库在恢复状态中,则数据库则是只读状态

结束恢复:
如果确认恢复没有问题,即达到了恢复目标,可以执行下面命令,使机器对外提供读写服务:
#select pg_xlog_replay_resume();

再次查看恢复状态:
#SELECT pg_is_in_recovery();

向表插入一条数据,验证一下:

至此,恢复完成!
删除备份:
查看备份信息:
#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

删除备份:
把备份失败的备份删除,执行:
#gs_probackup delete -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXEY2P

再次查看:
#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

注意
在删除全量备份时,如果基于该全量备份有增量备份,那么对应的增量备份也会一起删除

定时任务后台执行备份与恢复:
在实际生产应用中,需要定期对数据库做备份,需要使用到crontab定时任务
数据库数据量比较大,备份、恢复时间都比较长,直接执行备份命令会超时,也可以采用定时任务方式执行备份、恢复,
定时任务执行备份:
例如:
执行命令
#crontab -e
进入定时任务编辑,增加备份任务:
30 10 * * * source /home/omm/.bashrc;/opt/software/openGauss/install/app/bin/gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full -t 3000

说明:
50 10 * * * 表示每天10:50开始执行备份
source /home/omm/.bashrc 导入omm用户环境变量
定时任务执行日志,可以在/var/log目录下的cron文件中查看或cron-年月日(例如:cron-20250701)
到达定时任务指定的时间后,查看备份信息:
#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

定时任务执行完以后,查看定时任务日志:
#vi /var/log/cron

也可以使用nohup命令后台执行备份避免超时;
nohup gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b full > /opt/software/openGauss/install/backup/backup_20250206_3.log 2>&1 &
定时任务执行恢复:
恢复数据库时,一般不会用crontab定时任务,但数据库数据非常大时,恢复实践比较长,才用用到crontab定时任务来做恢复,注意配置crontab定时任务时,一定要配置一个一次性的,千万别搞成了定期执行了
配置crontab任务:
执行:
#crontab -e
增加:
15 11 * * * source /home/omm/.bashrc;/opt/software/openGauss/install/app/bin/gs_probackup restore -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXEYJE

等待定时任务指定的时间后,启动数据库,查看恢复情况
增量备份:
在进行增量备份之前,必须至少创建一次全备
执行增量备份:
#gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b PTRACK
[omm@hostName1 ~]$ gs_probackup backup -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -b PTRACK
INFO: Backup start, gs_probackup version: 2.4.2, instance: bak20250605_1, backup ID: SXF9E5, backup mode: PTRACK, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
LOG: Backup destination is initialized
LOG: This openGauss instance was initialized with data block checksums. Data block corruption will be detected
LOG: Database backup start
LOG: Latest valid FULL backup: SXEYJE
INFO: Parent backup: SXEYJE
LOG: started streaming WAL at 0/22000000 (timeline 1)
[2025-06-06 14:44:29]: check identify system success
[2025-06-06 14:44:29]: send START_REPLICATION 0/22000000 success
[2025-06-06 14:44:29]: keepalive message is received
INFO: PGDATA size: 639MB
LOG: Current tli: 1
LOG: Parent start_lsn: 0/1A000028
LOG: start_lsn: 0/22000028
INFO: Extracting pagemap of changed blocks
INFO: change bitmap start lsn location is 0/1A000028
INFO: change bitmap end lsn location is 00000000/22000028
INFO: Pagemap successfully extracted, time elapsed: 0 sec
INFO: Start backing up files
LOG: Creating page header map "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/page_header_map"
[2025-06-06 14:44:30]: keepalive message is received
[2025-06-06 14:44:35]: keepalive message is received
Progress: [==================================================] 100% (3217/3217, done_files/total_files). backup file
INFO: Finish backuping file
INFO: Data files are transferred, time elapsed: 10s
[2025-06-06 14:44:40]: keepalive message is received
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
LOG: stop_lsn: 0/220001E8
LOG: Looking for LSN 0/220001E8 in segment: 000000010000000000000022
LOG: Found WAL segment: /opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022"
LOG: Found LSN: 0/220001E8
LOG: finished streaming WAL at 0/23000000 (timeline 1)
LOG: Getting the Recovery Time from WAL
LOG: Thread [0]: Opening WAL segment "/opt/software/openGauss/install/backup/bak1/backups/bak20250605_1/SXF9E5/database/pg_xlog/000000010000000000000022"
INFO: Syncing backup files to disk
Progress: [==================================================] 100% (3217/3217, done_files/total_files). Sync backup file
INFO: Finish Syncing backup files.
INFO: Backup files are synced, time elapsed: 0
INFO: Validating backup SXF9E5
INFO: Begin validate file
Progress: [==================================================] 100% (3219/3219, done_files/total_files). validate file
INFO: Finish validate file.
INFO: Backup SXF9E5 data files are valid
INFO: Backup SXF9E5 resident size: 589MB
INFO: Backup SXF9E5 completed
[omm@hostName1 ~]$
[omm@hostName1 ~]$
查看备份信息:
#gs_probackup show -B /opt/software/openGauss/install/backup/bak1

查看当前LSN:
#select pg_current_xlog_location();

模拟数据丢失:

删除数据目录文件:
执行:
#mv data data_2

从增量备份中恢复:
执行:
#gs_probackup restore -B /opt/software/openGauss/install/backup/bak1 --instance bak20250605_1 -i SXF9E5

增加恢复配置文件:recovery.conf
配置内容:
recovery_target_lsn = '0/240003A8'
recovery_target_inclusive = true
restore_command = 'cp /opt/software/openGauss/install/archive/%f %p'

启动数据库:
#gs_om -t start

进入数据库验证数据恢复情况:

数据恢复OK!
检测恢复状态:
执行:
#SELECT pg_is_in_recovery();

结束恢复:
如果确认恢复没有问题,即达到了恢复目标,可以执行下面命令,使机器对外提供读写服务:
#select pg_xlog_replay_resume();

再次查看恢复状态:
#SELECT pg_is_in_recovery();

向表写数据验证数据库是否可读写:

集群恢复:
本测试是单节点数据库备份与恢复测试 ,如果是多节点,比如一主两备,则在主节点做备份,备份方法和单节点一致;但在恢复时,需要先恢复主节点,再恢复备节点,大致方法如下:
停掉主节点:
在主节点执行:
#gs_ctl stop -D /opt/software/openGauss/install/data -M primary
删除备节点数据目录:
在每个备节点执行:
#mv data data_bak
然后创建一个空的data目录
#mkdir data
将主节点数据目录copy到备节点:
在每个备节点执行:
#scp -r 192.xx.xx.xx:/opt/software/openGauss/install/data/* /opt/software/openGauss/install/data/
192.xx.xx.xx为主节点IP
如果有多个备节点,上述步骤要在每个备节点都要执行
将原备节点data目录下的配置文件copy过来:
在每个备节点执行:
#cp /opt/software/openGauss/install/data_bak/dn1/*.conf /opt/software/openGauss/install/data/dn1/
重启数据库集群:
执行:
#cm_ctl start

1282

被折叠的 条评论
为什么被折叠?



