一、PITR步骤
(1)初始化数据库
[postgres@90220 pg12]$ initdb data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "zh_CN.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8"
The default text search configuration will be set to "simple".
Data page checksums are disabled.
creating directory data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... PRC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D data -l logfile start
(2)创建归档目录
[postgres@90220 ~]$ mkdir archive
[postgres@90220 ~]$ cd archive/
[postgres@90220 archive]$ ls
[postgres@90220 archive]$ pwd
/home/postgres/archive
(3)修改归档参数
vi postgresql.conf
archive_mode = on
archive_command = 'cp %p /home/postgres/archive/%f'
(4)启动数据库服务
[postgres@90220 data]$ pg_ctl start
waiting for server to start....2019-10-25 16:58:00.137 CST [34745] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 16:58:00.140 CST [34745] LOG: listening on IPv6 address "::1", port 5432
2019-10-25 16:58:00.140 CST [34745] LOG: listening on IPv4 address "127.0.0.1", port 5432
2019-10-25 16:58:00.142 CST [34745] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 16:58:00.159 CST [34746] LOG: database system was shut down at 2019-10-25 16:53:45 CST
2019-10-25 16:58:00.162 CST [34745] LOG: database system is ready to accept connections
done
server started
(5)基础备份
[postgres@90220 pg12]$ pg_basebackup -Pv -Ft -Xf -D basebackup
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000060 on timeline 1
40974/40974 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000138
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
[postgres@90220 pg12]$
注:-Ft为输入格式为tar文件,当前命令即在当前目录创建basebackup目录,然后在该目下输入base.tar文件
(6)制造数据
[postgres@90220 pg12]$ psql
psql (12.0)
Type "help" for help.
postgres=# create table tb1 (c1 int,c2 char);
CREATE TABLE
postgres=# insert into tb1 values(2,'c');
INSERT 0 1
postgres=# select now(); --记录,数据库恢复到这个时间点time1
now
-------------------------------
2019-10-25 17:05:35.023024+08
(1 row)
postgres=# drop table tb1 ;
DROP TABLE
(7)关闭数据库服务
[postgres@90220 pg12]$ pg_ctl stop
waiting for server to shut down....2019-10-25 17:06:51.193 CST [34745] LOG: received fast shutdown request
2019-10-25 17:06:51.194 CST [34745] LOG: aborting any active transactions
2019-10-25 17:06:51.195 CST [34745] LOG: background worker "logical replication launcher" (PID 34753) exited with exit code 1
2019-10-25 17:06:51.196 CST [34747] LOG: shutting down
2019-10-25 17:06:51.398 CST [34745] LOG: database system is shut down
done
server stopped
(8)替换data
[postgres@90220 pg12]$ cd data/
[postgres@90220 data]$ rm -rf *
[postgres@90220 data]$ cp ../basebackup/base.tar .
[postgres@90220 data]$ tar xf base.tar
[postgres@90220 data]$ ls
backup_label global pg_hba.conf pg_multixact pg_serial pg_stat_tmp pg_twophase pg_xact tablespace_map
base pg_commit_ts pg_ident.conf pg_notify pg_snapshots pg_subtrans PG_VERSION postgresql.auto.conf
base.tar pg_dynshmem pg_logical pg_replslot pg_stat pg_tblspc pg_wal postgresql.conf
(9)创建标志文件
[postgres@90220 data]$ touch recovery.signal
(10)修改恢复参数
vi postgresql.conf
restore_command = 'cp /home/postgres/archive/%f %p'
recovery_target_time = '2019-10-25 17:05:35.023024+08'
recovery_target_action = 'promote'
注:recovery_targer_action控制恢复到指定目标时数据库的动作,默认是pause,表示将恢复暂停,只允许读操作,需要手动执行pg_wal_replay_resume() 恢复,设置为promote后,恢复过程完成,服务器接受连接
(11)启动数据库服务,完成恢复
[postgres@90220 data]$ pg_ctl start
waiting for server to start....2019-10-25 17:17:00.376 CST [34988] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 17:17:00.378 CST [34988] LOG: listening on IPv6 address "::1", port 5432
2019-10-25 17:17:00.378 CST [34988] LOG: listening on IPv4 address "127.0.0.1", port 5432
2019-10-25 17:17:00.380 CST [34988] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 17:17:00.392 CST [34989] LOG: database system was interrupted; last known up at 2019-10-25 16:59:34 CST
cp: 无法获取"/home/postgres/archive/00000002.history" 的文件状态(stat): 没有那个文件或目录
2019-10-25 17:17:00.403 CST [34989] LOG: starting point-in-time recovery to 2019-10-25 17:05:35.023024+08
2019-10-25 17:17:00.418 CST [34989] LOG: restored log file "000000010000000000000002" from archive
2019-10-25 17:17:01.036 CST [34989] LOG: redo starts at 0/2000060
2019-10-25 17:17:01.037 CST [34989] LOG: consistent recovery state reached at 0/2000138
2019-10-25 17:17:01.037 CST [34988] LOG: database system is ready to accept read only connections
done
server started
[postgres@90220 data]$ 2019-10-25 17:17:01.049 CST [34989] LOG: restored log file "000000010000000000000003" from archive
2019-10-25 17:17:01.512 CST [34989] LOG: recovery stopping before commit of transaction 488, time 2019-10-25 17:05:45.546195+08
2019-10-25 17:17:01.512 CST [34989] LOG: redo done at 0/3013528
2019-10-25 17:17:01.512 CST [34989] LOG: last completed transaction was at log time 2019-10-25 17:05:33.544826+08
cp: 无法获取"/home/postgres/archive/00000002.history" 的文件状态(stat): 没有那个文件或目录
2019-10-25 17:17:01.522 CST [34989] LOG: selected new timeline ID: 2
2019-10-25 17:17:01.944 CST [34989] LOG: archive recovery complete
cp: 无法获取"/home/postgres/archive/00000001.history" 的文件状态(stat): 没有那个文件或目录
2019-10-25 17:17:01.972 CST [34988] LOG: database system is ready to accept connections
(12)验证表已恢复且可写
[postgres@90220 data]$ psql
psql (12.0)
Type "help" for help.
postgres=# select * from tb1 ;
c1 | c2
----+----
2 | c
(1 row)
postgres=# insert into tb1 values (3,'c');
INSERT 0 1
二、主备流复制
主 | 备 |
---|---|
192.168.90.220 | 192.168.90.222 |
1、搭建主备流复制
(1)主端初始化数据库
[postgres@90220 pg12]$ initdb data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "zh_CN.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale "zh_CN.UTF-8"
The default text search configuration will be set to "simple".
Data page checksums are disabled.
creating directory data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... PRC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D data -l logfile start
(2)主端修改参数
vi data/postgresql.conf 修改
listen_addresses = '*'
vi data/pg_hba.conf 添加
host replication all 192.168.90.222/32 trust
(4)主端启动数据库服务
[postgres@90220 pg12]$ pg_ctl start
waiting for server to start....2019-10-25 17:44:16.661 CST [35347] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 17:44:16.663 CST [35347] LOG: listening on IPv4 address "0.0.0.0", port 5432
2019-10-25 17:44:16.663 CST [35347] LOG: listening on IPv6 address "::", port 5432
2019-10-25 17:44:16.667 CST [35347] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 17:44:16.689 CST [35348] LOG: database system was shut down at 2019-10-25 17:39:52 CST
2019-10-25 17:44:16.692 CST [35347] LOG: database system is ready to accept connections
done
server started
(5)备端删除data
[postgres@222 pg12]$ ls
bin data include lib share
[postgres@222 pg12]$ rm -rf data
[postgres@222 pg12]$ ls
bin include lib share
(6)备端基础备份
[postgres@222 pg12]$ pg_basebackup -h 192.168.90.220 -U postgres -Xs -Fp -R -Pv -D data
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_35429"
24588/24588 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000138
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed
注:-R参数与以往版本不同,不再生成recovery.conf文件,而是生成standby.signal标志文件,且在postgresql.auto.conf文件中添加primary_conninfo
查看生成标志文件
[postgres@222 data]$ ls
backup_label.old pg_commit_ts pg_ident.conf pg_notify pg_snapshots pg_subtrans PG_VERSION postgresql.auto.conf postmaster.pid
base pg_dynshmem pg_logical pg_replslot pg_stat pg_tblspc pg_wal postgresql.conf standby.signal
global pg_hba.conf pg_multixact pg_serial pg_stat_tmp pg_twophase pg_xact postmaster.opts
(7)备端启动数据库服务
[postgres@222 data]$ pg_ctl start
waiting for server to start....2019-10-25 17:55:14.525 CST [33828] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 17:55:14.529 CST [33828] LOG: listening on IPv4 address "0.0.0.0", port 5432
2019-10-25 17:55:14.529 CST [33828] LOG: listening on IPv6 address "::", port 5432
2019-10-25 17:55:14.570 CST [33828] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 17:55:14.636 CST [33829] LOG: database system was interrupted; last known up at 2019-10-25 17:52:10 CST
2019-10-25 17:55:14.652 CST [33829] LOG: entering standby mode
2019-10-25 17:55:14.655 CST [33829] LOG: redo starts at 0/2000028
2019-10-25 17:55:14.657 CST [33829] LOG: consistent recovery state reached at 0/2000138
2019-10-25 17:55:14.658 CST [33828] LOG: database system is ready to accept read only connections
2019-10-25 17:55:14.665 CST [33833] LOG: started streaming WAL from primary at 0/3000000 on timeline 1
done
server started
(8)验证流复制搭建成功
查询主备端进程
主端:
postgres 35347 1 0 17:44 ? 00:00:00 /home/postgres/pg12/bin/postgres
postgres 35349 35347 0 17:44 ? 00:00:00 postgres: checkpointer
postgres 35350 35347 0 17:44 ? 00:00:00 postgres: background writer
postgres 35351 35347 0 17:44 ? 00:00:00 postgres: walwriter
postgres 35352 35347 0 17:44 ? 00:00:00 postgres: autovacuum launcher
postgres 35353 35347 0 17:44 ? 00:00:00 postgres: stats collector
postgres 35354 35347 0 17:44 ? 00:00:00 postgres: logical replication launcher
postgres 35454 35347 0 17:55 ? 00:00:00 postgres: walsender postgres 192.168.90.222(65054) streaming 0/3000060
备端:
postgres 33828 1 0 17:55 ? 00:00:00 /home/postgres/pg12/bin/postgres
postgres 33829 33828 0 17:55 ? 00:00:00 postgres: startup recovering 000000010000000000000003
postgres 33830 33828 0 17:55 ? 00:00:00 postgres: checkpointer
postgres 33831 33828 0 17:55 ? 00:00:00 postgres: background writer
postgres 33832 33828 0 17:55 ? 00:00:00 postgres: stats collector
postgres 33833 33828 0 17:55 ? 00:00:00 postgres: walreceiver streaming 0/3000060
主端写入数据
[postgres@90220 pg12]$ psql
psql (12.0)
Type "help" for help.
postgres=# create table tb1 (i int,t text);
CREATE TABLE
postgres=# insert into tb1 values (1,'lili');
INSERT 0 1
postgres=#
备端查询数据且写入数据
[postgres@222 data]$ psql
psql (12.0)
Type "help" for help.
postgres=# select * from tb1 ; --查询成功
i | t
---+------
1 | lili
(1 row)
postgres=# insert into tb1 values (2,'sam'); --写入失败
2019-10-25 17:58:48.251 CST [33870] ERROR: cannot execute INSERT in a read-only transaction
2019-10-25 17:58:48.251 CST [33870] STATEMENT: insert into tb1 values (2,'sam');
ERROR: cannot execute INSERT in a read-only transaction
2、备端提升为主端
(1)主端关闭数据库服务
[postgres@90220 pg12]$ pg_ctl stop
waiting for server to shut down....2019-10-25 18:02:26.956 CST [35347] LOG: received fast shutdown request
2019-10-25 18:02:26.958 CST [35347] LOG: aborting any active transactions
2019-10-25 18:02:26.959 CST [35347] LOG: background worker "logical replication launcher" (PID 35354) exited with exit code 1
2019-10-25 18:02:26.959 CST [35349] LOG: shutting down
2019-10-25 18:02:26.978 CST [35347] LOG: database system is shut down
done
server stopped
此时备端报错
[postgres@222 data]$ 2019-10-25 18:02:26.960 CST [33833] LOG: replication terminated by primary server
2019-10-25 18:02:26.960 CST [33833] DETAIL: End of WAL reached on timeline 1 at 0/301A240.
2019-10-25 18:02:26.960 CST [33833] FATAL: could not send end-of-streaming message to primary: no COPY in progress
2019-10-25 18:02:26.960 CST [33829] LOG: invalid record length at 0/301A240: wanted 24, got 0
2019-10-25 18:02:26.978 CST [33925] FATAL: could not connect to the primary server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2019-10-25 18:02:31.968 CST [33926] FATAL: could not connect to the primary server: could not connect to server: 拒绝连接
Is the server running on host "192.168.90.220" and accepting
TCP/IP connections on port 5432?
...
(2)备端提升
[postgres@222 pg12]$ pg_ctl promote
waiting for server to promote.... done
server promoted
此时备端消息
2019-10-25 18:03:46.404 CST [33829] LOG: received promote request
2019-10-25 18:03:46.404 CST [33829] LOG: redo done at 0/301A1C8
2019-10-25 18:03:46.404 CST [33829] LOG: last completed transaction was at log time 2019-10-25 17:58:00.474187+08
2019-10-25 18:03:46.406 CST [33829] LOG: selected new timeline ID: 2
2019-10-25 18:03:46.992 CST [33829] LOG: archive recovery complete
2019-10-25 18:03:46.995 CST [33828] LOG: database system is ready to accept connections
(3)验证备端已提升
查询新主端进程
postgres 33828 1 0 17:55 ? 00:00:00 /home/postgres/pg12/bin/postgres
postgres 33830 33828 0 17:55 ? 00:00:00 postgres: checkpointer
postgres 33831 33828 0 17:55 ? 00:00:00 postgres: background writer
postgres 33832 33828 0 17:55 ? 00:00:00 postgres: stats collector
postgres 33950 33828 0 18:03 ? 00:00:00 postgres: walwriter
postgres 33951 33828 0 18:03 ? 00:00:00 postgres: autovacuum launcher
postgres 33952 33828 0 18:03 ? 00:00:00 postgres: logical replication launcher
新主端写入数据
[postgres@222 data]$ psql
psql (12.0)
Type "help" for help.
postgres=# insert into tb1 values (5,'dam');
INSERT 0 1
3、原主端以备端同步
(1)新主端修改参数
[postgres@222 data]$ vi pg_hba.conf
host all all 192.168.90.220/32 trust
host replication all 192.168.90.220/32 trust
重载生效
[postgres@222 data]$ pg_ctl reload
server signaled
2019-10-25 18:11:23.416 CST [33828] LOG: received SIGHUP, reloading configuration files
(2)原主端修改参数
[postgres@90220 data]$ vi postgresql.conf
wal_log_hints = on
启动数据库服务生效,然后关闭数据库服务
[postgres@90220 data]$ pg_ctl start
waiting for server to start....2019-10-25 18:14:30.947 CST [35680] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 18:14:30.950 CST [35680] LOG: listening on IPv4 address "0.0.0.0", port 5432
2019-10-25 18:14:30.950 CST [35680] LOG: listening on IPv6 address "::", port 5432
2019-10-25 18:14:30.951 CST [35680] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 18:14:30.962 CST [35681] LOG: database system was shut down at 2019-10-25 18:02:26 CST
2019-10-25 18:14:30.964 CST [35680] LOG: database system is ready to accept connections
done
server started
[postgres@90220 data]$ pg_ctl stop
waiting for server to shut down....2019-10-25 18:14:33.638 CST [35680] LOG: received fast shutdown request
2019-10-25 18:14:33.640 CST [35680] LOG: aborting any active transactions
2019-10-25 18:14:33.642 CST [35680] LOG: background worker "logical replication launcher" (PID 35687) exited with exit code 1
2019-10-25 18:14:33.643 CST [35682] LOG: shutting down
2019-10-25 18:14:33.663 CST [35680] LOG: database system is shut down
done
server stopped
(3)原主端执行pg_rewind
[postgres@90220 ~]$ pg_rewind --target-pgdata pg12/data --source-server='host=192.168.90.222 port=5432 user=postgres dbname=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/301A240 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/301A1C8 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 51 MB (total source directory size is 71 MB)
53090/53090 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
(4)原主端创建标志文件
[postgres@90220 data]$ touch standby.signal
(5)原主端修改参数
[postgres@90220 data]$ vi postgresql.auto.conf
将primary_conninfo中的host修改为新主端ip即192.168.90.222
(6)原主端启动数据库服务
[postgres@90220 data]$ pg_ctl start
waiting for server to start....2019-10-25 18:20:03.702 CST [35804] LOG: starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2019-10-25 18:20:03.706 CST [35804] LOG: listening on IPv4 address "0.0.0.0", port 5432
2019-10-25 18:20:03.706 CST [35804] LOG: listening on IPv6 address "::", port 5432
2019-10-25 18:20:03.710 CST [35804] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2019-10-25 18:20:03.735 CST [35805] LOG: database system was interrupted while in recovery at log time 2019-10-25 18:08:47 CST
2019-10-25 18:20:03.735 CST [35805] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2019-10-25 18:20:03.749 CST [35805] LOG: entering standby mode
2019-10-25 18:20:03.751 CST [35805] LOG: redo starts at 0/301A240
2019-10-25 18:20:03.753 CST [35805] LOG: invalid record length at 0/3033BF8: wanted 24, got 0
2019-10-25 18:20:03.769 CST [35808] LOG: started streaming WAL from primary at 0/3000000 on timeline 2
2019-10-25 18:20:03.773 CST [35805] LOG: consistent recovery state reached at 0/3033C30
2019-10-25 18:20:03.773 CST [35804] LOG: database system is ready to accept read only connections
done
server started
(7)验证原主端已为备端
查询新主端进程
postgres 33828 1 0 17:55 ? 00:00:00 /home/postgres/pg12/bin/postgres
postgres 33830 33828 0 17:55 ? 00:00:00 postgres: checkpointer
postgres 33831 33828 0 17:55 ? 00:00:00 postgres: background writer
postgres 33832 33828 0 17:55 ? 00:00:00 postgres: stats collector
postgres 33950 33828 0 18:03 ? 00:00:00 postgres: walwriter
postgres 33951 33828 0 18:03 ? 00:00:00 postgres: autovacuum launcher
postgres 33952 33828 0 18:03 ? 00:00:00 postgres: logical replication launcher
postgres 34131 33828 0 18:20 ? 00:00:00 postgres: walsender postgres 192.168.90.220(62626) streaming 0/3034990
查询新备端(原主端)进程
postgres 35804 1 0 18:20 ? 00:00:00 /home/postgres/pg12/bin/postgres
postgres 35805 35804 0 18:20 ? 00:00:00 postgres: startup recovering 000000020000000000000003
postgres 35806 35804 0 18:20 ? 00:00:00 postgres: checkpointer
postgres 35807 35804 0 18:20 ? 00:00:00 postgres: background writer
postgres 35808 35804 0 18:20 ? 00:00:00 postgres: walreceiver streaming 0/3034990
postgres 35809 35804 0 18:20 ? 00:00:00 postgres: stats collector
新主端写入数据
[postgres@222 data]$ psql
psql (12.0)
Type "help" for help.
postgres=# select * from tb1 ;
i | t
---+------
1 | lili
5 | dam
(2 rows)
postgres=# insert into tb1 values (10,'ling');
INSERT 0 1
新备端(原主端)查询写入数据
[postgres@90220 data]$ psql
psql (12.0)
Type "help" for help.
postgres=# select * from tb1 ; --查询成功
i | t
----+------
1 | lili
5 | dam
10 | ling
(3 rows)
postgres=# insert into tb1 values (20,'min'); --写入失败
2019-10-25 18:23:50.705 CST [35839] ERROR: cannot execute INSERT in a read-only transaction
2019-10-25 18:23:50.705 CST [35839] STATEMENT: insert into tb1 values (20,'min');
ERROR: cannot execute INSERT in a read-only transaction