本实验相关配置文件地址:https://github.com/lxgithub/pgHA
1、关于pgHA
This proven solution leverages PostgreSQL's native streaming replication technology and pgBouncer. It efficiently supports multiple read-only slaves, seamless DDL replication, and various failover and/or disaster recovery modes of operation. All of this without any application level program changes!!
Binary distributions are available pre-integrated into the PostgresHA and BigSQL projects.
Failoverd is an automatic failover monitoring system for Postgres database systems. It is designed to work with pgBouncer installed on the application servers for simplicity.
2、安装配置步骤
2.1设置hosts
(配置所有服务器的hosts)
192.168.100.120 app
192.168.100.131 masterdb
192.168.100.132 slavedb
2.2设置流复制环境
(在masterdb与slavedb上配置流复制环境)
配置过程略。
注:recovery.conf内容
standby_mode = on
primary_conninfo = 'host=masterdb user=postgres port=5432 password=password'
trigger_file = '/tmp/trigger.slave.5432'
2.3创建状态表failoverstatus
在masterdb节点上直接执行以下sql或者执行cfg/status_table.sql脚本:
CREATE TABLE public.failoverstatus (
id SERIAL8,
dttm TIMESTAMP WITH TIME ZONE,
event TEXT,
properties TEXT ) ;
或
[postgres@masterdb ~]$ psql -f/opt/pgHA/cfg/status_table.sql
CREATE TABLE
查看:
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+-----------------------+----------+----------
public | failoverstatus | table | postgres
public | failoverstatus_id_seq | sequence | postgres
(2 rows)
2.4安装pgBouncer
(在app节点上安装pgBouncer)
安装过程略,可参考http://my.oschina.net/lianshunke/blog/223896#OSC_h2_36 第8.1小节。
配置
[postgres@app ~]$ vi /opt/pgbouncer/pgbouncer.ini.orig
[databases]
masterdb = host=masterdb port=5432 dbname=postgres user=postgres
[pgbouncer]
logfile = /opt/pgbouncer/pgbouncer.log
pidfile = /opt/pgbouncer/pgbouncer.pid
listen_addr = *
listen_port = 6432
auth_type = trust
auth_file = /opt/pgbouncer/userlist.txt
admin_users = pgbouncer
pool_mode = session
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits
max_client_conn = 100
default_pool_size = 20
[postgres@app ~]$ vi /opt/pgbouncer/pgbouncer.ini.failover
[databases]
masterdb = host=slavedb port=5432 dbname=postgres user=postgres
[pgbouncer]
logfile = /opt/pgbouncer/pgbouncer.log
pidfile = /opt/pgbouncer/pgbouncer.pid
listen_addr = *
listen_port = 6432
auth_type = trust
auth_file = /opt/pgbouncer/userlist.txt
admin_users = pgbouncer
pool_mode = session
server_reset_query = DISCARD ALL
ignore_startup_parameters = extra_float_digits
max_client_conn = 100
default_pool_size = 20
[postgres@app ~]$ vi /opt/pgbouncer/userlist.txt
"postgres" "123456"
"pgbouncer" "123456"
[postgres@app ~]$ cp /opt/pgbouncer/pgbouncer.ini.orig /opt/pgbouncer/pgbouncer.ini
启动pgbouncer
[postgres@app ~]$ pgbouncer -d /opt/pgbouncer/pgbouncer.ini
2014-05-15 08:51:28.410 14172 LOG File descriptor limit: 1024 (H:4096), max_client_conn: 100, max fds possible: 130
2.5设置ssh互信
(保证masterdb和slavedb两个节点可以免密码连接app节点)
Masterdb节点执行:
[postgres@masterdb ~]$ cd .ssh/
[postgres@masterdb .ssh]$ ssh-keygen -t rsa
[postgres@masterdb .ssh]$ ssh-copy-id -i id_rsa.pub app
[postgres@masterdb .ssh]$ ssh app date
Wed May 14 06:41:07 CST 2014
Slavedb节点执行:
[postgres@slavedb ~]$ cd .ssh/
[postgres@slavedb .ssh]$ ssh-keygen -t rsa
[postgres@slavedb .ssh]$ ssh-copy-id -i id_rsa.pub app
[postgres@slavedb .ssh]$ ssh app date
Wed May 14 06:43:25 CST 2014
2.6安装pgHA依赖的perl模块
yum install perl-devel perl-CPAN
perl -MCPAN -e shell
>
install Module::Build::Compat
install Config::IniFiles
install Date::Format
install DBI
install DBD::Pg
force install File::Basename
install Getopt::Long
install IO::Socket::INET
install Log::Log4perl
install Proc::Daemon
install Net::Ping
2.7解压配置pgHA
在masterdb与slavedb节点上解压安装:
# tar -xvf openscg-pgha-739ecae07a44.tar.bz2
# mv openscg-pgha-739ecae07a44 /opt/pgHA
# chown postgres:postgres /opt/pgHA/ -R
在slavedb节点上配置:
[postgres@slavedb ~]$ cd /opt/pgHA/cfg/
[postgres@slavedb cfg]$ cp example.conf pgha.conf
[postgres@slavedb cfg]$ vi pgha.conf
[global]
pidFile=/opt/pgHA/bin/pgHA.pid
logConfig=/opt/pgHA/cfg/log4perl.conf
instanceName=pgHADefaultInstance
triggerFile=/tmp/trigger.slave.5432
defcon3_probe_frequency=2
defcon2_retries=10
defcon2_probe_frequency=1
[master]
dbhost=masterdb
dbport=5432
dbname=postgres
dbuser=postgres
dbpass=password
status_table=public.failoverstatus
dbcheck_query="select 1"
[slave]
dbhost=127.0.0.1
dbport=5432
dbname=postgres
dbuser=postgres
dbpass=password
[app01]
dbhost=app
dbport=6432
dbname=masterdb
dbuser=postgres
dbpass=password
dbhost_user=postgres
pgbouncer_db=pgbouncer
pgbouncer_db_user=pgbouncer
pgbouncer_db_pass=password
pgbouncer_config=/opt/pgbouncer/pgbouncer.ini
pgbouncer_standard_config=/opt/pgbouncer/pgbouncer.ini.orig
pgbouncer_failover_config=/opt/pgbouncer/pgbouncer.ini.failover
fence_lock_dbs=masterdb
fence_lock_command=kill
fence_move_command=reload
fence_unlock_command=resume
dbcheck="select 1"
2.8启动pgHA
[postgres@slavedb ~]$ cd /opt/pgHA/bin/
[postgres@slavedb bin]$ ./failoverd.pl -c ../cfg/pgha.conf --auto
Welcome to failoverd, an auto-failover system for PostgreSQL streaming replication
Instance Name: pgHADefaultInstance
Config: ../cfg/pgha.conf
PidFile: /opt/pgHA/bin/pgHA.pid
Failover Command: touch /tmp/trigger.slave.5432
The Failover command must be able to run successfully.
If it cannot, the slave database will NOT be promoted.
Initializing...
Now Checking: master
master : OK
Now Checking: slave
slave : OK
Now Checking: app01
app01 : OK
=== Verifying application host config ===
For host: app01
=== Application host config verified===
=== Arming Failover mechanism ===
Your cluster is now ready to failover.
restarting pgBouncer without taking this into
consideration could cause unexpected downtime
=== Failover mechanism armed ===
SUCCESS!
Worker active on PID: 31735
查看进程:
[postgres@slavedb bin]$ ps -ef | grep failoverd
postgres 31735 1 0 08:59 ? 00:00:00 failoverd-Defcon3-OK-pgHADefaultInstance
postgres 31741 22175 0 09:00 pts/2 00:00:00 grep failoverd
3、工作原理
将pgHA脚本部署在数据库(备库)节点上,通过监控进程failoverd实时检测主库与app的状态,检测方式为“select 1”及Net::Ping。一旦发现主库无法正常连接,那么将会在备机上touch一个与trigger_file相符的触发文件,进而使备库提升为主库,然后通过ssh连接到app服务器,将之前定义好的适合切换后的新主节点的pgbouncer配置文件拷贝一份为pgbouncer.ini,接着重启pgbouncer使更改生效。
问题:
通过实验,验证了在自动或手动时,备库均能成功提升状态,但是关于重置pgbouncer的相关操作未生效,虽脚本中有相关内容,但却未执行。如下:
my $statusCode=0;
foreach my $key ( keys %configHash )
{
if ( $key =~ m/^app/ )
{
my $setSpringCmd="ssh " . $configHash{$key}{$dbhostUser} . "@" . $configHash{$key}{$dbhost}
. " \"cp $configHash{$key}{$pgbouncerStandardConfig} $configHash{$key}{$pgbouncerConfig}\"";
print "\tUn-Setting failover config for pgBouncer on: $key... ";
my $thisStatus=system ($setSpringCmd);
print "DONE!\n" if ( $thisStatus == 0);
print "FAILED\n" if ( $thisStatus == 1);
$statusCode=$statusCode || $thisStatus
}
}
return !$statusCode;
继续分析原因……
据openscg介绍已经将pgHA组件集成到PostgresHA和BigSQL中,待之后验证。