Centos7部署pg-xl教程

学习目标:

完成分布式pg的部署


1.下载pgxl安装包,由于官方网站无法连接本文所使用的安装包来源于如下连接:

https://master.dl.sourceforge.net/project/postgres-xl/Releases/Version_9.5r1/postgres-xl-9.5r1.4.tar.gz?viasf=1


2.简介一下pgxl和pg及pgxc的区别:

PostgreSQL与Postgres-XL
  • 1994年,Postgre95发布,开源。
  • 1996年,PostgreSQL继承了Postgre95,发布。
  • 2010年,Postgres-XC发布。
  • 2012年,前PGXC核心开发者创建StormDB公司,进行了一些改进,包括对MPP并行化的性能改进和多租户安全。
  • 2013年,TransLattice收购了StormDB。
  • 2014年,将项目开源,命名为Postgres-XL。

Postgres-XC与Postgres-XL

PGXL的架构师和开发者 很多都是以前做PGXC的,PGXL的部分代码是从PGXC移植过来的。

比起功能性,PGXL更强调稳定性, 正确性和性能.

PGXL增加了一些重要的性能提升,比如MPP和replan avoidance on the data nodes,这些都是PGXC没有的。

PGXC目前集中在OLTP的业务上面,PGXL则更加灵活,可以应用于很多不同种类的业务上,比如可以用在大数据处理领域,除此,在多租户的环境中,PGXL也更加安全。

PGXL的社区非常开放。

 3.架构

PGXL有三个主要组件,分别是GTM,Coordinator(CN)和Datanode(DN)。

GTM(Gloable Transaction Manager)负责提供事务的ACID属性;Datanode负责存储表的数据和本地执行由Coordinator派发的SQL任务;Coordinator负责处理每个来自Application的SQL任务,并且决定由哪个Datanode执行,然后将任务计划派发给相应的Datanode,根据需要收集结果返还给Application;

4.部署环境

本次的部署的环境使用的是VMware中的centos7

hostnameIPfunction
gtm192.168.235.128gtm
node1192.168.235.129coordinator & datanode
node2192.168.235.130coordinator & datanode

5.开始部署

1.配置hosts

所有节点添加以下解析: 

192.168.235.128 gtm
192.168.235.129 node1
192.168.235.130 node2

操作多台机器的方法:本文使用的是xshell8,选项卡-排列-瓷砖

工具-发送键输入到-所有回话 

 2.新建账号

 useradd postgres 
 passwd postgres
 su - postgres
 mkdir ~/.ssh
 chmod 700 ~/.ssh

3.关闭防火墙和selinux  

 sudo systemctl stop firewalld
 sudo systemctl disable firewalld   #重启不会启动
 sudo systemctl status firewalld    #查看一下状态,确定一下
 #关闭selinux
 sudo vi /etc/selinux/config文件
 将SELINUX的值设置为disabled

4.配置gtm到其他节点免密ssh登陆,在gtm节点操作  

 ssh-keygen -t rsa    #这里一定要一直按回车,不要输入任何字符
 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 
 chmod 600 ~/.ssh/authorized_keys 
 scp ~/.ssh/authorized_keys postgres@node1:~/.ssh/authorized_keys
 scp ~/.ssh/authorized_keys postgres@node2:~/.ssh/authorized_keys

测试免密: 

ssh postgres@node1    

5. 安装基础环境

sudo yum install -y make gcc tar readline readline-devel zlib zlib-devel flex bison openjade docbook-style-dsssl

如果遇到yum源不可用,采用如下办法:

备份现有的仓库文件:
sudo mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
下载阿里云的 CentOS 7 镜像源配置文件:
sudo curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
清理缓存并重新生成缓存:
sudo yum clean all
sudo yum makecache

 6.上传安装包

 本文使用的是xshell8,需要下载xftp

7.解压

请注意./configure --prefix=/app/pgxl,目录为部署的pgxl的家目录,需要自己的位置

tar -zxvf postgres-xl-9.5r1.6.tar.gz
cd postgres-xl
./configure --prefix=/app/pgxl
make
make install
cd contrib/
make
make install

 8.配置环境变量

 请注意代码中的PGHOME变量要填自己的目录

vi ~/.bashrc
export PGHOME=/app/pgxl
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH
source ~/.bashrc

注:可以用echo $PGHOME查看变量是否生效,如果未生效,which查找一下目录

如果与自己设置的目录不同,重新编辑vi ~/.bashrc,文件然后source一下

9.配置集群配置文件,仅在gtm上配置,先生成一份模板,在修改其中内容

 生产配置文件

pgxc_ctl prepare

找到配置文件路径,修改内容 

10.配置文件示例,本文协调节点和数据节点公用,一共三个机器

pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all 

#!/usr/bin/env bash
#
# Postgres-XC Configuration file for pgxc_ctl utility. 
#
# Configuration file can be specified as -c option from pgxc_ctl command.   Default is
# $PGXC_CTL_HOME/pgxc_ctl.org.
#
# This is bash script so you can make any addition for your convenience to configure
# your Postgres-XC cluster.
#
# Please understand that pgxc_ctl provides only a subset of configuration which pgxc_ctl
# provide.  Here's several several assumptions/restrictions pgxc_ctl depends on.
#
# 1) All the resources of pgxc nodes has to be owned by the same user.   Same user means
#    user with the same user name.  User ID may be different from server to server.
#    This must be specified as a variable $pgxcOwner.
#
# 2) All the servers must be reacheable via ssh without password.   It is highly recommended
#    to setup key-based authentication among all the servers.
#
# 3) All the databases in coordinator/datanode has at least one same superuser.  Pgxc_ctl
#    uses this user to connect to coordinators and datanodes.   Again, no password should
#    be used to connect.  You have many options to do this, pg_hba.conf, pg_ident.conf and
#    others.  Pgxc_ctl provides a way to configure pg_hba.conf but not pg_ident.conf.   This
#    will be implemented in the later releases.
#
# 4) Gtm master and slave can have different port to listen, while coordinator and datanode
#    slave should be assigned the same port number as master.
#
# 5) Port nuber of a coordinator slave must be the same as its master.
#
# 6) Master and slave are connected using synchronous replication.  Asynchronous replication
#    have slight (almost none) chance to bring total cluster into inconsistent state.
#    This chance is very low and may be negligible.  Support of asynchronous replication
#    may be supported in the later release.
#
# 7) Each coordinator and datanode can have only one slave each.  Cascaded replication and
#    multiple slave are not supported in the current pgxc_ctl.
#
# 8) Killing nodes may end up with IPC resource leak, such as semafor and shared memory.
#    Only listening port (socket) will be cleaned with clean command.
#
# 9) Backup and restore are not supported in pgxc_ctl at present.   This is a big task and
#    may need considerable resource.
#
#========================================================================================
#
#
# pgxcInstallDir variable is needed if you invoke "deploy" command from pgxc_ctl utility.
# If don't you don't need this variable.
pgxcInstallDir=/app/pgxl
#---- OVERALL -----------------------------------------------------------------------------
#
pgxcOwner=postgres		# owner of the Postgres-XC databaseo cluster.  Here, we use this
						# both as linus user and database user.  This must be
						# the super user of each coordinator and datanode.
pgxcUser=$pgxcOwner		# OS user of Postgres-XC owner

tmpDir=/tmp					# temporary dir used in XC servers
localTmpDir=$tmpDir			# temporary dir used here locally

configBackup=n					# If you want config file backup, specify y to this value.
configBackupHost=pgxc-linker	# host to backup config file
configBackupDir=$HOME/pgxc		# Backup directory
configBackupFile=pgxc_ctl.bak	# Backup file name --> Need to synchronize when original changed.

dataDirRoot=$HOME/DATA/pgxl/nodes

#---- GTM ------------------------------------------------------------------------------------

# GTM is mandatory.  You must have at least (and only) one GTM master in your Postgres-XC cluster.
# If GTM crashes and you need to reconfigure it, you can do it by pgxc_update_gtm command to update
# GTM master with others.   Of course, we provide pgxc_remove_gtm command to remove it.  This command
# will not stop the current GTM.  It is up to the operator.

#---- Overall -------
gtmName=gtm

#---- GTM Master -----------------------------------------------

#---- Overall ----
gtmMasterServer=gtm
gtmMasterPort=20001
gtmMasterDir=$dataDirRoot/gtm

#---- Configuration ---
gtmExtraConfig=() # Will be added gtm.conf for both Master and Slave (done at initilization only)
gtmMasterSpecificExtraConfig=()	# Will be added to Master's gtm.conf (done at initialization only)

#---- GTM Slave -----------------------------------------------

# Because GTM is a key component to maintain database consistency, you may want to configure GTM slave
# for backup.

#---- Overall ------
gtmSlave=n					# Specify y if you configure GTM Slave.   Otherwise, GTM slave will not be configured and
							# all the following variables will be reset.
gtmSlaveName=()
gtmSlaveServer=()		# value none means GTM slave is not available.  Give none if you don't configure GTM Slave.
gtmSlavePort=()			# Not used if you don't configure GTM slave.
gtmSlaveDir=()	# Not used if you don't configure GTM slave.
# Please note that when you have GTM failover, then there will be no slave available until you configure the slave
# again. (pgxc_add_gtm_slave function will handle it)

#---- Configuration ----
gtmSlaveSpecificExtraConfig=() # Will be added to Slave's gtm.conf (done at initialization only)

#---- GTM Proxy -------------------------------------------------------------------------------------------------------
# GTM proxy will be selected based upon which server each component runs on.
# When fails over to the slave, the slave inherits its master's gtm proxy.  It should be
# reconfigured based upon the new location.
#
# To do so, slave should be restarted.   So pg_ctl promote -> (edit postgresql.conf and recovery.conf) -> pg_ctl restart
#
# You don't have to configure GTM Proxy if you dont' configure GTM slave or you are happy if every component connects
# to GTM Master directly.  If you configure GTL slave, you must configure GTM proxy too.

#---- Shortcuts ------
gtmProxyDir=()

#---- Overall -------
gtmProxy=()				# Specify y if you conifugre at least one GTM proxy.   You may not configure gtm proxies
						# only when you dont' configure GTM slaves.
						# If you specify this value not to y, the following parameters will be set to default empty values.
						# If we find there're no valid Proxy server names (means, every servers are specified
						# as none), then gtmProxy value will be set to "n" and all the entries will be set to
						# empty values.
gtmProxyNames=()	# No used if it is not configured
gtmProxyServers=()			# Specify none if you dont' configure it.
gtmProxyPorts=()				# Not used if it is not configured.
gtmProxyDirs=()	# Not used if it is not configured.

#---- Configuration ----
gtmPxyExtraConfig=n		# Extra configuration parameter for gtm_proxy.  Coordinator section has an example.

#---- Coordinators ----------------------------------------------------------------------------------------------------

#---- shortcuts ----------
coordMasterDir=$dataDirRoot/coord_master
coordSlaveDir=$HOME/coord_slave
coordArchLogDir=$HOME/coord_archlog

#---- Overall ------------
coordNames=(coord1 coord2)		# Master and slave use the same name
coordPorts=(node1 node2)			# Master server listening ports
poolerPorts=(30001 30001)			# Master pooler ports
coordPgHbaEntries=(0.0.0.0/0)	# Assumes that all the coordinator (master/slave) accepts
												# the same connection
												# This entry allows only $pgxcOwner to connect.
												# If you'd like to setup another connection, you should
												# supply these entries through files specified below.
#coordPgHbaEntries=(127.0.0.1/32)	# Same as above but for IPv4 connections

#---- Master -------------
coordMasterServers=()		# none means this master is not available
coordMasterDirs=()
coordMaxWALsender=5	# max_wal_senders: needed to configure slave. If zero value is specified,
						# it is expected to supply this parameter explicitly by external files
						# specified in the following.	If you don't configure slaves, leave this value to zero.
coordMaxWALSenders=($dataDirRoot/coord_master.1 $dataDirRoot/coord_master.2)
						# max_wal_senders configuration for each coordinator.

#---- Slave -------------
coordSlave=n			# Specify y if you configure at least one coordiantor slave.  Otherwise, the following
						# configuration parameters will be set to empty values.
						# If no effective server names are found (that is, every servers are specified as none),
						# then coordSlave value will be set to n and all the following values will be set to
						# empty values.
coordSlaveSync=n		# Specify to connect with synchronized mode.
coordSlaveServers=(none none)			# none means this slave is not available
coordSlavePorts=none none()			# coordinator slave listening ports
coordSlavePoolerPorts=(none none)			# coordinator slave pooler ports
coordSlaveDirs=(none none)
coordArchLogDirs=(none none)

#---- Configuration files---
# Need these when you'd like setup specific non-default configuration 
# These files will go to corresponding files for the master.
# You may supply your bash script to setup extra config lines and extra pg_hba.conf entries 
# Or you may supply these files manually.
coordExtraConfig=coordExtraConfig	# Extra configuration file for coordinators.  
						# This file will be added to all the coordinators'
						# postgresql.conf
# Pleae note that the following sets up minimum parameters which you may want to change.
# You can put your postgresql.conf lines here.
cat > $coordExtraConfig <<EOF
#================================================
# Added to all the coordinator postgresql.conf
# Original: $coordExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_connections = 100
hot_standby = off
EOF

# Additional Configuration file for specific coordinator master.
# You can define each setting by similar means as above.
coordSpecificExtraConfig=()
coordSpecificExtraPgHba=()

#---- Datanodes -------------------------------------------------------------------------------------------------------

#---- Shortcuts --------------
datanodeMasterDir=$dataDirRoot/dn_master
datanodeSlaveDir=$dataDirRoot/dn_slave
datanodeArchLogDir=$dataDirRoot/datanode_archlog

#---- Overall ---------------
primaryDatanode= # Primary Node.
datanodeNames=(dn1 dn2)
datanodePorts=(node1 node2)	# Master and slave use the same port!
datanodePoolerPorts=(40001 40001)	# Master and slave use the same port!
datanodePgHbaEntries=(40011 40011)	# Assumes that all the coordinator (master/slave) accepts
										# the same connection
										# This list sets up pg_hba.conf for $pgxcOwner user.
										# If you'd like to setup other entries, supply them
										# through extra configuration files specified below.
#datanodePgHbaEntries=(127.0.0.1/32)	# Same as above but for IPv4 connections

#---- Master ----------------
datanodeMasterServers=()	# none means this master is not available.
													# This means that there should be the master but is down.
													# The cluster is not operational until the master is
													# recovered and ready to run.	
datanodeMasterDirs=($dataDirRoot/dn_master.1 $dataDirRoot/dn_master.2)
datanodeMaxWalSender=5								# max_wal_senders: needed to configure slave. If zero value is 
													# specified, it is expected this parameter is explicitly supplied
													# by external configuration files.
													# If you don't configure slaves, leave this value zero.
datanodeMaxWALSenders=(5 5)
						# max_wal_senders configuration for each datanode

#---- Slave -----------------
datanodeSlave=n			# Specify y if you configure at least one coordiantor slave.  Otherwise, the following
						# configuration parameters will be set to empty values.
						# If no effective server names are found (that is, every servers are specified as none),
						# then datanodeSlave value will be set to n and all the following values will be set to
						# empty values.
datanodeSlaveServers=()	# value none means this slave is not available
datanodeSlavePorts=()	# Master and slave use the same port!
datanodeSlavePoolerPorts=()	# Master and slave use the same port!
#datanodeSlaveSync=y		# If datanode slave is connected in synchronized mode
datanodeSlaveDirs=()
datanodeArchLogDirs=()

# ---- Configuration files ---
# You may supply your bash script to setup extra config lines and extra pg_hba.conf entries here.
# These files will go to corresponding files for the master.
# Or you may supply these files manually.
datanodeExtraConfig=datanodeExtraConfig	
cat > $datanodeExtraConfig <<EOF
#================================================
# Added to all the datanode postgresql.conf
# Original: $datanodeExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_connections = 100
hot_standby = off
EOF
# Additional Configuration file for specific datanode master.
# You can define each setting by similar means as above.
datanodeSpecificExtraConfig=(none none)
datanodeSpecificExtraPgHba=(none none)

11.初始化集群

 对应的是配置文件的路径,初始化集群

pgxc_ctl -c /home/postgres/pgxc_ctl init all

 如果初始化过程中报错,请自行检测节点名与配置文件中对应的节点数量,然后启动集群

pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all 

 12.测试集群

在配置好环境的机器上执行psql连接cn节点的端口 

[postgres@node1 pgxc_ctl]$ psql -p 30001
psql (PGXL 9.5r1.4, based on PG 9.5.5 (Postgres-XL 9.5r1.4))
Type "help" for help.

postgres=# select * from pgxc_node;
 node_name | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
-----------+-----------+-----------+-----------+----------------+------------------+-------------
 coord1    | C         |     30001 | node1     | f              | f                |  1885696643
 coord2    | C         |     30001 | node2     | f              | f                | -1197102633
 dn1       | D         |     40001 | node1     | f              | t                |  -560021589
 dn2       | D         |     40001 | node2     | f              | f                |   352366662
(4 rows)

完成部署与测试 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值