概述
GBase 8a MPP Cluster 支持集群扩容、集群缩容、集群节点替换等功能,满足在集群运行过程中需要存储的数据增加导致数据存储空间不足,长时间运行导致单节点硬件故障,整体集群需要进行硬件升级等场景。
V9.5.3 当前不支持 gcware 节点的扩容和缩容,支持 gcware 节点替换。
集群扩容操作流程-多VC模式
扩容纯 data 节点 :
集群环境描述:
Coordinator 节点: 172.168.83.11, 172.168.83.12, 172.168.83.13
Data 节点:
vc1: 172.168.83.11, 172.168.83.12
Vc2: 172.168.83.13, 172.168.83.14
待扩容到 vc1 的 data 节点 IP:172.168.83.15
安装节点(可选)
如果当前已经存在 freenode 节点,可以忽略该步骤。
步骤 1:修改 demo.options 文件:
1) 设置 dataHost 参数为要安装的节点的 IP;
2) 修改 existCoordinateHost 参数为已存在的 Coordinator 节点的 IP;
3) 修改 existDataHost 参数为已存在的所有 data 节点的 IP。
修改后的 demo.options 参考如下:
$ cat demo.options installPrefix= /opt #coordinateHost = #coordinateHostNodeID =1,2,3 dataHost = 172.168.83.15 existCoordinateHost =172.168.83.11,172.168.83.12,172.168.83.13 existDataHost =172.168.83.11,172.168.83.12,172.168.83.13,172.168.83.14 existGcwareHost=172.168.83.11,172.168.83.12,172.168.83.13 #gcwareHost = #gcwareHostNodeID = dbaUser = gbase dbaGroup = gbase dbaPwd = 'gbasedba' rootPwd = '111111'
#rootPwdFile = rootPwd.json
步骤 2:执行安装
$ ./gcinstall.py --silent=demo.options ****************************************************************** *************** Thank you for choosing GBase product! ……………… ****************************************************************** Do you accept the above licence agreement ([Y,y]/[N,n])? y ****************************************************************** Welcome to install GBase products ****************************************************************** Environmental Checking on gcluster nodes. CoordinateHost: DataHost: 172.168.83.15
Are you sure to install GCluster on these nodes ([Y,y]/[N,n])? y
……………… 172.168.83.15 install cluster on host 172.168.83.15 successfully. update and sync configuration file... Starting all gcluster nodes... adding new datanodes to gcware... $ ##出现上面信息表示安装成功 安装后状态信息如下: $ gcadmin CLUSTER STATE: ACTIVE ================================================================ | GBASE COORDINATOR CLUSTER INFORMATION | ================================================================ | NodeName | IpAddress | gcware | gcluster | DataState | ---------------------------------------------------------------- | coordinator1 | 172.168.83.11 | OPEN | OPEN | 0 | ---------------------------------------------------------------- | coordinator2 | 172.168.83.12 | OPEN | OPEN | 0 | ---------------------------------------------------------------- | coordinator3 | 172.168.83.13 | OPEN | OPEN | 0 | ---------------------------------------------------------------- =============================================== | GBASE VIRTUAL CLUSTER INFORMATION | =============================================== | VcName | DistributionId | comment | ----------------------------------------------- | vc1 | 1 | vc1comments | ----------------------------------------------- | vc2 | 2 | vc2comments | ----------------------------------------------- ============================================================== | GBASE CLUSTER FREE DATA NODE INFORMATION | ============================================================== | NodeName | IpAddress | gnode | syncserver | DataState | -------------------------------------------------------------- | FreeNode1 | 172.168.83.15 | OPEN | OPEN | 0 | -------------------------------------------------------------- 2 virtual cluster: vc1, vc2 3 coordinator node
1 free data node
将节点添加到待扩容 vc
需要使用 addnodes 命令将 freenode 节点添加到要扩容的 VC 中,然后才可以进行下一步操作。
步骤 1:修改 gcChangeInfo.xml 文件:
$ cat gcChangeInfo.xml
<?xml version="1.0" encoding="utf-8"?>
<servers>
<rack>
<node ip="172.168.83.15"/>
</rack>
</servers>
步骤 2:将 freenode 添加到 vc1 中:
$ gcadmin addnodes gcChangeInfo.xml vc1
gcadmin add nodes ...
flush statemachine success
gcadmin addnodes to vc [vc1] success
添加后,集群状态信息如下:
$ gcadmin
CLUSTER STATE: ACTIVE
================================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
================================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
----------------------------------------------------------------
| coordinator1 | 172.168.83.11 | OPEN | OPEN | 0 |
----------------------------------------------------------------
| coordinator2 | 172.168.83.12 | OPEN | OPEN | 0 |
----------------------------------------------------------------
| coordinator3 | 172.168.83.13 | OPEN | OPEN | 0 |
----------------------------------------------------------------
===============================================
| GBASE VIRTUAL CLUSTER INFORMATION |
===============================================
| VcName | DistributionId | comment |
-----------------------------------------------
| vc1 | 1 | vc1comments |
-----------------------------------------------
| vc2 | 2 | vc2comments |
-----------------------------------------------
2 virtual cluster: vc1, vc2
3 coordinator node
0 free data node
$ gcadmin showcluster vc vc1
CLUSTER STATE: ACTIVE
VIRTUAL CLUSTER MODE: NORMAL
===============================================
| GBASE VIRTUAL CLUSTER INFORMATION |
===============================================
| VcName | DistributionId | comment |
-----------------------------------------------
| vc1 | 1 | vc1comments |
-----------------------------------------------
================================================================================
| VIRTUAL CLUSTER DATA NODE INFORMATION |
================================================================================
| NodeName | IpAddress | DistributionId | gnode | syncserver | DataState |
--------------------------------------------------------------------------------
| node1 |172.168.83.11 | 1 | OPEN | OPEN | 0 |
--------------------------------------------------------------------------------
| node2 |172.168.83.12 | 1 | OPEN | OPEN | 0 |
--------------------------------------------------------------------------------
| node3 |172.168.83.15 | | OPEN | OPEN | 0 |
3 data node
创建新的 distribution
步骤 1:修改安装目录下的 gcChangeInfo.xml 文件,增加待扩容的节点 IP,即将扩容后的所有节点 IP 都写入 gcChangeInfo.xml 文件。
修改后的 gcChangeInfo.xml 文件参考如下:
$ cat gcChangeInfo.xml
<?xml version="1.0" encoding="utf-8"?>
<servers>
<rack>
<node ip="172.168.83.15"/>
<node ip="172.168.83.11"/>
<node ip="172.168.83.12"/>
</rack>
</servers>
步骤 2:执行创建 distribution 的命令。
$ gcadmin distribution gcChangeInfo.xml p 1 d 1 vc vc1
gcadmin generate distribution ...
copy system table to 172.168.83.15
gcadmin generate distribution successful
完成后的集群信息如下:
$ gcadmin showdistribution vc vc1
Distribution ID: 3 | State: new | Total segment num: 3
Primary Segment Node IP Segment ID Duplicate Segment node IP
=========================================================================
| 172.168.83.15 | 1 | 172.168.83.11 |
-------------------------------------------------------------------------
| 172.168.83.11 | 2 | 172.168.83.12 |
-------------------------------------------------------------------------
| 172.168.83.12 | 3 | 172.168.83.15 |
=========================================================================
Distribution ID: 1 | State: old | Total segment num: 2
Primary Segment Node IP Segment ID Duplicate Segment node IP
=========================================================================
| 172.168.83.11 | 1 | 172.168.83.12 |
-------------------------------------------------------------------------
| 172.168.83.12 | 2 | 172.168.83.11 |
=========================================================================
初始化 hashmap 并进行数据重分布
在该步骤中可以设置 rebalance 任务的优先级。先设置参数gcluster_rebalancing_concurrent_count=0 阻止 rebalance 任务被执行。然后利用rebalance instance 把当前集群下所有表加入到gclusterdb.rebalancing_status 中。调整完每个表的 rebalance 任务的优先级后再设置 gcluster_rebalancing_concurrent_count为需要的并发数,开始执行数据重分布。详细步骤参考章节调整 rebalance 任务优先级。
步骤 1:初始化 hashmap:
$ gccli -uroot
GBase client 9.5.3.17.117651. Copyright (c) 2004-2020, GBase. All
Rights Reserved.
gbase> use vc vc1;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)
gbase> initnodedatamap;
Query OK, 0 rows affected, 5 warnings (Elapsed: 00:00:01.45)
步骤 2:执行数据重分布,本示例中没有进行优先级调整:
gbase> show variables like '%rebalanc%';
+-------------------------------------------------------+-----------+
| Variable_name | Value |
+-------------------------------------------------------+-----------+
| _t_gcluster_rebalance_mirror_node | 0 |
| gcluster_load_rebalance_seed | 5 |
| gcluster_rebalancing_concurrent_count | 5 |
| gcluster_rebalancing_ignore_mirror | OFF |
| gcluster_rebalancing_immediate_recover_internal_table | OFF |
| gcluster_rebalancing_parallel_degree | 4 |
| gcluster_rebalancing_random_table_quick_mode | 1 |
| gcluster_rebalancing_step | 100000000 |
| gcluster_rebalancing_update_status_on_drop_table | ON |
+-------------------------------------------------------+-----------+
9 rows in set (Elapsed: 00:00:00.24)
gbase> rebalance instance;
Query OK, 2 rows affected (Elapsed: 00:00:01.45)
gbase> show variables like '%rebalanc%';
+-------------------------------------------------------+-----------+
| Variable_name | Value |
+-------------------------------------------------------+-----------+
| _t_gcluster_rebalance_mirror_node | 0 |
| gcluster_load_rebalance_seed | 5 |
| gcluster_rebalancing_concurrent_count | 5 |
| gcluster_rebalancing_ignore_mirror | OFF |
| gcluster_rebalancing_immediate_recover_internal_table | OFF |
| gcluster_rebalancing_parallel_degree | 4 |
| gcluster_rebalancing_random_table_quick_mode | 1 |
| gcluster_rebalancing_step | 100000000 |
| gcluster_rebalancing_update_status_on_drop_table | ON |
+-------------------------------------------------------+-----------+
9 rows in set (Elapsed: 00:00:00.24)
gbase> rebalance instance;
Query OK, 2 rows affected (Elapsed: 00:00:01.45)
查看 rebalance 状态:
gbase> select index_name, status, percentage from
gclusterdb.rebalancing_status;
+------------+-----------+------------+
| index_name | status | percentage |
+------------+-----------+------------+
| demo.t | COMPLETED | 100 |
| demo.tt | COMPLETED | 100 |
+------------+-----------+------------+
2 rows in set (Elapsed: 00:00:00.04)
gbase> quit
Bye
删除旧的 distribution
步骤 1:确认当前的 distribution id,在当前示例中新的 Distribution ID 为 3,旧的Distribution ID 为 1:
$ gcadmin showdistribution vc vc1
Distribution ID: 3 | State: new | Total segment num: 3
Primary Segment Node IP Segment ID Duplicate Segment node IP
=========================================================================
| 172.168.83.15 | 1 | 172.168.83.11 |
-------------------------------------------------------------------------
| 172.168.83.11 | 2 | 172.168.83.12 |
-------------------------------------------------------------------------
| 172.168.83.12 | 3 | 172.168.83.15 |
=========================================================================
Distribution ID: 1 | State: old | Total segment num: 2
Primary Segment Node IP Segment ID Duplicate Segment node IP
=========================================================================
| 172.168.83.11 | 1 | 172.168.83.12 |
-------------------------------------------------------------------------
| 172.168.83.12 | 2 | 172.168.83.11 |
=========================================================================
步骤 2:确认当前集群中没有使用旧的 Distribution ID 的表
gbase> select index_name,tbname,data_distribution_id,vc_id from
gbase.table_distribution;
+------------------------------+------------------+---------------------+---------+
| index_name |tbname |data_distribution_id | vc_id |
+------------------------------+------------------+---------------------+---------+
| gclusterdb.rebalancing_status|rebalancing_status| 3 | vc00001 |
| gclusterdb.dual |dual | 3 | vc00001 |
| demo.t |t | 3 | vc00001 |
| demo.tt |tt | 3 | vc00001 |
+------------------------------+------------------+---------------------+---------+
步骤 3:删除旧的 distribution
:
$ gcadmin rmdistribution 1 vc vc1
cluster distribution ID [1]
it will be removed now
please ensure this is ok, input [Y,y] or [N,n]: y
select count(*) from gbase.nodedatamap where data_distribution_id=1 result is not 0
refreshnodedatamap drop 1 success
gcadmin remove distribution [1] success
$ gcadmin showdistribution vc vc1
Distribution ID: 3 | State: new | Total segment num: 3
Primary Segment Node IP Segment ID Duplicate Segment node IP
===================================================================================
| 172.168.83.15 | 1 | 172.168.83.11 |
-----------------------------------------------------------------------------------
| 172.168.83.11 | 2 | 172.168.83.12 |
-----------------------------------------------------------------------------------
| 172.168.83.12 | 3 | 172.168.83.15 |
===================================================================================
扩容纯 coordinator 节点
集群环境描述:
Coordinator 节点: 172.168.83.11, 172.168.83.12, 172.168.83.13
Data 节点:
vc1: 172.168.83.11, 172.168.83.12
Vc2: 172.168.83.13, 172.168.83.14
增加新的 coordinator 节点 IP:172.168.83.15
准备配置文件
步骤 1:修改 demo.options 文件:
1) 设置 coordinateHost 参数为要安装的节点的 IP;
2) 设置 coordinateHostNodeID 参数为要安装的节点设置的 ID,与 coordinateHost节点设置的一一对应,且不重复的整数值;
3) 修改 existCoordinateHost 参数为已存在的 Coordinator 节点的 IP;
4) 修改 existDataHost 参数为已存在的所有 data 节点的 IP。
修改后的 demo.options 参考如下:
$ cat demo.options
installPrefix= /opt
coordinateHost = 172.168.83.15 coordinateHostNodeID =15 #dataHost = 172.168.83.15 existCoordinateHost =172.168.83.11,172.168.83.12,172.168.83.13 existDataHost =172.168.83.11,172.168.83.12,172.168.83.13,172.168.83.14 existGcwareHost=172.168.83.11,172.168.83.12,172.168.83.13 #gcwareHost = #gcwareHostNodeID = dbaUser = gbase dbaGroup = gbase dbaPwd = 'gbasedba' rootPwd = '111111' #rootPwdFile = rootPwd.json
停止所有节点的集群服务
步骤 1 在所有节点执行集群服务停止命令
$ gcluster_services all stop
Stopping gcrecover : [ OK ]
Stopping gcluster : [ OK ]
Stopping gbase : [ OK ]
Stopping syncserver : [ OK ]
$ gcware_services all stop
Stopping GCWareMonit success!
Stopping gcware : [ OK ]
安装节点
步骤 1:执行安装
$ ./gcinstall.py --silent=demo.options --timeout=120 ********************************************************************************* Thank you for choosing GBase product! ……………… ****************************************************************** Do you accept the above licence agreement ([Y,y]/[N,n])? y ****************************************************************** Welcome to install GBase products ****************************************************************** Environmental Checking on gcluster nodes. CoordinateHost: DataHost: 172.168.83.15 Are you sure to install GCluster on these nodes ([Y,y]/[N,n])? y ……………… 172.168.83.11 install cluster on host 172.168.83.11 successfully. 172.168.83.12 install cluster on host 172.168.83.12 successfully. 172.168.83.13 install cluster on host 172.168.83.13 successfully. 172.168.83.15 install cluster on host 172.168.83.15 successfully. update and sync configuration file... Starting all gcluster nodes... sync coordinator system tables... $
##出现上面信息表示安装成功
安装后状态信息如下: |
$ gcadmin
CLUSTER STATE: ACTIVE
================================================================
| GBASE COORDINATOR CLUSTER INFORMATION |
================================================================
| NodeName | IpAddress | gcware | gcluster | DataState |
----------------------------------------------------------------
| coordinator1 | 172.168.83.11 | OPEN | OPEN | 0 |
----------------------------------------------------------------
| coordinator2 | 172.168.83.12 | OPEN | OPEN | 0 |
----------------------------------------------------------------
| coordinator3 | 172.168.83.13 | OPEN | OPEN | 0 |
----------------------------------------------------------------
| coordinator4 | 172.168.83.15 | OPEN | OPEN | 0 |
----------------------------------------------------------------
=============================================
| GBASE VIRTUAL CLUSTER INFORMATION |
=============================================
| VcName | DistributionId | comment |
---------------------------------------------
| vc1 | 1 | |
---------------------------------------------
| vc2 | 2 | |
---------------------------------------------
2 virtual cluster: vc1, vc2
4 coordinator node
0 free data node