DB2的HADR的搭建

本文档详细记录了在搭建DB2 High Availability Disaster Recovery (HADR)时遇到的问题,包括SQL0970N错误、HADR无法启动、数据同步问题等,并给出了相应的解决步骤,如修改监听端口、设置数据库为可读、授权操作等。

整理这篇文档主要记录复制搭建过程中遇到的问题,被其中一个搞了好久才突然一下子明白过来。。。。

1.备份恢复的时候报错:SQL0970N The system attempted to write to a read-only file. SQLSTATE=55009
2.SQL1768N Unable to start HADR. Reason code = “5”.
3.没有复制。。。

辅助工具: db2diag
db2diag -level Error
在一台机器上安装了两个实例db2inst1,db2inst2,然后将HADR_LOCAL_SVC 和HADR_REMOTE_SVC设置的与实例的运行端口一致,所以,一直在报“SQL1768N Unable to start HADR. Reason code = “5””的错误,实际上在这里有个误解,我们配置MySQL的复制时,要使用端口号3306,但是这里的复制监听端口跟MySQL的端口不一样的,因为DB2的进程实际上就是Linux的进程,所以在这里需要另行开端口,也就是我们随意指定一个监听端口,当启动HADR后,这个端口就会打开,负责复制的通信。

如图:
这里写图片描述

2017-09-01-16.28.54.340565+480 I908211E796           LEVEL: Error
PID     : 36456                TID : 139973198079744 PROC : db2sysc
INSTANCE: db2inst2             NODE : 000            DB   : KILO
APPHDL  : 0-17                 APPID: *LOCAL.db2inst2.170901082853
AUTHID  : DB2INST2             HOSTNAME: db22.com
EDUID   : 22                   EDUNAME: db2agent (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrValidateLocalServiceDbCfg, probe:21210
MESSAGE : ZRC=0x87800142=-2021654206=HDR_ZRC_CONFIG_LOCAL_SVC
          "HADR_LOCAL_SVC configuration parameter is an invalid service name"
DATA #1 : <preformatted>
HADR_LOCAL_SVC cannot be same port as SVCENAME or SVCENAME +1  
HADR_LOCAL_SVC 54001,  Port # 54001  
SVCENAME       54001,  Port # 54001  
SVCENAME + 1   54002

诊断日志中说HADR_LOCAL_SVC cannot be same port as SVCENAME or SVCENAME +1 ,所以突然就想到应该是要另开端口的。

2.db2inst1上的备份,db2inst2上恢复时报SQL0970N :read-only的错误

[db2inst2@a18 ~]$ db2 restore db kilo taken at 20170904100841
SQL0970N  The system attempted to write to a read-only file.  SQLSTATE=55009
2017-09-04-10.12.50.228929+480 I169707E3155          LEVEL: Error (OS)
PID     : 4096                 TID : 140640650258176 PROC : db2sysc
INSTANCE: db2inst2             NODE : 000            DB   :         
APPHDL  : 0-13                 APPID: *LOCAL.db2inst2.170904020959
AUTHID  : DB2INST2             HOSTNAME: a18.mysql.iso0.lfk.com
EDUID   : 21                   EDUNAME: db2agent (instance)
FUNCTION: DB2 Common, OSSe, ossErrorIOAnalysis, probe:100
CALLED  : OS, -, open                             OSERR: EACCES (13)
DATA #1 : String, 157 bytes
A total of 2 analysis will be performed :
 - User info
 - Path access permission
[db2inst2@a18 ~]$ db2 restore db kilo taken at 20170904100841 on /home/db2inst2
DB20000I  The RESTORE DATABASE command completed successfully.

3.在primary上建表并插入数据,standby上没有收到。

db2diag -level Error
... ... ...
2017-09-01-17.08.55.963065+480 E3711665E645          LEVEL: Error
PID     : 39151                TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1             NODE : 000            DB   : KILO    
HOSTNAME: db22.com
EDUID   : 41                   EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20200
MESSAGE : Did not receive anything through HADR connection for the duration of 
          HADR_TIMEOUT. Closing connection.
DATA #1 : String, 30 bytes
hdrCurrentTime/hdrLastRecvTime
DATA #2 : unsigned integer, 4 bytes
1504256935
DATA #3 : unsigned integer, 4 bytes
1504256417

2017-09-01-18.18.58.511765+480 E3720982E514          LEVEL: Error (OS)
PID     : 39151                TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1             NODE : 000            DB   : KILO    
HOSTNAME: db22.com
EDUID   : 41                   EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, oper system services, sqloPdbQuerySocketErrorStatus, probe:15
MESSAGE : ZRC=0x810F001C=-2129723364=SQLO_NO_CONNECTION "No connection"
CALLED  : OS, -, getsockopt                       OSERR: ECONNRESET (104)

2017-09-01-18.19.19.039294+480 I3721497E557          LEVEL: Error
PID     : 39151                TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1             NODE : 000            DB   : KILO    
HOSTNAME: db22.com
EDUID   : 41                   EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSendRedirectMsgToOneAddress, probe:31070
MESSAGE : ZRC=0xFFFFFFFF=-1
DATA #1 : <preformatted>
The HADR primary was not able to form a TCP connection with the standby: [0000:0000:0000:0000:0000:0000:0000:0001]:54001.


... ... ...
2017-09-04-11.13.22.594229+480 I445495E514           LEVEL: Error
PID     : 11131                TID : 140677660796672 PROC : db2sysc
INSTANCE: db2inst2             NODE : 000            DB   : KILO    
HOSTNAME: a18.mysql.iso0.lfk.com
EDUID   : 90                   EDUNAME: db2hadrs.0.0 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20280
MESSAGE : ZRC=0x810F0019=-2129723367=SQLO_CONN_REFUSED "Connection refused"
DATA #1 : <preformatted>
Failed to connect to primary.


The HADR primary was not able to form a TCP connection with the standby: [0000:0000:0000:0000:0000:0000:0000:0001]:54001.

4.备库设置可读

[db2inst2@a18 ~]$ db2 connect to kilo
SQL1776N  The command cannot be issued on an HADR database. Reason code = "1".

启用“在备用数据库上读取”
将 DB2_HADR_ROS 注册表变量设置为 ON。

[db2inst2@a18 ~]$ db2set DB2_HADR_ROS=ON
#重启一下db2实例,重启一下复制
[db2inst2@a18 ~]$ db2 connect to kilo

   Database Connection Information

 Database server        = DB2/LINUXX8664 11.1.2.2
 SQL authorization ID   = DB2INST2
 Local database alias   = KILO

5.设置了可读,如果用户不一致,还是会因为没有权限报SQL0551N的错误呢

SQL0551N
authorization-ID does not have the required authorization or privilege to perform operation operation on object object-name.
SELECT AUTHORITY, D_USER, D_GROUP, D_PUBLIC, ROLE_USER, ROLE_GROUP, ROLE_PUBLIC, D_ROLE  FROM TABLE (SYSPROC.AUTH_LIST_AUTHORITIES_FOR_AUTHID ('ALICE', 'U') ) AS T   ORDER BY AUTHORITY


在备份之前进行授权。

授权后备份恢复后又遇到一个问题:

执行select时报错:SQL1773N,
查找参考官文:

SQL1773N
该语句或命令失败,因为其需要使用允读的 HADR 备用数据库上不支持的功能。原因码:原因码。
说明

与原因码对应的解释是:

1

    使用了除 UR 之外的隔离级别,并且已关闭 DB2_STANDBY_ISO 注册表变量。
db2set  DB2_STANDBY_ISO=UR


终于正确读出来了~

https://www.ibm.com/support/knowledgecenter/zh/SSEPGG_10.1.0/com.ibm.db2.luw.admin.ha.doc/doc/t0011725.html

操作步骤

本次搭建采用一机两个实例搭建:

HADR角色hostsvcnameinstname数据库版本数据库名称
主库localhostba5oltp110.5ST
备库localhostdb2inst110.5

1).主机上设置循环日志

2).主机上连接到授权

db2 connect to st
db2 "grant DATAACCESS ON DATABASE TO DB2INST1"

3).主机上备份数据库

db2 backup database to /tmp

4).root更改备份文件所有者

chown db2inst1 /tmp/ST*

5).备库上恢复

db2 restore db st from /tmp taken at 20170920112008 on /home/db2inst1

6).编辑/etc/services文件,添加与SVCENAME不同的服务名和端口号

#ba5oltp1
$db2 get dbm cfg | grep SVC
 TCP/TP Service name              (SVCENAME)=DB2cbaoltp1
#主库所在的实例服务
DB2cba5oltp1    50110/tcp
#添加主库HADR的端口号
DB2_ba5oltp1_1  60071/tcp

#备库所在的实例服务
DB2_db2inst1    60011/tcp
#备库HADR的端口号
DB2_db2inst1_1  60013/tcp

7).根据配置执行更新

#主库上:
db2 update db cfg for ST using HADR_LOCAL_HOST localhost HADR_LOCAL_SVC DB2_ba5oltp1_1  HADR_REMOTE_HOST localhost HADR_REMOTE_SVC DB2_db2inst1_1 HADR_REMOTE_INST db2inst1



#备库上:
db2 update db cfg for ST using HADR_LOCAL_HOST localhost HADR_LOCAL_SVC DB2_db2inst1_1 HADR_REMOTE_HOST localhost HADR_REMOTE_SVC DB2_ba5oltp1_1    HADR_REMOTE_INST ba5oltp1

启动HADR

备库上:
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值