In this Document
Goal |
Solution |
References |
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 and laterInformation in this document applies to any platform.
该文档适用于所有平台的oracle db 10.2.0.1及以后版本.
GOAL
How to configure Server Side Transparent Application Failover (TAF)?
如何配置服务器端的透明应用程序故障切换(TAF)?
Server side TAF settings override client-side counterparts that might be configured in TNS connect descriptors. If TAF is not configured on the client side, then at a minimum, the failover type must be set to enable TAF. If the failover type is set on the server side, then the failover method will default to BASIC. Delay and retries are optional and may be specified independently.
服务器端的TAF设置覆盖客户端TNS连接描述符中的设置。如果客户端没有配置TAF,那么至少在客户端启用failover(failover=on)。如果故障切换类型在服务器端配置,那么故障切换的方法将默认为basi。延迟和重试是可选的,可以单独指定.
SOLUTION
1. Create a service on the RAC cluster to setup for TAF. Example creates a service called server_taf to the database called rac, with instance names rac1 & rac2.
在Rac集群上创建一个TAF服务。创建一个名为server_taf的例子,数据库名为rac,实例名为rac1、rac2.
Please note for the service name, select a name that is unique and not the same as the default service name created. A special Oracle database service is created by default for your Oracle RAC database. This default service is always available on all instances in an Oracle RAC environment, unless an instance is in restricted mode. You cannot alter this service or its properties. http://docs.oracle.com/cd/E11882_01/rac.112/e41960/hafeats.htm#CHDDBHHB
注意要创建的服务名必须是唯一的,rac数据库默认会创建一个服务,这个服务在所有实例上都可用,除非一个实例是限制模式,你不能修改这个服务和他的属性。
2. Start the service server_taf
3. Check service is running
ractest PREF: rac1 rac2 AVAIL:
server_taf PREF: rac1 rac2 AVAIL:
4. Find the service_id value for the service just created
Connect / as sysdba
SQL> select name,service_id from dba_services where name = 'server_taf';
NAME SERVICE_ID
---------------------------------------------------------------- ----------
server_taf 6
5. Review the standard setup for the services
col failover_method format a11 heading 'METHOD'
col failover_type format a10 heading 'TYPE'
col failover_retries format 9999999 heading 'RETRIES'
col goal format a10
col clb_goal format a8
col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT'
SQL>select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications
from dba_services where service_id = 6
NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT
--------------- ----------- ---------- -------- ---------- -------- -----
server_taf LONG NO
Please note there is no values for method, type or retries. These are required todo server side TAF.
The cause of this problem has been identified and verified in an unpublished Bug 6886239 DBMS_SERVICE parameters are not added using srvctl add service. This is fixed in release 11.2 onwards.
注意此处method, type or retries没有值,这些值是TAF必须有的。导致这个问题是由于未公布的一个bug(Bug 6886239),使用srvctl添加服务时DBMS_SERVICE的参数没有加上导致的,在11.2版本中已经修复。
6. Add the server side failover parameters to the service. (Pre 11.2)
, aq_ha_notifications => true -
, failover_method => dbms_service.failover_method_basic -
, failover_type => dbms_service.failover_type_select -
, failover_retries => 180 -
, failover_delay => 5 -
, clb_goal => dbms_service.clb_goal_long);
PL/SQL procedure successfully completed.
Addtional failover parameters value can be found in the Oracle� Database PL/SQL Packages and Types Reference 11g Release 1 (11.1), under section 116 DBMS_SERVICE
For 11.2 version use SVRCTL to modify the service
Service can be checked with the command:
srvctl config service -d RAC
Service name: server_taf
Service is enabled
Server pool: RAC_server_taf
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Failover type: SELECT
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition:
Preferred instances: RAC1,RAC2
Available instances:
7. Check the service and we can now see values for Method, Type and Retries.
from dba_services where service_id = 6
NAME METHOD TYPE RETRIES GOAL CLB_GOAL AQNOT
--------------- ----------- ---------- -------- ---------- -------- -----
server_taf BASIC SELECT 180 NONE LONG YES
8. Check the listener has the service registered. (output will look similar too following, depending on version used)
Service "server_taf.za.oracle.com" has 2 instance(s).
Instance "rac1", status READY, has 2 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=dell01)(PORT=1521))
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Instance "rac2", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=dell02)(PORT=1521))
9. Create a net service name. Here we have client load balancing between the two nodes.
(DESCRIPTION =
(LOAD_BALANCE = yes)
(ADDRESS = (PROTOCOL = TCP)(HOST = dell01)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = dell02)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = server_taf.za.oracle.com)
)
)
10. Testing...
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SQL> select host_name,instance_name from v$instance;
HOST_NAME
----------------------------------------------------------------
INSTANCE_NAME
----------------
dell02
rac2
11. Shutdown the database in the node the connection has routed to
INSTANCE_NAME
----------------
rac2
SQL> shutdown abort;
ORACLE instance shut down.
12. TAF will now kick in
HOST_NAME
----------------------------------------------------------------
INSTANCE_NAME
----------------
dell01
rac1
Oracle Net client trace of sqlplus connection during failover shows :
[02-OCT-2007 12:15:44:618] niotns: Calling address: (DESCRIPTION=(LOAD_BALANCE=yes)(ADDRESS=(PROTOCOL=TCP)(HOST=dell01)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=dell02)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=server_taf.za.oracle.com)
(CID=(PROGRAM=d:\oracle\102ee\bin\sqlplus.exe)(HOST=sflood-uk2)(USER=sflood))))
Selected node dell02
[02-OCT-2007 12:15:44:648] nttbnd2addr: looking up IP addr for host: dell02
Connection handshake
[02-OCT-2007 12:15:44:878] nscon: sending NSPTCN packet
[02-OCT-2007 12:15:45:229] nscon: got NSPTRS packet
[02-OCT-2007 12:15:45:229] nscon: sending NSPTCN packet
[02-OCT-2007 12:15:45:429] nscon: got NSPTAC packet
The select running
[02-OCT-2007 12:16:04:046] nspsend: 00 00 00 00 00 E8 64 0B |......d.|
[02-OCT-2007 12:16:04:046] nspsend: 01 2E 73 65 6C 65 63 74 |..select|
[02-OCT-2007 12:16:04:046] nspsend: 20 68 6F 73 74 5F 6E 61 |.host_na|
[02-OCT-2007 12:16:04:046] nspsend: 6D 65 2C 69 6E 73 74 61 |me,insta|
[02-OCT-2007 12:16:04:046] nspsend: 6E 63 65 5F 6E 61 6D 65 |nce_name|
[02-OCT-2007 12:16:04:046] nspsend: 20 66 72 6F 6D 20 76 24 |.from.v$|
[02-OCT-2007 12:16:04:046] nspsend: 69 6E 73 74 61 6E 63 65 |instance|
Here the the time the instance was shutdown
[02-OCT-2007 12:16:05:077] nioqrc: exit
[02-OCT-2007 12:18:20:642] nioqsn: entry
Select attempts to run again
[02-OCT-2007 12:18:20:652] nspsend: 00 00 00 00 00 E8 64 0B |......d.|
[02-OCT-2007 12:18:20:652] nspsend: 01 2E 73 65 6C 65 63 74 |..select|
[02-OCT-2007 12:18:20:652] nspsend: 20 68 6F 73 74 5F 6E 61 |.host_na|
[02-OCT-2007 12:18:20:652] nspsend: 6D 65 2C 69 6E 73 74 61 |me,insta|
[02-OCT-2007 12:18:20:652] nspsend: 6E 63 65 5F 6E 61 6D 65 |nce_name|
[02-OCT-2007 12:18:20:652] nspsend: 20 66 72 6F 6D 20 76 24 |.from.v$|
[02-OCT-2007 12:18:20:652] nspsend: 69 6E 73 74 61 6E 63 65 |instance|
Fails, due to instance was shutdown
[02-OCT-2007 12:18:20:652] nserror: nsres: id=0, op=68, ns=12537, ns2=12560; nt[0]=507, nt[1]=0, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0
[02-OCT-2007 12:18:20:652] nsrdr: error exit
[02-OCT-2007 12:18:20:652] nsdo: nsctxrnk=0
[02-OCT-2007 12:18:20:652] nsdo: error exit
[02-OCT-2007 12:18:20:652] nioqer: entry
[02-OCT-2007 12:18:20:652] nioqer: incoming err = 12151
[02-OCT-2007 12:18:20:652] nioqce: entry
[02-OCT-2007 12:18:20:652] nioqce: exit
[02-OCT-2007 12:18:20:652] nioqer: returning err = 3113
TAF kicks in
[02-OCT-2007 12:18:20:652] nsc2addr: (DESCRIPTION=(LOAD_BALANCE=yes)(ADDRESS=(PROTOCOL=TCP)(HOST=dell01)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=server_taf.za.oracle.com)(CID=(PROGRAM=d:\oracle\102ee\bin\sqlplus.exe)(HOST=sflood-uk2)(USER=sflood))))
Connection fails over to node dell01
[02-OCT-2007 12:18:20:652] nttbnd2addr: looking up IP addr for host: dell01
Connection handshake is completed
[02-OCT-2007 12:18:20:863] nscon: sending NSPTCN packet
[02-OCT-2007 12:18:23:547] nscon: got NSPTRS packet
[02-OCT-2007 12:18:23:547] nscon: sending NSPTCN packet
[02-OCT-2007 12:18:23:747] nscon: got NSPTAC packet
Select is run
[02-OCT-2007 12:18:47:861] nspsend: 00 00 00 00 00 E8 64 0B |......d.|
[02-OCT-2007 12:18:47:861] nspsend: 01 2E 73 65 6C 65 63 74 |..select|
[02-OCT-2007 12:18:47:861] nspsend: 20 68 6F 73 74 5F 6E 61 |.host_na|
[02-OCT-2007 12:18:47:861] nspsend: 6D 65 2C 69 6E 73 74 61 |me,insta|
[02-OCT-2007 12:18:47:861] nspsend: 6E 63 65 5F 6E 61 6D 65 |nce_name|
[02-OCT-2007 12:18:47:861] nspsend: 20 66 72 6F 6D 20 76 24 |.from.v$|
[02-OCT-2007 12:18:47:861] nspsend: 69 6E 73 74 61 6E 63 65 |instance|