个人总结
读metalink的时候发现的不错的一个文档.只是这标题起的没有主题.我copy过来不知道以后想到用到这个文档的时候能不能记住这个名字.11gR2 Clusterware Key Facts
GRID Home 与RAC/DB Home必须装在不同的地方.11gR2 Clusterware需要共享的OCR文件与VOT文件,这些文件可以被存储在ASM中或集群文件系统中(cluster filesystem),不支持存放在raw中了.
OCR每4小时被自动备份在/cdata// 目录下可以通过ocrconfig 恢复(restore).
voting file在每一次配置改变的时候自动备份到OCR中,可以通过crsctl恢复(restore).
11gR2 Clusterware需要至少一个私有网络进行内部连接与至少一个公共网络与外部连接,几个虚拟ip可以注册到DNS中,包括每一个结点的VIP跟三个SCAN VIPs(个人注释:可以是三个也可以是其他的数,我是按照安装文件做的三个).
每一个结点只能跑一个 clusterware daemons.
kill clusterware daemons 不被支持. (个人注释:不知道什嘛意思?)
(其他的key facts参考下面英语的相关连接吧,不写了.)
Important Log Locations
集群日志在/log/目录下,文档列出了此目录下的结构.alert.log - 出问题时首选此日志查看^_^
./admin:
./agent:
./agent/crsd:
./agent/crsd/oraagent_oracle:
./agent/crsd/ora_oc4j_type_oracle:
./agent/crsd/orarootagent_root:
./agent/ohasd:
./agent/ohasd/oraagent_oracle:
./agent/ohasd/oracssdagent_root:
./agent/ohasd/oracssdmonitor_root:
./agent/ohasd/orarootagent_root:
./client:
./crsd:
./cssd:
./ctssd:
./diskmon:
./evmd:
./gipcd:
./gnsd:
./gpnpd:
./mdnsd:
./ohasd:
./racg:
./racg/racgeut:
./racg/racgevtf:
./racg/racgmain:
./srvm:
cfgtoollogs dir 在 和 $ORACLE_BASE 下包含了其他重要日志,特别是rootcrs.pl跟配置助手(如ASMCA)运行的过程记录.
ASM 日志在$ORACLE_BASE/diag/asm/+asm//trace目录下.
diagcollection.pl 这个脚本在 /bin 下可以用来自动收集重要信息为将来的支持用,需要root用户执行.
srvctl 跟 crsctl 两条命令用来管理集群资源,一般是用srvctl来管理任何集群资源,crsctl只是当srvctl不能完成不能达到时使用(如,start集群).这两条命令都有help显示帮助信息.这里不重复贴了,下面有.文档还列出了其他三个命令 ocrconfig, olsnodes, cluvfy的help信息.
ocrconfig:
http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/ocrsyntax.htm#CWADD92028
olsnodes :
http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/olsnodes.htm#CWADD91126
cluvfy:
http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/cvu.htm#BEHIJAJC
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=1053147.1
11gR2 Clusterware and Grid Home - What You Need to Know [ID 1053147.1] | |||||
| |||||
Modified 27-SEP-2011 Type BULLETIN Status PUBLISHED |
In this Document
Purpose
Scope and Application
11gR2 Clusterware and Grid Home - What You Need to Know
11gR2 Clusterware Key Facts
Clusterware Startup Sequence
Important Log Locations
Clusterware Resource Status Check
Clusterware Resource Administration
OCRCONFIG Options:
OLSNODES Options
Cluster Verification Options
Scalability RAC Community
References
Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.1 - Release: 11.2 to 11.2Information in this document applies to any platform.
Purpose
The 11gR2 Clusterware has undergone numerous changes since the previous release. For information on the previous release(s), see Note: 259301.1 "CRS and 10g Real Application Clusters". This document is intended to go over the 11.2 Clusterware which has some similarities and some differences from the previous version(s).Scope and Application
This document is intended for RAC Database Administrators and Oracle support engineers.
11gR2 Clusterware and Grid Home - What You Need to Know
11gR2 Clusterware Key Facts
- 11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database.
- The GRID home consists of the Oracle Clusterware and ASM. ASM should not be in a seperate home.
- The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support. This clusterware is a subset of the full clusterware described in this document.
- The 11gR2 Clusterware can be run by itself or on top of vendor clusterware. See the certification matrix for certified combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters"
- The GRID Home and the RAC/DB Home must be installed in different locations.
- The 11gR2 Clusterware requires a shared OCR files and voting files. These can be stored on ASM or a cluster filesystem.
- The OCR is backed up automatically every 4 hours to /cdata// and can be restored via ocrconfig.
- The voting file is backed up into the OCR at every configuration change and can be restored via crsctl.
- The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one public network for external communication. Several virtual IPs need to be registered with DNS. This includes the node VIPs (one per node), SCAN VIPs (three). This can be done manually via your network administrator or optionally you could configure the "GNS" (Grid Naming Service) in the Oracle clusterware to handle this for you (note that GNS requires its own VIP).
- A SCAN (Single Client Access Name) is provided to clients to connect to. For more info on SCAN see Note: 887522.1
- The root.sh script. at the end of the clusterware installation starts the clusterware stack. For information on troubleshooting root.sh issues see Note: 1053970.1
- Only one set of clusterware daemons can be running per node.
- On Unix, the clusterware stack is started via the init.ohasd script. referenced in /etc/inittab with "respawn".
- A node can be evicted (rebooted) if a node is deemed to be unhealthy. This is done so that the health of the entire cluster can be maintained. For more information on this see: Note: 1050693.1 "Troubleshooting 11.2 Clusterware Node Evictions (Reboots)"
- Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all and let CTSS handle time synchonization. See Note: 1054006.1 for more infomation.
- If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 errors. See Note 946332.1 and Note:948456.1 for more info.
- The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware stack, or by running "crsctl start cluster" to start the clusterware on all nodes. Note that crsctl is in the /bin directory. Note that "crsctl start cluster" will only work if ohasd is running.
- The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the clusterware stack, or by running "crsctl stop cluster" to stop the clusterware on all nodes. Note that crsctl is in the /bin directory.
- Killing clusterware daemons is not supported.
Clusterware Startup Sequence
The following is the Clusterware startup sequence (image from the "Oracle Clusterware Administration and Deployment Guide):
Don't let this picture scare you too much. You aren't responsible for managing all of these processes, that is the Clusterware's job!
Short summary of the startup sequence: INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process (Oracle High Availability Services Daemon). This daemon spawns 4 processes.
Level 1: OHASD Spawns:
- cssdagent - Agent responsible for spawning CSSD.
- orarootagent - Agent responsible for managing all root owned ohasd resources.
- oraagent - Agent responsible for managing all oracle owned ohasd resources.
- cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).
Level 2: OHASD rootagent spawns:
- CRSD - Primary daemon responsible for managing cluster resources.
- CTSSD - Cluster Time Synchronization Services Daemon
- Diskmon
- ACFS (ASM Cluster File System) Drivers
Level 2: OHASD oraagent spawns:
- MDNSD - Used for DNS lookup
- GIPCD - Used for inter-process and inter-node communication
- GPNPD - Grid Plug & Play Profile Daemon
- EVMD - Event Monitor Daemon
- ASM - Resource for monitoring ASM instances
Level 3: CRSD spawns:
- orarootagent - Agent responsible for managing all root owned crsd resources.
- oraagent - Agent responsible for managing all oracle owned crsd resources.
Level 4: CRSD rootagent spawns:
- Network resource - To monitor the public network
- SCAN VIP(s) - Single Client Access Name Virtual IPs
- Node VIPs - One per node
- ACFS Registery - For mounting ASM Cluster File System
- GNS VIP (optional) - VIP for GNS
Level 4: CRSD oraagent spawns:
- ASM Resouce - ASM Instance(s) resource
- Diskgroup - Used for managing/monitoring ASM diskgroups.
- DB Resource - Used for monitoring and managing the DB and instances
- SCAN Listener - Listener for single client access name, listening on SCAN VIP
- Listener - Node listener listening on the Node VIP
- Services - Used for monitoring and managing services
- ONS - Oracle Notification Service
- eONS - Enhanced Oracle Notification Service
- GSD - For 9i backward compatibility
- GNS (optional) - Grid Naming Service - Performs name resolution
This image shows the various levels more clearly:
Important Log Locations
Clusterware daemon logs are all under /log/. Structure under /log/:alert.log - look here first for most clusterware issues
./admin:
./agent:
./agent/crsd:
./agent/crsd/oraagent_oracle:
./agent/crsd/ora_oc4j_type_oracle:
./agent/crsd/orarootagent_root:
./agent/ohasd:
./agent/ohasd/oraagent_oracle:
./agent/ohasd/oracssdagent_root:
./agent/ohasd/oracssdmonitor_root:
./agent/ohasd/orarootagent_root:
./client:
./crsd:
./cssd:
./ctssd:
./diskmon:
./evmd:
./gipcd:
./gnsd:
./gpnpd:
./mdnsd:
./ohasd:
./racg:
./racg/racgeut:
./racg/racgevtf:
./racg/racgmain:
./srvm:
The cfgtoollogs dir under and $ORACLE_BASE contains other important logfiles. Specifically for rootcrs.pl and configuration assistants like ASMCA, etc...
ASM logs live under $ORACLE_BASE/diag/asm/+asm//trace
The diagcollection.pl script. under /bin can be used to automatically collect important files for support. Run this as the root user.
Clusterware Resource Status Check
The following command will display the status of all cluster resources:$ ./crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.LISTENER.lsnr
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.SYSTEMDG.dg
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.asm
ONLINE ONLINE racbde1 Started
ONLINE ONLINE racbde2 Started
ora.eons
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.gsd
OFFLINE OFFLINE racbde1
OFFLINE OFFLINE racbde2
ora.net1.network
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.ons
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.registry.acfs
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE racbde1
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE racbde2
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE racbde2
ora.oc4j
1 OFFLINE OFFLINE
ora.rac.db
1 ONLINE ONLINE racbde1 Open
2 ONLINE ONLINE racbde2 Open
ora.racbde1.vip
1 ONLINE ONLINE racbde1
ora.racbde2.vip
1 ONLINE ONLINE racbde2
ora.scan1.vip
1 ONLINE ONLINE racbde1
ora.scan2.vip
1 ONLINE ONLINE racbde2
ora.scan3.vip
1 ONLINE ONLINE racbde2
Clusterware Resource Administration
Srvctl and crsctl are used to manage clusterware resources. The general rule is to use srvctl for whatever resource management you can. Crsctl should only be used for things that you cannot do with srvctl (like start the cluster). Both have a help feature to see the available syntax.Srvctl syntax:
Usage: srvctl [-V]
Usage: srvctl add database -d -o [-m ] [-p ] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY}] [-s ] [-t ] [-n ] [-y {AUTOMATIC | MANUAL}] [-g ""] [-x ] [-a ""]
Usage: srvctl config database [-d [-a] ]
Usage: srvctl start database -d [-o ]
Usage: srvctl stop database -d [-o ] [-f]
Usage: srvctl status database -d [-f] [-v]
Usage: srvctl enable database -d [-n ]
Usage: srvctl disable database -d [-n ]
Usage: srvctl modify database -d [-n ] [-o ] [-u ] [-m ] [-p ] [-r {PRIMARY | PHYSICAL_STANDBY | LOGICAL_STANDBY | SNAPSHOT_STANDBY}] [-s ] [-t ] [-y {AUTOMATIC | MANUAL}] [-g "" [-x ]] [-a ""|-z]
Usage: srvctl remove database -d [-f] [-y]
Usage: srvctl getenv database -d [-t ""]
Usage: srvctl setenv database -d {-t =[,=,...] | -T =}
Usage: srvctl unsetenv database -d -t ""
Usage: srvctl add instance -d -i -n [-f]
Usage: srvctl start instance -d {-n [-i ] | -i } [-o ]
Usage: srvctl stop instance -d {-n | -i } [-o ] [-f]
Usage: srvctl status instance -d {-n | -i } [-f] [-v]
Usage: srvctl enable instance -d -i ""
Usage: srvctl disable instance -d -i ""
Usage: srvctl modify instance -d -i { -n | -z }
Usage: srvctl remove instance -d [-i ] [-f] [-y]
Usage: srvctl add service -d -s {-r "" [-a ""] [-P {BASIC | NONE | PRECONNECT}] | -g [-c {UNIFORM. SINGLETON}] } [-k ] [-l [PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y {AUTOMATIC | MANUAL}] [-q {TRUE|FALSE}] [-x {TRUE|FALSE}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z ] [-w ]
Usage: srvctl add service -d -s -u {-r "" | -a ""}
Usage: srvctl config service -d [-s ] [-a]
Usage: srvctl enable service -d -s "" [-i | -n ]
Usage: srvctl disable service -d -s "" [-i | -n ]
Usage: srvctl status service -d [-s ""] [-f] [-v]
Usage: srvctl modify service -d -s -i -t [-f]
Usage: srvctl modify service -d -s -i -r [-f]
Usage: srvctl modify service -d -s -n -i "" [-a ""] [-f]
Usage: srvctl modify service -d -s [-c {UNIFORM. SINGLETON}] [-P {BASIC|PRECONNECT|NONE}] [-l [PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y {AUTOMATIC | MANUAL}][-q {true|false}] [-x {true|false}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z ] [-w ]
Usage: srvctl relocate service -d -s {-i -t | -c -n } [-f]
Specify instances for an administrator-managed database, or nodes for a policy managed database
Usage: srvctl remove service -d -s [-i ] [-f]
Usage: srvctl start service -d [-s "" [-n | -i ] ] [-o ]
Usage: srvctl stop service -d [-s "" [-n | -i ] ] [-f]
Usage: srvctl add nodeapps { { -n -A //[if1[|if2...]] } | { -S //[if1[|if2...]] } } [-p ] [-m ] [-e ] [-l ] [-r ] [-t [:][,[:]...]] [-v]
Usage: srvctl config nodeapps [-a] [-g] [-s] [-e]
Usage: srvctl modify nodeapps {[-n -A /[/if1[|if2|...]]] | [-S /[/if1[|if2|...]]]} [-m ] [-p ] [-e ] [ -l ] [-r ] [-t [:][,[:]...]] [-v]
Usage: srvctl start nodeapps [-n ] [-v]
Usage: srvctl stop nodeapps [-n ] [-f] [-r] [-v]
Usage: srvctl status nodeapps
Usage: srvctl enable nodeapps [-v]
Usage: srvctl disable nodeapps [-v]
Usage: srvctl remove nodeapps [-f] [-y] [-v]
Usage: srvctl getenv nodeapps [-a] [-g] [-s] [-e] [-t ""]
Usage: srvctl setenv nodeapps {-t "=[,=,...]" | -T "="}
Usage: srvctl unsetenv nodeapps -t "" [-v]
Usage: srvctl add vip -n -k -A //[if1[|if2...]] [-v]
Usage: srvctl config vip { -n | -i }
Usage: srvctl disable vip -i [-v]
Usage: srvctl enable vip -i [-v]
Usage: srvctl remove vip -i "" [-f] [-y] [-v]
Usage: srvctl getenv vip -i [-t ""]
Usage: srvctl start vip { -n | -i } [-v]
Usage: srvctl stop vip { -n | -i } [-f] [-r] [-v]
Usage: srvctl status vip { -n | -i }
Usage: srvctl setenv vip -i {-t "=[,=,...]" | -T "="}
Usage: srvctl unsetenv vip -i -t "" [-v]
Usage: srvctl add asm [-l ]
Usage: srvctl start asm [-n ] [-o ]
Usage: srvctl stop asm [-n ] [-o ] [-f]
Usage: srvctl config asm [-a]
Usage: srvctl status asm [-n ] [-a]
Usage: srvctl enable asm [-n ]
Usage: srvctl disable asm [-n ]
Usage: srvctl modify asm [-l ]
Usage: srvctl remove asm [-f]
Usage: srvctl getenv asm [-t [, ...]]
Usage: srvctl setenv asm -t "= [,...]" | -T "="
Usage: srvctl unsetenv asm -t "[, ...]"
Usage: srvctl start diskgroup -g [-n ""]
Usage: srvctl stop diskgroup -g [-n ""] [-f]
Usage: srvctl status diskgroup -g [-n ""] [-a]
Usage: srvctl enable diskgroup -g [-n ""]
Usage: srvctl disable diskgroup -g [-n ""]
Usage: srvctl remove diskgroup -g [-f]
Usage: srvctl add listener [-l ] [-s] [-p "[TCP:][, ...][/IPC:][/NMP:][/TCPS:] [/SDP:]"] [-o ] [-k ]
Usage: srvctl config listener [-l ] [-a]
Usage: srvctl start listener [-l ] [-n ]
Usage: srvctl stop listener [-l ] [-n ] [-f]
Usage: srvctl status listener [-l ] [-n ]
Usage: srvctl enable listener [-l ] [-n ]
Usage: srvctl disable listener [-l ] [-n ]
转载于:http://blog.itpub.net/11780477/viewspace-708496/