RAC waits events ad interconnect statistics
The RAC events are listed next in the report if you are running RAC (multiple instances). As stated earlier, you need to run STATSPACK or AWR Report for each instance that you have.For statspack, you run the statspck.snap procedure and the spreport.sql script. on each node you want to monitor to compare to instances.One of the best methods to see if a node is operating efficiently is to compare the report from that node to one from another node that accesses the same database. It’s very important to remember that single-instance tuning should be performed before attempting to tune the processes that communicate via the cluster interconnect. In other words,tune the system in single instance before you move it to RAC.
Some of the top waits events that you may encounter are listed briefly next.
The top global cache(gc) waits to look out for include :
Gc current block busy: happens when an instance requests a CURR data block(wants to do some DML) and the block to be transferred is in use.
Gc buffer busy: A wait event that occurs whenever a session has to waiwt for an ongoing operation on the resource to complete because the block is in use. The process has to wait for a block to become available because another process is obtaining a resource for this block.
Gc cr request: This happens when one instance is waiting for blocks from another instance’s cache(sent via the interconnect). This wait says that the current instance can’t find a consistent read(CR) version of a block in the local cache. If the block is not in the remote cache, then a db file sequential read wait will also follow this one. Tune the SQL that is causing large amounts of reads that get moved from node to node. Try to put users that are using the same blocks on the same instance so that blocks are not moved from instance to instance. Some non-Oracle application servers will move the same process from node to node looking for the fastest node(unaware that they are moving the same blocks from node to node). Pin these long processes to the same node. Potentially increase the size of the local cache if slow I/O combined with a small cache is the problem. Monitor v$cr_block_server to see if there is an issue like reading UNDO segments. Correlated to the waits the values for p1, p2, p3=file,block,lenum(look in v$lock_element for row where lock_element_addr has the same value as lenum). Happens when an instance requests a CR data block and the block to be transferred hasn’t arrived at the requesting instance. This is one I see the most, and it’s usually because the SQL is poorly tuned and many index blocks are being moved back and forth between instance.
Figure shows the AWR Report RAC section. You can see that there are two instances in this cluster. You can also see things like the number of blocks send and received as well as how many of the blocks are being accessed in the local cache(99.1 percent) versus the disk or another instance. As you would guess , it is faster to access block in the local cache, but accessing one of the remote caches on one of the other nodes is almost always faster(given a fast enouth interconnect and no saturation of the interconnect) than going to disk.
The following is another valuable query to derive session wait information. The instance_id lists the instance where the waiting session resides. The sid is te unique identifier for the waiting session(gv$session). The p1,p2 ,and p3 columns list event-specific information that may be useful for debugging. LAST_SQL lists
The last SQL executed by the waiting session.
Set numwidth 10
Col state format a7 tru
Col event format a25 tru
Col last_sql format a40 tru
Select sw.inst_id instance_id,
sw.sid sid,
sw.state state,
sw.event event,
Sw.seconds_in_wait seconds_wating,
sw.p1,
sw.p2,
sw.p3,
Sa.sql_text last_sql
From gv$session_wait sw, gv$session s, gv$sqlarea sa
Where sw.event not in
('rdbms ipc message', 'smon timer', 'pmon timer',
'SQL*Net message from client',
'lock manager wait for remote message', 'ges remote message',
'gcs remote message', 'gcs for action', 'client message', 'pipe get',
'PX Idle Wait', 'single-task message', 'listen endpoint status',
'slave wait', 'wakeup time manager')
And sw.SECONDS_IN_WAIT > 0
And (sw.INST_ID = s.inst_id and sw.sid = s.sid)
And (s.inst_id = sa.inst_id and s.sql_address = sa.address)
Order by SECONDS_IN_WAIT desc;
Seconds_in_wait: if wait_time=0, then seconds_in_wait is the seconds spend in the current wait condition. If wait_time>0, then seconds_in_wait is the seconds since the start of the last wait, and seconds_in_wait-wait_time/100 is the active seconds since the last wait ended.
RAC Statistics
| Begin | End |
Number of Instances: | 2 | 2 |
Global Cache Load Profile
| Per Second | Per Transaction |
Global Cache blocks received: | 73.42 | 10.16 |
Global Cache blocks served: | 81.75 | 11.31 |
GCS/GES messages received: | 110.26 | 15.26 |
GCS/GES messages sent: | 112.45 | 15.56 |
DBWR Fusion writes: | 0.07 | 0.01 |
Estd Interconnect traffic (KB) | 1,284.90 |
|
Global Cache Efficiency Percentages (Target local+remote 100%)
Buffer access - local cache %: | 99.31 |
Buffer access - remote cache %: | 0.60 |
Buffer access - disk %: | 0.08 |
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/104152/viewspace-168427/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/104152/viewspace-168427/