每次启动数据库,老是要等几分钟,服务器硬件配置都比较高,感觉莫名奇妙。
环境:128c + 256g rac+500g disk
oracle linux 9.4 + oracle rac 19.23
测试过程如下:
使用srvctl start database 等几分钟,
使用sql语句:startup ,也是几分钟,
后来直接使用start nomount,也开始等几分钟,后面mount,open都是秒出。
看来应该是配置出现问题。
一个节点在startup nomount,在另外一个节点看到大量的等待事件:如下:
select sid,event,blocking_session ,sql_id from v$session where wait_class<>'Idle' order by 2,1
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
1841 DFS lock handle
4327 DFS lock handle
4972 DFS lock handle
484 DLM cross inst call completion
603 DLM cross inst call completion
964 DLM cross inst call completion
1655 DLM cross inst call completion fhf8upax5cxsz
2133 DLM cross inst call completion
2205 DLM cross inst call completion
2376 DLM cross inst call completion
7027 DLM cross inst call completion
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
2957 PGA memory operation 491v8qqgy6jjv
3206 PGA memory operation 491v8qqgy6jjv
3664 PGA memory operation 59hmswh5r1aw1
6381 PGA memory operation 59hmswh5r1aw1
7157 PGA memory operation 491v8qqgy6jjv
3735 SQL*Net message from dblink 1hrsg820r04ff
489 SQL*Net message to client 7svn189480mw7
2129 SQL*Net message to client 44xb4qsf1aj66
5978 SQL*Net message to client 7svn189480mw7
6444 SQL*Net message to client 6qgyvzscbqvcp
1306 buffer busy waits gy5hxnhv18ds9
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
3857 buffer busy waits 9zg9qd9bm4spu
5432 buffer busy waits 9zg9qd9bm4spu
6456 buffer busy waits f1juf2qu4r7wb
2303 enq: CT - reading
2311 enq: DW - contention
5962 enq: DW - contention
6847 enq: JS - job run lock - synch
ronize
3502 enq: TM - contention bywmv13mpmg4w
5863 enq: TM - contention fh2hvv131zqrf
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
6612 enq: TM - contention gnjcp50sgps6c
7215 enq: TM - contention ar90y9vnc24qb
1123 enq: TT - contention
5972 enq: TT - contention c4s5nvfbanqzg
6376 enq: TT - contention
129 gcs enter server mode
131 gcs enter server mode gy5hxnhv18ds9
597 gcs enter server mode cg16yna4kpx67
672 gcs enter server mode 1ksn4ucn9ugk9
774 gcs enter server mode d7xawz2pjav0p
1131 gcs enter server mode fq88kg3bv7aun
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
1241 gcs enter server mode 79ah5gpgjjjm3
1314 gcs enter server mode
1483 gcs enter server mode 8ph3fap8bsnf9
1496 gcs enter server mode 6q6sqbv4m9x6n
1679 gcs enter server mode 9qj06jgzf442s
1851 gcs enter server mode f58tzbmhw0hd2
1913 gcs enter server mode 60x5b9qkzf0n2
2093 gcs enter server mode
2199 gcs enter server mode 8bvckf4a3k05q
2677 gcs enter server mode 01vugk4pw16v6
2914 gcs enter server mode
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
2975 gcs enter server mode f3yfg50ga0r8n
3367 gcs enter server mode 35qvy98dgv86k
3374 gcs enter server mode 93qw8913ra2xn
3559 gcs enter server mode 94ttg537kvcwd
3663 gcs enter server mode
3897 gcs enter server mode
4075 gcs enter server mode 01vugk4pw16v6
4333 gcs enter server mode 3catmz1kpgrff
4679 gcs enter server mode 3aydtv81pd402
5439 gcs enter server mode
5555 gcs enter server mode 3muqaf0csh9y7
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
5667 gcs enter server mode
5749 gcs enter server mode f4x9xmqj0zbdq
5967 gcs enter server mode b8sw22p0sguw9
6263 gcs enter server mode aa3tgy246y88v
6334 gcs enter server mode 9zg9qd9bm4spu
6386 gcs enter server mode 7ckcskdrfyu8d
6446 gcs enter server mode
6502 gcs enter server mode 98qdbzc1xp1d7
6568 gcs enter server mode 0u82rxrxxzw42
6864 gcs enter server mode 3catmz1kpgrff
6914 gcs enter server mode
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
7145 gcs enter server mode
7343 gcs enter server mode bzmpdrqr4hzg6
836 library cache lock 8v1jwdhcr3cdq
897 library cache lock djdvv4jpc56yv
1019 library cache lock 50955s7ycfg46
1723 library cache lock
1903 library cache lock 7xzc8hmbq4x73
2501 library cache lock 0yp01pgmdx9ps
2894 library cache lock an00537t39mfn
3024 library cache lock
3130 library cache lock 8kzhy6p0gtkyz
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
3734 library cache lock b8swjyzkt8hvj
3791 library cache lock 1fkbfyt0j3bur
3912 library cache lock
4503 library cache lock gu1082z9tx5ud
5146 library cache lock
6731 library cache lock dt9tm5pzfrh34
6807 library cache lock
6907 library cache lock dxgp7asdv2k9u
6926 library cache lock g71nwhssyywys
7025 library cache lock
3973 library cache pin 9kdd035uukj7y
SID EVENT BLOCKING_SESSION SQL_ID
---------- ------------------------------ ---------------- -------------
5845 resmgr:internal state change 37d8m7py91vfw
5495 row cache lock 76jnftqnh82n5
99 rows selected.
看此时报警日志:
Attached to domain 0 (addr: 0x706fe45c8)
Reconfiguration started (old inc 0, new inc 64)
Dynamic remastering is disabled
List of instances (total 2) :
1 2
My inst 2 (I'm a new instance)
Global Resource Directory frozen
Enabling Dynamic Remastering: NONE->NORM switch
Communication channels reestablished
* domain 0 valid = 1 (flags x8820, pdb flags x8000) according to instance 1
2025-11-01 01:43:41.933000 +08:00
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 8: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 6: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 13: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 17: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 7: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 9: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 14: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 16: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 11: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 12: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 15: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 10: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 20: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 18: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 19: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
2025-11-01 01:46:03.564000 +08:00
Reconfiguration complete (total time 141.7 secs)
Decreasing priority of 21 RS
看来是在启动LMS,有21个
那么这个lms,是有哪个参数控制的
查了一下,是由 gcs_server_processes 参数控制的
而这个参数 gcs_server_processes 与CPU 最终的个数有关,比如我们这里CPU个数是128
SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 128
SQL> show parameter gcs
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
gcs_server_processes integer 21
粗略计算,128/6 = 21
看来这个值,是系统自己计算的
但看19c官方文档:

看来是符合第5条
cpu=128,sga=128g
SQL> show parameter sga_target
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
sga_target big integer 120G
所以,自动计算为21了。
但我这里启用了numa ,按理应该为NUMA组数的2倍,不超过cpu/4,但没有这么算

根据这个,我调整一下到 2+cpu/32 ,=6,试一下,
检查日志:
2025-11-01 02:02:07.245000 +08:00
Increasing priority of 6 RS
* Setting GES domain 0
Attached to domain 0 (addr: 0x6f8444118)
Reconfiguration started (old inc 0, new inc 68)
Dynamic remastering is disabled
List of instances (total 2) :
1 2
My inst 2 (I'm a new instance)
Global Resource Directory frozen
Enabling Dynamic Remastering: NONE->NORM switch
Communication channels reestablished
* domain 0 valid = 1 (flags x8820, pdb flags x8000) according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
2025-11-01 02:02:18.760000 +08:00
Reconfiguration complete (total time 11.5 secs)
等待事件,少了很多,启动时间,确实只有11秒,默认值要140秒,快了很多。
而且操作系统负载也降了一半。
先观察一下。
总结一下:
数据库启动卡的原因,就是 gcs_server_processes 参数太高导致的,调整后,问题暂时解决。
1800

被折叠的 条评论
为什么被折叠?



