本周处理的第二个关于UNDO的案例
第一个:undo案列一
今天一来再次被告知另外一套数据库也出现undo表空间使用率过高的情况(这次学乖了,先问客户有没有程序运行报ORA-30036错误,O(∩_∩)O哈!)
OS:AIX
DB:10.2.0.4.0
现象:undo表空间使用率过高,未发现报ORA-30036的情况。
能想到Undo表空间出现使用率过高的情况,常见的原因有:
1) 有大事务的SQL使用了大量的undo空间
2) 分配出来的undo段中expired和unexpired状态的extent过多,虽然这部分extent是可以重用的,但是在查看undo使用率的时候常会把这两种状态下的extent占用的空间计算在undo表空间的使用空间内,造成undo表空间虚高
3) 设置的undo_retention时间过长,造成undo段中的extent长期保持在unexpired状态,无法正常回收
Oracle10gundo_retention自动优化功能的影响,该功能会在UNDO表空间非自动增长的情况下,动态调整undo_retention的值忽略系统自身设置的undo_retention的值,以最大限制的利用当前UNDO表空间的可用空间,尽可能的保留最多的UNDO数据,以减少ORA-1555错误的发生,但是也会造成undo_retention值过大,引起大量extent长时间置于unexpired状态无法回收,参考原因2)和3)也会引起undo表空间虚高
查看undo段实际占用情况的信息:
回滚段信息:
SYSDATE Tablespace Name TS Size(MB) UNDO Status Used Extents Used Size(MB) Used Rate(%)
-------- ----------------- ----------- ----------- ------------ ------------- ------------
09:09:49 UNDOTBS1 16336 ACTIVE 2 16 0.1
09:09:49 UNDOTBS1 16336 EXPIRED 1075 2703 16.55
09:09:49 UNDOTBS1 16336 UNEXPIRED 1675 11889.44 72.78
09:09:49 UNDOTBS2 16336 ACTIVE 5 40 0.24
09:09:49 UNDOTBS2 16336 EXPIRED 1493 6198.5 37.94
09:09:49 UNDOTBS2 16336 UNEXPIRED 1820 9643.75 59.03
表空间信息:
NOW_DATE TABLESPACE_NAME FREE_SPACE USED_SPACE TOTAL_SPACE USED_PERCENT
---------- ----------------- ---------- ---------- ----------- ------------
2013-01-09 UNDOTBS2 455.8 15880.2 16336 97.21
2013-01-09 UNDOTBS1 1731.2 14604.8 16336 89.4
Undo_autotune的设置情况
NAME VALUE
-------------------------- --------------------------
_undo_autotune TRUE
动态undo_retention调整情况:
T.BEGIN_TIME MAXQUERYLEN Active(MB) Unexpired(MB) Expire(MB) TUNED_UNDORETENTION
--------------------- ----------- ---------- ------------- ---------- -------------------
2013-01-09 14:19:49 71 22.125 5874.625 1872.875 43774
2013-01-09 14:09:49 71 22.125 5874.625 1872.875 43788
2013-01-09 13:59:49 71 19.125 5908.625 1793.875 44763
2013-01-09 13:49:49 21 22.125 5888.625 1866.875 46116
2013-01-09 13:39:49 112 14.125 5946.625 1929.875 46061
2013-01-09 13:29:49 5 10.125 5954.125 1892.375 47111
2013-01-09 13:19:49 454 12.125 5928.125 1932.375 47401
2013-01-09 13:09:49 25 26.125 5889.125 1986.875 47590
2013-01-09 12:59:49 598 18.125 5899.625 2023.375 47947
2013-01-09 12:49:49 117 22.125 5922.125 1980.875 48432
2013-01-09 12:39:49 106 14.125 5937.625 2005.875 48255
2013-01-09 12:29:49 27 18.125 5895.125 2032.375 48641
2013-01-09 12:19:49 137 14.125 5920.125 1988.875 49042
2013-01-09 12:09:49 65 36.625 5853.625 2029.875 49209
2013-01-09 11:59:49 49 35.125 5859.125 1974.375 49204
2013-01-09 11:49:49 148 75.625 5815.125 1933.375 50051
2013-01-09 11:39:49 22 47.625 5896.625 1872.875 50325
2013-01-09 11:29:49 93 25.125 5901.53125 1871.96875 50735
2013-01-09 11:19:49 63 63.625 5889.53125 1816.96875 52384
2013-01-09 11:09:49 211 91.625 5833.375 1853.625 52993
T.BEGIN_TIME MAXQUERYLEN Active(MB) Unexpired(MB) Expire(MB) TUNED_UNDORETENTION
--------------------- ----------- ---------- ------------- ---------- -------------------
2013-01-09 10:59:49 1731 66.125 5886.875 1773.125 52416
2013-01-09 10:49:49 1130 58.625 5787.375 1819.125 53834
2013-01-09 10:39:49 530 51.625 5924.875 1705.125 53484
2013-01-09 10:29:49 566 33.125 5930.375 1674.625 54255
2013-01-09 10:19:49 34 19.625 5923.875 1658.125 55553
2013-01-09 10:09:49 152 37.625 5922.71875 1622.78125 55421
2013-01-09 09:59:49 113 44.125 5905.375 1548.125 56355
2013-01-09 09:49:49 138 34.125 5847.375 1595.125 57472
2013-01-09 09:39:49 646 49.15625 5837.25 1570.21875 56650
2013-01-09 09:29:49 2 30.625 5902.84375 1447.15625 56971
2013-01-09 09:19:49 44 30.625 5674.6875 1629.8125 56064
2013-01-09 09:09:49 134 23.125 5721.78125 1606.21875 56209
2013-01-09 08:59:49 82 23.625 5911 1404.5 56282
2013-01-09 08:49:49 121 14.125 5882 1371 56859
2013-01-09 08:39:49 47 10.125 5891 1362 56743
2013-01-09 08:29:49 54 10.125 5762 1447 56219
2013-01-09 08:19:49 16 10.125 5835.625 1946.71875 55923
2013-01-09 08:09:49 144 10.125 5874.125 1900.71875 55557
2013-01-09 07:59:49 0 21.125 5538.46875 2209.375 55116
2013-01-09 07:49:49 0 10.125 5575.96875 2208.875 53615
2013-01-09 07:39:49 0 23.625 5584.5 2179.34375 53193
从上面UNDO表空间使用率的情况来看,在undo表空间使用率过高的时候,undo表空间中分配的extent只有少量处于active状态,其余大部分extent处于unexpired和expired状态。由于启用了undo_retention自动优化功能(隐藏参数_undo_autotune为true),查看自动调整后undo_retention的值长期处于在43774秒 – 56971秒之间,使60%-- 70%的undo extent的状态长期保持为unexpired状态,不能正常的回收。这次确实是由于undo_retention自动优化功能引起的问题了,O(∩_∩)O哈!
处理方法:
ALTER SYSTEM SET "_undo_autotune"=FALSE;
可以的参考资料:
Automatic Tuning of Undo_retention Causes Space Problems [ID 420525.1]
In this Document
Symptoms
Cause
Solution
References
Applies to:
Oracle Server - Enterprise Edition - Version 10.2.0.1 to 10.2.0.3 [Release 10.2]
Information in this document applies to any platform.
Oracle Server Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3 -- fixed by patchset 10.2.0.4 and no issues on this at 11g
*** Checked for currency: 13-SEP-2012 ***
Symptoms
You have verified that Document 413732.1 is not applicable and the problem is not a misunderstanding in the way EXPIRED/UNEXPIRED are used and reused over time.
Look for:
Whether the undo is automatically managed by the database by checking the following instance parameter:
UNDO_MANAGEMENT=AUTO
Whether the undo tablespace is fixed in size:
SQL> SELECT autoextensible
FROM dba_data_files
WHERE tablespace_name='<UNDO_TABLESPACE_NAME>'
This returns "NO" for all the undo tablespace datafiles.
The undo tablespace is already sized such that it always has more than enough space to store all the undo generated within the undo_retention time, and the in-use undo space never exceeds the undo tablespace warning alert threshold (see below for the query to show the thresholds).
The tablespace threshold alerts recommend that the DBA add more space to the undo tablespace:
SQL> SELECT creation_time, metric_value, message_type, reason, suggested_action
FROM dba_outstanding_alerts
WHERE object_name='<UNDO_TABLESPACE_NAME>';
This returns a suggested action of: "Add space to the tablespace".
Or,
This recommendation has been reported in the past but the condition has now cleared:
SQL> SELECT creation_time, metric_value, message_type, reason, suggested_action, resolution
FROM dba_alert_history
WHERE object_name='<UNDO_TABLESPACE_NAME>';
The undo tablespace in-use space exceeded the warning alert threshold at some point in time. To see the warning alert percentage threshold, issue:
SQL> SELECT object_type, object_name, warning_value, critical_value
FROM dba_thresholds
WHERE object_type='TABLESPACE';
To see the (current) undo tablespace percent of space in use:
SQL> SELECT
((SELECT (NVL(SUM(bytes),0))
FROM dba_undo_extents
WHERE tablespace_name='<UNDO_TABLESPACE_NAME>'
AND status IN ('ACTIVE','UNEXPIRED')) * 100)
/
(SELECT SUM(bytes)
FROM dba_data_files
WHERE tablespace_name='<UNDO_TABLESPACE_NAME>')
"PCT_INUSE"
FROM dual;
Cause
The cause of this problem has been identified in:
Bug:5387030 - AUTOMATIC TUNING OF UNDO_RETENTION CAUSING SPACE PROBLEMS
It is caused by a wrong calculation of the tuned undo retention value.
Bug:5387030 is fixed in RDBMS 11.1.
Solution
To implement a solution for Bug:5387030, please execute any of the below alternative solutions:
Upgrade to 11.1 in which Bug:5387030 is fixed
OR
Apply patchset release 10.2.0.4 or higher in which Bug:5387030 is fixed.
OR
Download and apply interim Patch:5387030, if available for your platform and RDBMS release. To check for conflicting patches, please use the MOS Patch Planner Tool. If no patch is available, file a Service Request through My Oracle Support for your specific Oracle version and platform.
OR
Use any of the following workarounds:
Set the AUTOEXTEND and MAXSIZE attributes of each datafile of the undo tablespace in such a way that they are autoextensible and the MAXSIZE is equal to the current size (so the undo tablespace now has the AUTOEXTEND attribute but does not autoextend):
SQL> ALTER DATABASE DATAFILE '<datafile_flename>' AUTOEXTEND ON MAXSIZE <current_size>
With this setting, V$UNDOSTAT.TUNED_UNDORETENTION is not calculated based on a percentage of the undo tablespace size. Instead it is set to the maximum of (MAXQUERYLEN secs + 300) and UNDO_RETENTION.
2)Set the following instance parameter:
_smu_debug_mode=33554432
With this setting, V$UNDOSTAT.TUNED_UNDORETENTION is not calculated based on a percentage of the fixed size undo tablespace. Instead it is set to the maximum of (MAXQUERYLEN secs + 300) and UNDO_RETENTION.
3)Set the following instance parameter:
_undo_autotune = false
With this setting, V$UNDOSTAT (and therefore V$UNDOSTAT.TUNED_UNDORETENTION) is not maintained and the undo retention used is based on the UNDO_RETENTION instance parameter.
NOTE: This means you loose all advantages in having automatic undo management and is not an ideal long term fix.
NOTE: Even with the patch fix installed, the autotuned retention can still grow under certain circumstances. The fix attempts to throttle back how aggressive that autotuning will be. Options 2 and 3 may be needed to get around this aggressive growth in some environments.
References
BUG:5387030 - AUTOMATIC TUNING OF UNDO_RETENTION CAUSING SPACE PROBLEMS
NOTE:413732.1 - Full UNDO Tablespace In 10gR2
@NOTE:563556.1 - INIT.ORA Parameter: "_smu_debug_mode" [Hidden]