SMON清理IND$字典基-ORA-08104
背景
平常在运维工作当中,创建索引的时候,可能会遇到报错:ORA-08104。
或者表上索引状态valid,统计信息都正常,但是就是不走索引。
模拟故障
- 创建test表,插入一条数据,但是不提交。
SQL> conn scott/tiger
Connected.
SQL> create table test as select * from dba_objects;
Table created.
SQL> select * from v$mystat where rownum=1;
SID STATISTIC# VALUE
---------- ---------- ----------
22 0 0
SQL> insert into test(object_id) values (0);
1 row created.
SQL>
- 查看锁的一些视图,熟悉视图
session 2:
SQL> select XIDUSN,XIDSLOT,XIDSQN,STATUS from v$transaction;
XIDUSN XIDSLOT XIDSQN STATUS
---------- ---------- ---------- ----------------
4 27 673 ACTIVE
SQL> select * from v$lock;
ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK
---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ----------
000000007D15F5A8 000000007D15F600 7 KT 12876 0 4 0 161 0
000000007D15F3E8 000000007D15F440 7 KD 0 0 6 0 161 0
000000007D15F4C8 000000007D15F520 10 MR 3 0 4 0 161 0
000000007D15F228 000000007D15F280 10 MR 4 0 4 0 161 0
000000007D15F928 000000007D15F980 10 MR 5 0 4 0 161 0
000000007D15F768 000000007D15F7C0 10 MR 1 0 4 0 161 0
000000007D15F848 000000007D15F8A0 10 MR 2 0 4 0 161 0
000000007D15FA08 000000007D15FA60 10 MR 201 0 4 0 161 0
000000007D15FE80 000000007D15FED8 10 PW 1 0 3 0 159 0
000000007D15F068 000000007D15F0C0 11 RT 1 0 6 0 161 0
000000007D15EEA8 000000007D15EF00 12 RS 25 1 2 0 161 0
000000007D15ECE8 000000007D15ED40 12 XR 4 0 1 0 165 0
000000007D15EDC8 000000007D15EE20 12 RD 1 0 1 0 165 0
000000007D15F148 000000007D15F1A0 12 CF 0 0 2 0 165 0
000000007D15F688 000000007D15F6E0 13 TS 3 1 3 0 161 0
000000007D15FB00 000000007D15FB58 15 AE 100 0 4 0 161 0
000000007D15EF88 000000007D15EFE0 22 AE 100 0 4 0 127 0
00007FFFF7E53898 00007FFFF7E538F8 22 TM 80628 0 3 0 34 0
000000007B224450 000000007B2244C8 22 TX 393249 669 6 0 33 0
000000007D15FCC0 000000007D15FD18 26 AE 100 0 4 0 89 0
000000007D160200 000000007D160258 33 AE 100 0 4 0 20 0
21 rows selected.
--关注sid=22的锁信息。
--可以使用此语句查看锁的详细信息,看description:
SQL> col TYPE for a10
SQL> col NAME for a15
SQL> col ID1_TAG for a20
SQL> col ID2_TAG for a20
SQL> col DESCRIPTION for a50
SQL> select * from v$lock_type where type='TX';
TYPE NAME ID1_TAG ID2_TAG IS_ DESCRIPTION
---------- --------------- -------------------- -------------------- --- --------------------------------------------------
TX Transaction usn<<16 | slot sequence YES Lock held by a transaction to allow other transact
ions to wait for it
SQL> select * from v$lock_type where type='AE';
TYPE NAME ID1_TAG ID2_TAG IS_ DESCRIPTION
---------- --------------- -------------------- -------------------- --- --------------------------------------------------
AE Edition Lock edition obj# 0 NO Prevent Dropping an edition in use
SQL> select * from v$lock_type where type='TM';
TYPE NAME ID1_TAG ID2_TAG IS_ DESCRIPTION
---------- --------------- -------------------- -------------------- --- --------------------------------------------------
TM DML object # table/partition YES Synchronizes accesses to an object
SQL> select * from v$locked_object;
XIDUSN XIDSLOT XIDSQN OBJECT_ID SESSION_ID ORACLE_USERNAME OS_USER_NAME PROCESS LOCKED_MODE
---------- ---------- ---------- ---------- ---------- ------------------------------ ------------------------------ ------------------------ -----------
6 33 669 80628 22 SCOTT oracle 1519 3
SQL> select object_type,object_name from dba_objects where object_id=80628;
OBJECT_TYPE OBJECT_NAM
------------------- ----------
TABLE TEST
- 创建索引,报错
session 3:
SQL> conn scott/tiger;
Connected.
SQL> create index idx_id on test(object_id);
create index idx_id on test(object_id)
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
##一般,在出现ora-00054报错时,主要原因就可能是该表存在尚未commit或rollback的DML命令。
SQL> select * from v$mystat where rownum=1;
SID STATISTIC# VALUE
---------- ---------- ----------
26 0 0
SQL> create index idx_id on test(object_id) online;
^C
^C
^C
hang了
##然后在创建索引语句后加上了online关键词,而在线创建索引没有返回的原因也应该是被别的会话进程阻塞了。
- 查询阻塞信息
SQL> @block
SQL_ID
USERNAME sid:serial:os session STATUS SQL_CHILD_NUMBER ID1:ID2 LMODE:QUEST TYPE BLOCK CTIME EVENT
---------- ----------------------------------- ---------- -------------------- -------------------- -------------------- ------------------------- ---------- ---------- --------------------
SCOTT Holder: 1:22:21:1521 INACTIVE 15dsc87s1q0m9: 393249:669 6||Exclusive:NULL TX|Transaction enqueue 1 237 SQL*Net message from
SCOTT Waiter: 1:26:15:1566 ACTIVE 60nx7jga3cm6w:0 393249:669 NULL:4||Share TX|Transaction enqueue 0 23 enq: TX - row lock c
查看 holder 的 sql_id 文本:
SQL> select sql_text from v$sql where sql_id='15dsc87s1q0m9';
SQL_TEXT
----------------------------------------------------------------------------------------------------
insert into test(object_id) values (0)
而 waiter 的确实是创建索引的进程:
SQL> select sql_text from v$sql where sql_id='60nx7jga3cm6w';
SQL_TEXT
----------------------------------------------------------------------------------------------------
create index idx_id on test(object_id) online
create index idx_id on test(object_id) online
现在active会话sid=26被sid=22的阻塞,在等待TX锁。而sid=22的会话持有6级锁,排他锁。
- 杀会话
由于创建索引hang,长时间没有返回,所以一般人为了不影响业务,会杀掉创建会话,如下:
在数据库如果要彻底杀掉一个会话,尤其是大事务会话,最好是使用ALTER SYSTEM DISCONNECT SESSION IMMEDIATE
session 2:
SQL> ALTER SYSTEM DISCONNECT SESSION '26,15' IMMEDIATE;
System altered.
SQL> @block
no rows selected
阻塞消息
session 3的会话前台返回:
SQL> create index idx_id on test(object_id) online;
^C
^C
^C
create index idx_id on test(object_id) online
*
ERROR at line 1:
ORA-12152: TNS:unable to send break message
SQL> SQL>
- 重新创建索引.业务闲了,现在准备再次创建
SQL> conn scott/tiger
Connected.
SQL> create index idx_id on test(object_id) online;
create index idx_id on test(object_id) online
*
ERROR at line 1:
ORA-00955: name is already used by an existing object
SQL> @index
Enter value for indexname: idx_id
tablespace PCT Dinsinct COLUMN
OWNER:INDEX_NAME name STATUS INDEX_TYPE UNIQUENES FREE LOG LEVEL Keys LEAF_BLOCKS DEGREE NUM_ROWS PARTI LOCALI POST:NAME
----------------------------------- -------------------- ---------- ---------- --------- ----- --- ------ ----------- ----------- ------- -------------- ----- ------ ---------------
SCOTT:IDX_ID USERS VALID NORMAL NONUNIQUE 10 YES 1 NO 1:OBJECT_ID
##怪不得报错已存在,通过查询,此索引确实存在,而且状态还是valid。即然这样,查看此索引是否可用。
收集统计信息:
SQL> begin
2 dbms_stats.gather_table_stats(ownname=>upper('SCOTT'),
3 tabname=>upper('TEST'),
4 estimate_percent=>30,
5 method_opt=>'for all columns size repeat',
6 no_invalidate=>false,
7 degree=>8,
8 cascade=>true);
9 end;
10 /
begin
*
ERROR at line 1:
ORA-20000: this index object "SCOTT"."IDX_ID" is being online built or rebuilt
ORA-06512: at "SYS.DBMS_STATS", line 24281
ORA-06512: at "SYS.DBMS_STATS", line 24332
ORA-06512: at line 2
收集统计信息的时候,收集索引的都会报错,说明索引有问题。不收集索引统计信息:
SQL> begin
2 dbms_stats.gather_table_stats(ownname=>upper('SCOTT'),
3 tabname=>upper('TEST'),
4 estimate_percent=>100,
5 method_opt=>'for all columns size repeat',
6 no_invalidate=>false,
7 degree=>8,cascade=>false);
8 end;
9 /
PL/SQL procedure successfully completed.
查看执行计划
SQL> alter session set statistics_level=all;
Session altered.
SQL> select * from test where object_id=2;
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID ba7nqkg6k6waq, child number 0
-------------------------------------
select * from test where object_id=2
Plan hash value: 1357081020
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.02 | 1170 | 1162 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 1 | 1 |00:00:00.02 | 1170 | 1162 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=2)
18 rows selected.
SQL> select count(*) from test;
COUNT(*)
----------
80072
#按道理,从80072中返回1条数据,而且object-id上面有索引,选择度也很高,应该走索引的,但是此处走的是全表扫面,侧面印证索引有问题。下面强制加hint。
SQL> select /* +INDEX(t idx_id)*/ * from test t where object_id=2;
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 55cxmu0bq4gs2, child number 0
-------------------------------------
select /* +INDEX(t idx_id)*/ * from test t where object_id=2
Plan hash value: 1357081020
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 1170 | 1162 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 1 | 1 |00:00:00.01 | 1170 | 1162 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=2)
18 rows selected.
#还是走的全表,也就是说虽然这个索引的状态是valid,但是索引有问题,用不上。
以上说明,该索引有问题,索引现在要删除这个索引,重新创建,或者rebuild。
SQL> drop index idx_id;
drop index idx_id
*
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
SQL> drop index idx_id online;
drop index idx_id online
*
ERROR at line 1:
ORA-00933: SQL command not properly ended
SQL> alter index idx_id rebuild online;
alter index idx_id rebuild online
*
ERROR at line 1:
ORA-08104: this index object 80629 is being online built or rebuilt
重新创建报错索引已经存在。
直接drop是不可行的,并且该数据库版本为11.2.0.4,drop index并不支持online操作,故当时尝试了下是否可以在线重建索引。现在所有的online操作,都会报错ORA-8104。
问题现象
create index online 报错:ORA-00955。 rebuild index online操作报错ORA-08104,并且此时的索引状态位valid,有效的,但是,这个索引是有问题的,不能用。
问题原因
[oracle@oracle11g ~]$ oerr ora 8104
08104, 00000, “this index object %s is being online built or rebuilt”
// *Cause: the index is being created or rebuild or waited for recovering
// from the online (re)build
// *Action: wait the online index build or recovery to complete
服务器进程在执行create index online时意外终止,导致生成的索引并不可用,而由于数据表上的insert进程一直未提交,导致索引进程的等待。
当我们在线创建或重建索引时(create or rebuild index online),服务进程会到INDKaTeX parse error: Expected 'EOF', got '#' at position 220: …nn(nnnn为该索引的obj#̲)。数据字典的维护工作就包含对…基表中相应索引记录的FLAGS标志位的恢复,但是如果服务进程在语句执行过程中意外终止的话,那么短时间内FLAGS标志位字段就无法得到恢复,这将导致对该索引的后续操作因ORA-8104错误而无法继续。
即会话异常终止,SMON无法恢复FLAG和清理 SYS_JOURNAL_obj#日志中间表。
故障处理
由前面可知,由于数据字典的原因,现在要清除这个无效的索引信息。有以下方法:
dbms_repair.online_index_clean
方法一使用存储过程dbms_repair.online_index_clean进行清理,这种方式不仅可以清理所有处于online built的索引,还可以指定具体的object_id。
清理所有在线索引的语法:
declare
isClean boolean;
begin
isClean := FALSE;
while isClean=FALSE loop
isClean := dbms_repair.online_index_clean(dbms_repair.all_index_id,
dbms_repair.lock_wait);
dbms_lock.sleep(2);
end loop;
exception
when others then
RAISE;
end;
/
hang着
指定具体object_id:
查询object_id
SQL> select i.obj#, i.flags, u.name, o.name, o.type#
2 from sys.obj$ o, sys.user$ u, sys.ind_online$ i
3 where (bitand(i.flags, 256) = 256 or bitand(i.flags, 512) = 512)
4 and (not ((i.type# = 9) and bitand(i.flags, 8) = 8))
5 and o.obj# = i.obj#
6 and o.owner# = u.user#;
OBJ# FLAGS NAME NAME TYPE#
---------- ---------- ------------------------------ ------------------------------ ----------
80629 256 SCOTT IDX_ID 1
SQL> select object_id from dba_objects where object_name='IDX_ID';
OBJECT_ID
----------
80629
declare
a boolean;
begin
a := dbms_repair.online_index_clean(1760,dbms_repair.lock_wait);
end;
/
hang着
但该方法具有版本限制,根据官方mos资料,在10.2及以上版本中均可以使用该存储过程,而针对10.2之前的版本,9.2.0.7和10.1.0.4可通过打上bug3805539的patch后方可使用dbms_repair包。(具体参见mos文档:Bug 3805539 - Add DBMS_REPAIR.ONLINE_INDEX_CLEAN to manually clean up failed ONLINE builds (Doc ID 3805539.8))。
但是在此处,测试都没有成功,2个操作都会hang,没有达到清理的目标。
alert日志会有如下打印:
Wed Feb 03 17:28:50 2021
online index (re)build cleanup: objn=80633 maxretry=2000 forever=0
使用ORADEBUG唤醒SMON进程进行清理
SQL> select pid,spid
2 from v$process p,v$bgprocess b
3 where b.paddr=p.addr and name='SMON';
PID SPID
---------- ------------------------
13 1469
SQL> ORADEBUG WAKEUP 13
Statement processed.
SQL>
由于打开了smon 的trc,所以在smon的trc中会有如下显示:
。。。。。。
*** 2021-02-03 14:35:47.588
SMON: system monitor process posted msgflag:0x0000 (-/-/-/-/-/-/-)
*** 2021-02-03 14:35:47.592
SMON: process sort segment requests begin
*** 2021-02-03 14:35:47.592
SMON: process sort segment requests end
*** 2021-02-03 14:35:47.592
SMON: parallel transaction recovery begin
*** 2021-02-03 14:35:47.595
SMON: parallel transaction recovery end
SMON: IMU monitor begin
SMON: IMU monitor end
SMON: cleanup index online rebuild begin
但是仅显示了begin,也没有进行清理,hang着,所以在此处也没成功。
这个语句会一直试图占用资源,但是如果一直无法占用资源,这个sql就会被挡。
SMON也会自动清理,但是触发条件比较苛刻:
Note: SMON will perform the cleanup and does this once every 60 minutes. SMON cleanup is only successful if there are no transactions against the base table [or [sub]partition] at the time of the attempted cleanup. In an environment where there are likely to be uncommitted transactions, this makes cleanup a bit ‘hit and miss’. To speed up the process, you can stop your application which uses the table and wait until the cleanup is done.
1、SMON每60分钟清理一次。2、索引所在基表上不能有事务未提交,不然也会清理失败。
因为上述事务一直未提交,所以即时到达60min,SMON也不能自动清理ind$基表。
提交原来创建索引列对应的DML事务
SQL> rollback;
Rollback complete.
SQL> select object_id from dba_objects where object_name='IDX_ID';
no rows selected
SQL> drop index idx_id;
drop index idx_id
*
ERROR at line 1:
ORA-01418: specified index does not exist
smon trc部分信息:
*** 2021-02-03 14:51:45.055
SMON: cleanup IOT-PMO failures begin
SMON: cleanup IOT-PMO failures end
SMON: offline rollback segment begin
SMON: offline rollback segment end
SMON: offline rollback segment begin
SMON: offline rollback segment end
SMON: flush modification to disk begin
SMON: flush modification to disk end
SMON: launch SMCO begin
SMON: launch SMCO end
SMON: poll shared mount broadcast channel begin
SMON: poll shared mount broadcast channel end
SMON: poll flashback broadcast channel begin
SMON: poll flashback broadcast channel end
SMON: poll segment info broadcast channel begin
SMON: poll segment info broadcast channel end
SMON: calculate time per CPU cycle begin
SMON: calculate time per CPU cycle end
–可以看到SMON是去清理IOT-PMO表,
重启数据库实例
重启数据库实例,由SMON进程清理,需要安排停机时间,停应用然后重启数据库,此时相关表上没有了DML操作,SMON进程会完成自动清理。
手动修改IND$字典基表
因需要手工修改数据库的内部数据字典,风险较大,不推荐在生产系统中使用,容易造成其他不可预见的问题。
总结
SMON功能—清理IND$字典基表
功能概述
smon对于字典的清理有2点:
1、清理IND$
基表:SMON负责在启动后(startup)的每小时执行一次对IND$
基表中因在线创建/重建索引失败所留下记录的清理,这种清理工作由kdicclean函数驱动(kdicclean is run by smon every 1 hour,called from SMON to find if there is any online builder death and cleanup our ind$ and obj$ and drop the journal table, stop journaling)。 这种清理工作典型的调用堆栈stack call如下:
ksbrdp -> ktmSmonMain ktmmon -> kdicclean -> kdic_cleanup -> ktssdrp_segment
注意因为SMON进程的清理工作每小时才执行一次,而且在工作负载很高的情况下可能实际很久都不会得到清理。
2、清理OBJ$
基表:OBJ$
基表是一张低级数据字典表,该表几乎对库中的每个对象(表、索引、包、视图等)都包含有一行记录。很多情况下,这些条目所代表的对象是不存在的对象(non-existent),引起这种现象的一种可能的原因是对象本身已经被从数据库中删除了,但是对象条目仍被保留下来以满足消极依赖机制(negative dependency)。因为这些条目的存在会导致OBJ$表不断膨胀,这时就需要由SMON进程来删除这些不再需要的行。SMON会在实例启动(after startup of DB is started cleanup function again)时以及启动后的每12个小时执行一次清理任务(the cleanup is scheduled to run after startup and then every 12 hours)。
触发场景
当我们在线创建或重建索引时(create or rebuild index online),服务进程会到IND$字典基表中将该索引对应的记录的FLAGS字段修改为十进制的256或者512(0x100=256,0×200=512),如:
SQL> create index idx_id on test(object_id) online
2 ;
^C
##online创建索引,让hang着,在另一个窗口查询。
SQL> select obj# from obj$ where name='IDX_ID';
OBJ#
----------
80633
SQL> select FLAGS from ind$ where obj#=80633;
FLAGS
----------
256
#ind_online$字典基表记录了索引在线创建/重建的历史
SQL> select * from ind_online$;
OBJ# TYPE# FLAGS
---------- ---------- ----------
80633 1 256
##kill掉online的创建索引会话。flag会变化256--》402653185。
SQL> select * from ind_online$;
OBJ# TYPE# FLAGS
---------- ---------- ----------
80633 1 256
SQL> select obj# from obj$ where name='IDX_ID';
OBJ#
----------
80633
SQL> select FLAGS from ind$ where obj#=80633;
FLAGS
----------
402653185
create table ind_online$
( obj# number not null,
type# number not null, /* what kind of index is this? */
/* normal : 1 */
/* bitmap : 2 */
/* cluster : 3 */
/* iot - top : 4 */
/* iot - nested : 5 */
/* secondary : 6 */
/* ansi : 7 */
/* lob : 8 */
/* cooperative index method : 9 */
flags number not null
/* index is being online built : 0x100 */
/* index is being online rebuilt : 0x200 */
)
清理原理
原则上online create/rebuild index的的清理工作由实际操作的服务进程负责完成,这种清理在DDL语句成功的情况下包括一系列数据字典的维护,在该DDL语句失败的情形中包括对临时段的清理和数据字典的维护,无论如何都需要drop在线日志中间表 SYS_JOURNAL_nnnnn(nnnn为该索引的obj#)。数据字典的维护工作就包含对IND$基表中相应索引记录的FLAGS标志位的恢复,但是如果服务进程在语句执行过程中意外终止的话,那么短时间内FLAGS标志位字段就无法得到恢复,这将导致对该索引的后续操作因ORA-8104错误而无法继续:
SQL> alter index idx_id rebuild online;
alter index idx_id rebuild online
*
ERROR at line 1:
ORA-08104: this index object 80629 is being online built or rebuilt
08104, 00000, "this index object %s is being online built or rebuilt"
// *Cause: the index is being created or rebuild or waited for recovering
// from the online (re)build
// *Action: wait the online index build or recovery to complete
SMON负责在启动后(startup)的每小时执行一次对IND$
基表中因在线创建/重建索引失败所留下记录的清理,这种清理工作由kdicclean函数驱动(kdicclean is run by smon every 1 hour,called from SMON to find if there is any online builder death and cleanup our ind$ and obj$ and drop the journal table, stop journaling)。
这种清理工作典型的调用堆栈stack call如下:
ksbrdp -> ktmSmonMain ktmmon -> kdicclean -> kdic_cleanup -> ktssdrp_segment
注意因为SMON进程的清理工作每小时才执行一次,而且在工作负载很高的情况下可能实际很久都不会得到清理,在这种情景中我们总是希望能尽快完成对索引的在线创建或重建,在10gr2以后的版本中我们可以直接使用dbms_repair.online_index_clean来手动清理online index rebuild的遗留问题。
但是如果在9i中的话就比较麻烦,可以尝试用以下方法(不是很推荐,除非你已经等了很久):
1.首先手工删除在线日志表,通过以下手段找出这个中间表的名字
SQL> select object_name
2 from dba_objects
3 where object_name like
4 (select '%' || object_id || '%'
5 from dba_objects
6 where object_name = '&INDEX_NAME')
7 /
Enter value for index_name: IDX_ID
OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
SYS_JOURNAL_80646
SQL> DROP TABLE SYS_JOURNAL_80646;
Table dropped.
2.第二步要手动修改IND$字典基表
!!! 注意!手动修改数据字典要足够小心!!
SQL> select obj# from obj$ where name='IDX_ID';
OBJ#
----------
80646
SQL> select flags from ind$ where obj#=&INDEX_OBJECT_ID;
Enter value for index_object_id: 80646
FLAGS
----------
256
a) 针对online create index,手动删除对应的记录
delete from IND$ where obj#=&INDEX_OBJECT_ID
b) 针对online rebuild index,手动恢复对应记录的FLAGS标志位
update IND$ set FLAGS=FLAGS-512 where obj#=&INDEX_OBJECT_ID
SQL> delete from IND$ where obj#=&INDEX_OBJECT_ID
2 ;
Enter value for index_object_id: 80646
1 row deleted.
接下来我们实际观察一下清理工作的细节:
SQL> select obj# from obj$ where name=‘IDX_ID’;
OBJ#
80646
SQL> select flags from ind$ where obj#=&INDEX_OBJECT_ID;
Enter value for index_object_id: 80646
FLAGS
256
SQL> commit;
Commit complete.
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 8;
Statement processed.
SQL> DECLARE
2 isClean BOOLEAN;
3 BEGIN
4 isClean := FALSE;
5 WHILE isClean=FALSE
6 LOOP
7 isClean := dbms_repair.online_index_clean(
8 dbms_repair.all_index_id, dbms_repair.lock_wait);
9
10 dbms_lock.sleep(10);
11 END LOOP;
12 END;
13 /
PL/SQL procedure successfully completed.
SQL> oradebug tracefile_name;
/u01/app/oracle/diag/rdbms/zhuo/zhuo/trace/zhuo_ora_2982.trc
===============================10046 trace=============================
select i.obj#, i.flags, u.name, o.name, o.type#
from sys.obj$ o, sys.user$ u, sys.ind_online$ i
where (bitand(i.flags, 256) = 256 or bitand(i.flags, 512) = 512)
and (not ((i.type# = 9) and bitand(i.flags, 8) = 8))
and o.obj# = i.obj#
and o.owner# = u.user#
select u.name,
o.name,
o.namespace,
o.type#,
decode(bitand(i.property, 1024), 0, 0, 1)
from ind$ i, obj$ o, user$ u
where i.obj# = :1
and o.obj# = i.bo#
and o.owner# = u.user#
delete from object_usage
where obj# in (select a.obj#
from object_usage a, ind$ b
where a.obj# = b.obj#
and b.bo# = :1)
drop table "SYS"."SYS_JOURNAL_1343854" purge
delete from icoldep$ where obj# in (select obj# from ind$ where bo#=:1)
delete from ind$ where bo#=:1
delete from ind$ where obj#=:1
我们可以利用以下语句找出系统中可能需要恢复的IND$记录,注意不要看到查询有结果就认为这是操作失败的征兆,很可能是有人在线创建或重建索引:
select i.obj#, i.flags, u.name, o.name, o.type#
from sys.obj$ o, sys.user$ u, sys.ind_online$ i
where (bitand(i.flags, 256) = 256 or bitand(i.flags, 512) = 512)
and (not ((i.type# = 9) and bitand(i.flags, 8) = 8))
and o.obj# = i.obj#
and o.owner# = u.user#
/
禁止清理
相关诊断事件可以通过设置诊断事件event=’8105 trace name context forever’
来禁止SMON清理IND$(Oracle event to turn off smon cleanup for online index build)
alter system set events '8105 trace name context forever';
锁相关视图
v$transaction
XIDUSN表示当前事务使用的回滚段的编号
XIDSLOT说明该事务在回滚段头部的事务表中对应的记录编号(也可以叫做槽号)
XIDSQN说明序列号
STATUS说明该事务是否为活动的
v$lock
记录了session已经获得的锁定以及正在请求的锁定的信息
SID说明session的ID号
TYPE说明锁定锁定级别,主要关注TX和TM
LMODE说明已经获得的锁定的模式,以数字编码表示
REQUEST说明正在请求的锁定的模式,以数字编码表示
BLOCK说明是否阻止了其他用户获得锁定,大于0说明是,等于0说明否
锁定模式 锁定简称 编码数值
Row Exclusive RX 3
Row Shared RS 2
Share S 4
Exclusive X 6
Share Row Exclusive SRX 5
NULL N/A 0或者1
v$enqueue_lock
该视图中包含的字段以及字段含义与v$lock中的字段一模一样。
只不过该视图中只显示那些申请锁定,但是无法获得锁定的session信息。
其中的记录按照申请锁定的时间先后顺序排列,先申请锁定的session排在前面,排在前面的session将会先获得锁定。
v$locked_object
记录了当前已经被锁定的对象的信息
XIDUSN表示当前事务使用的回滚段的编号
XIDSLOT说明该事务在回滚段头部的事务表中对应的记录编号
XIDSQN说明序列号
OBJECT_ID说明当前被锁定的对象的ID号,可以根据该ID号到dba_objects里查找被锁定的对象名称
LOCKED_MODE说明锁定模式的数字编码
v$session
记录了当前session的相关信息
SID表示session的编号
SERIAL#表示序列号
SID和SERIAL#可以认为是v$session的主键,它们共同唯一标识一个session
当我们在线创建或重建索引时(create or rebuild index online),服务进程会到IND$字典基表中将该索引对应的记录的FLAGS字段修改为十进制的256或者512(见上图0x100=256,0×200=512)
参考:
https://blog.youkuaiyun.com/haibusuanyun/article/details/50236057
https://www.askmaclean.com/archives/smon-clean-ind-while-rebuild-create-index-online-fail.html