初次听说聚簇因子,我相信大部分人都是丈二和尚摸不着头脑,不知所云。然而当我们了解其真正含义后,也许会觉得这个高大上的名词也没有那么神秘。废话少说,上官方文档:
Index Clustering Factor
The index clustering factor measures row order in relation to an indexed value such as employee last name. The more order that exists in row storage for this value, the lower the clustering factor.
The clustering factor is useful as a rough measure of the number of I/Os required to read an entire table by means of an index
- If the clustering factor is high, then Oracle Database performs a relatively high number of I/Os during a large index range scan. The index entries point to random table blocks, so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index.
- If the clustering factor is low, then Oracle Database performs a relatively low number of I/Os during a large index range scan. The index keys in a range tend to point to the same data block, so the database does not have to read and reread the same blocks over and over.
下面通过实验作一个直观的展现:
点击(此处)折叠或打开
- drop table colocated;
- create table colocated ( x int, y varchar2(2000));
- begin
- for i in 1..100000 loop
- insert into colocated values(i,dbms_random.string('a',5));
- end loop;
- commit;
- end;
- /
- alter table colocated add constraint colocated_pk primary key(x);
-
- drop table disorganized;
- create table disorganized as select x,y from colocated order by y;
- alter table disorganized add constraint disorganized_pk primary key(x);
- exec dbms_stats.gather_table_stats(USER,'COLOCATED',CASCADE=>TRUE);
- exec dbms_stats.gather_table_stats(USER,'DISORGANIZED',CASCADE=>TRUE);
- SELECT a.index_name,b.num_rows,b.blocks,a.clustering_factor FROM user_indexes a,user_tables b WHERE a.table_name=b.table_name AND a.table_name='COLOCATED';
- INDEX_NAME NUM_ROWS BLOCKS CLUSTERING_FACTOR
- ------------------------------ ---------- ---------- -----------------
- COLOCATED_PK 100000 244 219
-
- SELECT a.index_name,b.num_rows,b.blocks,a.clustering_factor FROM user_indexes a,user_tables b WHERE a.table_name=b.table_name AND a.table_name='DISORGANIZED';
-
- INDEX_NAME NUM_ROWS BLOCKS CLUSTERING_FACTOR
- ------------------------------ ---------- ---------- -----------------
- DISORGANIZED_PK 100000 232 99572
了解了聚簇因子的概念,具体聚簇因子有什么用途呢?下面进一步说明。
点击(此处)折叠或打开
- set autotrace traceonly;
- SQL> select * from COLOCATED where x between 20000 and 20050;
- 51 rows selected.
-
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 1550765370
-
- --------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 52 | 572 | 3 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| COLOCATED | 52 | 572 | 3 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | COLOCATED_PK | 52 | | 2 (0)| 00:00:01 |
- --------------------------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 2 - access("X">=20000 AND "X"<=20050)
-
-
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 11 consistent gets
- 0 physical reads
- 0 redo size
- 2000 bytes sent via SQL*Net to client
- 556 bytes received via SQL*Net from client
- 5 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 51 rows processed
-
- SQL> select * from DISORGANIZED where x between 20000 and 20050;
- 51 rows selected.
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 2594580634
-
- -----------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -----------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 52 | 572 | 54 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| DISORGANIZED | 52 | 572 | 54 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | DISORGANIZED_PK | 52 | | 2 (0)| 00:00:01 |
- -----------------------------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- 2 - access("X">=20000 AND "X"<=20050)
-
-
- Statistics
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 56 consistent gets
- 0 physical reads
- 0 redo size
- 2000 bytes sent via SQL*Net to client
- 556 bytes received via SQL*Net from client
- 5 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 51 rows processed
COLOCATED和 DISORGANIZED两张表存储了同样的语句,相同的 SQL语句执行,聚簇因子小的表逻辑读为 11,聚簇因子大的逻辑读却为 56,大家都知道逻辑读和锁机制息息相关,锁会影响到了数据库的并发性,也会影响性能。
小结:
1.聚簇因子的高低会影响执行计划的选择。
2.聚簇因子与表的存储有关,无法通过重建索引来改变聚簇因子的大小。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29827284/viewspace-2134262/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29827284/viewspace-2134262/