create table t as select level as id ,level||'a' as a,level||level||'b' as b from dual connect by level<100;
这里A列的值能够确定B列的值,
insert into t select * from t;
.............................. 一直重复插入数据
SQL> select count(*) from t;
COUNT(*)
----------
3244032
create index idx1 on t(a);
create index idx2 on t(a,b);
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SCOTT',
tabname => 'T',
estimate_percent => 100,
method_opt => 'for all columns size skewonly',
no_invalidate => FALSE,
degree => 8,
cascade => TRUE);
END;
/
- SQL> select * from t where a='1a' and b='11b';
- 已选择32768行。
- 已用时间: 00: 00: 03.98
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 2303463401
- ------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 331 | 3972 | 84 (0)| 00:00:02 |
- | 1 | TABLE ACCESS BY INDEX ROWID| T | 331 | 3972 | 84 (0)| 00:00:02 |
- |* 2 | INDEX RANGE SCAN | IDX2 | 331 | | 3 (0)| 00:00:01 |
- ------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("A"='1a' AND "B"='11b')
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 11838 consistent gets
- 7943 physical reads
- 0 redo size
- 441749 bytes sent via SQL*Net to client
- 24424 bytes received via SQL*Net from client
- 2186 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 32768 rows processed
SQL> select * from t where a='1a' and b='11b';
已选择32768行。
已用时间: 00: 00: 03.98
执行计划
----------------------------------------------------------
Plan hash value: 2303463401
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 331 | 3972 | 84 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 331 | 3972 | 84 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IDX2 | 331 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"='1a' AND "B"='11b')
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
11838 consistent gets
7943 physical reads
0 redo size
441749 bytes sent via SQL*Net to client
24424 bytes received via SQL*Net from client
2186 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32768 rows processed
因为CBO不知道A与B关系,所以计算基数等于331,
SQL> select 1/99/99*3244032 from dual; ----这个其实就是 a选择性*b选择性 =(1/99)*(1/99)
1/99/99*3244032
---------------
330.989899
但是实际上它要返回32768条记录
- SQL> select * from t where a='1a';
- 已选择32768行。
- 已用时间: 00: 00: 01.38
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1601196873
- --------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 32768 | 384K| 1874 (8)| 00:00:23 |
- |* 1 | TABLE ACCESS FULL| T | 32768 | 384K| 1874 (8)| 00:00:23 |
- --------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("A"='1a')
- 统计信息
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 10120 consistent gets
- 6312 physical reads
- 0 redo size
- 441749 bytes sent via SQL*Net to client
- 24424 bytes received via SQL*Net from client
- 2186 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 32768 rows processed
SQL> select * from t where a='1a';
已选择32768行。
已用时间: 00: 00: 01.38
执行计划
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32768 | 384K| 1874 (8)| 00:00:23 |
|* 1 | TABLE ACCESS FULL| T | 32768 | 384K| 1874 (8)| 00:00:23 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"='1a')
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
10120 consistent gets
6312 physical reads
0 redo size
441749 bytes sent via SQL*Net to client
24424 bytes received via SQL*Net from client
2186 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32768 rows processed
如果where条件单独是 where a='1a' CBO 就能够算对基数,它的基数是这样计算的
SQL> select 3244032/99 from dual;
3244032/99
----------
32768
很显然,这个SQL select * from t where a='1a' and b='11b' 的执行计划走错了,它应该走全表扫描,但是因为计算基数错误,导致它走 IDX2这个索引
相关列的解决办法在Oracle中有2个,一个是动态采样,另外一个就是Oracle11g,对相关列收集扩展统计
- SQL> ALTER SESSION SET optimizer_dynamic_sampling=6;
- 会话已更改。
- SQL> set lines 200
- SQL> set pages 200
- SQL> set timi on
- SQL> explain plan for select * from t where a='1a' and b='11b';
- 已解释。
- 已用时间: 00: 00: 00.86
- SQL> select * from table(dbms_xplan.display);
- PLAN_TABLE_OUTPUT
- ---------------------------------------------------------------------------------------
- Plan hash value: 1601196873
- --------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 32776 | 384K| 1885 (8)| 00:00:23 |
- |* 1 | TABLE ACCESS FULL| T | 32776 | 384K| 1885 (8)| 00:00:23 |
- --------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("A"='1a' AND "B"='11b')
- Note
- -----
- - dynamic sampling used for this statement
- 已选择17行。
SQL> ALTER SESSION SET optimizer_dynamic_sampling=6;
会话已更改。
SQL> set lines 200
SQL> set pages 200
SQL> set timi on
SQL> explain plan for select * from t where a='1a' and b='11b';
已解释。
已用时间: 00: 00: 00.86
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32776 | 384K| 1885 (8)| 00:00:23 |
|* 1 | TABLE ACCESS FULL| T | 32776 | 384K| 1885 (8)| 00:00:23 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"='1a' AND "B"='11b')
Note
-----
- dynamic sampling used for this statement
已选择17行。
设置动态采样之后 Oracle评估基数就基本正确了,关于11g扩展统计这里就不做了,有兴趣的请自己做一下。
我对相关列的建议就是,能否在程序里拼接?如果A能确定B,那么做DB 设计的时候就不要创建B列了 直接在程序里根据A列的值生成B的值 这样减少DB的存储空间。
如果非要在DB里设置B列,写SQL的时候就不要把2个列都写进去,也就是说不要写成
select * from t where a='1a' and b='11b';
直接写成
select * from t where a='1a' 或者 select * from t where b='11b'
这样能尽量避免CBO计算基数出错,如果这个表要与多表关联,基数一旦算错,必然导致整个SQL的执行计划全部出错,从而导致SQL性能下降。
动态采样和扩展统计虽然是解决办法,但是如果产品要考虑兼容性呢?我的产品要同时支持ORACLE,DB2,SQLSERVER,甚至以后的国产数据库达梦,他们没有动态采样怎么办。
转载自:http://blog.youkuaiyun.com/robinson1988/article/details/7311896