相关列问题!

先来创建一个表T

create table t as select level as id ,level||'a' as a,level||level||'b' as b from dual connect by level<100;

这里A列的值能够确定B列的值,

insert into t select * from t;
.............................. 一直重复插入数据

SQL> select count(*) from t;

COUNT(*)
----------
3244032

create index idx1 on t(a);
create index idx2 on t(a,b);

BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SCOTT',
tabname => 'T',
estimate_percent => 100,
method_opt => 'for all columns size skewonly',
no_invalidate => FALSE,
degree => 8,
cascade => TRUE);
END;
/

  1. SQL> select * from t where a='1a' and b='11b';
  2. 已选择32768行。
  3. 已用时间: 00: 00: 03.98
  4. 执行计划
  5. ----------------------------------------------------------
  6. Plan hash value: 2303463401
  7. ------------------------------------------------------------------------------------
  8. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  9. ------------------------------------------------------------------------------------
  10. | 0 | SELECT STATEMENT | | 331 | 3972 | 84 (0)| 00:00:02 |
  11. | 1 | TABLE ACCESS BY INDEX ROWID| T | 331 | 3972 | 84 (0)| 00:00:02 |
  12. |* 2 | INDEX RANGE SCAN | IDX2 | 331 | | 3 (0)| 00:00:01 |
  13. ------------------------------------------------------------------------------------
  14. Predicate Information (identified by operation id):
  15. ---------------------------------------------------
  16. 2 - access("A"='1a' AND "B"='11b')
  17. 统计信息
  18. ----------------------------------------------------------
  19. 1 recursive calls
  20. 0 db block gets
  21. 11838 consistent gets
  22. 7943 physical reads
  23. 0 redo size
  24. 441749 bytes sent via SQL*Net to client
  25. 24424 bytes received via SQL*Net from client
  26. 2186 SQL*Net roundtrips to/from client
  27. 0 sorts (memory)
  28. 0 sorts (disk)
  29. 32768 rows processed
SQL> select * from t where a='1a' and b='11b';

已选择32768行。

已用时间:  00: 00: 03.98

执行计划
----------------------------------------------------------
Plan hash value: 2303463401

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |   331 |  3972 |    84   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T    |   331 |  3972 |    84   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | IDX2 |   331 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"='1a' AND "B"='11b')


统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      11838  consistent gets
       7943  physical reads
          0  redo size
     441749  bytes sent via SQL*Net to client
      24424  bytes received via SQL*Net from client
       2186  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32768  rows processed

因为CBO不知道A与B关系,所以计算基数等于331,

SQL> select 1/99/99*3244032 from dual; ----这个其实就是 a选择性*b选择性 =(1/99)*(1/99)

1/99/99*3244032
---------------
330.989899

但是实际上它要返回32768条记录

  1. SQL> select * from t where a='1a';
  2. 已选择32768行。
  3. 已用时间: 00: 00: 01.38
  4. 执行计划
  5. ----------------------------------------------------------
  6. Plan hash value: 1601196873
  7. --------------------------------------------------------------------------
  8. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  9. --------------------------------------------------------------------------
  10. | 0 | SELECT STATEMENT | | 32768 | 384K| 1874 (8)| 00:00:23 |
  11. |* 1 | TABLE ACCESS FULL| T | 32768 | 384K| 1874 (8)| 00:00:23 |
  12. --------------------------------------------------------------------------
  13. Predicate Information (identified by operation id):
  14. ---------------------------------------------------
  15. 1 - filter("A"='1a')
  16. 统计信息
  17. ----------------------------------------------------------
  18. 0 recursive calls
  19. 0 db block gets
  20. 10120 consistent gets
  21. 6312 physical reads
  22. 0 redo size
  23. 441749 bytes sent via SQL*Net to client
  24. 24424 bytes received via SQL*Net from client
  25. 2186 SQL*Net roundtrips to/from client
  26. 0 sorts (memory)
  27. 0 sorts (disk)
  28. 32768 rows processed
SQL> select * from t where a='1a';

已选择32768行。

已用时间:  00: 00: 01.38

执行计划
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 32768 |   384K|  1874   (8)| 00:00:23 |
|*  1 |  TABLE ACCESS FULL| T    | 32768 |   384K|  1874   (8)| 00:00:23 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"='1a')


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      10120  consistent gets
       6312  physical reads
          0  redo size
     441749  bytes sent via SQL*Net to client
      24424  bytes received via SQL*Net from client
       2186  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      32768  rows processed

如果where条件单独是 where a='1a' CBO 就能够算对基数,它的基数是这样计算的

SQL> select 3244032/99 from dual;

3244032/99
----------
32768

很显然,这个SQL select * from t where a='1a' and b='11b' 的执行计划走错了,它应该走全表扫描,但是因为计算基数错误,导致它走 IDX2这个索引

相关列的解决办法在Oracle中有2个,一个是动态采样,另外一个就是Oracle11g,对相关列收集扩展统计

  1. SQL> ALTER SESSION SET optimizer_dynamic_sampling=6;
  2. 会话已更改。
  3. SQL> set lines 200
  4. SQL> set pages 200
  5. SQL> set timi on
  6. SQL> explain plan for select * from t where a='1a' and b='11b';
  7. 已解释。
  8. 已用时间: 00: 00: 00.86
  9. SQL> select * from table(dbms_xplan.display);
  10. PLAN_TABLE_OUTPUT
  11. ---------------------------------------------------------------------------------------
  12. Plan hash value: 1601196873
  13. --------------------------------------------------------------------------
  14. | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
  15. --------------------------------------------------------------------------
  16. | 0 | SELECT STATEMENT | | 32776 | 384K| 1885 (8)| 00:00:23 |
  17. |* 1 | TABLE ACCESS FULL| T | 32776 | 384K| 1885 (8)| 00:00:23 |
  18. --------------------------------------------------------------------------
  19. Predicate Information (identified by operation id):
  20. ---------------------------------------------------
  21. 1 - filter("A"='1a' AND "B"='11b')
  22. Note
  23. -----
  24. - dynamic sampling used for this statement
  25. 已选择17行。
SQL> ALTER SESSION SET optimizer_dynamic_sampling=6;

会话已更改。

SQL> set lines 200
SQL> set pages 200
SQL> set timi on
SQL> explain plan for select * from t where a='1a' and b='11b';

已解释。

已用时间:  00: 00: 00.86
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 32776 |   384K|  1885   (8)| 00:00:23 |
|*  1 |  TABLE ACCESS FULL| T    | 32776 |   384K|  1885   (8)| 00:00:23 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("A"='1a' AND "B"='11b')

Note
-----
   - dynamic sampling used for this statement

已选择17行。

设置动态采样之后 Oracle评估基数就基本正确了,关于11g扩展统计这里就不做了,有兴趣的请自己做一下。

我对相关列的建议就是,能否在程序里拼接?如果A能确定B,那么做DB 设计的时候就不要创建B列了 直接在程序里根据A列的值生成B的值 这样减少DB的存储空间。

如果非要在DB里设置B列,写SQL的时候就不要把2个列都写进去,也就是说不要写成
select * from t where a='1a' and b='11b';

直接写成
select * from t where a='1a' 或者 select * from t where b='11b'

这样能尽量避免CBO计算基数出错,如果这个表要与多表关联,基数一旦算错,必然导致整个SQL的执行计划全部出错,从而导致SQL性能下降。

动态采样和扩展统计虽然是解决办法,但是如果产品要考虑兼容性呢?我的产品要同时支持ORACLE,DB2,SQLSERVER,甚至以后的国产数据库达梦,他们没有动态采样怎么办。

转载自:http://blog.youkuaiyun.com/robinson1988/article/details/7311896

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值