SQL优化技巧-优快云博客

SQL优化经验：
1. 避免SQL中的函数调用，否则会产生很多的递归调用。
2. 如果使用函数调用，尽量在最外层使用函数调用，否则会产生很多次的递归调用。
3. 插入多行数据的时候，可以考虑集合写法。
eg: insert into t SELECT LEVEL FROM dual CONNECT BY LEVEL < 10 ;
这样的写法比写for循环会快很多。
4. 只取你所需要的列，访问视图会更快，因为如果你访问的列越少，访问的表也越少，那么在执行计划中，可能涉及的表越少，访问效率越高。
5. 只取你所需要的列，索引读无需回表了。

针对select count(*) from t的优化方案：
1. 索引什么都不建立。（可能最慢）
2. 建立普通索引。
3. 建立位图索引。
4. 建立物化视图。
5. 缓存结果集（select /*+ result_cache */ count(*) from t ;）
如果结果集改变了，这个缓存中的数据会失效。而不会查错。如果经常改变，就会多做事。这种优化就没有用了。

下面举个例子。

将1到1000000数字插入到T表：

速度由慢到快的优化方式：

1. 使用存储过程循环，并且使用动态sql ，并未绑定变量。

2. 使用存储过程循环，并且使用动态sql ，使用绑定变量。

3. 使用存储过程循环，使用静态sql. 每次插入数据都commit .

4. 使用存储过程循环，使用静态sql. 批量提交数据。

5. 使用集合的写法。 insert into t SELECT rownum FROM dual CONNECT BY LEVEL < 1000000 ; commit ;

6. 使用直接路径写法：create table t as select rownum X from dual connect by level<1000000 ;

7. 使用并行。 create table t nologging parallel 64 as select rownum x from dual connect by level < 1000000 ;

count(*)将返回表格中所有存在的行的总数包括值为null的行，然而count(列名)将返回表格中除去null以外的所有行的总数(有默认值的列也会被计入）.
distinct 列名,得到的结果将是除去值为null和重复数据后的结果.
可以自己去试验一下下面三种情况的执行结果：
select count(distinct sal) from test;
select count(sal) from test;
select count(*) from test;

not in 与 not exist的测试：
test2中object_id含有空的数据。下面看一下执行计划：
18:15:47 SQL> select * from test t where not exists(
18:16:09 2 select 1 from test2 i where i.object_id=t.object_id
18:16:09 3 ) ;

已选择71594行。

执行计划
----------------------------------------------------------
Plan hash value: 2923433643

------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72537 | 7792K| 565 (1)| 00:00:07 |
|* 1 | HASH JOIN RIGHT ANTI| | 72537 | 7792K| 565 (1)| 00:00:07 |
| 2 | TABLE ACCESS FULL | TEST2 | 6323 | 82199 | 281 (0)| 00:00:04 |
| 3 | TABLE ACCESS FULL | TEST | 72537 | 6871K| 283 (1)| 00:00:04 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("I"."OBJECT_ID"="T"."OBJECT_ID")

Note
-----
- dynamic sampling used for this statement (level=2)

统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
6775 consistent gets
0 physical reads
0 redo size
3443327 bytes sent via SQL*Net to client
52908 bytes received via SQL*Net from client
4774 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
71594 rows processed

看not in的执行计划：
18:16:18 SQL> select * from test t where t.object_id not in(
18:16:34 2 select i.object_id from test2 i
18:16:34 3 ) ;

未选定行

执行计划
----------------------------------------------------------
Plan hash value: 1660021635

--------------------------------------------------------------------------------
-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
--------------------------------------------------------------------------------
-
| 0 | SELECT STATEMENT | | 72537 | 7792K| 565 (1)| 00:00:07
|
|* 1 | HASH JOIN RIGHT ANTI NA| | 72537 | 7792K| 565 (1)| 00:00:07
|
| 2 | TABLE ACCESS FULL | TEST2 | 6323 | 82199 | 281 (0)| 00:00:04
|
| 3 | TABLE ACCESS FULL | TEST | 72537 | 6871K| 283 (1)| 00:00:04
|
--------------------------------------------------------------------------------
-

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T"."OBJECT_ID"="I"."OBJECT_ID")

Note
-----
- dynamic sampling used for this statement (level=2)

统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
1184 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed

可以看出来，使用not exists的一致性读远远大于not in。因为我把test2这张表的数据有部分object_id这一列的数据部分设置为空了。在关联的字段如果含有为空的数据比较比，尽量用not in. 一般情况下，not exists和not in在11g以后，如果关联字段不为空，代价基本上是一致的。都使用了anti的半连接算法（排除法）。具体情况需要看执行计划。我做过试验，是几乎一样的。有好奇的朋友可以自己做实验验证。

原来表连接顺序的说法早就过时了，那是基于规则的时代，现在我们是基于代价的。与顺序无关。
下面可以看一下如下的基于规则的例子来实验：
select /*+rule*/ count(*) from tab_big,tab_small ;
select /*+rule*/ count(*) from tab_small,tab_big ;

查看视图不需要的字段尽量不要显示出来。可以节省效率：
下面看个例子：
drop table t1 cascade constraints purge;
drop table t2 cascade constraints purge;
create table t1 as select * from dba_objects;
create table t2 as select * from dba_objects where rownum<=10000;
update t1 set object_id=rownum ;
update t2 set object_id=rownum ;
commit;

create or replace view v_t1_join_t2
as select t2.object_id,t2.object_name,t1.object_type,t1.owner from t1,t2
where t1.object_id=t2.object_id;

set autotrace traceonly
set linesize 1000
select * from v_t1_join_t2;
select object_id,object_name from v_t1_join_t2;

--做个试验
alter table T1 add constraint pk_object_id primary key (OBJECT_ID);
alter table T2 add constraint fk_objecdt_id foreign key (OBJECT_ID) references t1 (OBJECT_ID);

select * from v_t1_join_t2;
select object_id,object_name from v_t1_join_t2;
看执行计划，可以看出来，第二个语句只扫描了一个表的数据，而第一个语句，扫描了两个表的数据。

大表建议使用分区：
drop table part_tab purge;
create table part_tab (id int,col2 int,col3 int)
partition by range (id)
(
partition p1 values less than (10000),
partition p2 values less than (20000),
partition p3 values less than (30000),
partition p4 values less than (40000),
partition p5 values less than (50000),
partition p6 values less than (60000),
partition p7 values less than (70000),
partition p8 values less than (80000),
partition p9 values less than (90000),
partition p10 values less than (100000),
partition p11 values less than (maxvalue)
)
;
insert into part_tab select rownum,rownum+1,rownum+2 from dual connect by rownum <=110000;
commit;
create index idx_par_tab_col2 on part_tab(col2) local;
create index idx_par_tab_col3 on part_tab(col3) ;

drop table norm_tab purge;
create table norm_tab (id int,col2 int,col3 int);
insert into norm_tab select rownum,rownum+1,rownum+2 from dual connect by rownum <=110000;
commit;
create index idx_nor_tab_col2 on norm_tab(col2) ;
create index idx_nor_tab_col3 on norm_tab(col3) ;

set autotrace traceonly statistics
set linesize 1000
set timing on
select * from part_tab where col2=8 ;
select * from norm_tab where col2=8 ;

select * from part_tab where col2=8 and id=2;
select * from norm_tab where col2=8 and id=2;

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/29196873/viewspace-1142130/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/29196873/viewspace-1142130/