数据库优化

本文介绍了SQL优化的方法,包括查看SQL执行效率的不同方式、COUNT函数的使用技巧、IN与EXISTS的区别、表连接顺序的影响、索引对效率的影响以及避免在SQL中调用函数等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

前几天做了数据库优化的培训,现将培训的内容做整理,知识得自己加工一遍才会记忆深刻,有点疑惑的是,SQL优化属于DBA的工作,却是开发来培训的。
1、首先来看看如何查看sql执行效率

--环境构造
set autotrace off
DROP TABLE t1 CASCADE CONSTRAINTS PURGE; 
DROP TABLE t2 CASCADE CONSTRAINTS PURGE; 
CREATE TABLE t1 (id NUMBER NOT NULL,n NUMBER,contents VARCHAR2(4000)); 
CREATE TABLE t2 (id NUMBER NOT NULL,t1_id NUMBER NOT NULL,n NUMBER,contents VARCHAR2(4000)); 
execute dbms_random.seed(0); 
INSERT INTO t1
     SELECT  rownum,  rownum, dbms_random.string('a', 50)
       FROM dual
     CONNECT BY level <= 1000
      ORDER BY dbms_random.random; 
INSERT INTO t2 SELECT rownum, rownum, rownum, dbms_random.string('b', 50) FROM dual CONNECT BY level <= 100000
    ORDER BY dbms_random.random; 
COMMIT; 
CREATE INDEX t1_n ON t1 (n);
CREATE INDEX t2_t1_id ON t2(t1_id);

t1有1000条数据,在n上建立索引,t2有100000条数据,在t1_id上建立索引。
/*
下面我们将会用多种方法来查看如下语句的执行计划
SELECT *
FROM t1, t2
WHERE t1.id = t2.t1_id
AND t1.n in(18,19);

*/
----方法1(explain plan for 的方式。类似PLSQL DEVELOPE里的F5)

/*
  步骤1:explain plan for "你的SQL"
  步骤2:select * from table(dbms_xplan.display()); 
*/
set autotrace off
set linesize 1000
set pagesize 2000
explain plan for
SELECT  *
FROM t1, t2
WHERE t1.id = t2.t1_id
AND t1.n in(18,19);
select * from table(dbms_xplan.display());

执行结果:

SQL> select * from table(dbms_xplan.display());
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3373872153
--------------------------------------------------------------------------------
| Id  | Operation                              | Name     | Rows  | Bytes | Cost
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |          |     2 |  8138 |
|   1 |  NESTED LOOPS                          |          |     2 |  8138 |
|   2 |   NESTED LOOPS                         |          |     2 |  8138 |
|   3 |    INLIST ITERATOR                     |          |       |       |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1       |     2 |  4056 |
|*  5 |      INDEX RANGE SCAN                  | T1_N     |     4 |       |
|*  6 |    INDEX RANGE SCAN                    | T2_T1_ID |     1 |       |
|   7 |   TABLE ACCESS BY INDEX ROWID          | T2       |     1 |  2041 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("T1"."N"=18 OR "T1"."N"=19)
   6 - access("T1"."ID"="T2"."T1_ID")
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
 
24 rows selected

/*
优点: 无需真正执行,快捷方便

缺陷: 1.没有输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);
2.可能不是真实的执行计划
/
----方法2(set autotrace on 方式)
/

步骤1:set autotrace on
步骤2:在此处执行你的SQL,后续会有结果输出

另,有如下几种方式:
set autotrace on (得到执行计划,输出运行结果)
set autotrace traceonly (得到执行计划,不输出运行结果)
set autotrace traceonly explain (得到执行计划,不输出运行结果和统计信息部分,仅展现执行计划部分)
set autotrace traceonl statistics(不输出运行结果和执行计划部分,仅展现统计信息部分)
*/

set autotrace on 
SELECT  *
FROM t1, t2
WHERE t1.id = t2.t1_id
AND t1.n in(18,19);

/*
–优点:可以输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);

–缺陷:必须要等到语句真正执行完毕后,才可以出结果,可以用traceonly开关来控制返回结果不打屏输出;

*/
----方法3(知道sql_id后,直接带入的方式,简单,就步骤1)

/*
步骤1:得到sql_id (session,awr)

步骤2: select * from table(dbms_xplan.display_cursor(‘&sq_id’)); (该方法是从共享池里得到)

*/

set autotrace off
SELECT sq.sql_id
  FROM v$sql                    sq,
       v$active_session_history s
 WHERE s.sql_id = sq.sql_id
       AND sq.sql_fulltext LIKE '%AND t1.n in(18,19)%'
 ORDER BY s.sample_time DESC;
  --显示执行效率
 select * from table(dbms_xplan.display_cursor('ds48yh2v179kx'));

/*
–优点:1.知道sql_id立即可得到执行计划;
2.可以得到真实的执行计划。

–缺陷 没有输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);
*/

2、count的说明
drop table t purge;
create table t as select * from dba_objects;
update t set object_id =rownum ;
set timing on
set linesize 1000
set autotrace on

–Oracle有数据缓冲池和共享池,在比较性能的时候多执行几次,让执行过程稳定了再作比较。
select count(*) from t;
/
select count(object_id) from t;
/

–看来count(列)比count()更快是谣传,明明是一样快,真相是这样吗?
–建个索引看看
create index idx_object_id on t(object_id);
select count(
) from t;
/
select count(object_id) from t;
/

–看来真的是用COUNT(列)比COUNT()要快啊,因为COUNT()不能用到索引,而COUNT(列)可以?

alter table T modify object_id not null;
select count() from t;
/
select count(object_id) from t;
/
–看来count(列)和count(
)其实一样快,如果索引列是非空的,count(*)可用到索引,此时一样快!

–其实两者根本没有可比性,性能比较首先要考虑写法等价,这两个语句根本就不等价!
alter table T modify object_id null;
update t set object_id =null where rownum<=2;
set autotrace off

select count(*) from t;
/
select count(object_id) from t;
/
3、in和exist之争
drop table emp_test purge;
drop table dept_test purge;
create table emp_test as select * from emp;
create table dept_test as select * from dept;
set timing on
set linesize 1000
set autotrace traceonly explain

–NOT IN
select * from dept_test where deptno NOT IN ( select deptno from emp_test );
select * from dept_test where not exists ( select deptno from emp_test where emp_test.deptno=dept_test.deptno);
select * from dept_test where deptno NOT IN ( select deptno from emp_test where deptno is not null) and deptno is not null;
/一般来说,anti的反连接算法比filter更高效,但是在10g时,Oracle的这个算法不完善,必须要制定非空,才可以让not in 用anti算法。在11g的时候,这个情况已经改变了,无论not in 还是not exists,无论是否列为空,都可以走到Oracle比较先进高效的anti反连接算法/

–IN
select * from dept_test where deptno IN ( select deptno from emp_test );
select * from dept_test where exists ( select deptno from emp_test where emp_test.deptno=dept_test.deptno);
4、表的链接顺序
drop table tab_big;
drop table tab_small;
create table tab_big as select * from dba_objects where rownum<=30000;
create table tab_small as select * from dba_objects where rownum<=10;
set autotrace traceonly
set linesize 1000
set timing on
select count() from tab_big,tab_small;
select count(
) from tab_small,tab_big;

—其实任何谣言,都是有一定的影子的。看看下面的语句,比较一下性能。
select /+rule/ count() from tab_big,tab_small ;
select /
+rule*/ count(*) from tab_small,tab_big ;

select /+cost/ count() from tab_big,tab_small ;
select /
+cost*/ count(*) from tab_small,tab_big ;
/结论:原来表连接顺序的说法早就过时了,那是基于规则的时代,现在我们是基于代价的。
RBO: Rule-Based Optimization 基于规则的优化器
CBO: Cost-Based Optimization 基于代价的优化器
在RBO中,SQL的写法往往会影响执行计划,CBO则不会。
/
5、索引影响效率
建立索引查询的效率提高了,但是插入的效率降低了
–构造表
drop table test1 purge;
drop table test2 purge;
drop table test3 purge;
drop table t purge;
create table t as select * from dba_objects;
create table test1 as select * from t;
create table test2 as select * from t;
create table test3 as select * from t;
–构造索引
create index test2_n1 on test2(status);
create index test2_n2 on test2(object_id);
create index test3_n1 on test3(owner);
create index test3_n2 on test3(object_name);
create index test3_n3 on test3(data_object_id);
create index test3_n4 on test3(created);
create index test3_n5 on test3(last_ddl_time);
create index test3_n6 on test3(status);
–开启时间
set timing on
set autotrace off;

select count(1) from dba_objects;

–语句1(test1表无索引)
insert into test1 select * from t;
commit;
–语句2(test2表有2个索引)
insert into test2 select * from t;
commit;
–语句3(test3表有6个索引)
insert into test3 select * from t;
commit;
6、避免SQL中调用函数
drop table people purge;
drop table sex purge;
create table people (first_name varchar2(200),last_name varchar2(200),sex_id number);
create table sex (name varchar2(20), sex_id number);
insert into people (first_name,last_name,sex_id) select object_name,object_type,1 from dba_objects;
insert into sex (name,sex_id) values (‘男’,1);
insert into sex (name,sex_id) values (‘女’,2);
insert into sex (name,sex_id) values (‘不详’,3);
commit;

create or replace function get_sex_name(p_id sex.sex_id%type) return sex.name%type is
v_name sex.name%type;
begin
select name
into v_name
from sex
where sex_id=p_id;
return v_name;
end;
/

–以下两种写法是等价的,都是为了查询people 表信息,同时通过sex 表,获取人员的性别信息。
–但是通过autotrace 比较观察发现两种写法性能上存在巨大差异
set autotrace traceonly statistics
select sex_id,
first_name||’ '||last_name full_name,
get_sex_name(sex_id) gender
from people;

select p.sex_id,
p.first_name||’ '||p.last_name full_name,
s.name
from people p, sex s
where s.sex_id=p.sex_id;

select p.sex_id,
p.first_name||’ '||p.last_name full_name,
(select s.name from sex s where s.sex_id = p.sex_id) as gender
from people p;

–标量子查询缓存
select sex_id,
first_name||’ '||last_name full_name,
(select get_sex_name(sex_id) from dual) gender
from people where sex_id<50;
6、警惕字段类型转化
drop table t_col_type purge;
create table t_col_type(id varchar2(30),col2 varchar2(30),col3 varchar2(30));
insert into t_col_type select rownum,‘abc’,‘efg’ from dual connect by level<=10000;
commit;
create index idx_id on t_col_type(id);
set linesize 1000
set autotrace traceonly

–错误的类型
select * from t_col_type where id = 6;
select * from t_col_type where to_number(id) = 6;
–实际上只有如下写法才可以用到索引,是什么类型的取值就设置什么样的字段。
select * from t_col_type where id = ‘6’;

–做查询时候不要对列做函数计算,如果有必要,对查询条件做转换
select COUNT() from pur_headers_all where to_char(creation_date, ‘yyyymmdd’) >= ‘20161219’;
select COUNT(
) from pur_headers_all where creation_date >= to_date(‘20161219’, ‘yyyymmdd’);


–关于nvl
update t_col_type t set t.id = null where rownum <= 3;
commit;
exec dbms_stats.gather_table_stats(‘MYDEV’,‘T_COL_TYPE’,cascade=>true);
select * from t_col_type where nvl(id, ‘-1’) = ‘-1’;

–查询条件is null / is not null
select * from t_col_type where id is null;
select * from t_col_type where id is not null;

–is not null在选择率小的情况下,可能走索引。
update t_col_type t set t.id = null where rownum <=9900;
commit;
exec dbms_stats.gather_table_stats(‘MYDEV’,‘T_COL_TYPE’,cascade=>true);
select * from t_col_type where id is not null;

–结论:在设计表字段的时候,尽量把字段设置为非空,可以有默认值的就设置默认值,以免在查询时用nvl或者is null做过滤,效率太低。

—以上很多实践在12c上是差别不大的,这块还有待深究,培训的内容不一定是对的。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值