oracle in和exists、not in和not exists原理和性能探究

最新推荐文章于 2025-05-31 19:42:04 发布

深圳gg

最新推荐文章于 2025-05-31 19:42:04 发布

阅读量4k

点赞数

CC 4.0 BY-SA版权

分类专栏： SQL语句学习

本文链接：https://blog.youkuaiyun.com/stevendbaguo/article/details/12854887

SQL语句学习专栏收录该内容

48 篇文章

订阅专栏

对于in和exists、not in和not exists还是有很多的人有疑惑，更有甚者禁用not in，所有的地方都要用not exists，它真的高效吗？通过下面的使用我们来证明。

先制造一些数据

SQL> drop table test1 purge;
SQL> drop table test2 purge;
SQL> create table test1 as select * from dba_objects where rownum <=1000;
SQL> create table test2 as select * from dba_objects;
SQL> exec dbms_stats.gather_table_stats(user,'test1');
SQL> exec dbms_stats.gather_table_stats(user,'test2');
SQL> set autotrace traceonly

in和exists原理及性能实验：
SQL> select * from test1 t1 where t1.object_id in (select t2.object_id from test2 t2);
已选择1000行。
执行计划
----------------------------------------------------------
Plan hash value: 3819917785
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 997 | 84745 | 168 (3)| 00:00:03 |
|* 1 | HASH JOIN SEMI | | 997 | 84745 | 168 (3)| 00:00:03 |
| 2 | TABLE ACCESS FULL| TEST1 | 1000 | 80000 | 5 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST2 | 50687 | 247K| 162 (2)| 00:00:02 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
95 consistent gets
0 physical reads
0 redo size
45820 bytes sent via SQL*Net to client
1111 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed

SQL> select * from test1 t1
2 where exists (select 1 from test2 t2 where t1.object_id = t2.object_id);
已选择1000行。
执行计划
----------------------------------------------------------
Plan hash value: 3819917785
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 997 | 84745 | 168 (3)| 00:00:03 |
|* 1 | HASH JOIN SEMI | | 997 | 84745 | 168 (3)| 00:00:03 |
| 2 | TABLE ACCESS FULL| TEST1 | 1000 | 80000 | 5 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST2 | 50687 | 247K| 162 (2)| 00:00:02 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
95 consistent gets
0 physical reads
0 redo size
45820 bytes sent via SQL*Net to client
1111 bytes received via SQL*Net from client
68 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1000 rows processed
结论：在oracle 10g中,in 和 exists其实是一样的，原理就是两张表做HASH JOIN SEMI。也可以通过10053事件看到两条sql语句最终转换成同一条sql。

not in和not exists原理及性能实验：
not exists 比 not in效率高的例子
SQL> select count(*) from test1 where object_id not in(select object_id from test2);
执行计划
----------------------------------------------------------
Plan hash value: 3641219899
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 81076 (2)| 00:16:13 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL| TEST1 | 1000 | 4000 | 5 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| TEST2 | 1 | 5 | 162 (2)| 00:00:02 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
LNNVL("OBJECT_ID"<>:B1)))
4 - filter(LNNVL("OBJECT_ID"<>:B1))
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
9410 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select count(*) from test1 t1 where not exists
(select 1 from test2 t2 where t1.object_id=t2.object_id);
执行计划
----------------------------------------------------------
Plan hash value: 240185659
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 168 (3)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | HASH JOIN ANTI | | 3 | 27 | 168 (3)| 00:00:03 |
| 3 | TABLE ACCESS FULL| TEST1 | 1000 | 4000 | 5 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TEST2 | 50687 | 247K| 162 (2)| 00:00:02 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
717 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

not in比not exists 效率高的例子
SQL> Set autotrace off
SQL> drop table test1 purge;
表已删除。
SQL> drop table test2 purge;
表已删除。
SQL> create table test1 as select * from dba_objects where rownum <=5;
表已创建。
SQL> create table test2 as select * from dba_objects;
表已创建。
SQL> Insert into test2 select * from dba_objects;
已创建50687行。
SQL> Insert into test2 select * from test2;
已创建101374行。
SQL> Insert into test2 select * from test2;
已创建202748行。
SQL> Commit;
提交完成。
SQL> exec dbms_stats.gather_table_stats(user,'test1');
PL/SQL 过程已成功完成。
SQL> exec dbms_stats.gather_table_stats(user,'test2');
PL/SQL 过程已成功完成。
SQL> Set autotrace traceonly
SQL> select count(*) from test1 where object_id not in(select object_id from test2);
执行计划
----------------------------------------------------------
Plan hash value: 3641219899
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3143 (2)| 00:00:38 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
| 3 | TABLE ACCESS FULL| TEST1 | 5 | 15 | 3 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL| TEST2 | 8 | 40 | 1256 (2)| 00:00:16 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
LNNVL("OBJECT_ID"<>:B1)))
4 - filter(LNNVL("OBJECT_ID"<>:B1))
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
23 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

SQL> select count(*) from test1 t1 where not exists
(select 1 from test2 t2 where t1.object_id=t2.object_id);
执行计划
----------------------------------------------------------
Plan hash value: 240185659
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 8 | 1263 (3)| 00:00:16 |
| 1 | SORT AGGREGATE | | 1 | 8 | | |
|* 2 | HASH JOIN ANTI | | 1 | 8 | 1263 (3)| 00:00:16 |
| 3 | TABLE ACCESS FULL| TEST1 | 5 | 15 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TEST2 | 405K| 1981K| 1253 (2)| 00:00:16 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."OBJECT_ID"="T2"."OBJECT_ID")
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
5609 consistent gets
0 physical reads
0 redo size
407 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
结论：not in 和not exists原理是nestedloops 与HASH JOIN的区别，not in中的filter算法类似于nestedloops。如果比较两者的性能，就是比较nestedloops 与HASH JOIN的性能差异。在本例子中：
not in 性能大于not exists test1的数据量5条，test2数量40多万条。
not exists 性能大于not in test1的数据量1000条，test2数量50687条。

not in和not exists还有一个重要区别，就是查询条件后面的语句连接字段中有null值时，not in查询的结果不正确。
http://blog.youkuaiyun.com/stevendbaguo/article/details/8270572