sql 调优之-在子查询中使用exist还是 in

本文链接：https://blog.youkuaiyun.com/csucxcc/article/details/5575678

本文探讨了在SQL子查询中使用IN与EXISTS的区别及其对性能的影响。通过具体案例对比了两种方法在不同情况下的表现，指出当选择性过滤在子查询中时使用IN更优，而在父查询中则使用EXISTS更佳。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

sql 调优之---在子查询中使用 exist还是 in

Use of EXISTS versus IN for Subqueries

In certain circumstances, it is better to use IN rather than EXISTS. In general, if the

selective predicate is in the subquery, then use IN. If the selective predicate is in the

parent query, then use EXISTS

在某些情况下，使用IN比使用exists要好。

一般说来，如果选择性的语法是在子查询上，那么建议使用in.

如果选择性查询语句在父查询上，那么就使用exist.

Sometimes, Oracle can rewrite a subquery when used with an IN clause to take

advantage of selectivity specified in the subquery. This is most beneficial when the

most selective filter appears in the subquery and there are indexes on the join columns.

Conversely, using EXISTS is beneficial when the most selective filter is in the parent

query. This allows the selective predicates in the parent query to be applied before

filtering the rows against the EXISTS criteria.

某些情况下，oracle 可以重写一个子查询，当使用IN 语法来提高子查询的选择性的时候。

当绝大多数选择性的过滤问题在子查询中的时候，或者在join 列上有索引的时候，这种IN 语法最合适。反过来，当最有选择性的语句在父查询中的时候，我们可以使用exists 语法结构。

使用exist 语法结构将允许这样的场景，父查询中有选择性的查询将被应用，再利用exist 语法中的值进行行过滤。

"Example 1: Using IN - Selective Filters in the Subquery" and "Example 2: Using

EXISTS - Selective Predicate in the Parent" are two examples that demonstrate the

benefits of IN and EXISTS. Both examples use the same schema with the following

characteristics:

There is a unique index on the employees.employee_id field.

■ There is an index on the orders.customer_id field.

■ There is an index on the employees.department_id field.

■ The employees table has 27,000 rows.

■ The orders table has 10,000 rows.

■ The OE and HR schemas, which own these segments, were both analyzed with

COMPUTE.

Example 1: Using IN - Selective Filters in the Subquery

This example demonstrates how

rewriting a query to use IN can improve performance. This query identifies all

employees who have placed orders on behalf of customer 144.

The following SQL statement uses EXISTS:

SELECT /* EXISTS example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE EXISTS (SELECT 1 FROM orders o /* Note 1 */

WHERE e.employee_id = o.sales_rep_id /* Note 2 */

AND o.customer_id = 144); /* Note 3 */

我们举下面的例子来说明in/exist的作用：

EXAMPLE 1：使用 IN-----子查询方式的选择性的 过滤

EXAMPLE 2: 使用exist ------在父查询中选择性的语句

两个语法都使用了如下的格式：

在 employee.employee_id 列上有一个唯一性 索引

在employees.department_id 列上有一个索引

employees 表有27000 行

oraders 表有10000行

在OE和HR用户下，含有这些段，都已经被分析过了

Note 1: This shows the line containing EXISTS.（这里列出了包含exist的行）

Note 2: This shows the line that makes the subquery a correlated subquery.（这里列出了行，此行让子查询变成了相互关联的查询）

Note 3: This shows the line where the correlated subqueries include the highly selective predicate customer_id = number.

（这里列出了一行，这一行让相关的子查询包含了高度选择性的语法）

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The plan requires a full table scan of the employees table, returning many

rows. Each of these rows is then filtered against the orders table (through an index).

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 FILTER

2 TABLE ACCESS FULL EMPLOYEES ANA 155

3 TABLE ACCESS BY INDEX ROWID ORDERS ANA 3

4 INDEX RANGE SCAN ORD_CUSTOMER_IX ANA 1

Rewriting the statement using IN results in significantly fewer resources used.

The SQL statement using IN:

SELECT /* IN example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */

FROM orders o

WHERE o.customer_id = 144); /* Note 3 */

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The plan requires a full table scan of the employees table, returning many

rows. Each of these rows is then filtered against the orders table (through an index).

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 FILTER

2 TABLE ACCESS FULL EMPLOYEES ANA 155

3 TABLE ACCESS BY INDEX ROWID ORDERS ANA 3

4 INDEX RANGE SCAN ORD_CUSTOMER_IX ANA 1

Rewriting the statement using IN results in significantly fewer resources used.

The SQL statement using IN:

SELECT /* IN example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */

FROM orders o

WHERE o.customer_id = 144); /* Note 3 */

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The optimizer rewrites the subquery into a view, which is then joined

through a unique index to the employees table. This results in a significantly better

plan, because the view (that is, subquery) has a selective predicate, thus returning only

a few employee_ids. These employee_ids are then used to access the employees

table through the unique index.

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 NESTED LOOPS 5

2 VIEW 3

3 SORT UNIQUE 3

4 TABLE ACCESS FULL ORDERS ANA 1

5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1

6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The optimizer rewrites the subquery into a view, which is then joined

through a unique index to the employees table. This results in a significantly better

plan, because the view (that is, subquery) has a selective predicate, thus returning only

a few employee_ids. These employee_ids are then used to access the employees

table through the unique index.

如下的内容是相关语句的执行计划。这个计划中包含了对employees 表的全表扫描，返回了很多行。

然后这些返回行将和 order 表（通过index)过滤。

重写语句，我们可以得到下面的语句，这个语句使用了In语法：

SELECT /* IN example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */

FROM orders o

WHERE o.customer_id = 144); /* Note 3 */

查看它的执行计划如下：

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 NESTED LOOPS 5

2 VIEW 3

3 SORT UNIQUE 3

4 TABLE ACCESS FULL ORDERS ANA 1

5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1

6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA

执行计划如下。优化器将重新写子查询，并把它放入一个视图，这个视图将通过唯一性索引和 employees表

进行join 操作。这个结果将是一个 好的多的执行计划，因为视图（子查询）拥有一个很好的区分度，将只会

返回很少的employee_ids.这些employee_id 将被使用来通过唯一性索引来访问employees表。

第2个例子如下：

Using EXISTS - Selective Predicate in the Parent This example demonstrates how

rewriting a query to use EXISTS can improve performance. This query identifies all

employees from department 80 who are sales reps who have placed orders.

The following SQL statement uses IN:

SELECT /* IN example */

e.employee_id, e.first_name, e.last_name, e.department_id, e.salary

FROM employees e

WHERE e.department_id = 80 /* Note 5 */

AND e.job_id = 'SA_REP' /* Note 6 */

AND e.employee_id IN (SELECT o.sales_rep_id FROM orders o); /* Note 4 */

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The SQL statement was rewritten by the optimizer to use a view on the

orders table, which requires sorting the data to return all unique employee_ids

existing in the orders table. Because there is no predicate, many employee_ids are

returned. The large list of resulting employee_ids are then used to access the

employees table through the unique index.

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 NESTED LOOPS 125

2 VIEW 116

3 SORT UNIQUE 116

4 TABLE ACCESS FULL ORDERS ANA 40

5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1

6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA

这个例子讲说明如何使用exist来提高语句的perfomance.

这个查询的目的是获得部门号为80的所有的员工信息，这些员工是销售代表，负责提供订单。

相关的查询语句如下：

SELECT /* IN example */

e.employee_id, e.first_name, e.last_name, e.department_id, e.salary

FROM employees e

WHERE e.department_id = 80 /* Note 5 */

AND e.job_id = 'SA_REP' /* Note 6 */

AND e.employee_id IN (SELECT o.sales_rep_id FROM orders o); /* Note 4 */

如下的语句时相关的执行计划：

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 NESTED LOOPS 125

2 VIEW 116

3 SORT UNIQUE 116

4 TABLE ACCESS FULL ORDERS ANA 40

5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1

6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA

优化器被重写来使用oraders表的视图，这个SQL语句将使用基于orders 表的视图，

这个语句将需要对数据进行排序来返回orders 表中的所有的唯一 employee_ids。

因为没有对order 表 通过条件进行压减，非常多的 employee_ids 都被返回了。

大量的 employee_ids被用来通过唯一性索引访问employees 表。

The following SQL statement uses EXISTS:

SELECT /* EXISTS example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE e.department_id = 80 /* Note 5 */

AND e.job_id = 'SA_REP' /* Note 6 */

AND EXISTS (SELECT 1 /* Note 1 */

FROM orders o

WHERE e.employee_id = o.sales_rep_id); /* Note 2 */

Note:

■ Note 1: This shows the line containing EXISTS.

■ Note 2: This shows the line that makes the subquery a

correlated subquery.

■ Note 5 & 6:These are the selective predicates in the parent SQL.

The following plan output is the execution plan (from V$SQL_PLAN) for the preceding

statement. The cost of the plan is reduced by rewriting the SQL statement to use an

EXISTS. This plan is more effective, because two indexes are used to satisfy the

predicates in the parent query, thus returning only a few employee_ids. The

employee_ids are then used to access the orders table through an index.

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 FILTER

2 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 98

3 AND-EQUAL

4 INDEX RANGE SCAN EMP_JOB_IX ANA

5 INDEX RANGE SCAN EMP_DEPARTMENT_IX ANA

6 INDEX RANGE SCAN ORD_SALES_REP_IX ANA

An even more efficient approach is to have a concatenated index on department_id and job_id.

This eliminates the need to access two indexes and reduces the resources used

我们可以使用exists，改写成如下的语法结构：

SELECT /* EXISTS example */

e.employee_id, e.first_name, e.last_name, e.salary

FROM employees e

WHERE e.department_id = 80 /* Note 5 */

AND e.job_id = 'SA_REP' /* Note 6 */

AND EXISTS (SELECT 1 /* Note 1 */

FROM orders o

WHERE e.employee_id = o.sales_rep_id); /* Note 2 */

Note:

■ Note 1: This shows the line containing EXISTS.

■ Note 2: This shows the line that makes the subquery a

correlated subquery.

■ Note 5 & 6:These are the selective predicates in the parent SQL.

执行计划如下

ID OPERATION OPTIONS OBJECT_NAME OPT COST

---- -------------------- --------------- ---------------------- --- ----------

0 SELECT STATEMENT CHO

1 FILTER

2 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 98

3 AND-EQUAL

4 INDEX RANGE SCAN EMP_JOB_IX ANA

5 INDEX RANGE SCAN EMP_DEPARTMENT_IX ANA

6 INDEX RANGE SCAN ORD_SALES_REP_IX ANA

我们可以看到

通过使用exist，上面的执行计划的cost 被减少了。

因为indexes 被用来满足 满足父查询的结构，因此父查询将只返回很少的employee_ids。

然后，employee_ids将被用来通过index 来访问 orders表。