sql 调优之---在子查询中使用 exist还是 in Use of EXISTS versus IN for Subqueries In certain circumstances, it is better to use IN rather than EXISTS. In general, if the selective predicate is in the subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS 在某些情况下,使用IN比使用exists要好。 一般说来,如果选择性的语法是在子查询上,那么建议使用in. 如果选择性查询语句在父查询上,那么就使用exist. Sometimes, Oracle can rewrite a subquery when used with an IN clause to take advantage of selectivity specified in the subquery. This is most beneficial when the most selective filter appears in the subquery and there are indexes on the join columns. Conversely, using EXISTS is beneficial when the most selective filter is in the parent query. This allows the selective predicates in the parent query to be applied before filtering the rows against the EXISTS criteria. 某些情况下,oracle 可以重写一个子查询,当使用IN 语法来提高子查询的选择性的时候。 当绝大多数选择性的过滤问题在子查询中的时候,或者在join 列上有索引的时候,这种IN 语法最合适。反过来,当最有选择性的语句在父查询中的时候,我们可以使用exists 语法结构。 使用exist 语法结构将允许这样的场景,父查询中有选择性的查询将被应用,再利用exist 语法中的值进行行过滤。 "Example 1: Using IN - Selective Filters in the Subquery" and "Example 2: Using EXISTS - Selective Predicate in the Parent" are two examples that demonstrate the benefits of IN and EXISTS. Both examples use the same schema with the following characteristics: There is a unique index on the employees.employee_id field. ■ There is an index on the orders.customer_id field. ■ There is an index on the employees.department_id field. ■ The employees table has 27,000 rows. ■ The orders table has 10,000 rows. ■ The OE and HR schemas, which own these segments, were both analyzed with COMPUTE. Example 1: Using IN - Selective Filters in the Subquery This example demonstrates how rewriting a query to use IN can improve performance. This query identifies all employees who have placed orders on behalf of customer 144. The following SQL statement uses EXISTS: SELECT /* EXISTS example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE EXISTS (SELECT 1 FROM orders o /* Note 1 */ WHERE e.employee_id = o.sales_rep_id /* Note 2 */ AND o.customer_id = 144); /* Note 3 */ 我们举下面的例子来说明in/exist的作用: EXAMPLE 1:使用 IN-----子查询方式的选择性的 过滤 EXAMPLE 2: 使用exist ------在父查询中选择性的语句 两个语法都使用了如下的格式: 在 employee.employee_id 列上有一个唯一性 索引 在employees.department_id 列上有一个索引 employees 表有27000 行 oraders 表有10000行 在OE和HR用户下,含有这些段,都已经被分析过了 Note 1: This shows the line containing EXISTS.(这里列出了包含exist的行) Note 2: This shows the line that makes the subquery a correlated subquery.(这里列出了行,此行让子查询变成了相互关联的查询) Note 3: This shows the line where the correlated subqueries include the highly selective predicate customer_id = number. (这里列出了一行,这一行让相关的子查询包含了高度选择性的语法) The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The plan requires a full table scan of the employees table, returning many rows. Each of these rows is then filtered against the orders table (through an index). ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 FILTER 2 TABLE ACCESS FULL EMPLOYEES ANA 155 3 TABLE ACCESS BY INDEX ROWID ORDERS ANA 3 4 INDEX RANGE SCAN ORD_CUSTOMER_IX ANA 1 Rewriting the statement using IN results in significantly fewer resources used. The SQL statement using IN: SELECT /* IN example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */ FROM orders o WHERE o.customer_id = 144); /* Note 3 */ The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The plan requires a full table scan of the employees table, returning many rows. Each of these rows is then filtered against the orders table (through an index). ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 FILTER 2 TABLE ACCESS FULL EMPLOYEES ANA 155 3 TABLE ACCESS BY INDEX ROWID ORDERS ANA 3 4 INDEX RANGE SCAN ORD_CUSTOMER_IX ANA 1 Rewriting the statement using IN results in significantly fewer resources used. The SQL statement using IN: SELECT /* IN example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */ FROM orders o WHERE o.customer_id = 144); /* Note 3 */ The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The optimizer rewrites the subquery into a view, which is then joined through a unique index to the employees table. This results in a significantly better plan, because the view (that is, subquery) has a selective predicate, thus returning only a few employee_ids. These employee_ids are then used to access the employees table through the unique index. ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 NESTED LOOPS 5 2 VIEW 3 3 SORT UNIQUE 3 4 TABLE ACCESS FULL ORDERS ANA 1 5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1 6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The optimizer rewrites the subquery into a view, which is then joined through a unique index to the employees table. This results in a significantly better plan, because the view (that is, subquery) has a selective predicate, thus returning only a few employee_ids. These employee_ids are then used to access the employees table through the unique index. 如下的内容是相关语句的执行计划。这个计划中包含了对employees 表的全表扫描,返回了很多行。 然后这些返回行将和 order 表(通过index)过滤。 重写语句,我们可以得到下面的语句,这个语句使用了In语法: SELECT /* IN example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE e.employee_id IN (SELECT o.sales_rep_id /* Note 4 */ FROM orders o WHERE o.customer_id = 144); /* Note 3 */ 查看它的执行计划如下: ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 NESTED LOOPS 5 2 VIEW 3 3 SORT UNIQUE 3 4 TABLE ACCESS FULL ORDERS ANA 1 5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1 6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA 执行计划如下。优化器将重新写子查询,并把它放入一个视图,这个视图将通过唯一性索引和 employees表 进行join 操作。这个结果将是一个 好的多的执行计划,因为视图(子查询)拥有一个很好的区分度,将只会 返回很少的employee_ids.这些employee_id 将被使用来通过唯一性索引来访问employees表。 第2个例子如下: Using EXISTS - Selective Predicate in the Parent This example demonstrates how rewriting a query to use EXISTS can improve performance. This query identifies all employees from department 80 who are sales reps who have placed orders. The following SQL statement uses IN: SELECT /* IN example */ e.employee_id, e.first_name, e.last_name, e.department_id, e.salary FROM employees e WHERE e.department_id = 80 /* Note 5 */ AND e.job_id = 'SA_REP' /* Note 6 */ AND e.employee_id IN (SELECT o.sales_rep_id FROM orders o); /* Note 4 */ The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The SQL statement was rewritten by the optimizer to use a view on the orders table, which requires sorting the data to return all unique employee_ids existing in the orders table. Because there is no predicate, many employee_ids are returned. The large list of resulting employee_ids are then used to access the employees table through the unique index. ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 NESTED LOOPS 125 2 VIEW 116 3 SORT UNIQUE 116 4 TABLE ACCESS FULL ORDERS ANA 40 5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1 6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA 这个例子讲说明如何使用exist来提高语句的perfomance. 这个查询的目的是获得部门号为80的所有的员工信息,这些员工是销售代表,负责提供订单。 相关的查询语句如下: SELECT /* IN example */ e.employee_id, e.first_name, e.last_name, e.department_id, e.salary FROM employees e WHERE e.department_id = 80 /* Note 5 */ AND e.job_id = 'SA_REP' /* Note 6 */ AND e.employee_id IN (SELECT o.sales_rep_id FROM orders o); /* Note 4 */ 如下的语句时相关的执行计划: ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 NESTED LOOPS 125 2 VIEW 116 3 SORT UNIQUE 116 4 TABLE ACCESS FULL ORDERS ANA 40 5 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 1 6 INDEX UNIQUE SCAN EMP_EMP_ID_PK ANA 优化器被重写来使用oraders表的视图,这个SQL语句将使用基于orders 表的视图, 这个语句将需要对数据进行排序来返回orders 表中的所有的唯一 employee_ids。 因为没有对order 表 通过条件进行压减,非常多的 employee_ids 都被返回了。 大量的 employee_ids被用来通过唯一性索引访问employees 表。 The following SQL statement uses EXISTS: SELECT /* EXISTS example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE e.department_id = 80 /* Note 5 */ AND e.job_id = 'SA_REP' /* Note 6 */ AND EXISTS (SELECT 1 /* Note 1 */ FROM orders o WHERE e.employee_id = o.sales_rep_id); /* Note 2 */ Note: ■ Note 1: This shows the line containing EXISTS. ■ Note 2: This shows the line that makes the subquery a correlated subquery. ■ Note 5 & 6:These are the selective predicates in the parent SQL. The following plan output is the execution plan (from V$SQL_PLAN) for the preceding statement. The cost of the plan is reduced by rewriting the SQL statement to use an EXISTS. This plan is more effective, because two indexes are used to satisfy the predicates in the parent query, thus returning only a few employee_ids. The employee_ids are then used to access the orders table through an index. ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 FILTER 2 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 98 3 AND-EQUAL 4 INDEX RANGE SCAN EMP_JOB_IX ANA 5 INDEX RANGE SCAN EMP_DEPARTMENT_IX ANA 6 INDEX RANGE SCAN ORD_SALES_REP_IX ANA An even more efficient approach is to have a concatenated index on department_id and job_id. This eliminates the need to access two indexes and reduces the resources used 我们可以使用exists,改写成如下的语法结构: SELECT /* EXISTS example */ e.employee_id, e.first_name, e.last_name, e.salary FROM employees e WHERE e.department_id = 80 /* Note 5 */ AND e.job_id = 'SA_REP' /* Note 6 */ AND EXISTS (SELECT 1 /* Note 1 */ FROM orders o WHERE e.employee_id = o.sales_rep_id); /* Note 2 */ Note: ■ Note 1: This shows the line containing EXISTS. ■ Note 2: This shows the line that makes the subquery a correlated subquery. ■ Note 5 & 6:These are the selective predicates in the parent SQL. 执行计划如下 ID OPERATION OPTIONS OBJECT_NAME OPT COST ---- -------------------- --------------- ---------------------- --- ---------- 0 SELECT STATEMENT CHO 1 FILTER 2 TABLE ACCESS BY INDEX ROWID EMPLOYEES ANA 98 3 AND-EQUAL 4 INDEX RANGE SCAN EMP_JOB_IX ANA 5 INDEX RANGE SCAN EMP_DEPARTMENT_IX ANA 6 INDEX RANGE SCAN ORD_SALES_REP_IX ANA 我们可以看到 通过使用exist,上面的执行计划的cost 被减少了。 因为indexes 被用来满足 满足父查询的结构,因此父查询将只返回很少的employee_ids。 然后,employee_ids将被用来通过index 来访问 orders表。