SQL优化技巧

最新推荐文章于 2025-05-14 08:29:01 发布

josonchen

最新推荐文章于 2025-05-14 08:29:01 发布

阅读量384

点赞数

CC 4.0 BY-SA版权

分类专栏： SQL优化技巧

本文链接：https://blog.youkuaiyun.com/josonchen/article/details/45698275

SQL优化技巧专栏收录该内容

0 篇文章

订阅专栏

本文深入探讨了SQL优化的策略，包括避免全表扫描、使用选择性索引、优化多表连接、管理视图查询等，旨在提高数据库查询性能。

SQL优化技巧,摘自：Irisbay--虹湾软件

1 避免无计划的全表扫描
  如下情况进行全表扫描：
- 该表无索引
- 对返回的行无人和限制条件（无Where子句）
- 对于索引主列（索引的第一列）无限制条件
- 对索引主列的条件含在表达式中
- 对索引主列的限制条件是is (not) null或!=
- 对索引主列的限制条件是like操作且值是一个bind variable或%打头的值

2 只使用选择性索引
   索引的选择性是指索引列中不同值得数目和标志中记录数的比，选择性最好的是非空列的唯一索引为1.0。
复合索引中列的次序的问题：
  1 在限定条件里最频繁使用的列应该是主列
  2 最具有选择性的列（即最清晰的列）应该是主列
  如果1和2 不一致，可以考虑建立多个索引。
在复合索引和多个单个索引中作选择：
  考虑选择性考虑读取索引的次数考虑AND-EQUAL操作

3 管理多表连接（Nested Loops, Merge Joins和Hash Joins）优化联接操作

  Merge Joins是集合操作 Nested Loops和Hash Joins是记录操作返回第一批记录迅速

Merge Joins的操作适用于批处理操作，巨大表和远程查询

1全表扫描 --〉 2排序 --〉3比较和合并性能开销主要在前两步
  适用全表扫描的情形，都适用Merge Joins操作（比Nested Loops有效）。
  改善1的效率：优化I/O，提高使用ORACLE多块读的能力，使用并行查询的选项
  改善1的效率：提高Sort_Area_Size的值，使用Sort Direct Writes，为临时段提供专用表空间

4 管理包含视图的SQL语句

  优化器执行包含视图的SQL语句有两种方法：

- 先执行视图，完成全部的结果集，然后用其余的查询条件作过滤器执行查询
- 将视图文本集成到查询里去

含有group by子句的视图不能被集成到一个大的查询中去。

在视图中使用union，不阻止视图的SQL集成到查询的语法中去。

5 优化子查询

6 使用复合Keys/Star查询

7 恰当地索引Connect By操作

8 限制对远程表的访问

9 管理非常巨大的表的访问

- 管理数据接近(proximity) 记录在表中的存放按对表的范围扫描中最长使用的列排序按次序存储数据有助于范围扫描，尤其是对大表。

- 避免没有帮助的索引扫描当返回的数据集合较大时，使用索引对SGA的数据块缓存占用较大，影响其他用户；全表扫描还能从ORACLE的多块读取机制和“一致性获取/每块”特性中受益。

- 创建充分索引的表使访问索引能够读取较全面的数据建立仅主列不同的多个索引

- 创建hash簇

- 创建分割表和视图

- 使用并行选项

10 使用Union All 而不是Union

   UNION ALL操作不包括Sort Unique操作，第一行检索的响应速度快，多数情况下不用临时段完成操作，
   UNION ALL建立的视图用在查询里可以集成到查询的语法中去，提高效率

11 避免在SQL里使用PL/SQL功能调用

12 绑定变量(Bind Variable)的使用管理

   使用Bind Variable和Execute using方式

   将like :name ||’%’ 改写成 between :name and :name || char(225), 已避免进行全表扫描，而是使用索引。

13 回访优化进程

   数据变化后，重新考察优化情况

SQL Tuning

TROUBLESHOOTING GUIDE: SQL Tuning

=================================

This document contains a number of potentially useful pointers for use when

attempting to tune an individual SQL statement. This is a vast topic and this

is just a drop in the ocean.

Contents: Possible Causes of Poor SQL Performance

=================================================

1. Poorly tuned SQL

2. Poor disk performance/disk contention

3. Unnecessary sorting

4. Late row elimination

5. Over parsing

6. Missing indexes/use of 'wrong' indexes

7. Wrong plan or join order selected

8. Import estimating statistics on tables

9. Insufficiently high sample rate for CBO

10. Skewed data

11. New features forcing use of CBO

12. ITL contention

Diagnostics/Remedies

====================

1. Poorly tuned SQL

  Often, part of the problem is finding the SQL that is causing the problems.

  If you are seeing problems on a system, it is usually a good idea to start

  by eliminating database setup issues by using the UTLBSTAT & UTLESTAT

  reports. See:



   Introduction to Tuning

   Tuning using BSTAT/ESTAT



  for much more on this.

  Once the database has been tuned to a reasonable level then the most

  resource hungry selects can be determined as follows

  (a very similar report can be found in the Enterprise Manager Tuning Pack):

   SELECT address, SUBSTR(sql_text,1,20) Text, buffer_gets, executions,

   buffer_gets/executions AVG

   FROM v$sqlarea

   WHERE executions > 0

   AND buffer_gets > 100000

   ORDER BY 5;

  Remember that the 'buffer_gets' value of > 100000 needs to be varied for the

  individual system being tuned. On some systems no queries will read more than

  100000 buffers, while on others most of them will. This value allows you to

  control how many rows you see returned from the select.

  The ADDRESS value retrieved above can then be used to lookup the whole

  statement in the v$sqltext view:

   SELECT sql_text FROM v$sqltext WHERE address = '...' ORDER BY piece;

  Once the whole statement has been identified it can be tuned to reduce

  resource usage.

  If the problem relates to CPU bound applications then CPU information

  for each session can be examined to determine the culprits. The v$sesstat

  view can be queried to find high cpu using sessions and then SQL can be

  listed as before. Steps:

  1. Verify the reference number for the 'CPU used by this session'

   statistic:

   SELECT name ,statistic#

   FROM v$statname

   WHERE name LIKE '%CPU%session';

   NAME STATISTIC#

   ----------------------------------- ----------

   CPU used by this session 12

  2. Then determine which session is using most of the cpu:

   SELECT * FROM v$sesstat WHERE statistic# = 12;

   SID STATISTIC# VALUE

   ---------- ---------- ----------

   1 12 0

   2 12 0

   3 12 0

   4 12 0

   5 12 0

   6 12 0

   7 12 0

   8 12 0

   9 12 0

   10 12 0

   11 12 0

   12 12 0

   16 12 1930

  3. Lookup details for this session:

   SELECT address ,SUBSTR(sql_text,1,20) Text, buffer_gets, executions,

   buffer_gets/executions AVG

   FROM v$sqlarea a, v$session s

   WHERE sid = 16

   AND s.sql_address = a.address

   AND executions > 0

   ORDER BY 5;

  4. Use v$sqltext to extract the whole SQL text.

  5. Explain the queries and examine their access paths. Autotrace is

   a useful tool for examining access paths.

2. Poor disk performance/disk contention

  Use of BSTAT/ESTAT and/or operating system i/o reports can help in this

  area. Remember that you may be able to capture the activity of a single

  statement by running the report around the run of your statement with

  no other activity.

  Another good way of monitoring IO is to run a 10046 Level 8 trace to

  capture all the waits for a particular session. 10046 can be turned on at

  the session level using:

   alter session set events '10046 trace name context forever, level 8';

  Excessing i/o can be found by examining the resultant trace file and

  looking for i/o related waits such as:

  'db file sequential read' (Single-Block i/o - Index, Rollback Segment or Sort)

  'db file scattered read' (Multi-Block i/o - Full table Scan).



  Remember to set TIMED_STATISTICS = TRUE to capture timing information

  otherwise comparisons will be meaningless.

  If you are also interested in viewing bind variable values then a level 12

  trace an be used.

3. Unnecessary sorting

  The first question to ask is 'Does the data REALLY need to be sorted?'

  If sorting does need to be done then try to allocate enough memory to

  prevent the sorts from spilling to disk an causing i/o problems.

  Sorting is a very expensive operation:

   - High CPU usage

   - Potentially large disk usage



  Try to make the query sort the data as late in the access path as possible.

  The idea behind this is to make sure that the smallest number of rows

  possible are sorted.

  Remember that:

   - Indexes may be used to provided presorted data.

   - Sort merge joins inherently need to do a sort.

   - Some sorts don't actually need a sort to be performed. In this case the

   explain plan should show NOSORT for this operation.

In summary:

  - Increase sort area size to promote in memory sorts.

  - Modify the query to process less rows -> Less to sort

  - Use an index to retrieve the rows in order and avoid the sort.

  - use sort_direct_writes to avoid flooding the buffer cache with sort

   blocks.

  - If Pro*C use release_cursor=yes as this will free up any temporary

   segments held open.

4. Late row elimination

  Queries are more likely to be performant if the bulk of the rows can be

  eliminated early in the plan. If this does happen then unnecessary

  comparisons may be made on rows that are simply eliminated later.

  This tends to increase CPU usage with no performance benefits.

  If these rows can be eliminated early in the access path using a selective

  predicate then this may significantly enhance the query performance.

5. Over parsing

  Over parsing implies that cursors are not being shared.

  If statements are referenced multiple times then it makes sense to share

  then rather than fill up the shared pool with multiple copies of

  essentially the same statement. See:



  Main issues affecting the Shared Pool on Oracle 7 and 8

  Use of bind variables with CBO



6. Missing indexes/use of 'wrong' indexes

  If indexes are missing on key columns then queries will have to use Full

  Table Scans to retrieve data. Usually indexes for performance should be

  added to support selective predicates included in queries.

  If an unselective index is chosen in preference to a selective one then

  potential solutions are:



  RBO

  - indexes have an equal ranking so row cache order is used.



  CBO

  - reanalyze with a higher sample size

  - add histograms if column data has an uneven distribution of values

  - add hints to force use of the index you require

  Remember that index usage on join can be compromised by the join type and

  join order chosen.

7. Wrong plan or join order selected

  If the wrong plan has been selected then you may want to force the correct

  one.



  If the problem relates to an incorrect join order, then it ofter helps to

  draw out the tables linking them together to show how they join e.g.:

  A-B-C-D

   |

   E-F

  This can help with visualisation of the join order and identifications of

  missing joins. When tuning a plan, try different join orders

  examining number of rows returned to get an idea of how good they may be.

8. Import estimating statistics on tables

  Pre 8i, import performs an analyze estimate statistics on all tables

  that were analyzed when the tables were exported. This can result in

  different performance after an export/import.

  Introduced in 8i, more sampling functionality has been introduced including

  the facility to extract statistics on export.

9. Insufficiently high sample rate for CBO

  If the CBO does not have the correct statistical information then it

  cannot be expected to produce accurate results based on them. Usually a

  sample size of 5% will be sufficient.

10. Skewed data



  If column data distribution is non uniform, then the use of column statistics

  in the form of histograms should be considered. Histogram statistics do not

  help with uniformly distributed data or where no information about the

  column predicate is available such as with bind variables.

11. New features forcing use of CBO

  A number of new features are not implemented in the RBO and their presence

  in queries will force the use of the CBO. These include:

  - Degree of parallelism set on any table in the query

  - Index-only tables

  - Partition Tables

  - Materialised views

12. ITL contention



  ITL contention can occur when there is not enough Interested Transaction

  Lists in each block to support the update volume required. This can often

  occur after an export and import especially when no update space has been

  left in the blocks and the ITLs have not been increased.