SQL优化技巧,摘自:Irisbay--虹湾软件
1 避免无计划的全表扫描 如下情况进行全表扫描: - 该表无索引 - 对返回的行无人和限制条件(无Where子句) - 对于索引主列(索引的第一列)无限制条件 - 对索引主列的条件含在表达式中 - 对索引主列的限制条件是is (not) null或!= - 对索引主列的限制条件是like操作且值是一个bind variable或%打头的值 2 只使用选择性索引 索引的选择性是指索引列中不同值得数目和标志中记录数的比,选择性最好的是非空列的唯一索引为1.0。 复合索引中列的次序的问题: 1 在限定条件里最频繁使用的列应该是主列 2 最具有选择性的列(即最清晰的列)应该是主列 如果1和2 不一致,可以考虑建立多个索引。 在复合索引和多个单个索引中作选择: 考虑选择性 考虑读取索引的次数 考虑AND-EQUAL操作 3 管理多表连接(Nested Loops, Merge Joins和Hash Joins) 优化联接操作 Merge Joins是集合操作 Nested Loops和Hash Joins是记录操作返回第一批记录迅速 Merge Joins的操作适用于批处理操作,巨大表 和远程查询 1全表扫描 --〉 2排序 --〉3比较和合并 性能开销主要在前两步 适用全表扫描的情形,都适用Merge Joins操作(比Nested Loops有效)。 改善1的效率: 优化I/O, 提高使用ORACLE多块读的能力, 使用并行查询的选项 改善1的效率:提高Sort_Area_Size的值, 使用Sort Direct Writes,为临时段提供专用表空间 4 管理包含视图的SQL语句 优化器执行包含视图的SQL语句有两种方法: - 先执行视图,完成全部的结果集,然后用其余的查询条件作过滤器执行查询 - 将视图文本集成到查询里去 含有group by子句的视图不能被集成到一个大的查询中去。 在视图中使用union,不阻止视图的SQL集成到查询的语法中去。 5 优化子查询 6 使用复合Keys/Star查询 7 恰当地索引Connect By操作 8 限制对远程表的访问 9 管理非常巨大的表的访问 - 管理数据接近(proximity) 记录在表中的存放按对表的范围扫描中最长使用的列排序 按次序存储数据有助于范围扫描,尤其是对大表。 - 避免没有帮助的索引扫描 当返回的数据集合较大时,使用索引对SGA的数据块缓存占用较大,影响其他用户;全表扫描还能从ORACLE的多块读取机制和“一致性获取/每块”特性中受益。 - 创建充分索引的表 使访问索引能够读取较全面的数据 建立仅主列不同的多个索引 - 创建hash簇 - 创建分割表和视图 - 使用并行选项 10 使用Union All 而不是Union UNION ALL操作不包括Sort Unique操作,第一行检索的响应速度快,多数情况下不用临时段完成操作, UNION ALL建立的视图用在查询里可以集成到查询的语法中去,提高效率 11 避免在SQL里使用PL/SQL功能调用 12 绑定变量(Bind Variable)的使用管理 使用Bind Variable和Execute using方式 将like :name ||’%’ 改写成 between :name and :name || char(225), 已避免进行全表扫描,而是使用索引。 13 回访优化进程 数据变化后,重新考察优化情况 SQL Tuning TROUBLESHOOTING GUIDE: SQL Tuning ================================= This document contains a number of potentially useful pointers for use when attempting to tune an individual SQL statement. This is a vast topic and this is just a drop in the ocean. Contents: Possible Causes of Poor SQL Performance ================================================= 1. Poorly tuned SQL 2. Poor disk performance/disk contention 3. Unnecessary sorting 4. Late row elimination 5. Over parsing 6. Missing indexes/use of 'wrong' indexes 7. Wrong plan or join order selected 8. Import estimating statistics on tables 9. Insufficiently high sample rate for CBO 10. Skewed data 11. New features forcing use of CBO 12. ITL contention Diagnostics/Remedies ==================== 1. Poorly tuned SQL Often, part of the problem is finding the SQL that is causing the problems. If you are seeing problems on a system, it is usually a good idea to start by eliminating database setup issues by using the UTLBSTAT & UTLESTAT reports. See: Introduction to Tuning Tuning using BSTAT/ESTAT for much more on this. Once the database has been tuned to a reasonable level then the most resource hungry selects can be determined as follows (a very similar report can be found in the Enterprise Manager Tuning Pack): SELECT address, SUBSTR(sql_text,1,20) Text, buffer_gets, executions, buffer_gets/executions AVG FROM v$sqlarea WHERE executions > 0 AND buffer_gets > 100000 ORDER BY 5; Remember that the 'buffer_gets' value of > 100000 needs to be varied for the individual system being tuned. On some systems no queries will read more than 100000 buffers, while on others most of them will. This value allows you to control how many rows you see returned from the select. The ADDRESS value retrieved above can then be used to lookup the whole statement in the v$sqltext view: SELECT sql_text FROM v$sqltext WHERE address = '...' ORDER BY piece; Once the whole statement has been identified it can be tuned to reduce resource usage. If the problem relates to CPU bound applications then CPU information for each session can be examined to determine the culprits. The v$sesstat view can be queried to find high cpu using sessions and then SQL can be listed as before. Steps: 1. Verify the reference number for the 'CPU used by this session' statistic: SELECT name ,statistic# FROM v$statname WHERE name LIKE '%CPU%session'; NAME STATISTIC# ----------------------------------- ---------- CPU used by this session 12 2. Then determine which session is using most of the cpu: SELECT * FROM v$sesstat WHERE statistic# = 12; SID STATISTIC# VALUE ---------- ---------- ---------- 1 12 0 2 12 0 3 12 0 4 12 0 5 12 0 6 12 0 7 12 0 8 12 0 9 12 0 10 12 0 11 12 0 12 12 0 16 12 1930 3. Lookup details for this session: SELECT address ,SUBSTR(sql_text,1,20) Text, buffer_gets, executions, buffer_gets/executions AVG FROM v$sqlarea a, v$session s WHERE sid = 16 AND s.sql_address = a.address AND executions > 0 ORDER BY 5; 4. Use v$sqltext to extract the whole SQL text. 5. Explain the queries and examine their access paths. Autotrace is a useful tool for examining access paths. 2. Poor disk performance/disk contention Use of BSTAT/ESTAT and/or operating system i/o reports can help in this area. Remember that you may be able to capture the activity of a single statement by running the report around the run of your statement with no other activity. Another good way of monitoring IO is to run a 10046 Level 8 trace to capture all the waits for a particular session. 10046 can be turned on at the session level using: alter session set events '10046 trace name context forever, level 8'; Excessing i/o can be found by examining the resultant trace file and looking for i/o related waits such as: 'db file sequential read' (Single-Block i/o - Index, Rollback Segment or Sort) 'db file scattered read' (Multi-Block i/o - Full table Scan). Remember to set TIMED_STATISTICS = TRUE to capture timing information otherwise comparisons will be meaningless. If you are also interested in viewing bind variable values then a level 12 trace an be used. 3. Unnecessary sorting The first question to ask is 'Does the data REALLY need to be sorted?' If sorting does need to be done then try to allocate enough memory to prevent the sorts from spilling to disk an causing i/o problems. Sorting is a very expensive operation: - High CPU usage - Potentially large disk usage Try to make the query sort the data as late in the access path as possible. The idea behind this is to make sure that the smallest number of rows possible are sorted. Remember that: - Indexes may be used to provided presorted data. - Sort merge joins inherently need to do a sort. - Some sorts don't actually need a sort to be performed. In this case the explain plan should show NOSORT for this operation. In summary: - Increase sort area size to promote in memory sorts. - Modify the query to process less rows -> Less to sort - Use an index to retrieve the rows in order and avoid the sort. - use sort_direct_writes to avoid flooding the buffer cache with sort blocks. - If Pro*C use release_cursor=yes as this will free up any temporary segments held open. 4. Late row elimination Queries are more likely to be performant if the bulk of the rows can be eliminated early in the plan. If this does happen then unnecessary comparisons may be made on rows that are simply eliminated later. This tends to increase CPU usage with no performance benefits. If these rows can be eliminated early in the access path using a selective predicate then this may significantly enhance the query performance. 5. Over parsing Over parsing implies that cursors are not being shared. If statements are referenced multiple times then it makes sense to share then rather than fill up the shared pool with multiple copies of essentially the same statement. See: Main issues affecting the Shared Pool on Oracle 7 and 8 Use of bind variables with CBO 6. Missing indexes/use of 'wrong' indexes If indexes are missing on key columns then queries will have to use Full Table Scans to retrieve data. Usually indexes for performance should be added to support selective predicates included in queries. If an unselective index is chosen in preference to a selective one then potential solutions are: RBO - indexes have an equal ranking so row cache order is used. CBO - reanalyze with a higher sample size - add histograms if column data has an uneven distribution of values - add hints to force use of the index you require Remember that index usage on join can be compromised by the join type and join order chosen. 7. Wrong plan or join order selected If the wrong plan has been selected then you may want to force the correct one. If the problem relates to an incorrect join order, then it ofter helps to draw out the tables linking them together to show how they join e.g.: A-B-C-D | E-F This can help with visualisation of the join order and identifications of missing joins. When tuning a plan, try different join orders examining number of rows returned to get an idea of how good they may be. 8. Import estimating statistics on tables Pre 8i, import performs an analyze estimate statistics on all tables that were analyzed when the tables were exported. This can result in different performance after an export/import. Introduced in 8i, more sampling functionality has been introduced including the facility to extract statistics on export. 9. Insufficiently high sample rate for CBO If the CBO does not have the correct statistical information then it cannot be expected to produce accurate results based on them. Usually a sample size of 5% will be sufficient. 10. Skewed data If column data distribution is non uniform, then the use of column statistics in the form of histograms should be considered. Histogram statistics do not help with uniformly distributed data or where no information about the column predicate is available such as with bind variables. 11. New features forcing use of CBO A number of new features are not implemented in the RBO and their presence in queries will force the use of the CBO. These include: - Degree of parallelism set on any table in the query - Index-only tables - Partition Tables - Materialised views 12. ITL contention ITL contention can occur when there is not enough Interested Transaction Lists in each block to support the update volume required. This can often occur after an export and import especially when no update space has been left in the blocks and the ITLs have not been increased. |