SQL优化及注意事项

最新推荐文章于 2024-03-11 13:57:57 发布

原创最新推荐文章于 2024-03-11 13:57:57 发布 · 617 阅读

0 ·

CC 4.0 BY-SA版权

数据库&缓存专栏收录该内容

16 篇文章

订阅专栏

本文深入探讨了SQL查询优化的关键要点，包括避免全表扫描、利用索引、合理使用连接操作、减少OR条件的使用等。通过具体实例说明了如何优化数据库查询，以提高性能并降低资源消耗。此外，文章还提供了SQL语句简化、字段选择优化、避免NULL字段、高效使用枚举和集合类型等实用技巧，帮助开发者构建更高效、更具扩展性的数据库系统。

1.where子句中有"!="或"<>"操作符时将不使用索引而进行全表扫描。
    select * from emp where amount != 0 (不使用)
    select * from emp where amount > 0 (使用)

2.where条件中对字段增加处理函数将不使用该列的索引。
    select * from emp where to_char(date,'yyyymmdd')='20151003' (不使用)
    select * from emp where date = to_char('20151003','yyyymmdd') (使用)

3.避免在索引列上使用IS NULL和 IS NOT NULL。
    select * from emp where t_code is not null (不使用)
    select * from emp where t_code > 0 (使用)

4.通配符 % 的使用。
    select * from emp where name like '%A' (不使用索引)
    select * from emp where name like 'A%' (使用索引)

5、使用多表连接时，在from子句中，将记录数少的表放在后面，可提高执行效率。

6、多表查询时，如果其中一个表的记录数量明显大于其他表，则可以先对此表进行查询后，再与其他小表进行表连接。
    实例：
    select a.id, b.name, c.addr, sum(a.account) tot_amount
        from table_name_1 a, table_name_2 b, table_name_3 c
            where substr(a.depaddr,1,7) = substr(b.depaddr,1,7) and a.id = c.id
                group by b.name, c.addr, a.id;
    替代方案：
    select b.name, c.addr, a.id, a.tot_amount
        from (select id, depaddr, sum(account) tot_amount
                     from table_name_1 group by depaddr, id) a
                         table_name_2 b, table_name_3 c
                             where substr(a.depaddr,1,7) = substr(b.depaddr,1,7) and a.id = c.id
                                 group by b.name, c.addr, a.id;
    注意：由于table_name_1表的记录数远远大于table_name_2表和table_name_3表中的记录数，所以首先从table_name_1表中查询需要的记录，此时记录数已经被大量缩小，然后再和其他两个表连接，速度将会得到很大改善！

7、查询数量较大时，使用表连接代替IN，EXISTS，NOT IN，NOT EXISTS等。
    实例：
    a、使用IN：
        select sum(col2) from tab1 where col1 in (select col1 from tab2);
        使用EXISTS:：
        select sum(col2) from tab1 a where exists (select * from tab2 where col1=a.col1);

    替代方案：
    a、使用连接：
        select sum(a.col2) from tab1 a, tab2 b where a.col1=b.col2;
    b、使用外连接：
       select sum(a.col2) from tab1 a, tab2 b
          where a.col1=b.col2 and b.col1 is null;

8、应尽量避免在where子句中使用or来连接条件，否则将导致不使用索引而进行全表扫描。
    select id from t where num=10 or num=20 (不使用)
    select id from t where num=10 union all select id from t where num=20 (使用)

注意：
一、核心
    1.不在数据库做运算：cpu计算务必移至业务层
    2.控制单表数据量：单表记录控制在1000w
    3.控制列数量：字段数控制在20以内
    4.平衡范式与冗余：为提高效率牺牲范式设计，冗余数据
    5.拒绝：拒绝大sql，大事物，大批量

二、字段类
1.用好数值类型
    tinyint(1Byte)
    smallint(2Byte)
    mediumint(3Byte)
    int(4Byte)
    bigint(8Byte)

2.字符转化为数字
    用int而不是char(15)存储ip

3.优先使用enum或set
    例如：`sex` enum (‘F’, ‘M’)

4.避免使用NULL字段
    NULL字段很难查询优化，NULL字段的索引需要额外空间，NULL字段的复合索引无效
    bad case：
    `name` char(32) default null
        `age` int not null
    good case：
        `age` int not null default 0

5.少用text/blob
    varchar的性能会比text高很多，实在避免不了blob，请拆表

四、sql类
    1.sql语句尽可能简单
    2.不用select *
        消耗cpu，io，内存，带宽
        这种程序不具有扩展性
    3.OR改写为IN()
        or的效率是n级别
        in的消息时log(n)级别
        in的个数建议控制在200以内
        select id from t where phone=’159′ or phone=’136′; => select id from t where phone in (’159′, ’136′);

    4.性能分析工具
        show profile;
        mysqlsla;
        mysqldumpslow;
        explain;
        show slow log;
        show processlist;
        show query_response_time(percona)