MySQL Internals--How MySQL optimize single range

本文深入解析MySQL如何通过SEL_TREE和SEL_ARG结构优化复杂条件下的范围扫描,并通过具体例子展示优化过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

MySQL Internals-How MySQL optimize single range

Louis Hust

 

0  Perface

When MySQL calculates the cost of QEP(Query execute plan), it will do a quick test for range scan with const values. But when the condition in where is complex, it should do the right thing for optimization, otherwise the exact plan may be lost. So this article showes how MySQL handle complex condition in WHERE with a sample in MySQL manual.

 

0  SEL_TREE & SEL_ARG

SEL_TREE & SEL_ARG are two important structs which is used in range optimization. Here we don't care what the two struct constains, what we should know is what they stand for. SEL_TREE can stand for a simple condition, such as c1 < 'aaa', and also can stand for complex condition, such as c1 > 'aaa' and c1 < 'bbb', so it's a process of recursion.

 

Just like the condition c1 > 'aaa' and c1 < 'bbb', first of all, there are two SEL_TREE structs st1, st2 stand for c1>'aaa', c1 < 'bbb' respectively, and the s1 and s1 do an 'tree and' operation to generate a new SEL_TREE st which stands for (c1 > 'aaa' and c1 < 'bbb').

 

What does SEL_ARG stand for? SEL_ARG stands for a range, such as c1 > 'aaa', so there is a SEL_ARG sa1 which belongs to st1, stands for c1 in range ('aaa', +inf) and a SEL_ARG sa2 which belongs to st2 stands for c1 in range (-inf, 'bbb'). And at last, the two will combine togother to generate a new SEL_ARG sa which stards for c1 in range ('aaa', 'bbb') and sa belongs to st.

 

0  Examples

 
mysql> show create table range_opt_single\G
*************************** 1. row ***************************
       Table: range_opt_single
Create Table: CREATE TABLE `range_opt_single` (
  `c1` varchar(20) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  KEY `c1` (`c1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM range_opt_single
    -> WHERE (c1 < 'a' AND (c1 LIKE 'abcde%' OR c1 LIKE '%b')) 
    ->       OR (c1 < 'aaa' AND c2 = 4) 
    ->       OR (c1 < 'uux' AND c1 > 'z')\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: range_opt_single
         type: range
possible_keys: c1
          key: c1
      key_len: 23
          ref: NULL
         rows: 1
        Extra: Using index condition; Using where
1 row in set (0.00 sec)

 

The WHERE condition is some kind of complex, three conditions combined with OR, let's look every part in deep.

  1. (c1 < 'a' AND (c1 LIKE 'abcde%' OR c1 LIKE '%b'))
    (c1 LIKE 'abcde%') can generate a SEL_TREE st_12 with c1 in range ('abcde', 0x 61 62 63 64 65 ff ff);
    (c1 LIKE '%b') can not generate a SEL_TREE, cause the '%b' starts with % which can not be optimized;
    So tree_or(st_12, NULL) returns NULL;
    (c1 < 'a') generates a SEL_TREE st_11 with c1 in range (-inf, 'a');
    At last, this condition returns st_11.
     
  2. (c1 < 'aaa' AND c2 = 4)
    (c1 < 'aaa') generates a SEL_TREE st_21 with c1 in range (-inf, 'aaa');
    (c2 = 4) can not generate SEL_TREE, cause c2 is not in the index;
    So tree_and(st_21, NULL) returns st_21;
     
  3. (c1 < 'uux' AND c1 > 'z')
    (c1 < 'uux') generates a SEL_TREE st_31 with c1 in range (-inf, 'uux');
    (c1 > 'uux') generates a SEL_TREE st_32 with c1 in range ('z', +inf);
    tree_and(st_31, st_32) return a new SEL_TREE st_3 with an impossible SEL_ARG ('z', 'uux');
     
  4. tree_or(st_11, st_21)
    Cause st_11 stands for (-inf, 'a') and st_21 stands for (-inf, 'aaa'), the OR operater will get the large range, so generate a new SEL_TREE st_n12 with range (-inf, 'aaa');
     
  5. tree_or(st_n12, st_3) Cause st_3 is impossible, so just return st_n12.
     
 

At last, the WHERE condition return a SEL_TREE st_n12 with SEL_ARG stands for (-inf, 'aaa'), and the the optimizer will check the cost of a plan scan with the INDEX on c1 with the range (-inf, 'aaa').

 

0  Code Inspection

You may have figured out that the function for get SEL_TREE is recursive, actually it is. The name of the function is get_mm_tree.

 

References

[1]
The Range Access Method for Single-Part Indexes




File translated from TEX by TTH, version 4.03.
On 26 Jan 2013, 19:03.

转载于:https://www.cnblogs.com/nocode/archive/2013/01/26/2878095.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值