文章目录
not null的索引为聚集索引,null的索引为辅助索引?
1. 索引
1.1 索引I/O次数
索引是B+树结构,目的是为了减少I/O次数,叶子节点也就是最后一层存放的是数据,非叶子节点应该保存的是下一层的地址,所以每次经过索引查找所需的I/O次数其实就是该B+树的高度的,要是三层B+树,则每次查找都需要经过三次的I/O。
1.2 覆盖索引为什么就不需要回表
覆盖索引就是索引中已经有对应所需的字段,则不需要回表查询其字段值了,比如联合索引 (a,b),select a,b from table where a = ? and b = ?;这样就不需要回表查询了,因为找到的索引中有a,b值了,而select * from table where a = ? and b = ?;这样就需要再回一次表查询,虽然在索引中找到了对应a和b的值,但其他字段也要查询出来,就需要回表查询了。
2. 聚集(聚簇)索引
叶子节点存放的不止是索引还有行记录数据,即数据页。
1. 物理上不需要连续,逻辑上连续即可:通过双向链表连接实现逻辑连续
2. 范围查询:通过主键查找某一范围的数据,通过叶子节点的上层中间节点就能得到页的范围,直接读取数据页即可
3. 辅助(非聚集)索引
索引页存放的是键值以及书签(bookmark),书签能够找到索引对应的行数据。
1. 每张表可以有多个辅助索引,辅助索引可以指向聚集索引,若是在一颗高度为3的辅助索引树查找数据,则最多3次IO,然后再去聚集索引树查找页,可能总共需要6次IO
2. 辅助索引的分析:找图片
辅助索引使用不了:如下数据中,通过select * 查找的数据,使用辅助索引时,却发现是type是all,即全表扫描,
这是因为辅助索引并不包含整行记录的所有数据,则*是想找到其他列的数据,此时优化器就会使用聚集索引了,
当使用select creation_date时,就能够使用到辅助索引了,或者通过force index强制指定索引
mysql> show index from jiashi_guess;
+--------------+------------+------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| jiashi_guess | 0 | PRIMARY | 1 | user_id | A | 26010 | NULL | NULL | | BTREE | | |
| jiashi_guess | 0 | PRIMARY | 2 | guess_type_id | A | 58456 | NULL | NULL | | BTREE | | |
| jiashi_guess | 1 | idx_create | 1 | creation_date | A | 50279 | NULL | NULL | YES | BTREE | | |
| jiashi_guess | 1 | idx_modifi | 1 | modification_date | A | 2 | NULL | NULL | YES | BTREE | | |
+--------------+------------+------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
mysql> explain select * from jiashi_guess where creation_date > '2018-07-10';
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | jiashi_guess | NULL | ALL | idx_create | NULL | NULL | NULL | 58456 | 49.61 | Using where |
+----+-------------+--------------+------------+------+---------------+------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.11 sec)
mysql> explain select creation_date from jiashi_guess where creation_date > '2018-07-10';
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+--------------------------+
| 1 | SIMPLE | jiashi_guess | NULL | range | idx_create | idx_create | 6 | NULL | 29000 | 100.00 | Using where; Using index |
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+--------------------------+
1 row in set, 1 warning (0.02 sec)
mysql> explain select * from jiashi_guess force index(idx_create) where creation_date > '2018-07-10';
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+-----------------------+
| 1 | SIMPLE | jiashi_guess | NULL | range | idx_create | idx_create | 6 | NULL | 29000 | 100.00 | Using index condition |
+----+-------------+--------------+------------+-------+---------------+------------+---------+------+-------+----------+-----------------------+
4. 分裂
增加值之后,若是节点满了则会分裂,重新构造B+树,InnoDB的page header中能决定分裂是向左或者向右,若是只能向左则会导致空间的浪费,因为最左的节点一直不用分裂
只索引一个列的指定长度字段:alter table t add key idx_b (b(100)),只索引b列的前100个字段
5. cardinality:基数
第一次查看
mysql> show index from jiashi_guess;
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| jiashi_guess | 0 | PRIMARY | 1 | user_id | A | 58624 | NULL | NULL | | BTREE | | |
| jiashi_guess | 0 | PRIMARY | 2 | guess_type_id | A | 58624 | NULL | NULL | | BTREE | | |
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
第二次查看
mysql> show index from jiashi_guess;
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| jiashi_guess | 0 | PRIMARY | 1 | user_id | A | 58994 | NULL | NULL | | BTREE | | |
| jiashi_guess | 0 | PRIMARY | 2 | guess_type_id | A | 58994 | NULL | NULL | | BTREE | | |
+--------------+------------+----------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
jiashi_guess是联合主键,上表说明其cardinality值很高,总行数数据也是,58624
则cardinality/rows_in_table = 1,越接近1则辨识度越高,效果越好。
①cardinality是采样预估的:cardinality是根据采样去计算得出的一个预估值,在update以及insert时就会发生cardinality采样的操作,不过一般计算cardinality的时间成本消耗大,所以就采用采样的方式了,InnoDB默认是8页数据作为样本,若A为所有页数,则cardinality = 8页数据的不同记录个数总和*A/8,所以有可能每次cardinality的值都不一样。
②cardinality是策略性更新:并不是每次update或insert之后,都会计算cardinality的值,而是有两个策略,
策略一:表中1/16的数据改变了,不过这种情况有可能是同一行数据频繁变更,所以有了第二个策略
策略二:stat_modified_count>2 000 000 000,stat_modified_count是InnoDB内部的一个计数器,表示发生变化的次数
6. 联合索引
6.1 键是排序的
联合索引的特质:已经对键进行排序了,包括第二个键(在第一个键值一样的前提下才是排序的),此时,若是联合索引为下表的user_id,day,则语句select * from buy where user_id=1 order by day desc;节省了一次排序,虽然用了order by,当explain结果能知道,extra中没有用到filter sort,这是联合索引的一个好处。
create table buy( user_id int , day date)engine=innodb;
alter table buy add key idx_u(user_id);
alter table buy add key idx_u2(user_id,day);
explain select * from buy where user_id=1;
-- 使用到的索引是idx_u,
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| 1 | SIMPLE | buy | NULL | ref | idx_u,idx_u2 | idx_u | 5 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
explain select * from buy where user_id=1 order by day desc;
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | buy | NULL | ref | idx_u,idx_u2 | idx_u2 | 5 | const | 3 | 100.00 | Using index |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
6.2 第二个键无法用到索引的原因
联合索引:第二个键没办法用到索引的原因,如下图所示,联合索引后,第一个键是顺序排列的,所以能够用到索引:select * from table where 第一个键;而单独的第二个键,则不是顺序的,1,2,1,4,1,2,所以无法用到索引:select * from table where 第二个键=;而在第一个键确定的情况下,比如是1,则第二个键是顺序的,也就是能用到索引:1,2,顺序 select * from table where 第一个键= and 第二个键;
7. 覆盖索引
覆盖索引:即从辅助索引中得到查询的记录,而不需要查询聚集索引中的记录
好处1:覆盖索引不包含整行记录的所有信息,大小远小于聚集索引,能够减少IO操作
好处2: 对于某些统计信息而言,优化器会选择使用覆盖索引而不是聚集索引,如下数据,possible_key未null,而
key却是idx_u,Extra也表明使用到了索引,这就是优化器自动选择覆盖索引的例子。
mysql> explain select count(*) from buy;
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| 1 | SIMPLE | buy | NULL | index | NULL | idx_u | 5 | NULL | 6 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
8. MRR优化:multi-range read
MMR优化目的:减少磁盘的随机访问,并将随机访问转换成较为顺序的数据访问
select * from salary where key1>1000 and key1<2000
and key2 = 1000
-- 若是没启动MMR时,是获取key1访问在1000-2000的所有数据后,再进行key2=1000的过滤,无用数据被获取到
-- 启用MMR:优化器将查询条件进行拆分,然后再进行数据查询,拆分:(1000,1000),(1001,1000)...(1999,1000)
9. ICP优化:index condition pushdown
将where过滤的部分放在存储引擎层,原有的方式是:先根据索引获取到数据,再从数据中使用where进行过滤,
使用ICP则能够减少上层SQL对记录的索取
10. 哈希算法
①哈希表是由直接寻址表改进而来,哈希碰撞通过链表来解决
②一般将关键字转为自然数,然后再通过除法散列来得到对应的槽
③自适应哈希:只能对等值查找生效,范围查找则不行