Mycat作为数据库中间件,本身并不存储数据。Mycat通过其分片规则与读写规则,实现对后端众多mysql数据库实例的分布式访问。但是,在实际使用过程中,可能会出现实际的数据分布与分片规则不一致的情况。譬如:
1. mysql可通过直连方式访问,这就有可能将任意数据写到任意的数据库实例中;
2. Mycat后期调整分片规则时,前期已写入的数据与调整后的分片规则不一致;
当Mycat的分片规则与实际的数据分布不一致时,在执行sql时,有些问题需要注意。下面以一个例子来具体说明。
例子:
1. 建表、配置表分片规则
#建employee表,主键为id字段 CREATE TABLE employee ( id int(11) NOT NULL, name varchar(100) DEFAULT NULL, sharding_id int(11) NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; #分片规则,dn00 - dn04 <table name="employee" primaryKey="ID" autoIncrement="true" dataNode="dn0$0-3" rule="employee" /> #分片字段为sharding_id <tableRule name="employee"> <rule> <columns>sharding_id</columns> <algorithm>myfunc1</algorithm> </rule> </tableRule> #分片规则:根据sharding_id的枚举值分片 0=0 1=1 2=2 3=3 |
2. 通过直接mysql,分别在dn00和dn03写入数据
insert into employee(id,name,sharding_id) values(101,'dn00',3); --写入dn00 insert into employee(id,name,sharding_id) values(101,'dn03',3); --写入dn03 这里注意2点: 1. id字段均为101,也就是重复主键了。但数据存在mysql不同的实例中,故不会报错。 2. 分片字段sharding_id的值均为3。但实际上这2条记录是存在不同的分片上的。 |
3. 查询employee,到4个分片上广播查询,没问题。实际查询出来的数据,也是4个分片的数据合集。
mysql> explain select * from employee; +-----------+----------------------------------+ | DATA_NODE | SQL | +-----------+----------------------------------+ | dn00 | SELECT * FROM employee LIMIT 100 | | dn01 | SELECT * FROM employee LIMIT 100 | | dn02 | SELECT * FROM employee LIMIT 100 | | dn03 | SELECT * FROM employee LIMIT 100 | #到每个分片上广播查询 +-----------+----------------------------------+ 4 rows in set (0.01 sec) mysql> select * from employee; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn00 | 3 | | 101 | dn03 | 3 | +-----+------+-------------+ 2 rows in set (0.00 sec) |
4. 下面再来看按分片字段查询的情况。从测试结果来看,若按分片字段查询,Mycat只会通过分片规则到指定的分片上进行查询,即使在其它分片上也有满足where条件的记录。
mysql> explain select * from employee where sharding_id=3; +-----------+--------------------------------------------------------+ | DATA_NODE | SQL | +-----------+--------------------------------------------------------+ | dn03 | SELECT * FROM employee WHERE sharding_id = 3 LIMIT 100 | #根据分片规则到指定分片查询 +-----------+--------------------------------------------------------+ 1 row in set (0.04 sec) mysql> select * from employee where sharding_id=3; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn03 | 3 | +-----+------+-------------+ 1 row in set (0.00 sec) |
5. 按主键查询。先来看第一次按主键查询的情况,可以看到,第一次按主键(id=101)查询,由于主键(到分片)缓存并未命中该主键值,Mycat会到每个分片进行广播查询。故查询结果为所有分片的查询结果合集。
mysql> explain select * from employee where id=101; +-----------+-------------------------------------+ | DATA_NODE | SQL | +-----------+-------------------------------------+ | dn00 | select * from employee where id=101 | | dn01 | select * from employee where id=101 | | dn02 | select * from employee where id=101 | | dn03 | select * from employee where id=101 | +-----------+-------------------------------------+ 4 rows in set (0.00 sec) mysql> select * from employee where id=101; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn00 | 3 | | 101 | dn03 | 3 | +-----+------+-------------+ 2 rows in set (0.00 sec) |
6. 再来看第二次按主键(id=101)查询的情况,此时,由于Mycat已对该主键作了主键(到分片)缓存,故Mycat无须到每个分片上进行广播查询。但是,由于每一次查询时,有2条记录返回,而Mycat只会对第1条记录的主键和分片作缓存处理。所以当再次查询时,Mycat从缓存命中的是这一条记录的分片。
另外,从例子可看到,从缓存中命中到的记录,其实它的实际分片位置与分片规则是不一致的。
mysql> show @@cache; +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ | CACHE | MAX | CUR | ACCESS | HIT | PUT | LAST_ACCESS | LAST_PUT | +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ | ER_SQL2PARENTID | 1000 | 0 | 0 | 0 | 0 | 0 | 0 | | SQLRouteCache | 10000 | 2 | 13 | 7 | 2 | 1513329652279 | 1513328908328 | | TableID2DataNodeCache.TESTDB_ORDERS | 50000 | 0 | 0 | 0 | 0 | 0 | 0 | | TableID2DataNodeCache.TESTDB_EMPLOYEE | 10000 | 1 | 4 | 2 | 1 | 1513329652280 | 1513329310256 | +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ 4 rows in set (0.01 sec) mysql> explain select * from employee where id=101; +-----------+-------------------------------------+ | DATA_NODE | SQL | +-----------+-------------------------------------+ | dn00 | select * from employee where id=101 | #只到所缓存的一个分片上进行查询 +-----------+-------------------------------------+ 1 row in set (0.00 sec) mysql> select * from employee where id=101; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn00 | 3 | +-----+------+-------------+ 1 row in set (0.00 sec) |
7. 下面再来看按主键删除记录:
(1)由于主键缓存命中到该主键,故只删除了主键缓存指向分片的记录(另一条记录仍存在);
(2)删除后,主键缓存中并未删除该主键值的缓存,故再按主键查询时,并无记录返回(缓存所指向的分片已无该主键记录)。
mysql> delete from employee where id=101; Query OK, 1 row affected (0.01 sec) #只删除了一条记录(通过缓存命中) mysql> select * from employee; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn03 | 3 | #dn03上的记录仍然存在 +-----+------+-------------+ 1 row in set (0.00 sec) mysql> explain select * from employee where id=101; +-----------+-------------------------------------+ | DATA_NODE | SQL | +-----------+-------------------------------------+ | dn00 | select * from employee where id=101 | #再通过主键查询,仍到dn00分片上执行 +-----------+-------------------------------------+ 1 row in set (0.00 sec) mysql> select * from employee where id=101; #dn03分片上的记录并未查询出来 Empty set (0.00 sec) mysql> show @@cache; +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ | CACHE | MAX | CUR | ACCESS | HIT | PUT | LAST_ACCESS | LAST_PUT | +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ | ER_SQL2PARENTID | 1000 | 0 | 0 | 0 | 0 | 0 | 0 | | SQLRouteCache | 10000 | 2 | 19 | 9 | 2 | 1513330571512 | 1513328908328 | | TableID2DataNodeCache.TESTDB_ORDERS | 50000 | 0 | 0 | 0 | 0 | 0 | 0 | | TableID2DataNodeCache.TESTDB_EMPLOYEE | 10000 | 1 | 9 | 7 | 1 | 1513330571512 | 1513329310256 | +---------------------------------------+-------+------+--------+------+------+---------------+---------------+ 4 rows in set (0.00 sec) #缓存中该主键值的缓存仍然存在 |
8. 回到第7步,若不按主键删除,而是按分片字段删除。则只会删除指定分片的记录,其它分片的记录不会被删除。
mysql> delete from employee where sharding_id=3; Query OK, 1 row affected (0.01 sec) #删除了指定分片的记录 mysql> select * from employee; +-----+------+-------------+ | id | name | sharding_id | +-----+------+-------------+ | 101 | dn00 | 3 | #其它分片的记录仍存在 +-----+------+-------------+ 1 row in set (0.00 sec) |