OceanBase 数据库当前版本支持 Nested Loop Join、Hash Join 和 Merge Join 三种不同的联接算法。
Hash Join 和 Merge Join 只适用于等值的联接条件,Nested Loop Join 可用于任意的联接条件。
Nested Loop Join
Nested Loop Join 原理是扫描一个表(外表),每读到该表中的一条记录,就去"扫描"另一张表(内表)找到满足条件的数据。
这里的"扫描"可以是利用索引快速定位扫描,也可以是全表扫描。通常来说,全表扫描的性能很差,所以如果联接条件的列上没有索引,优化器一般就不会选择 Nested Loop Join。在 OceanBase 数据库中,执行计划中展示了是否能够利用索引快速定位扫描。
如下例所示,第一个计划对于内表的扫描是全表扫描,因为联接条件是 t1.c = t2.c
,而 t2
表没有在 c
列上的索引。第二个计划对于内表的扫描能够使用索引快速找到匹配的行,主要原因是联接条件为 t1.b = t2.b
,而且 t2
表选择了创建在 b
列上的索引 k1
作为访问路径,这样对于 t1
表中的每一行的每个 b
值,t2
表都可以根据索引快速找到满足条件的匹配行。
obclient> CREATE TABLE t1(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected
obclient> CREATE TABLE t2(a INT PRIMARY KEY, b INT, c INT, KEY k1(b));
Query OK, 0 rows affected
obclient> EXPLAIN EXTENDED_NOADDR SELECT/*+USE_NL(t1 t2)*/ * FROM t1, t2
WHERE t1.c = t2.c;
+-----------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------+
| ===========================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-------------------------------------------
|0 |NESTED-LOOP JOIN| |1980 |623742|
|1 | TABLE SCAN |t1 |1000 |455 |
|2 | TABLE SCAN |t2 |2 |622 |
===========================================
Outputs & filters:
-------------------------------------
0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
conds(nil), nl_params_([t1.c])
1 - output([t1.c], [t1.a], [t1.b]), filter(nil),
access([t1.c], [t1.a], [t1.b]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
2 - output([t2.c], [t2.a], [t2.b]), filter([? = t2.c]),
access([t2.c], [t2.a], [t2.b]), partitions(p0),
is_index_back=false, filter_before_indexback[false],
range_key([t2.a]), range(MIN ; MAX)
obclient> EXPLAIN EXTENDED_NOADDR SELECT/*+USE_NL(t1 t2)*/ * FROM t1, t2
WHERE t1.b = t2.b;
+-----------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------+
| ============================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
--------------------------------------------
|0 |NESTED-LOOP JOIN| |1980 |94876|
|1 | TABLE SCAN |t1 |1000 |455 |
|2 | TABLE SCAN |t2(k1)|2 |94 |
============================================
Outputs & filters:
-------------------------------------
0 - output([t1.a], [t1.b], [t1.c], [t2.a], [t2.b], [t2.c]), filter(nil),
conds(nil), nl_params_([t1.b])
1 - output([t1.b], [t1.a], [t1.c]), filter(nil),
access([t1.b], [t1.a], [t1.c]), partitions(p0),
is_index_back=false,
range_key([t1.a]), range(MIN ; MAX)always true
2 - output([t2.b], [t2.a], [t2.c]), filter(nil),
access([t2.b], [t2.a], [t2.c]), partitions(p0),
is_index_back=true,
range_key([t2.b], [t2.a]), range(MIN ; MAX),
range_cond([? = t2.b])
Nested Loop Join 可能会对内表进行多次全表扫描,因为每次扫描都需要从存储层重新迭代一次,这个代价相对比较高,所以 OceanBase 数据库支持对内表进行一次扫描并把结果物化在内存中,这样在下一次执行扫描时就可以直接在内存中扫描相关的数据,而不需要从存储层进行多次扫描。但是物化在内存中的方式是有代价的,所以 OceanBase 数据库优化器是基于代价去判断是否需要物化内表。
Nested Loop Join 的一个优化变种是 Blocked Nested Loop Join,它每次从外表中读取一个块大小的行,然后再去扫描内表找到满足条件的数据,这样可以减少内表的读取次数。
Nested Loop Join 通常用在外表行数比较少,而且内表在联接条件的列上有索引的场景,因为内表中的每一行都可以快速的使用索引定位到相对应的匹配的数据。
同时,OceanBase 数据库也提供了 Hint 机制 /*+ USE_NL(table_name_list) */
去控制多表联接时选择 Nested Loop Join 算法。例如下述场景联接算法选择的是 Hash Join,而用户希望使用 Nested Loop Join,就可以使用上述 Hint 进行控制。
obclient> CREATE TABLE t1(c1 INT, c2 INT);
Query OK, 0 rows affected
obclient> CREATE TABLE t2(c1 INT, c2 INT);
Query OK, 0 rows affected
obclient> EXPLAIN SELECT * FROM t1,t2 WHERE t1.c1 = t2.c1;
+-----------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------+
| ========================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
----------------------------------------
|0 |HASH JOIN | |98010000 |66774608|
|1 | TABLE SCAN|T1 |100000 |68478 |
|2 | TABLE SCAN|T2 |100000 |68478 |
========================================
Outputs & filters:
-------------------------------------
0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil),
equal_conds([T1.C1 = T2.C1]), other_conds(nil)
1 - output([T1.C1], [T1.C2]), filter(nil),
access([T1.C1], [T1.C2]), partitions(p0)
2 - output([T2.C1], [T2.C2]), filter(nil),
access([T2.C1], [T2.C2]), partitions(p0)
obclient> EXPLAIN SELECT /*+USE_NL(t1, c2)*/* FROM t1, t2 WHERE t1.c1 = t2.c1;
+-----------------------------------------------------------------+
| Query Plan |
+-----------------------------------------------------------------+
| ===============================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-----------------------------------------------
|0 |NESTED-LOOP JOIN| |98010000 |4595346207|
|1 | TABLE SCAN |T1 |100000 |68478 |
|2 | MATERIAL | |100000 |243044 |
|3 | TABLE SCAN |T2 |100000 |68478 |
===============================================
Outputs & filters:
-------------------------------------
0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil),
conds([T1.C1 = T2.C1]), nl_params_(nil)
1 - output([T1.C1], [T1.C2]), filter(nil),
access([T1.C1], [T1.C2]), partitions(p0)
2 - output([T2.C1], [T2.C2]), filter(nil)
3 - output([T2.C1], [T2.C2]), filter(nil),
access([T2.C1], [T2.C2]), partitions(p0)
Nested Loop Join 还有以下两种实现的算法:
-
缓存块嵌套循环联接(Blocked Nested Loop Join)
在 OceanBase 数据库中,Blocked Nested Loop Join 的实现方式是 Batch Nested Loop Join,即通过从外表中批量读取数据行(默认是 1000 行),然后再去扫描内表找到满足条件的数据。这样将批量的数据与内层表的数据进行匹配,减少了内表的读取次数和内层循环的次数。
如下示例中,
batch_join=true
字段表示本次查询使用了 Batch Nested Loop Join。obclient>CREATE TABLE t1(c1 INT PRIMARY KEY); Query OK, 0 rows affected obclient>CREATE TABLE t2(c1 INT PRIMARY KEY); Query OK, 0 rows affected obclient>EXPLAIN EXTENDED_NOADDR SELECT /*+USE_NL(t1,t2)*/* FROM t1,t2 WHERE t1.c1=t2.c1\G *************************** 1. row *************************** Query Plan: ============================================ |ID|OPERATOR |NAME|EST. ROWS|COST | -------------------------------------------- |0 |NESTED-LOOP JOIN| |100001 |3728786| |1 | TABLE SCAN |t1 |100000 |59654 | |2 | TABLE GET |t2 |1 |36 | ============================================ Outputs & filters: ------------------------------------- 0 - output([t1.c1], [t2.c1]), filter(nil), conds(nil), nl_params_([t1.c1]), inner_get=false, self_join=false, batch_join=true 1 - output([t1.c1]), filter(nil), access([t1.c1]), partitions(p0), is_index_back=false, range_key([t1.c1]), range(MIN ; MAX)always true 2 - output([t2.c1]), filter(nil), access([t2.c1]), partitions(p0), is_index_back=false, range_key([t2.c1]), range(MIN ; MAX), range_cond([? = t2.c1])
-
索引嵌套循环联接(Index Nested Loop Join)
Index Nested Loop Join 是基于索引进行联接的算法,通过外层表匹配条件直接与内层表索引进行匹配,避免和内层表的每条记录进行比较,减少了对内层表的匹配次数。
如下示例中存在联接条件
t1.c1 = t2.c1
,则在t2
表的c1
列上有索引或t1
表的c1
列上有索引的时候,会使用 Index Nested Loop Join 算法。obclient> CREATE TABLE t1(c1 INT PRIMARY KEY); Query OK, 0 rows affected obclient> CREATE TABLE t2(c1 INT ,c2 INT); Query OK, 0 rows affected obclient> EXPLAIN SELECT /*+ORDERED USE_NL(t2,t1)*/ * FROM t2, (SELECT /*+NO_MERGE*/ * FROM t1)t1 WHERE t1.c1 = t2.c1 AND t2.c2 = 1\G *************************** 1. row *************************** Query Plan: =========================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ------------------------------------------- |0 |NESTED-LOOP JOIN| |981 |117272| |1 | TABLE SCAN |t2 |990 |80811 | |2 | SUBPLAN SCAN |t1 |1 |37 | |3 | TABLE GET |t1 |1 |36 | =========================================== Outputs & filters: ------------------------------------- 0 - output([t2.c1], [t2.c2], [t1.c1]), filter(nil), conds(nil), nl_params_([t2.c1]) 1 - output([t2.c1], [t2.c2]), filter([t2.c2 = 1]), access([t2.c1], [t2.c2]), partitions(p0) 2 - output([t1.c1]), filter(nil), access([t1.c1]) 3 - output([t1.c1]), filter(nil), access([t1.c1]), partitions(p0)
在
outputs & filters
的输出结果中nl_param
出现参数[t2.c1]
,说明执行了条件下压优化。详细信息请参考 JOIN。一般地,在进行查询优化时,OceanBase 数据库优化器会优先选择 Index Nested Loop Join,然后检查是否可以使用 Batch Nested Loop Join,这两种优化方式可以一起使用,最后才会选择 Nested Loop Join。