btree和位图索引的对比

1、btree 索引


通过建表t1 object_id的值没有重复值,而t2 表的object_id的值重复率很高
通过实验在t1,t2表的object_id列建立普通索引,来证明普通索引列比较适合列的重复值比较低的列


优点:适合键值重复率较低的字段上使用
     那么有个B-tree索引我们就像翻书目录一样,直接定位rowid立刻就找到了我们想要的数据,实质减少了I/O操作就提高速度,它有一     个显著特点查询性能与表中数据量无关
缺点:不适合键值重复率较高的字段上使用,

SQL> create table t1 as select object_id,object_name from dba_objects;
Table created.
SQL> create table t2 as select mod(object_id,2) object_id,object_name from dba_objects;
Table created.
SQL> create index ind_t1 on t1(object_id);
Index created.
SQL> create index ind_t2 on t2(object_id);
Index created.


收集统计信息:
BEGIN  
         DBMS_STATS.GATHER_TABLE_STATS(ownname => 'scott',  
         tabname => 't1',  
         estimate_percent =>100,  
          method_opt => 'for all columns size 1',  
        degree => 8,  
         cascade=>TRUE  
         );  
         END;  
BEGIN  
         DBMS_STATS.GATHER_TABLE_STATS(ownname => 'scott',  
         tabname => 't2',  
         estimate_percent =>100,  
          method_opt => 'for all columns size 1',  
        degree => 8,  
         cascade=>TRUE  
         );  
         END;  


SQL> select count(*) from t1 where object_id=1;


  COUNT(*)
----------
	 0

Execution Plan
----------------------------------------------------------
Plan hash value: 2587783732


----------------------------------------------------------------------------
| Id  | Operation	  | Name   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	   |	 1 |	 5 |	 1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |	   |	 1 |	 5 |		|	   |
|*  2 |   INDEX RANGE SCAN| IND_T1 |	 1 |	 5 |	 1   (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------


   2 - access("OBJECT_ID"=1)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	525  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed


SQL> select count(*) from t2 where object_id=1;


  COUNT(*)
----------
     36200
Execution Plan
----------------------------------------------------------
Plan hash value: 2800912005


--------------------------------------------------------------------------------
| Id  | Operation	      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        |     1 |     3 |    38	 (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |        |     1 |     3 |	    |	       |
|*  2 |   INDEX FAST FULL SCAN| IND_T2 | 36086 |   105K|    38	 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("OBJECT_ID"=1)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	144  consistent gets
	  0  physical reads
	  0  redo size
	527  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

SQL> drop index ind_t1;
Index dropped.


SQL> drop index ind_t2;
Index dropped.






2、位图索引


位图索引适合于:列的基数很少,可枚举,重复值很多,数据不会被经常更新,由于一个键值对应很多行(rowid), 更新索引键值的时候,就会锁定索引,导致其他行不可被修改,阻塞


优点:OLAP 例如报表类数据库 重复率高的数据 特定类型的查询例如count、or、and等逻辑操作因为只需要进行位运算即可得到我们需要的结果


缺点:不适合重复率低的字段,还有经常DML操作(insert,update,delete),因为位图索引的锁代价极高,修改一个位图索引段影响整个位图段,例如修改
一个键值,会影响同键值的多行,所以对于OLTP 系统位图索引基本上是不适用的


接着上面的实验,在t1 t2表上建立位图索引
SQL> create bitmap index ind_t1 on t1(object_id);
Index created.


SQL> create bitmap index ind_t2 on t2(object_id);
Index created.


SQL>  select segment_name,bytes from user_segments where segment_name like '%T1%' OR  SEGMENT_NAME LIKE '%T2%';
SEGMENT_NAME									       BYTES
--------------------------------------------------------------------------------- ----------
T1										     3145728
T2										     3145728
IND_T1										     3145728
IND_T2										       65536

我们可以看出t1表的object_id列没有重复值,而t2表的object_id列重复值很多,建立位图索引的时候,重复值越多,位图索引就越小

SQL> drop table t1;
SQL> drop table t2;




下面我们来看一下,在重复率很高的情况下,位图索引和btree的效率


 create table t1 as select object_id,object_type from dba_objects;
create table t2 as select  object_id,object_type from dba_objects;
create  index ind_t1 on t1(object_type);
create bitmap index ind_t2 on t2(object_type);

SQL> select count(*) from t1 where object_type='TABLE';

Execution Plan
----------------------------------------------------------
Plan hash value: 2587783732


----------------------------------------------------------------------------
| Id  | Operation	  | Name   | Rows  | Bytes | Cost (%CPU)| Time	   |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |	   |	 1 |	 9 |	 5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |	   |	 1 |	 9 |		|	   |
|*  2 |   INDEX RANGE SCAN| IND_T1 |  1678 | 15102 |	 5   (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE')

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  8  consistent gets
	  0  physical reads
	  0  redo size
	527  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed


SQL> select count(*) from t2 where object_type='TABLE';
Execution Plan
----------------------------------------------------------
Plan hash value: 2032664525


--------------------------------------------------------------------------------------
| Id  | Operation		    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |	     |	   1 |	   9 |	   1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE 	    |	     |	   1 |	   9 |		  |	     |
|   2 |   BITMAP CONVERSION COUNT   |	     |	1678 | 15102 |	   1   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| IND_T2 |	     |	     |		  |	     |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	526  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed

SQL> select * from t1 where object_type='TABLE';
2799 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 634656657


--------------------------------------------------------------------------------------
| Id  | Operation		    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	    |	     |	1678 | 23492 |	  26   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |	1678 | 23492 |	  26   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN	    | IND_T1 |	1678 |	     |	   5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	427  consistent gets
	  0  physical reads
	  0  redo size
      79004  bytes sent via SQL*Net to client
       2569  bytes received via SQL*Net from client
	188  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       2799  rows processed


SQL> select * from t2 where object_type='TABLE';
2800 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2737179948
---------------------------------------------------------------------------------------
| Id  | Operation		     | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |	      |  1678 | 23492 |    47	(0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | T2     |  1678 | 23492 |    47	(0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|	      |       |       | 	   |	      |
|*  3 |    BITMAP INDEX SINGLE VALUE | IND_T2 |       |       | 	   |	      |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='TABLE')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	235  consistent gets
	  0  physical reads
	  0  redo size
      79020  bytes sent via SQL*Net to client
       2569  bytes received via SQL*Net from client
	188  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       2800  rows processed
在等值查找中我们可以看出位图索引的效率依言高于B-tree索引
上面实验参考了http://www.itpub.net/thread-1700144-1-1.html
create table t1 as select  object_id,mod(object_id,2) id, object_name, object_type From dba_objects;
create table t2 as select  object_id,mod(object_id,2) id, object_name, object_type From dba_objects;


create index ind_type_t1 on t1(object_type);
create bitmap index ind_type_t2 on t2(object_type);

create index ind_object_id_t1 on t1(object_id);
create bitmap index ind_object_id_t2 on t2(object_id);

create index ind_id_t1 on t1(id);
create bitmap index ind_id_t2 on t2(id);


SQL> select * from t1 where object_id in (1,2,10,20,30,50,60,70,40);
8 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1020377091

-------------------------------------------------------------------------------------------------
| Id  | Operation		     | Name		| Rows	| Bytes | Cost (%CPU)| Time	|
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |			|     9 |   369 |    10   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR	     |			|	|	|	     |		|
|   2 |   TABLE ACCESS BY INDEX ROWID| T1		|     9 |   369 |    10   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN	     | IND_OBJECT_ID_T1 |     9 |	|     9   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_ID"=1 OR "OBJECT_ID"=2 OR "OBJECT_ID"=10 OR "OBJECT_ID"=20 OR
	      "OBJECT_ID"=30 OR "OBJECT_ID"=40 OR "OBJECT_ID"=50 OR "OBJECT_ID"=60 OR "OBJECT_ID"=70)

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 12  consistent gets
	  1  physical reads
	  0  redo size
	980  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  8  rows processed


SQL> select * from t2 where object_id in (1,2,10,20,30,50,60,70,40);
8 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3310774432

--------------------------------------------------------------------------------------------------
| Id  | Operation		      | Name		 | Rows  | Bytes | Cost (%CPU)| Time	 |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 		 |     9 |   369 |    11   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR	      | 		 |	 |	 |	      | 	 |
|   2 |   TABLE ACCESS BY INDEX ROWID | T2		 |     9 |   369 |    11   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS| 		 |	 |	 |	      | 	 |
|*  4 |     BITMAP INDEX SINGLE VALUE | IND_OBJECT_ID_T2 |	 |	 |	      | 	 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("OBJECT_ID"=1 OR "OBJECT_ID"=2 OR "OBJECT_ID"=10 OR "OBJECT_ID"=20 OR
	      "OBJECT_ID"=30 OR "OBJECT_ID"=40 OR "OBJECT_ID"=50 OR "OBJECT_ID"=60 OR "OBJECT_ID"=70)
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	 15  consistent gets
	  1  physical reads
	  0  redo size
	980  bytes sent via SQL*Net to client
	523  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  8  rows processed


结论:在使用or 的情况下,object_id 重复率比较低的情况下,还是btree效率高一些


SQL> select * From t2 where object_type in ('INDEX','CLUSTER');
3805 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 879057093
---------------------------------------------------------------------------------------------
| Id  | Operation		      | Name	    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 	    |  3357 |	134K|	127   (0)| 00:00:02 |
|   1 |  INLIST ITERATOR	      | 	    |	    |	    |		 |	    |
|   2 |   TABLE ACCESS BY INDEX ROWID | T2	    |  3357 |	134K|	127   (0)| 00:00:02 |
|   3 |    BITMAP CONVERSION TO ROWIDS| 	    |	    |	    |		 |	    |
|*  4 |     BITMAP INDEX SINGLE VALUE | IND_TYPE_T2 |	    |	    |		 |	    |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("OBJECT_TYPE"='CLUSTER' OR "OBJECT_TYPE"='INDEX')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	349  consistent gets
	  1  physical reads
	  0  redo size
     198248  bytes sent via SQL*Net to client
       3306  bytes received via SQL*Net from client
	255  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       3805  rows processed

SQL> select * From t1 where object_type in ('INDEX','CLUSTER');
3805 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 918902357
--------------------------------------------------------------------------------------------
| Id  | Operation		     | Name	   | Rows  | Bytes | Cost (%CPU)| Time	   |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	     |		   |  3357 |   134K|	87   (0)| 00:00:02 |
|   1 |  INLIST ITERATOR	     |		   |	   |	   |		|	   |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1	   |  3357 |   134K|	87   (0)| 00:00:02 |
|*  3 |    INDEX RANGE SCAN	     | IND_TYPE_T1 |  3357 |	   |	11   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("OBJECT_TYPE"='CLUSTER' OR "OBJECT_TYPE"='INDEX')
Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	610  consistent gets
	  0  physical reads
	  0  redo size
     198248  bytes sent via SQL*Net to client
       3306  bytes received via SQL*Net from client
	255  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
       3805  rows processed


结论:在使用or的情况下,object_type重复率比较高的表,还是位图索引效率高一些



<think>好的,用户遇到了从达梦数据库迁移到MySQL时创建表失败的问题,需要解决方案。首先,我需要根据提供的引用内容分析可能的原因。引用中提到了字段类型不兼容、字段注释缺失、字符串长度问题等。这些都是迁移时常见的障碍。 首先考虑字段类型问题。达梦MySQL的字段类型可能不完全对应,比如达梦的VARCHAR2在MySQL是VARCHAR,但可能有不同的限制。用户需要检查表结构,逐一对比类型是否兼容。例如,达梦的CLOB可能需要转换成MySQL的LONGTEXT。此外,自增列的处理也不同,达梦用IDENTITY,而MySQL用AUTO_INCREMENT,这里可能需要修改建表语句。 然后是字符串长度问题。引用3提到达梦迁移后出现varchar字段超长,尽管长度设置一致。这可能与字符编码有关,比如UTF8在MySQL中每个字符占用3字节,而达梦可能按字符数计算。需要检查MySQL的varchar长度是否超过字节限制,必要时调整字段长度或使用更合适的类型。 约束索引的差异也是问题源。达梦可能支持不同的约束名称或索引类型,而MySQL有长度限制,例如索引名最长64字符。此外,外键约束的语法可能不同,需要调整。 字段注释缺失的问题在引用2中提到,虽然注释不影响建表,但若迁移工具处理不当,可能需要手动导出注释并添加到MySQL表里。 接下来是处理方案。用户需要获取达梦的表结构DDL,使用迁移工具或导出脚本,然后逐项检查类型转换。可能需要手动修改建表语句中的字段类型、调整字符串长度、修改自增列定义,以及处理约束索引名称。此外,测试迁移后的表结构,确保数据能正确插入,并处理字符集排序规则问题。 最后,用户可能需要分步操作,如导出结构、对比修改、执行建表,并验证结果。同时,提供示例注意事项,帮助用户避免常见错误,如忽略默认值或字符集设置。总结来说,关键在于详细分析错误日志,针对性调整表结构,确保兼容性。</think>### 达梦数据库迁移至MySQL创建表失败解决方案 #### **常见原因分析** 1. **字段类型不兼容** 达梦数据库与MySQL的字段类型存在差异,例如: - 达梦的 `VARCHAR2` 需改为MySQL的 `VARCHAR`,且需注意字符长度限制(如UTF8编码下MySQL单字段最大为$21845$字符)[^3][^4]。 - 达梦的 `CLOB` 类型需改为MySQL的 `LONGTEXT`。 - 达梦的 `NUMBER` 类型需根据实际精度调整为MySQL的 `DECIMAL` 或 `INT`。 2. **自增列语法差异** 达梦使用 `IDENTITY(1,1)` 定义自增列,而MySQL需使用 `AUTO_INCREMENT`,例如: ```sql -- 达梦语法 CREATE TABLE demo (id INT IDENTITY(1,1)); -- MySQL语法 CREATE TABLE demo (id INT AUTO_INCREMENT PRIMARY KEY); ``` 3. **字符串长度超限** 达梦的 `VARCHAR` 字段长度按字符数计算,而MySQL按字节计算。若使用UTF8编码,实际存储长度可能超过MySQL限制(如 `VARCHAR(255)` 占用$765$字节),需调整字段长度或改用 `TEXT` 类型。 4. **约束或索引冲突** - 达梦允许更长的约束名(如外键名),而MySQL约束名限制为$64$字符[^2]。 - 达梦的某些索引类型(如位图索引)在MySQL中不支持,需替换为普通索引。 --- #### **具体解决方案** 1. **获取达梦表结构并转换** 使用达梦的 `DTS工具` 导出表结构脚本,或手动执行以下SQL生成DDL: ```sql -- 达梦语法 SELECT DBMS_METADATA.GET_DDL('TABLE', '表名') FROM DUAL; ``` 2. **逐步修改字段类型** | 达梦字段类型 | MySQL对应类型 | 注意事项 | |----------------------|------------------------|-----------------------------------| | `VARCHAR2(n)` | `VARCHAR(n)` | 检查编码导致的字节超限问题 | | `NUMBER(p,s)` | `DECIMAL(p,s)` | 确保精度小数位一致 | | `TIMESTAMP` | `DATETIME` | 处理默认值格式差异 | 3. **处理自增列与主键** 修改达梦的 `IDENTITY` 为MySQL的 `AUTO_INCREMENT`,并显式定义主键: ```sql -- 修改前(达梦) CREATE TABLE test (id INT IDENTITY(1,1), name VARCHAR(50)); -- 修改后(MySQL) CREATE TABLE test ( id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50) ); ``` 4. **调整约束与索引** - 缩短约束名称至64字符内,例如将 `fk_long_constraint_name` 改为 `fk_short_name`。 - 删除MySQL不支持的索引类型,改用 `BTREE` 索引。 5. **验证并执行建表语句** 使用分步验证法: ```sql -- 步骤1:仅创建表结构(无约束) CREATE TABLE test (...); -- 步骤2:逐个添加约束 ALTER TABLE test ADD CONSTRAINT ...; ``` --- #### **操作示例** **达梦原始DDL**: ```sql CREATE TABLE "USER" ( "ID" NUMBER(10) IDENTITY(1,1), "NAME" VARCHAR2(50), "DESC" CLOB ); ``` **转换后MySQL DDL**: ```sql CREATE TABLE user ( id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50), `desc` LONGTEXT ) DEFAULT CHARSET=utf8mb4; ``` --- #### **注意事项** 1. **字符集与排序规则** 统一设置为 `utf8mb4` `utf8mb4_general_ci`,避免中文乱码。 2. **默认值处理** 达梦的默认值可能包含MySQL不支持函数(如 `SYSDATE`),需改为 `CURRENT_TIMESTAMP`。 3. **错误日志分析** 若仍失败,检查MySQL错误日志(默认路径:`/var/log/mysql/error.log`),定位具体报错行。 ---
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值