[20160624]慎用nvarchar2数据类型.txt

本文通过一个具体的SQL优化案例,揭示了在Oracle数据库中使用nvarchar2数据类型可能带来的问题,包括索引效率低下和直方图分析不准确等。

[20160624]慎用nvarchar2数据类型.txt

--我以前的blog都写过谨慎使用nvarchar2类型,如果没有国际化需求,建议不要使用.
--而且这种类型可能导致另外的问题.正好最近优化一条sql语句,最后才发现问题所在.

1.环境:

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
IBMPC/WIN_NT64-9.1.0           11.2.0.1.0     Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

UPDATE pacs_image
   SET DELFLAGS = '1'
WHERE insuid = '1.2.840.10008.1.2.1000510567.20160624.4765453.0';

--执行计划走的是全表扫描,如果检查可以发现索引是建立的.我一看insuid的查询值就大概知道可能是直方图导致的问题.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from (select substr(INSUID,1,32),count(*)  from tjpacs.PACS_IMAGE group by  substr(INSUID,1,32) order by 2 desc) where rownum<=10;
SUBSTR(INSUID,1,32)                 COUNT(*)
--------------------------------- ----------
1.3.12.2.1107.5.4.4.10319.300000       40893
1.3.12.2.1107.5.4.4.10278.300000       26867
1.3.12.2.1107.5.3.58.40106.12.20       15343
1.2.840.10008.1.2.160400058.2016          25
1.2.840.10008.1.2.1000342479.201          16
1.2.840.10008.1.2.1000471555.201          16
1.2.840.10008.1.2.1000353888.201          15
1.2.840.10008.1.2.1000325399.201          15
1.2.840.10008.1.2.1000450679.201          15
1.2.840.10008.1.2.1000407902.201          14
10 rows selected.

SYSTEM@192.168.xx.yyy:1521/orcl> select count(*)   from tjpacs.PACS_IMAGE ;
  COUNT(*)
----------
    184870

--不管怎样,查询占14/184870应该可以很好的使用索引,但是当我查询直方图信息发现:

SYSTEM@192.168.xx.yyy:1521/orcl> select * from DBA_TAB_HISTOGRAMS where table_name='PACS_IMAGE' and column_name='INSUID';
OWNER  TABLE_NAME COLUMN_NAME                             ENDPOINT_NUMBER                     ENDPOINT_VALUE ENDPOINT_A
------ ---------- -------------------- ---------------------------------- ---------------------------------- ----------
TJPACS PACS_IMAGE INSUID                                             2862  993852307125505000000000000000000
TJPACS PACS_IMAGE INSUID                                             5416  993852307130228000000000000000000

--仅仅2个backet,有点奇怪了.

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/desc tjpacs.PACS_IMAGE
Name         Null?    Type
------------ -------- ----------------------------
FILENAME     NOT NULL NVARCHAR2(128)
HOSTNAME     NOT NULL NVARCHAR2(20)
VIRTUAL_DIR           NVARCHAR2(128)
STUDYID      NOT NULL NUMBER
GROUPNO      NOT NULL NUMBER
SERIESID     NOT NULL NUMBER
IMAGEID      NOT NULL NUMBER
INSUID                NVARCHAR2(128)
IMAGENO               NVARCHAR2(10)
IMAGEDATE             DATE
IMAGEDES              NVARCHAR2(128)
BODYPART              NVARCHAR2(128)
IMAGESTATUS           NVARCHAR2(1)
IMAGETYPE             NVARCHAR2(4)
FILESIZE              NUMBER
DELFLAGS              NVARCHAR2(1)
VIEWTAGPATH           NVARCHAR2(200)

--一看表定义很容易明白问题使用了nvarchar2类型.转储看看:

SYSTEM@192.168.xx.yyy:1521/orcl> select dump(INSUID,16) c70  ,insuid c60 from tjpacs.PACS_IMAGE where rownum<=1;
C70                                                                    C60
---------------------------------------------------------------------- ------------------------------------------------------------
Typ=1 Len=110: 0,31,0,2e,0,33,0,2e,0,31,0,32,0,2e,0,32,0,2e,0,31,0,31, 1.3.12.2.1107.5.4.4.10278.30000015111823402501500000000
0,30,0,37,0,2e,0,35,0,2e,0,34,0,2e,0,34,0,2e,0,31,0,30,0,32,0,37,0,38,
0,2e,0,33,0,30,0,30,0,30,0,30,0,30,0,31,0,35,0,31,0,31,0,31,0,38,0,32,
0,33,0,34,0,30,0,32,0,35,0,30,0,31,0,35,0,30,0,30,0,30,0,30,0,30,0,30,
0,30,0,30

--可以发现nvarchar2类型保存'1'需要2个字节.这样直方图仅仅分析16个字符.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from (select substr(INSUID,1,16),count(*)  from tjpacs.PACS_IMAGE group by  substr(INSUID,1,16) order by 2 desc) where rownum<=10;
SUBSTR(INSUID,1,16)  COUNT(*)
-------------------- --------
1.2.840.10008.1.       101766
1.3.12.2.1107.5.        83110
1.2.840.113619.2            4

--可以发现这样查询建立的直方图就是鸡肋.毫无用处,导致oracle认为重复值很多.取消直方图定义.

BEGIN
  SYS.DBMS_STATS.GATHER_TABLE_STATS (
      OwnName        => 'TJPACS'
     ,TabName        => 'PACS_IMAGE'
    ,Estimate_Percent  => SYS.DBMS_STATS.AUTO_SAMPLE_SIZE
    ,Method_Opt        => 'FOR ALL COLUMNS SIZE auto for columns INSUID size 1 '
    ,Degree            => 4
    ,Cascade           => TRUE
    ,No_Invalidate     => FALSE);
END;
/

SELECT * FROM tjpacs.pacs_image WHERE insuid = '1.2.840.10008.1.2.1000510567.20160624.4765453.0';

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  fmy1xha29r0dg, child number 0
-------------------------------------
SELECT * FROM tjpacs.pacs_image WHERE insuid =
'1.2.840.10008.1.2.1000510567.20160624.4765453.0'
Plan hash value: 3036732012
-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name              | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                   |      1 |        |       |     4 (100)|          |      1 |00:00:00.01 |       5 |
|   1 |  TABLE ACCESS BY INDEX ROWID| PACS_IMAGE        |      1 |      1 |   487 |     4   (0)| 00:00:01 |      1 |00:00:00.01 |       5 |
|*  2 |   INDEX RANGE SCAN          | PACS_IMAGE$INSUID |      1 |      1 |       |     3   (0)| 00:00:01 |      1 |00:00:00.01 |       4 |
-------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1 / PACS_IMAGE@SEL$1
   2 - SEL$1 / PACS_IMAGE@SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("INSUID"=U'1.2.840.10008.1.2.1000510567.20160624.4765453.0')

--OK现在可以正常使用索引了.当然如果oracle如果晚上自动分析问题就还原了.优化这个问题需要控制直方图建立.
exec dbms_stats.set_table_prefs('tjpacs','.pacs_image','method_opt','FOR ALL COLUMNS SIZE AUTO FOR columns  INSUID size 1');

SYSTEM@192.168.xx.yyy:1521/orcl> exec dbms_stats.set_table_prefs('tjpacs','pacs_image','method_opt','FOR ALL COLUMNS SIZE AUTO FOR columns  INSUID size 1');
PL/SQL procedure successfully completed.

SYSTEM@192.168.xx.yyy:1521/orcl> select * from SYS.OPTSTAT_USER_PREFS$;
OBJ# PNAME          VALNUM VALCHAR                                                      CHGTIME                           SPARE1
----- ---------- ---------- ------------------------------------------------------------ --------------------------------- ------
73788 METHOD_OPT            FOR ALL COLUMNS SIZE AUTO FOR COLUMNS  INSUID SIZE 1         2016-06-24 15:35:10.005000 +08:00

--另外这样建立的索引键值太长,索引占用空间也很大.

SYSTEM@192.168.xx.yyy:1521/orcl> validate index tjpacs.PACS_IMAGE$INSUID;
Index analyzed.

SYSTEM@192.168.xx.yyy:1521/orcl> @ &r/i
    HEIGHT     BLOCKS NAME          LF_ROWS    LF_BLKS LF_ROWS_LEN LF_BLK_LEN    BR_ROWS    BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS DEL_LF_ROWS_LEN DISTINCT_KEYS35M
---------- ---------- ---------- ---------- ---------- ----------- ---------- ---------- ---------- ----------- ---------- ----------- --------------- -------------
         3       4480 PACS_IMAGE     184892       4317    20799904       8000       4316         84      372844       8032           2             244        181921
                      $INSUID


MOST_REPEATED_KEY BTREE_SPACE USED_SPACE   PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS   PRE_ROWS PRE_ROWS_LEN OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
----------------- ----------- ---------- ---------- ------------ -------------------- ---------- ------------ -------------- ----------------
                3    35210688   21172748         61   1.01633126           4.00816563          0            0              0                0

--一个块仅仅保存4个键值,索引达到了35M,.而表104M.

总之:在生产系统要慎用Nvarchar2类型.不要打着什么国际化的幌子骗人,开发应该认真了解这种数据类型,实际上这样的程序就是垃圾.
再回头非常困难.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值