继续Expert One-on-One Oracle Chapter 13 物化视图 读书笔记(3)文章讲的:
上一篇文章主要讲到下面:
2、如何确保物化视图可以使用?
这里只讲三种方法来帮助使用物化视图的查询重写功能:
1)约束;
2)维数;
3)描述复杂关系-数据层次结构;
1)约束已经讲完;
2)讲维数如何提示oracle以及帮助oracle使用物化视图功能:
使用维数是另一种可以给oracle信息。
假设有张表:给出来事物日期和顾客ID。事务日期指向另一张表,这个表给出了事务日期在哪个月,哪个季度以及 哪个财政年等全部信息。
假设创建了一个物化视图,按月统计了销售信息,那么可以通过维数这个数据库对象,让oracle知道只要知道了月
的销售信息,就可以知道季度的销售信息 以及年的销售信息;
sales表存放了销售的详细信息,
time_hierarchy表存放了事务日期到月,季度以及年的映射;
customer_hierarchy表存放将客户ID映射到邮政编码,邮政编码
映射到地区,这样可以通过time_hierarchy表和customer_hierarchy
表计算按月,季度以及年的销售额,以及按邮政编码以及地区进行累记销售额。
这样使用物化视图的好处是:只要知道了按月的销售额以及按邮政编码的销售额
就会很快的知道季度以及按地区的销售额,而不必重新去从销售表中每条记录去
读取信息再做统计,这样对很大数据量来说会很慢。
下面举例实践:
首先创建一些需要的表以及表的信息:
drop table sales;
create table sales
(trans_date date, cust_id int, sales_amount number );
insert /*+ APPEND */ into sales
select trunc(sysdate,'year')+mod(rownum,366) TRANS_DATE,
mod(rownum,100) CUST_ID,
abs(dbms_random.random)/100 SALES_AMOUNT
from all_objects
/
commit;
begin
for i in 1 .. 4
loop
insert /*+ APPEND */ into sales
select trans_date, cust_id, abs(dbms_random.random)/100 SALES_AMOUNT
from sales;
commit;
end loop;
end;
/
select count(*) from sales;
drop table time_hierarchy cascade constraints;
create table time_hierarchy
(day primary key, mmyyyy, mon_yyyy, qtr_yyyy, yyyy)
organization index
as
select distinct
trans_date DAY,
cast (to_char(trans_date,'mmyyyy') as number) MMYYYY,
to_char(trans_date,'mon-yyyy') MON_YYYY,
'Q' || ceil( to_char(trans_date,'mm')/3) || ' FY'
|| to_char(trans_date,'yyyy') QTR_YYYY,
cast( to_char( trans_date, 'yyyy' ) as number ) YYYY
from sales;
/
SQL> analyze table sales compute statistics;
Table analyzed.
SQL> analyze table time_hierarchy compute statistics;
Table analyzed.
下面创建物化视图,将原来的销售表中的记录每天的信息滚动到月,计算出每月的销售情况:
SQL> create materialized view sales_mv
2 build immediate
refresh on demand
enable query rewrite
3 4 5 as
select sales.cust_id, sum(sales.sales_amount) sales_amount,
time_hierarchy.mmyyyy
from sales, time_hierarchy
6 7 8 9 where sales.trans_date = time_hierarchy.day
group by sales.cust_id, time_hierarchy.mmyyyy 10 ;
Materialized view created.
SQL> alter session set query_rewrite_enabled=true;
Session altered.
SQL> alter session set query_rewrite_integrity=trusted;
Session altered.
SQL> set autotrace on;
SQL> set timing on;
SQL> select time_hierarchy.mmyyyy, sum(sales_amount)
from sales, time_hierarchy
where sales.trans_date = time_hierarchy.day
group by time_hierarchy.mmyyyy 2 3 4 ;
MMYYYY SUM(SALES_AMOUNT)
---------- -----------------
12008 4.4277E+11
22008 4.1381E+11
32008 4.4380E+11
42008 4.3047E+11
52008 4.4144E+11
62008 4.2721E+11
72008 4.4304E+11
82008 4.4304E+11
92008 4.2732E+11
102008 4.4012E+11
112008 4.2573E+11
MMYYYY SUM(SALES_AMOUNT)
---------- -----------------
122008 4.3868E+11
12 rows selected.
Elapsed: 00:00:00.51
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=572 Bytes=148
72)
1 0 SORT (GROUP BY) (Cost=5 Card=572 Bytes=14872)
2 1 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 Bytes
=14872)
Statistics
----------------------------------------------------------
17 recursive calls
0 db block gets
25 consistent gets
4 physical reads
0 redo size
689 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
12 rows processed
分析:由于上面的查询是按照 月来统计销售额的,同时物化视图也是按照月来统计的,
这样oracle数据库自己认为物化视图中已经存在答案,所以进行了查询重写,使用了物
化视图。
下面进行更高的聚集查询,看是否还会发生查询重写:
SQL> select time_hierarchy.qtr_yyyy, sum(sales_amount)
from sales, time_hierarchy
2 3 where sales.trans_date = time_hierarchy.day
group by time_hierarchy.qtr_yyyy 4 ;
QTR_YYYY SUM(SALES_AMOUNT)
------------------------------------------------ -----------------
Q1 FY2008 1.3004E+12
Q2 FY2008 1.2991E+12
Q3 FY2008 1.3134E+12
Q4 FY2008 1.3045E+12
Elapsed: 00:00:08.46
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2028 Card=4 Bytes=11
6)
1 0 SORT (GROUP BY) (Cost=2028 Card=4 Bytes=116)
2 1 NESTED LOOPS (Cost=153 Card=485840 Bytes=14089360)
3 2 TABLE ACCESS (FULL) OF 'SALES' (Cost=153 Card=485840 B
ytes=6315920)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_32043' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
487392 consistent gets
721 physical reads
0 redo size
563 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
4 rows processed
分析:发现更高级的聚集到季度该SQL语句并没有进行查询重写,没有使用物化视图。
此时因为oracle不知道它自己能够通过物化视图来统计销售额,所以扫描原来
的主表sales表中去获取信息;
这个时候可以采用维数来告诉oracle数据库它可以通过物化视图中的月的销售
额的信息来滚动到季度的销售额的信息以及年的销售额的信息。
下面创建维数以及之后的查询语句情况:
SQL> create dimension time_hierarchy_dim
level day is time_hierarchy.day
level mmyyyy is time_hierarchy.mmyyyy
level qtr_yyyy is time_hierarchy.qtr_yyyy
level yyyy is time_hierarchy.yyyy
2 3 4 5 6 hierarchy time_rollup
(
7 8 day child of
mmyyyy child of
qtr_yyyy child of
yyyy
)
9 10 11 12 13 attribute mmyyyy
14 determines mon_yyyy;
Dimension created.
Elapsed: 00:00:02.49
SQL> set autotrace on
SQL> select time_hierarchy.qtr_yyyy, sum(sales_amount)
from sales, time_hierarchy
2 where sales.trans_date = time_hierarchy.day
3 group by time_hierarchy.qtr_yyyy 4 ;
QTR_YYYY SUM(SALES_AMOUNT)
------------------------------------------------ -----------------
Q1 FY2008 1.3004E+12
Q2 FY2008 1.2991E+12
Q3 FY2008 1.3134E+12
Q4 FY2008 1.3045E+12
Elapsed: 00:00:01.04
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=16 Card=4 Bytes=156)
1 0 SORT (GROUP BY) (Cost=16 Card=4 Bytes=156)
2 1 HASH JOIN (Cost=7 Card=1621 Bytes=63219)
3 2 VIEW (Cost=4 Card=34 Bytes=442)
4 3 SORT (UNIQUE) (Cost=4 Card=34 Bytes=442)
5 4 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_32043' (UNI
QUE) (Cost=2 Card=366 Bytes=4758)
6 2 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 Byt
es=14872)
Statistics
----------------------------------------------------------
193 recursive calls
0 db block gets
51 consistent gets
2 physical reads
0 redo size
563 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
4 rows processed
分析:创建了维数,则会告诉oracle天与月以及年之间的关系,这样oracle就会知道只要知道其中的
任意一个,及可以通过推算知道其他情况的销售额信息;所以在上面的按季度查询销售额的信
息oracle就会知道由于物化视图已经知道月的销售额的信息则可以通过查询重写使用物化视图
来统计季度的销售额的信息这样与原来没有使用物化视图的查询相比:原来用了 487392次逻辑I/O,
并且进行了721次物理I/O,使用了物化视图之后只使用了51次逻辑I/O,且没有进行物理I/O。
性能大大提高,充分体现了物化视图的优势。
对于主表使用维数可以使用多次,下面是就考虑为客户建立zip_code以及region属性,希望能够根据
客户的zip_code以及region属性来统计销售额。
SQL> alter session set query_rewrite_enabled=true;
Session altered.
Elapsed: 00:00:00.01
SQL> alter session set query_rewrite_integrity=trusted;
Session altered.
Elapsed: 00:00:00.01
SQL> create table customer_hierarchy
( cust_id primary key, zip_code, region )
2 3 organization index
4 as
select cust_id,
mod( rownum, 6 ) || to_char(mod( rownum, 1000 ), 'fm0000') zip_code,
5 6 7 mod( rownum, 6 ) region
8 from ( select distinct cust_id from sales);
Table created.
Elapsed: 00:00:02.69
SQL> analyze table customer_hierarchy compute statistics;
Table analyzed.
Elapsed: 00:00:00.20
SQL> drop materialized view sales_mv;
Materialized view dropped.
Elapsed: 00:00:00.89
重新创建物化视图,增加了可以根据客户ID来统计销售额。
SQL> create materialized view sales_mv
build immediate
2 3 refresh on demand
enable query rewrite
4 5 as
6 select customer_hierarchy.zip_code,
7 time_hierarchy.mmyyyy,
sum(sales.sales_amount) sales_amount
8 9 from sales, time_hierarchy, customer_hierarchy
where sales.trans_date = time_hierarchy.day
10 11 and sales.cust_id = customer_hierarchy.cust_id
12 group by customer_hierarchy.zip_code, time_hierarchy.mmyyyy;
Materialized view created.
Elapsed: 00:00:16.34
SQL> set autotrace traceonly
SQL> select customer_hierarchy.zip_code,
time_hierarchy.mmyyyy,
2 3 sum(sales.sales_amount) sales_amount
4 from sales, time_hierarchy, customer_hierarchy
5 where sales.trans_date = time_hierarchy.day
and sales.cust_id = customer_hierarchy.cust_id
6 7 group by customer_hierarchy.zip_code, time_hierarchy.mmyyyy;
1200 rows selected.
Elapsed: 00:00:00.40
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=572 Bytes=286
00)
1 0 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 Bytes=2
8600)
Statistics
----------------------------------------------------------
28 recursive calls
0 db block gets
113 consistent gets
4 physical reads
0 redo size
32285 bytes sent via SQL*Net to client
1372 bytes received via SQL*Net from client
81 SQL*Net roundtrips to/from client
8 sorts (memory)
0 sorts (disk)
1200 rows processed
分析:上面采用了物化视图。
但是下面的查询则不采用查询重写,不使用物化视图:
SQL>
select customer_hierarchy.region, --改为统计region的销售额信息
SQL> 2 time_hierarchy.mmyyyy,
sum(sales.sales_amount) sales_amount
3 4 from sales, time_hierarchy, customer_hierarchy
5 where sales.trans_date = time_hierarchy.day
6 and sales.cust_id = customer_hierarchy.cust_id
7 group by customer_hierarchy.region, time_hierarchy.mmyyyy;
72 rows selected.
Elapsed: 00:00:11.80
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2072 Card=51 Bytes=1
530)
1 0 SORT (GROUP BY) (Cost=2072 Card=51 Bytes=1530)
2 1 NESTED LOOPS (Cost=153 Card=485840 Bytes=14575200)
3 2 NESTED LOOPS (Cost=153 Card=485840 Bytes=9230960)
4 3 TABLE ACCESS (FULL) OF 'SALES' (Cost=153 Card=485840
Bytes=7287600)
5 3 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_32048' (UNIQUE)
6 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_32043' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
487394 consistent gets
803 physical reads
0 redo size
2405 bytes sent via SQL*Net to client
547 bytes received via SQL*Net from client
6 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
72 rows processed
以及下面的:
SQL> select customer_hierarchy.region, --改为统计region的销售额信息
time_hierarchy.yyyy, --改为统年的销售额信息
sum(sales.sales_amount) sales_amount
2 3 4 from sales, time_hierarchy, customer_hierarchy
5 where sales.trans_date = time_hierarchy.day
6 and sales.cust_id = customer_hierarchy.cust_id
7 group by customer_hierarchy.region, time_hierarchy.yyyy;
6 rows selected.
Elapsed: 00:00:10.07
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2028 Card=5 Bytes=14
5)
1 0 SORT (GROUP BY) (Cost=2028 Card=5 Bytes=145)
2 1 NESTED LOOPS (Cost=153 Card=485840 Bytes=14089360)
3 2 NESTED LOOPS (Cost=153 Card=485840 Bytes=9230960)
4 3 TABLE ACCESS (FULL) OF 'SALES' (Cost=153 Card=485840
Bytes=7287600)
5 3 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_32048' (UNIQUE)
6 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_32043' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
487394 consistent gets
803 physical reads
0 redo size
613 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
分析:上面两个例子都没有采用查询重写,没有使用物化视图,是因为oracle不知道
客户的ID与邮政编码以及地区之间的关系,我们没有告诉oralce,需要通过维数
来告诉oracle
下面来创建维数来告诉 oralce客户ID与邮政编码以及年之间的关系:
SQL> create dimension sales_dimension
2 level cust_id is customer_hierarchy.cust_id
3 level zip_code is customer_hierarchy.zip_code
4 level region is customer_hierarchy.region
5 level day is time_hierarchy.day
6 level mmyyyy is time_hierarchy.mmyyyy
level qtr_yyyy is time_hierarchy.qtr_yyyy
7 8 level yyyy is time_hierarchy.yyyy
9 hierarchy cust_rollup
10 (
11 cust_id child of
12 zip_code child of
13 region
14 )
15 hierarchy time_rollup
16 (
17 day child of
18 mmyyyy child of
19 qtr_yyyy child of
20 yyyy
21 )
22 attribute mmyyyy
23 determines mon_yyyy;
Dimension created.
Elapsed: 00:00:00.21
SQL> set autotrace on
SQL> select customer_hierarchy.region,
2 time_hierarchy.yyyy,
sum(sales.sales_amount) sales_amount
from sales, time_hierarchy, customer_hierarchy
3 4 5 where sales.trans_date = time_hierarchy.day
6 and sales.cust_id = customer_hierarchy.cust_id
group by customer_hierarchy.region, time_hierarchy.yyyy 7 ;
REGION YYYY SALES_AMOUNT
---------- ---------- ------------
0 2008 8.3478E+11
1 2008 8.8560E+11
2 2008 8.8461E+11
3 2008 8.8898E+11
4 2008 8.8777E+11
5 2008 8.3567E+11
6 rows selected.
Elapsed: 00:00:00.71
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=15 Card=5 Bytes=320)
1 0 SORT (GROUP BY) (Cost=15 Card=5 Bytes=320)
2 1 HASH JOIN (Cost=10 Card=429 Bytes=27456)
3 2 VIEW (Cost=3 Card=100 Bytes=700)
4 3 SORT (UNIQUE) (Cost=3 Card=100 Bytes=700)
5 4 INDEX (FULL SCAN) OF 'SYS_IOT_TOP_32048' (UNIQUE)
(Cost=1 Card=100 Bytes=700)
6 2 HASH JOIN (Cost=7 Card=429 Bytes=24453)
7 6 VIEW (Cost=4 Card=9 Bytes=63)
8 7 SORT (UNIQUE) (Cost=4 Card=9 Bytes=63)
9 8 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_32043' (U
NIQUE) (Cost=2 Card=366 Bytes=2562)
10 6 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 B
ytes=28600)
Statistics
----------------------------------------------------------
403 recursive calls
0 db block gets
98 consistent gets
0 physical reads
0 redo size
613 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
25 sorts (memory)
0 sorts (disk)
6 rows processed
分析:通过创建维数告诉oracle客户ID与邮政编码以及地区之间的关系之后,则
此SQL语句进行了查询重写,使用了新创建的物化视图。
现在才理解畏数是用来干什么的,以前看PL/SQL书的时候不知道维数有什么用。