讀書筆記之MV_DM

wmlm發表於2007-07-31

<>

對維度、實體化檢視、查詢重寫的理解

以前,我對實體化檢視的理解仍停留在8I的快照概念上,雖然知道它可以做統計,但對維的理解仍不清楚,查詢重寫也是知其一不知其二,很慚愧總是在MV上做查詢。
這次正好專案中的統計用到了MV,以及日報表、月報表,於是想使用MV來解決一些問題。以下是參考AKADIA的資料所做的兩個測試。只看白皮書會知道個大概,還需要實際操練一下。

[@more@]

測試一說明了實體化檢視的查詢重寫功能,可以改善查詢效能;
測試二說明了如何讓實體化檢視與維度結合起來使用。
更深入的比如在MV上聚合、子集等內容沒有研究。話說過來,既然查詢重寫的結果是讀MV,那麼不如我直接讀MV來操作。但是,舉一返三,會用時間維,就可以延伸使用其它維,比如公安單位,也可以建立一個單位維表,在儲存日報表時,儲存派出所的單位程式碼,在維表中儲存其上級、上上級程式碼名稱。

-- 測試一

-- create user and grant it
SQL> create user dw identified by dw;
SQL> alter user dw default tablespace users;
SQL> grant connect,resource to dw;
grant create dimension to dw;
SQL> show parameter compati
SQL> show parameter query_rewrite
SQL> grant query rewrite to dw;
SQL> grant create materialized view to dw;

-- conn dw and test it.
sqlplus dw/dw
set echo on
set termout off

drop table bigtab;

create table bigtab
nologging
as
select * from all_objects
union all
select * from all_objects
union all
select * from all_objects
/

insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;
insert /*+ APPEND */ into bigtab
select * from bigtab;
commit;

analyze table bigtab compute statistics;
select count(*) from bigtab;

COUNT(*)
----------
708456

Initially this quewry will require a full scan of the large table.

set autotrace on
set timing on
select owner, count(*) from bigtab group by owner;

OWNER COUNT(*)
------------------------------ ----------
CTXSYS 6264
ELAN 1272
HR 816
......

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=878 Card=9 Bytes=45)
1 0 SORT (GROUP BY) (Cost=878 Card=9 Bytes=45)
2 1 TABLE ACCESS (FULL) OF 'BIGTAB' (Cost=396 Card=327648 By
tes=1638240)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4032 consistent gets
1503 physical reads
0 redo size
592 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
9 rows processed

set autotrace off

alter session set query_rewrite_enabled=true;
alter session set query_rewrite_integrity=enforced;

create materialized view mv_bigtab
build immediate
refresh on commit
enable query rewrite
as
select owner, count(*)
from bigtab
group by owner
/

set autotrace traceonly
select owner,count(*) from bigtab group by owner;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=9 Bytes=90)
1 0 TABLE ACCESS (FULL) OF 'MV_BIGTAB' (Cost=2 Card=9 Bytes=90
)

Statistics
----------------------------------------------------------
25 recursive calls
0 db block gets
19 consistent gets
1 physical reads
0 redo size
592 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
9 rows processed

SQL> set timing off
SQL> set autotrace off
SQL> drop materialized view mv_bigtab;
SQL> drop table bigtab;
-- 結論:使用MV的查詢重寫功能,可以減少一致性讀和物理讀。
-- 測試二

Fact table: Sales
Dimension Table: Time

create user dw identified by dw;
alter user dw default tablespace users;
grant connect,resource to dw;
grant create dimension to dw;
show parameter compati
show parameter query_rewrite
grant query rewrite to dw;
grant create materialized view to dw;

-- Setup of the sales table
CREATE TABLE sales (
trans_date DATE,
cust_id INT,
sales_amount NUMBER
);

INSERT /*+ APPEND */ INTO sales
SELECT TRUNC(SYSDATE,'YYYY')+MOD(ROWNUM,366) trans_date,
MOD(ROWNUM,100) cust_id,
ABS(DBMS_RANDOM.RANDOM)/100 sales_amount
FROM all_objects
/
COMMIT;

-- 再多插入一些資料
BEGIN
FOR i IN 1 .. 10
LOOP
INSERT /*+ APPEND */ INTO sales
SELECT trans_date, cust_id,
ABS(DBMS_RANDOM.RANDOM)/100 sales_amount
FROM sales;
COMMIT;
END LOOP;
END;
/

SELECT COUNT(*) FROM sales;

COUNT(*)
----------
3494912

-- Setup of the TIME table
drop table time;
CREATE TABLE time (
day PRIMARY KEY,
mmyyyy,
mon_yyyy,
qtr_yyyy,
yyyy
)
ORGANIZATION INDEX
AS
SELECT DISTINCT
trans_date DAY,
CAST (TO_CHAR(trans_date,'MMYYYY') AS NUMBER) MMYYYY,
TO_CHAR(trans_date,'MON-YYYY') MON_YYYY,
'Q' || CEIL(TO_CHAR(trans_date,'MM')/3) || ' FY'
|| TO_CHAR(trans_date,'YYYY') QTR_YYYY,
CAST(TO_CHAR(trans_date, 'YYYY') AS NUMBER) YYYY
FROM sales
/

-- Setup of the SALES_MV materialized view
ANALYZE TABLE SALES COMPUTE STATISTICS;
ANALYZE TABLE TIME COMPUTE STATISTICS;

CREATE MATERIALIZED VIEW sales_mv
BUILD IMMEDIATE
REFRESH ON DEMAND
ENABLE QUERY REWRITE
AS
SELECT sales.cust_id,
SUM(sales.sales_amount) sales_amount,
time.mmyyyy
FROM sales, time
WHERE sales.trans_date = time.day
GROUP BY sales.cust_id, time.mmyyyy
/

ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE;
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=TRUSTED;

SET AUTOTRACE ON

SELECT time.mmyyyy, SUM(sales_amount)
FROM sales, time
WHERE sales.trans_date = time.day
GROUP BY time.mmyyyy
/
SET AUTOTRACE OFF

......
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=572 Bytes=148
72)

1 0 SORT (GROUP BY) (Cost=5 Card=572 Bytes=14872)
2 1 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 Bytes=14872)
......
Oracle rewrote the query to use the materialized view SALES_MV (see Execution Plan).
However, let's see what happens if we issue a query that calls for a higher level of aggregation.

SET TIMING ON
SET AUTOTRACE ON
SELECT time.qtr_yyyy, SUM(sales_amount)
FROM sales, time
WHERE sales.trans_date = time.day
GROUP BY time.qtr_yyyy
/
SET AUTOTRACE OFF

QTR_YYYY SUM(SALES_AMOUNT)
---------- -----------------
Q1 FY2002 1.2507E+12
Q1 FY2003 1.3510E+10
Q2 FY2002 1.2674E+12
Q3 FY2002 1.2762E+12
Q4 FY2002 1.2672E+12

Elapsed: 00:00:16.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=14571 Card=5 Bytes=1
45)

1 0 SORT (GROUP BY) (Cost=14571 Card=5 Bytes=145)
2 1 NESTED LOOPS (Cost=1084 Card=3496960 Bytes=101411840)
3 2 TABLE ACCESS (FULL) OF 'SALES' (Cost=1084 Card=3496960 Bytes=45460480)
4 2 INDEX (UNIQUE SCAN) OF 'SYS_IOT_TOP_9678' (UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3508089 consistent gets
9091 physical reads
660 redo size
586 bytes sent via SQL*Net to client
499 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
5 rows processed

We see that Oracle does not yet that it could have used the materialized view to answer this particular query, so it used the original SALES table instead, and had to do a lot of work to get the answer. The same bad thing would happen if we requested data aggregated by fiscal year.

-- Setup of the TIME_DIM dimension
CREATE DIMENSION time_dim
LEVEL DAY IS time.day
LEVEL MMYYYY IS time.mmyyyy
LEVEL QTR_YYYY IS time.qtr_yyyy
LEVEL YYYY IS time.yyyy
HIERARCHY TIME_ROLLUP
(
day CHILD OF
mmyyyy CHILD OF
qtr_yyyy CHILD OF
yyyy
)
ATTRIBUTE mmyyyy DETERMINES mon_yyyy; -- attribute 是說明mmyyyy 決定 mon_yyyy的值

/*
CREATE TABLE time (
day PRIMARY KEY,
mmyyyy,
mon_yyyy,
qtr_yyyy,
yyyy
)*/

conn dw/dw
SET TIMING ON
SET AUTOTRACE ON
ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE;
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=TRUSTED;

SELECT time.qtr_yyyy, SUM(sales_amount)
FROM sales, time
WHERE sales.trans_date = time.day
GROUP BY time.qtr_yyyy
/
SET AUTOTRACE OFF

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=18 Card=5 Bytes=195)
1 0 SORT (GROUP BY) (Cost=18 Card=5 Bytes=195)
2 1 HASH JOIN (Cost=7 Card=2024 Bytes=78936)
3 2 VIEW (Cost=4 Card=46 Bytes=598)
4 3 SORT (UNIQUE) (Cost=4 Card=46 Bytes=598)
5 4 INDEX (FAST FULL SCAN) OF 'SYS_IOT_TOP_9678' (UNIQUE) (Cost=2 Card=366 Bytes=4758)

6 2 TABLE ACCESS (FULL) OF 'SALES_MV' (Cost=2 Card=572 Bytes=14872)

-- 同樣也可以使用上MV

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/271063/viewspace-929675/,如需轉載,請註明出處,否則將追究法律責任。

相關文章