你可以為列統計直方圖。這些直方圖為列資料的分佈情況提供準確的估算。當列資料傾斜時,直方圖提供更為最佳化的選擇性估算,為資料分佈不均勻的情況提供最優的執行計劃。
Oracle Database為提供2種類別的列統計資訊直方圖:
-
Height-Balanced Histograms
-
Frequency Histograms
資料庫儲存直方圖資訊*TAB_COL_STATISTICS檢視(使用者和DBA)。列值範圍:HEIGHTBALANCED, FREQUENCY, or NONE。
2、Height-Balanced Histograms
在height-balanced histogram中,列值被劃分為桶,使得每個桶包含大致相同數量的行。直方圖顯示端點在值範圍內的位置。
考慮一個my_col值為1到100之間的列,以及一個10個桶的直方圖。如果資料my_col均勻分佈,則直方圖看起來與圖13-1類似,其中數字是端點值。例如,第七個桶具有值在60到70之間的行。
圖13-1具有均勻分佈的高度平衡直方圖
每個桶中的行數為總行數的10%。在這個均勻分佈的例子中,40%的行的值在60到100之間。
如果資料不均勻分佈,則直方圖可能如圖13-2所示。在這種情況下,大多數行的列的值為5。只有10%的行的值在60到100之間。
圖13-2具有非均勻分佈的高度平衡直方圖
您可以使用USER_TAB_HISTOGRAMS表格檢視高度平衡的直方圖,如示例13-1所示。
-
BEGIN
-
DBMS_STATS
.
GATHER_table_STATS
(
-
OWNNAME
=
>
'OE'
,
-
TABNAME
=
>
'INVENTORIES'
,
-
METHOD_OPT
=
>
'FOR COLUMNS SIZE 10 quantity_on_hand'
)
;
-
END
;
-
/
-
-
SELECT
COLUMN_NAME
,
NUM_DISTINCT
,
NUM_BUCKETS
,
HISTOGRAM
-
FROM
USER_TAB_COL_STATISTICS
-
WHERE
TABLE_NAME
=
'INVENTORIES'
AND
COLUMN_NAME
=
'QUANTITY_ON_HAND'
;
-
-
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
-
------------------------------ ------------ ----------- ---------------
-
QUANTITY_ON_HAND 237 10 HEIGHT BALANCED
-
-
SELECT
ENDPOINT_NUMBER
,
ENDPOINT_VALUE
-
FROM
USER_TAB_HISTOGRAMS
-
WHERE
TABLE_NAME
=
'INVENTORIES'
AND
COLUMN_NAME
=
'QUANTITY_ON_HAND'
-
ORDER
BY
ENDPOINT_NUMBER
;
-
-
ENDPOINT_NUMBER ENDPOINT_VALUE
-
--------------- --------------
-
0 0
-
1 27
-
2 42
-
3 57
-
4 74
-
5 98
-
6 123
-
7 149
-
8 175
-
9 202
-
10 353
在示例13-1查詢輸出中,一行(1-10)對應於直方圖中的每個桶。Oracle資料庫向該直方圖新增了特殊的第0個資料桶,因為第1個資料桶(27)中的值不是quantity_on_hand列的最小值。第0個桶的最小值為0 quantity_on_hand。
3、
frequency histogra
m
在
frequency histogram中,列的每個值對應於直方圖的單個桶。每個桶包含此單個值的出現次數。例如,假設36行包含列的值1 warehouse_id。端點值1具有端點號36。
資料庫在以下條件下自動建立頻率直方圖,而不是高度平衡的直方圖:
-
不同值的數量小於或等於指定的直方圖桶數(最多254個)。
-
每個列值重複一次。
您可以使用USER_TAB_HISTOGRAMS檢視檢視頻率直方圖,如示例13-2所示。
-
BEGIN
-
DBMS_STATS
.
GATHER_TABLE_STATS
(
-
OWNNAME
=
>
'OE'
,
-
TABNAME
=
>
'INVENTORIES'
,
-
METHOD_OPT
=
>
'FOR COLUMNS SIZE 20 warehouse_id'
)
;
-
END
;
-
/
-
-
SELECT
COLUMN_NAME
,
NUM_DISTINCT
,
NUM_BUCKETS
,
HISTOGRAM
-
FROM
USER_TAB_COL_STATISTICS
-
WHERE
TABLE_NAME
=
'INVENTORIES'
AND
COLUMN_NAME
=
'WAREHOUSE_ID'
;
-
-
COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM
-
------------------------------ ------------ ----------- ---------------
-
WAREHOUSE_ID 9 9 FREQUENCY
-
-
SELECT
ENDPOINT_NUMBER
,
ENDPOINT_VALUE
-
FROM
USER_TAB_HISTOGRAMS
-
WHERE
TABLE_NAME
=
'INVENTORIES'
AND
COLUMN_NAME
=
'WAREHOUSE_ID'
-
ORDER
BY
ENDPOINT_NUMBER
;
-
-
ENDPOINT_NUMBER ENDPOINT_VALUE
-
--------------- --------------
-
36 1
-
213 2
-
261 3
-
370 4
-
484 5
-
692 6
-
798 7
-
984 8
-
1112 9
在例13-2中,第一個桶為warehouse_id1。該值在表中顯示36次,如以下查詢所證實:
oe@PROD> SELECT COUNT(*) FROM inventories WHERE warehouse_id = 1;
COUNT(*)
----------
36
5、
練習4
、直方圖最佳化練習
統計已銷戶使用者數量,請最佳化以下語句
select count(1) from ht.c_cons where status='close';
-
SQL
>
select
status
,
count
(
1
)
from
ht
.
c_cons
group
by
status
;
-
-
STATUS
COUNT
(
1
)
-
------------------------------------------------------------ ----------
-
close 19
-
open 9519
-
creating 462
-
-
SQL
>
create
index
ht
.
idx_c_cons_status
on
ht
.
c_cons
(
status
)
;
-
SQL
>
col
owner
for
a10
-
col
table_name
for
a20
-
col
column_name
for
a20
-
col
data_type
for
a30
-
col
histogram
for
a20
-
select
owner
,
table_name
,
column_name
,
data_type
,
-
column_id
,
num_distinct
,
histogram
,
NUM_NULLS
,
LAST_ANALYZED
from
-
dba_tab_columns
where
table_name
=
'C_CONS'
and
owner
=
'HT'
-
order
by
column_id
;
SQL
>
SQL
>
SQL
>
SQL
>
SQL
>
2 3 4
-
-
OWNER TABLE_NAME COLUMN_NAME DATA_TYPE COLUMN_ID NUM_DISTINCT HISTOGRAM NUM_NULLS LAST_ANALYZED
-
---------- -------------------- -------------------- ------------------------------ ---------- ------------ -------------------- ---------- ------------------------------
-
HT C_CONS CONS_NO
NUMBER
1 10000 NONE 0 20
-
AUG
-
17
-
HT C_CONS CONS_NAME
VARCHAR2
2 5057 NONE 0 20
-
AUG
-
17
-
HT C_CONS ORG_NAME
VARCHAR2
3 12 NONE 0 20
-
AUG
-
17
-
HT C_CONS BUILD_DATE
DATE
4 10000 NONE 0 20
-
AUG
-
17
-
HT C_CONS STATUS
VARCHAR2
5 3 NONE 0 20
-
AUG
-
17
-
-
SQL
>
exec DBMS_STATS
.
GATHER_TABLE_STATS
(
ownname
=
>
'HT'
,
tabname
=
>
'C_CONS'
,
estimate_percent
=
>
30
,
method_opt
=
>
'for columns size 50 status'
,
no_invalidate
=
>
FALSE
,
degree
=
>
4
,
cascade
=
>
TRUE
)
;
-
PL
/
SQL procedure successfully completed
.
-
SQL
>
col
owner
for
a10
-
SQL
>
col
table_name
for
a20
-
col
column_name
for
a20
-
col
data_type
for
a30
-
col
histogram
for
a20
-
select
owner
,
table_name
,
column_name
,
data_type
,
-
column_id
,
num_distinct
,
histogram
,
NUM_NULLS
,
LAST_ANALYZED
from
-
dba_tab_columns
where
table_name
=
'C_CONS'
and
owner
=
'HT'
-
order
by
column_id
;
SQL
>
SQL
>
SQL
>
SQL
>
2 3 4
-
-
OWNER TABLE_NAME COLUMN_NAME DATA_TYPE COLUMN_ID NUM_DISTINCT HISTOGRAM NUM_NULLS LAST_ANALYZED
-
---------- -------------------- -------------------- ------------------------------ ---------- ------------ -------------------- ---------- ------------------------------
-
HT C_CONS CONS_NO
NUMBER
1 10000 NONE 0 20
-
AUG
-
17
-
HT C_CONS CONS_NAME
VARCHAR2
2 5057 NONE 0 20
-
AUG
-
17
-
HT C_CONS ORG_NAME
VARCHAR2
3 12 NONE 0 20
-
AUG
-
17
-
HT C_CONS BUILD_DATE
DATE
4 10000 NONE 0 20
-
AUG
-
17
-
HT C_CONS STATUS
VARCHAR2
5 3 FREQUENCY 0 20
-
AUG
-
17
-
SQL
>
select
count
(
1
)
from
ht
.
c_cons
where
status
=
'open'
;
-
Execution Plan
-
----------------------------------------------------------
-
Plan hash
value
:
2016425671
-
-
-------------------------------------------------------------------------------------------
-
|
Id
|
Operation
|
Name
|
Rows
|
Bytes
|
Cost
(
%
CPU
)
|
Time
|
-
-------------------------------------------------------------------------------------------
-
|
0
|
SELECT
STATEMENT
|
|
1
|
6
|
8
(
0
)
|
00
:
00
:
01
|
-
|
1
|
SORT AGGREGATE
|
|
1
|
6
|
|
|
-
|
*
2
|
INDEX
FAST FULL SCAN
|
IDX_C_CONS_STATUS
|
9639
|
57834
|
8
(
0
)
|
00
:
00
:
01
|
-
-------------------------------------------------------------------------------------------
-
-
Predicate Information
(
identified
by
operation id
)
:
-
---------------------------------------------------
-
2
-
filter
(
"STATUS"
=
'open'
)
-
Statistics
-
----------------------------------------------------------
-
1 recursive calls
-
0 db block gets
-
28 consistent gets
-
0 physical reads
-
0 redo
size
-
527 bytes sent via SQL
*
Net
to
client
-
523 bytes received via SQL
*
Net
from
client
-
2 SQL
*
Net roundtrips
to
/
from
client
-
0 sorts
(
memory
)
-
0 sorts
(
disk
)
-
1
rows
processed
-
SQL
>
-
SQL
>
select
count
(
1
)
from
ht
.
c_cons
where
status
=
'close'
;
-
Execution Plan
-
----------------------------------------------------------
-
-
Plan hash
value
:
2292286995
-
-
---------------------------------------------------------------------------------------
-
|
Id
|
Operation
|
Name
|
Rows
|
Bytes
|
Cost
(
%
CPU
)
|
Time
|
-
---------------------------------------------------------------------------------------
-
|
0
|
SELECT
STATEMENT
|
|
1
|
6
|
1
(
0
)
|
00
:
00
:
01
|
-
|
1
|
SORT AGGREGATE
|
|
1
|
6
|
|
|
-
|
*
2
|
INDEX
RANGE SCAN
|
IDX_C_CONS_STATUS
|
24
|
144
|
1
(
0
)
|
00
:
00
:
01
|
-
---------------------------------------------------------------------------------------
-
-
Predicate Information
(
identified
by
operation id
)
:
-
---------------------------------------------------
-
-
2
-
access
(
"STATUS"
=
'close'
)
-
-
-
Statistics
-
----------------------------------------------------------
-
1 recursive calls
-
0 db block gets
-
2 consistent gets
-
0 physical reads
-
0 redo
size
-
526 bytes sent via SQL
*
Net
to
client
-
523 bytes received via SQL
*
Net
from
client
-
2 SQL
*
Net roundtrips
to
/
from
client
-
0 sorts
(
memory
)
-
0 sorts
(
disk
)
-
1
rows
processed