給兄弟專案做Statspack最佳化過程

foreverlee發表於2006-03-24

目前資料庫負載處理事務情況
需要指出的是資料庫每秒鐘需要處理事務數(業務和Oracle資料字典本身)為99.53.說明資料庫在24-Feb-05 16:00:03至24-Feb-05 22:00:05期間比較繁忙.
每秒鐘的物理讀(disk reads)為4,558.93*8k=35671k(34M左右)表示: 每秒鐘處理業務需要系統級的記憶體交換在34M左右.這一點是需要降低的,看到這裡我感覺到資料緩衝區中的命中率不會高.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 84,012.03 844.10
Logical reads: 7,510.41 75.46
Block changes: 514.31 5.17
Physical reads: 4,558.93 45.81
Physical writes: 9.13 0.09
User calls: 215.01 2.16
Parses: 4.12 0.04
Hard parses: 0.01 0.00
Sorts: 2.13 0.02
Logons: 0.12 0.00
Executes: 152.38 1.53
Transactions: 99.53

[@more@]

STATSPACK report for

DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
GPS 4182057045 gps 1 9.2.0.4.0 NO dbsvr1

我採集了24-Feb-05 16:00:03至24-Feb-05 22:00:05的效能資料作為此篇報告依據.
Snap Id Snap Time Sessions Curs/Sess Comment
------- ------------------ -------- --------- -------------------
Begin Snap: 2 24-Feb-05 16:00:03 44 6.4
End Snap: 8 24-Feb-05 22:00:05 47 6.0
Elapsed: 360.03 (mins)


目前資料庫配置:
Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,904M Std Block Size: 8K
Shared Pool Size: 224M Log Buffer: 1,024K


目前資料庫負載處理事務情況
需要指出的是資料庫每秒鐘需要處理事務數(業務和Oracle資料字典本身)為99.53.說明資料庫在24-Feb-05 16:00:03至24-Feb-05 22:00:05期間比較繁忙.
每秒鐘的物理讀(disk reads)為4,558.93*8k=35671k(34M左右)表示: 每秒鐘處理業務需要系統級的記憶體交換在34M左右.這一點是需要降低的,看到這裡我感覺到資料緩衝區中的命中率不會高.
Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 84,012.03 844.10
Logical reads: 7,510.41 75.46
Block changes: 514.31 5.17
Physical reads: 4,558.93 45.81
Physical writes: 9.13 0.09
User calls: 215.01 2.16
Parses: 4.12 0.04
Hard parses: 0.01 0.00
Sorts: 2.13 0.02
Logons: 0.12 0.00
Executes: 152.38 1.53
Transactions: 99.53


例項效能分析:
Buffer Hit %: 資料緩衝區中的命中率為39.30% 這個指標比較低,通常應當在90%以上.提升這個指標需要做的工作很多,稍後介紹.
Buffer Nowait Ratio: 在緩衝區中獲取buffer的未等待比率為99.99%,間接反映邏輯讀的成功率。通常應當在99%以上.
Soft Parse Ratio:99.83% 資料庫軟分析解析率為99.44%. 通常高代表可能使用了繫結變數(或者share_pool有足夠空間存放sql執行計劃也可以提高軟分析率,但這是錯誤的),太低需要調整應用使用繫結變數,或者參考 cursor_sharing = similar ,9i以上版本.通常應當在99%以上.這是一個很重要的指標.
由於Soft Parse Ratio與shared pool的設定有很大關係,下面還好繼續討論.

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 39.30 In-memory Sort %: 100.00
Library Hit %: 99.99 Soft Parse %: 99.83
Execute to Parse %: 97.29 Latch Hit %: 99.97
Parse CPU to Parse Elapsd %: 100.28 % Non-Parse CPU: 99.79


共享池資料統計(這部分對於效能很重要).
Memory Usage %:共享池的使用率,應該穩定在75%--90%之間(穩定!!!),太小浪費記憶體,太大則顯記憶體不足.這裡為37%左右. 表示shared pool還有足夠空間生成和存放新的sql執行計劃。
Percent of SQLs with Execution:執行次數大於1的sql的比率(若太小可能是沒有使用繫結變數) 這裡為39%.
執行次數大於1的含義: 相同的sql語句如果使用繫結變數其執行計劃應當在shared pool中保留一份.
Percent of Memory for SQl with Execution: 執行次數大於1的sql消耗記憶體/(所有sql消耗記憶體) 越大越好.表示充分使用繫結變數.這裡為:35%.
Shared Pool Statistics Begin End
------ ------
Memory Usage %: 35.90 38.07
% SQL with executions>1: 39.18 39.44
% Memory for SQL w/exec>1: 35.17 34.50


前5個資料庫內部等待事件
log file sync:當客戶段commit,rollback一個事務的時候,lgwr(一個oracle內部程式,主要完成日誌寫任務)會將這段時間的重做日誌寫入到物理日誌檔案中.日誌檔案的同步必須等待這一過程的完成.這個等待事件的發生應當有這樣幾個原因:
1〉 物理硬碟本身寫的速度慢
2〉 資料庫日誌檔案組每個成員應當平均分配在不同的盤上,而不是同一個盤的不同分割槽.
SQL> select group#,MEMBER from v$logfile;

GROUP#
----------
MEMBER
--------------------------------------------------------------------------------
3
/opt/ora9/oradata/gps/redo03.log

2
/opt/ora9/oradata/gps/redo02.log

1
/opt/ora9/oradata/gps/redo01.log
我們這裡沒有做日誌檔案組成員的映象,但不能說明日誌檔案設定的沒有問題.原因:
SQL> select file_name from dba_data_files;
FILE_NAME
--------------------------------------------------------------------------------
/opt/ora9/oradata/gps/system01.dbf
/opt/ora9/oradata/gps/perfstat.dbf
/opt/ora9/oradata/gps/cwmlite01.dbf
/opt/ora9/oradata/gps/drsys01.dbf
/gps/example01.dbf
/opt/ora9/oradata/gps/indx01.dbf
/opt/ora9/oradata/gps/odm01.dbf
/opt/ora9/oradata/gps/tools01.dbf
/gps/users01.dbf
/opt/ora9/oradata/gps/xdb01.dbf
/gps/users02.dbf
FILE_NAME
--------------------------------------------------------------------------------
/gps/undotbs21.dbf
日誌檔案和資料檔案應當分配在不同的磁碟上,減少磁碟徵用(Oracle預設安裝的日誌檔案目錄設定是需要調整的).

db file scattered read:這種情況通常顯示與全表掃描相關的等待.通常對於全表掃描,出於效能考慮,資料回分散的讀入至DB_Cache中.如果這個事件過高,可能說明對於有些全表掃描的表沒有建立索引或者沒有建立正確的索引或者沒有正確使用索引.
進一步確定是哪張表的問題,發現MCC2.COUNTER_SINGLE表
SQL> select TARGET,count(*) from v$session_longops
2 group by TARGET;
TARGET COUNT(*)
---------------------------------------------------------------- ----------
MCC2.COUNTER_SINGLE 127


control file parallel write :和log file sync的問題一樣,在安裝Oracle資料庫的時候應當將control file和資料檔案分開.而現在的redo日誌檔案,所有資料檔案,control file都位於同一磁碟.由於業務資料訪問相當頻繁,資料檔案磁碟寫很密集,當Oracle寫redo的時候(每三秒)就會發生等待.

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
log file sync 2,161,184 16,122 69.26
CPU time 5,043 21.66
db file scattered read 6,213,357 1,448 6.22
control file parallel write 6,904 602 2.58
latch free 3,237 13 .06
-------------------------------------------------------------









問題級初步解決方案
1 redo日誌檔案,所有資料檔案,control file控制檔案位於同一磁碟.
解決方法: 將redo日誌檔案,control file,資料檔案(靜態資料)至於磁碟A,資料檔案(動態資料,主要是業務資料)至於磁碟B.

2 沒有充足繫結變數
解決方法: 見最佳化你的應用--請使用繫結變數.txt


3 索引使用情況. 特別是COUNTER_SINGLE表的索引是否合理,是否正確使用了索引 (這類問題的解決需要多方面配合)
見:最佳化你的系統--索引(一) 正確使用索引.txt


4 需要最佳化的sql
我覺得前2個查詢要開並行了.不是有4個cpu麼
1>
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE" FROM
"MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "ONLATITUDE"
<=:1 AND "ONLATITUDE">=:2 AND "ONLONGITUDE"<=:3 AND "ONLONGITUDE
">=:4 AND "UTC"<=:5 AND "UTC">=:6
2>
SELECT "UTC","RECORDID","OFFLONGITUDE","OFFLATITUDE","PRICE" FRO
M "MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "OFFLATITU
DE"<=:1 AND "OFFLATITUDE">=:2 AND "OFFLONGITUDE"<=:3 AND "OFFLON
GITUDE">=:4 AND "UTC"<=:5 AND "UTC">=:6
3> 使用表連線代替in
select DISTINCT OGM.opid from OGMEMBER OGM,OGOPT where OGM.opid=
:1 and OGOPT.moptcode=:2 and OGM.ogid in(select OGOPT.ogid from
OGOPT OGOPT, SERVICEGROUP SG, SERVICEUNIT SU where SU.oemcode=:3
and SU.commaddr=:4 and SG.sgid=SU.sgid and OGOPT.sgid=SG.sgid)
4>
select * from MCC_LASTTRACK where SUID in (select SUID from SERV
ICEVIEW where MAC=:1) 使用表連線代替in
select ml.*
from MCC_LASTTRACK ml,
SERVICEVIEW ser,
where ml.suid=ser.suid; (MCC_LASTTRACK表suid有索引)

5 關於shared pool問題.
現在的Shared Pool Size:224M,而shared pool的利用率為37%左右.而且資料庫每秒鐘需要處理事務數(業務和Oracle資料字典本身)為99.53.資料庫如此繁忙,但shared pool的利用率仍為為37%.我猜測蘇州天澤應用系統的查詢種類並不是很多,但查詢數量是非常非常大的.查詢種類如果不多,那麼應用採用繫結變數後產生的sql執行計劃就應當不會持續增長.
我認為將shared pool的大小控制在150M左右是比較恰當的.

6 關於DB Cache的問題
現在Buffer Hit %: 39.30
導致這個問題是多方面的。2,3,4
當然DB_Cache_size這個引數也可以設大點。2.5G可以考慮.


7 先簡單說一下COUNTER_SINGLE表的查詢.

1>
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
49,813,694 384 129,723.2 30.7 1800.14 1763.39 1923892574
Module: (TNS V1-V3)
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE" FROM
"MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "ONLATITUDE"
<=:1 AND "ONLATITUDE">=:2 AND "ONLONGITUDE"<=:3 AND "ONLONGITUDE
">=:4 AND "UTC"<=:5 AND "UTC">=:6
分析:
SELECT "UTC","RECORDID","ONLONGITUDE","ONLATITUDE","PRICE"
FROM "MCC2"."COUNTER_SINGLE" "A1"
WHERE "RECORDID">0
AND "ONLATITUDE"<=:1
AND "ONLATITUDE">=:2
AND "ONLONGITUDE"<=:3
AND "ONLONGITUDE">=:4
AND "UTC"<=:5
AND "UTC">=:6
ONLONGITUDE,ONLATITUDE沒有索引? 這會是一個恐怖的全表掃描(FTS)啊.

COUNTER_SINGLE表索引情況
SQL> column INDEX_NAME for a20
SQL> column COLUMN_NAME for a20
SQL> column TABLE_NAME for a20
SQL> l
1 select INDEX_NAME,COLUMN_NAME,TABLE_NAME from dba_ind_columns
2 where INDEX_OWNER='MCC2'
3* and TABLE_NAME='COUNTER_SINGLE'
SQL> /

INDEX_NAME COLUMN_NAME TABLE_NAME
-------------------- -------------------- --------------------
COUNTER_SINGLE_IDX SUID COUNTER_SINGLE
COUNTER_SINGLE_IDX RECORDID COUNTER_SINGLE
COUNTER_SINGLE_PK SUID COUNTER_SINGLE
COUNTER_SINGLE_PK UTC COUNTER_SINGLE

2> 也是一樣ONLONGITUDE,ONLATITUDE沒有索引?
49,813,585 384 129,722.9 30.7 1841.68 1802.62 635376328
Module: (TNS V1-V3)
SELECT "UTC","RECORDID","OFFLONGITUDE","OFFLATITUDE","PRICE" FRO
M "MCC2"."COUNTER_SINGLE" "A1" WHERE "RECORDID">0 AND "OFFLATITU
DE"<=:1 AND "OFFLATITUDE">=:2 AND "OFFLONGITUDE"<=:3 AND "OFFLON
GITUDE">=:4 AND "UTC"<=:5 AND "UTC">=:6

3>
33,962,880 14 2,425,920.0 20.9 85.44 83.67 1822985056
Module: JDBC Thin Client
select trim(to_char(su.OEMCODE, '000X')) || ':' || su.COMMADDR a
s MAC, su.SUID,o.OEMNAME,v.VNAME,v.VCAT,sg.sgname,c.cname,c2.cn
ame as cname2,c3.cname as cname3 from SERVICEUNIT su,SERVICEGRO
UP sg, VEHICLE v, taxi_vehicle tv,OEM o,CUSTOMER c,CUSTOMER c2,C
USTOMER c3 where su.SGID in (select DISTINCT sgid from OGOPT

分析:
首先根據hash_value找到全部sql語句
SQL> select sql_text from v$sqltext where hash_value=&hash_value order by piece;
Enter value for hash_value: 1822985056
old 1: select sql_text from v$sqltext where hash_value=&hash_value order by piece
new 1: select sql_text from v$sqltext where hash_value=1822985056 order by piece

SQL_TEXT
----------------------------------------------------------------
select trim(to_char(su.OEMCODE, '000X')) || ':' || su.COMMADDR as MAC su.SUID,o.OEMNAME,v.VNAME,v.VCAT,sg.sgname,c.cname,c2.cn
ame as cname2,c3.cname as cname3
from SERVICEUNIT su,
SERVICEGROUP sg,
VEHICLE v,
taxi_vehicle tv,
OEM o,
CUSTOMER c,
CUSTOMER c2,
CUSTOMER c3
where su.SGID in
(select DISTINCT sgid
from OGOPT
where OGID in ( select OGID
from OGMEMBER
where OPID = :1))
and su.VID = v.VID
and su.vid = tv.vid(+)
and su.OEMCODE=o.OEMCODE
and su.sgid = sg.sgid and su.cid = c.cid
and tv.driver2id = c2.cid(+)
and tv.driver3id = c3.cid(+)
and su.stid = 1
order by mac
首先in要用子查詢改寫.



來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/76065/viewspace-822632/,如需轉載,請註明出處,否則將追究法律責任。

相關文章