DB Issue Trouble Shooting Guideline (資料庫問題處理指導)
DB Issue Trouble Shooting Guideline
DB Issues :
• Performance Issues
• DB Hung
• Internal Error Raised
• Lock / Dead Lock
• Other Issues
Performance Issues :
• Most of (more 90%) performance issues are caused by Application.
• Few (less than 10%) of issues are caused by resource limitation, OS/RDBMS configuration and other reasons.
Work flow of identifying/solving Performance Issues.
• Identifying and Solving : DB Server , AP Server, Middle Ware, Client Program
Latent Reasons
• Bottleneck in Client Program
• PLSQL Issue
• Unreasonable configuration in RDBMS
• Network Issue
• Memory Issue
• CPU Issue
• IO Issue
Top Processes
• TOP command in UNIX
• Glance command in HP-UX
• Task Manager in Windows
Top Sessions in DB
• Find out the PID of top process from OS
• In dedicated server, find out the mapping session info in DB by the PID
select b.spid, a.sid, a.username, s.sql_text from v$session a , v$process b, v$sqlarea s where a.PADDR = b.ADDR and a.sql_hash_value = s.hash_value and b.spid=&PID;
Find out the long operation in the transaction
select username,sid,opname,
round ( sofar *100 / totalwork,0)||'%' as progress,
time_remaining,sql_text
from v$session_longops , v$sqlarea
where time_remaining <> 0
and sql_address = address
and sql_hash_value = hash_value;
Related View:
• V$PROCESS
• V$SESSION
• V$SQLAREA
• v$session_longops
Find more from Statspack Report
• Generate the report during peak time
• Analyze the report
Important Sections in the report
• Instance Efficiency Percentages
• Top 5 Wait Events
• SQL ordered by Gets for DB
• SQL ordered by Reads for DB
• Tablespace IO Stats for DB & File IO Stats for DB
Instance Efficiency Percentages
• Buffer Nowait Ratio
• Buffer Hit Ratio
• Library Hit Ratio
• Redo no-wait Ratio
• In-memory Sort Ratio
Instance Efficiency Percentages
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 99.08 In-memory Sort %: 99.60
Library Hit %: 99.46 Soft Parse %: 98.99
Execute to Parse %: 45.61 Latch Hit %: 100.00
Parse CPU to Parse Elapsd %: 87.88 % Non-Parse CPU: 100.00
Top 5 Wait Events:
• DB File Scattered Read.
• DB File Sequential Read
• Free Buffer
• Enqueue
• Log Buffer Space
• Log File Sync
• log file parallel write
Top 5 Wait Events
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 7,544,366 25,591 48.49
CPU time 14,126 26.76
log file sync 50,593 6,241 11.82
db file scattered read 1,473,891 2,154 4.08
log file parallel write 60,964 1,546 2.93
-------------------------------------------------------------
SQL ordered by Gets for DB
SQL ordered by Gets for DB: ICSSPRD Instance: icssprd Snaps: 179 -181
-> End Buffer Gets Threshold: 10000
-> Note that resources reported for PL/SQL includes the resources used by
all SQL statements called within the PL/SQL code. As individual SQL
statements are also reported, it is possible and valid for the summed
total % to exceed 100
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
91,163,470 567 160,782.1 19.2 850.46 949.33 1513226087
Module: SQL*Plus
SELECT DISTINCT b.filename
|| :"SYS_B_00" FROM css_tp_transaction_event a
,css_tp_txn_event_activity b WHERE a.trdng_ptnr_id
= :"SYS_B_01" AND a.event_type_id in (:"SYS
_B_02",:"SYS_B_03") AND a.txn_event_id = b.txn_ev
SQL ordered by Reads for DB:
SQL ordered by Reads for DB: ICSSPRD Instance: icssprd Snaps: 179 -181
-> End Disk Reads Threshold: 1000
CPU Elapsd
Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
11,993,968 120 99,949.7 36.2 443.71 552.02 4292891611
Module: JDBC Connect Client
BEGIN CSSPLSQL.B2B_MSG_DETECTOR.GET_B2B_OUT_MSG_DETECTOR ( :V1,
:V2, :V3); END;
Tablespace IO Stats for DB & File IO Stats for DB
Tablespace IO Stats for DB: ICSSPRD Instance: icssprd Snaps: 179 -181
->ordered by IOs (Reads + Writes) desc
Tablespace
------------------------------
Av Av Av Av Buffer Av Buf
Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)
-------------- ------- ------ ------- ------------ -------- ---------- ------
CSS_BKG_DATA
1,699,930 472 4.3 4.3 109,215 30 5,034 10.9
Generate SQL report for further study
SQL> @?rdbmsadminsprepsql.sql
……
Enter value for begin_snap: 886
…
Enter value for end_snap: 910
…
Enter value for hash_value: 4206207905
Generate SQL report for further study
Plans in shared pool between Begin and End Snap Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Shows the Execution Plans found in the shared pool between the begin and end
snapshots specified. The values for Rows, Bytes and Cost shown below are those
which existed at the time the first-ever snapshot captured this plan - these
values often change over time, and so may not be indicative of current values
-> Rows indicates Cardinality, PHV is Plan Hash Value
-> ordered by Plan Hash Value
--------------------------------------------------------------------------------
| Operation | PHV/Object Name | Rows | Bytes| Cost |
--------------------------------------------------------------------------------
|SELECT STATEMENT |----- 235857360 -----| | | 2021 |
|SORT UNIQUE | | 194 | 16K| 2021 |
| CONCATENATION | | | | |
SQL Trace on top SQL
• SET TIMING ON
• SET AUTOT[RACE] TRACE
• Statistics
• ----------------------------------------------------------
• 0 recursive calls
• 0 db block gets
• 159414 consistent gets
• 0 physical reads
• 0 redo size
• 146 bytes sent via SQL*Net to client
• 235 bytes received via SQL*Net from client
• 1 SQL*Net roundtrips to/from client
• 1 sorts (memory)
• 0 sorts (disk)
• 0 rows processed
DB Hung Issue
• It’s fatal fault
• Mostly, need work with OS & Oracle support team
• Run RDA in DB server host to collect information as soon
• Log SR in Oracle Metalink
Internal Error
• ORA-00600 to ORA-07445
• Mostly relate to Oracle internal bug
• Log SR in Metalink
• Send Trace file (under bdump folder) to Oracle
Lock Issue
• Top Event Enqueue
• Find out the TM, TX lock; Find out the holder
• Work with dev team to change the logic to reduce/release the locks
select * from DBA_WAITERS;
SELECT /*+ rule */ lpad(' ',decode(l.xidusn ,0,3,0))||l.oracle_username User_name,
o.owner, o.object_name, o.object_type, s.sid, s.serial#
FROM v$locked_object l,dba_objects o,v$session s
WHERE l.object_id=o.object_id
AND l.session_id=s.sid
ORDER BY o.object_id,xidusn DESC;
select sid,type,id1,id2,lmode,request,ctime,block from v$lock where type
in ('TM','TX');
• V$lock
• dba_waiters
• v$locked_objects
Dead Lock:
• It’s logical issue
• Two or more sessions locking each other
• Find out the related SQL from trace file
• Work with dev team to change the logic
DEADLOCK DETECTED
Current SQL statement for this session:
DELETE FROM CSS_SI_BL_DISTRIBUTION WHERE (SI_REQUEST_ID = :"SYS_B_0")
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-00005ed2-00000000 622 573 SX 156 226 SX SSX
TX-00060004-00063409 156 226 X 622 573 X
session 573: DID 0001-026E-00000012 session 226: DID 0001-009C-0000256C
session 226: DID 0001-009C-0000256C session 573: DID 0001-026E-00000012
Rows waited on:
Session 226: obj - rowid = 0000700D - AAAHANAC4AAAAAAAAA
(dictionary objn - 28685, file - 184, block - 0, slot - 0) -- CSS_SI_CARGO_FK4
Session 573: obj - rowid = 00005EB6 - AAAF62AC2AAApFQAAh
(dictionary objn - 24246, file - 182, block - 168272, slot - 33) -- CSS_SI_BL_DISTRIBUTION
Information on the OTHER waiting sessions:
Session 226:
pid=156 serial=25463 audsid=96195353 user: 33/CSSJAVA
O/S info: user: oracle, term: unknown, ospid: , machine: as04.cargosmart.com
program: JDBC Thin Client
application name: JDBC Thin Client, hash value=0
Current SQL Statement:
DELETE FROM CSS_SI_RELATED_BOOKING WHERE (SI_REQUEST_ID = :"SYS_B_0")
End of information on OTHER waiting sessions.
Other Issues
• Rollback segment can not be extend
• Snapshot of rollback segment is too old
Rollback segment can not be extend
• Long transaction involve much of undo log
• Find out the related SQL from the error information
• SQL tuning
• Split the transaction
• Transfer to batch insert / direct loading
• Extent rollback segment
Snapshot of rollback segment is too old
• Long transaction running, unable to get the consistent snapshot in rollback segment
• Find out the related SQL from the error information
• SQL Tuning
• Improve other much-undo consuming/holding transactions
• Setup special large segment for long transactions
• Extent rollback segment
Other OS commands for collect information
• sar
• iostat
• vmstat
• swapinfo
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/35489/viewspace-85032/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- trouble shooting案例分享
- 使用資料庫處理併發可能導致的問題資料庫
- 資料庫響應慢問題處理資料庫
- DNS查詢 and trouble-shooting caseDNS
- Trouble shooting latch: cache buffers chainsAI
- Trouble shooting for Pin S wait on XAI
- Trouble shooting(問題解決):centos 7 gnome show someting has gone wrong.CentOSGo
- 資料庫升級問題處理一則資料庫
- Oracle資料庫無效物件問題處理Oracle資料庫物件
- 資料處理--pandas問題
- DB2資料庫配置問題??DB2資料庫
- Oracle資料庫中的逐行處理問題NEOracle資料庫
- 近期處理的Oracle資料庫問題總結Oracle資料庫
- 一次OWB資料庫效能問題處理資料庫
- aix login shell permission error trouble shooting.AIError
- Oracle日常問題處理-資料庫無法啟動Oracle資料庫
- 資料庫主機重啟卡住問題處理分享資料庫
- Windows 下處理資料庫無法啟動問題Windows資料庫
- 資料庫如何處理大資料訪問資料庫大資料
- 記一次:歸檔檔案系統問題導致資料庫hang處理資料庫
- 【問題處理】因ASM磁碟組空間不足導致資料庫例項無法啟動的故障處理ASM資料庫
- 如何處理Oracle資料庫中的壞塊問題(轉)Oracle資料庫
- openGauss資料庫xlog目錄滿問題處理資料庫
- 有關分散式資料庫事務處理的問題分散式資料庫
- 一次資料庫不能歸檔問題的處理資料庫
- DB2資料庫連線失敗問題DB2資料庫
- X7一體機資料庫遷移問題處理資料庫
- 達夢資料庫日常管理之問題處理筆記1資料庫筆記
- python中多程式處理資料庫連線的問題Python資料庫
- 銀河麒麟系統安裝ORACLE資料庫問題處理Oracle資料庫
- 資料庫無響應問題的緊急處理和分析資料庫
- 一個關於資料庫閃回區問題的處理資料庫
- SQL Server資料庫中處理空值時常見問題SQLServer資料庫
- 【轉】 一次資料庫不能歸檔問題的處理資料庫
- 資料庫出現兩個Listener問題的官方處理方法資料庫
- 大資料處理需留意哪些問題大資料
- 應用使用JNDI,資料庫無法連線,導致的程序無法啟動問題處理資料庫
- ES同步Mysql資料庫(包括出現問題怎麼處理哦)MySql資料庫