DB Issue Trouble Shooting Guideline

DB Issues :

• Performance Issues

• DB Hung

• Internal Error Raised

• Lock / Dead Lock

• Other Issues

Performance Issues :

• Most of (more 90%) performance issues are caused by Application.

• Few (less than 10%) of issues are caused by resource limitation, OS/RDBMS configuration and other reasons.

Work flow of identifying/solving Performance Issues.

• Identifying and Solving : DB Server , AP Server, Middle Ware, Client Program

Latent Reasons

• Bottleneck in Client Program

• PLSQL Issue

• Unreasonable configuration in RDBMS

• Network Issue

• Memory Issue

• CPU Issue

• IO Issue

Top Processes

• TOP command in UNIX

• Glance command in HP-UX

• Task Manager in Windows

Top Sessions in DB

• Find out the PID of top process from OS

• In dedicated server, find out the mapping session info in DB by the PID

select b.spid, a.sid, a.username, s.sql_text from v$session a , v$process b, v$sqlarea s where a.PADDR = b.ADDR and a.sql_hash_value = s.hash_value and b.spid=&PID;

Find out the long operation in the transaction

select username,sid,opname,

round ( sofar *100 / totalwork,0)||'%' as progress,

time_remaining,sql_text

from v$session_longops , v$sqlarea

where time_remaining <> 0

and sql_address = address

and sql_hash_value = hash_value;

Related View:

• V$PROCESS

• V$SESSION

• V$SQLAREA

• v$session_longops

Find more from Statspack Report

• Generate the report during peak time

• Analyze the report

Important Sections in the report

• Instance Efficiency Percentages

• Top 5 Wait Events

• SQL ordered by Gets for DB

• SQL ordered by Reads for DB

• Tablespace IO Stats for DB & File IO Stats for DB

Instance Efficiency Percentages

• Buffer Nowait Ratio

• Buffer Hit Ratio

• Library Hit Ratio

• Redo no-wait Ratio

• In-memory Sort Ratio

Instance Efficiency Percentages

Instance Efficiency Percentages (Target 100%)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Buffer Nowait %: 100.00 Redo NoWait %: 100.00

Buffer Hit %: 99.08 In-memory Sort %: 99.60

Library Hit %: 99.46 Soft Parse %: 98.99

Execute to Parse %: 45.61 Latch Hit %: 100.00

Parse CPU to Parse Elapsd %: 87.88 % Non-Parse CPU: 100.00

Top 5 Wait Events:

• DB File Scattered Read.

• DB File Sequential Read

• Free Buffer

• Enqueue

• Log Buffer Space

• Log File Sync

• log file parallel write

Top 5 Wait Events

Top 5 Timed Events

~~~~~~~~~~~~~~~~~~ % Total

Event Waits Time (s) Ela Time

-------------------------------------------- ------------ ----------- --------

db file sequential read 7,544,366 25,591 48.49

CPU time 14,126 26.76

log file sync 50,593 6,241 11.82

db file scattered read 1,473,891 2,154 4.08

log file parallel write 60,964 1,546 2.93

-------------------------------------------------------------

SQL ordered by Gets for DB

SQL ordered by Gets for DB: ICSSPRD Instance: icssprd Snaps: 179 -181

-> End Buffer Gets Threshold: 10000

-> Note that resources reported for PL/SQL includes the resources used by

all SQL statements called within the PL/SQL code. As individual SQL

statements are also reported, it is possible and valid for the summed

total % to exceed 100

CPU Elapsd

Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value

--------------- ------------ -------------- ------ -------- --------- ----------

91,163,470 567 160,782.1 19.2 850.46 949.33 1513226087

Module: SQL*Plus

SELECT DISTINCT b.filename

|| :"SYS_B_00" FROM css_tp_transaction_event a

,css_tp_txn_event_activity b WHERE a.trdng_ptnr_id

= :"SYS_B_01" AND a.event_type_id in (:"SYS

_B_02",:"SYS_B_03") AND a.txn_event_id = b.txn_ev

SQL ordered by Reads for DB:

SQL ordered by Reads for DB: ICSSPRD Instance: icssprd Snaps: 179 -181

-> End Disk Reads Threshold: 1000

CPU Elapsd

Physical Reads Executions Reads per Exec %Total Time (s) Time (s) Hash Value

--------------- ------------ -------------- ------ -------- --------- ----------

11,993,968 120 99,949.7 36.2 443.71 552.02 4292891611

Module: JDBC Connect Client

BEGIN CSSPLSQL.B2B_MSG_DETECTOR.GET_B2B_OUT_MSG_DETECTOR ( :V1,

:V2, :V3); END;

Tablespace IO Stats for DB & File IO Stats for DB

Tablespace IO Stats for DB: ICSSPRD Instance: icssprd Snaps: 179 -181

->ordered by IOs (Reads + Writes) desc

Tablespace

------------------------------

Av Av Av Av Buffer Av Buf

Reads Reads/s Rd(ms) Blks/Rd Writes Writes/s Waits Wt(ms)

-------------- ------- ------ ------- ------------ -------- ---------- ------

CSS_BKG_DATA

1,699,930 472 4.3 4.3 109,215 30 5,034 10.9

Generate SQL report for further study

SQL> @?rdbmsadminsprepsql.sql

……

Enter value for begin_snap: 886

…

Enter value for end_snap: 910

…

Enter value for hash_value: 4206207905

Generate SQL report for further study

Plans in shared pool between Begin and End Snap Ids

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Shows the Execution Plans found in the shared pool between the begin and end

snapshots specified. The values for Rows, Bytes and Cost shown below are those

which existed at the time the first-ever snapshot captured this plan - these

values often change over time, and so may not be indicative of current values

-> Rows indicates Cardinality, PHV is Plan Hash Value

-> ordered by Plan Hash Value

--------------------------------------------------------------------------------

--------------------------------------------------------------------------------

|SELECT STATEMENT |----- 235857360 -----| | | 2021 |

|SORT UNIQUE | | 194 | 16K| 2021 |

SQL Trace on top SQL

• SET TIMING ON

• SET AUTOT[RACE] TRACE

• Statistics

• ----------------------------------------------------------

• 0 recursive calls

• 0 db block gets

• 159414 consistent gets

• 0 physical reads

• 0 redo size

• 146 bytes sent via SQL*Net to client

• 235 bytes received via SQL*Net from client

• 1 SQL*Net roundtrips to/from client

• 1 sorts (memory)

• 0 sorts (disk)

• 0 rows processed

DB Hung Issue

• It’s fatal fault

• Mostly, need work with OS & Oracle support team

• Run RDA in DB server host to collect information as soon

• Log SR in Oracle Metalink

Internal Error

• ORA-00600 to ORA-07445

• Mostly relate to Oracle internal bug

• Log SR in Metalink

• Send Trace file (under bdump folder) to Oracle

Lock Issue

• Top Event Enqueue

• Find out the TM, TX lock; Find out the holder

• Work with dev team to change the logic to reduce/release the locks

select * from DBA_WAITERS;

SELECT /*+ rule */ lpad(' ',decode(l.xidusn ,0,3,0))||l.oracle_username User_name,

o.owner, o.object_name, o.object_type, s.sid, s.serial#

FROM v$locked_object l,dba_objects o,v$session s

WHERE l.object_id=o.object_id

AND l.session_id=s.sid

ORDER BY o.object_id,xidusn DESC;

select sid,type,id1,id2,lmode,request,ctime,block from v$lock where type

in ('TM','TX');

• V$lock

• dba_waiters

• v$locked_objects

Dead Lock:

• It’s logical issue

• Two or more sessions locking each other

• Find out the related SQL from trace file

• Work with dev team to change the logic

DEADLOCK DETECTED

Current SQL statement for this session:

DELETE FROM CSS_SI_BL_DISTRIBUTION WHERE (SI_REQUEST_ID = :"SYS_B_0")

The following deadlock is not an ORACLE error. It is a

deadlock due to user error in the design of an application

or from issuing incorrect ad-hoc SQL. The following

information may aid in determining the deadlock:

Deadlock graph:

---------Blocker(s)-------- ---------Waiter(s)---------

Resource Name process session holds waits process session holds waits

TM-00005ed2-00000000 622 573 SX 156 226 SX SSX

TX-00060004-00063409 156 226 X 622 573 X

session 573: DID 0001-026E-00000012 session 226: DID 0001-009C-0000256C

session 226: DID 0001-009C-0000256C session 573: DID 0001-026E-00000012

Rows waited on:

Session 226: obj - rowid = 0000700D - AAAHANAC4AAAAAAAAA

(dictionary objn - 28685, file - 184, block - 0, slot - 0) -- CSS_SI_CARGO_FK4

Session 573: obj - rowid = 00005EB6 - AAAF62AC2AAApFQAAh

(dictionary objn - 24246, file - 182, block - 168272, slot - 33) -- CSS_SI_BL_DISTRIBUTION

Information on the OTHER waiting sessions:

Session 226:

pid=156 serial=25463 audsid=96195353 user: 33/CSSJAVA

O/S info: user: oracle, term: unknown, ospid: , machine: as04.cargosmart.com

program: JDBC Thin Client

application name: JDBC Thin Client, hash value=0

Current SQL Statement:

DELETE FROM CSS_SI_RELATED_BOOKING WHERE (SI_REQUEST_ID = :"SYS_B_0")

End of information on OTHER waiting sessions.

Other Issues

• Rollback segment can not be extend

• Snapshot of rollback segment is too old

Rollback segment can not be extend

• Long transaction involve much of undo log

• Find out the related SQL from the error information

• SQL tuning

• Split the transaction

• Transfer to batch insert / direct loading

• Extent rollback segment

Snapshot of rollback segment is too old

• Long transaction running, unable to get the consistent snapshot in rollback segment

• Find out the related SQL from the error information

• SQL Tuning

• Improve other much-undo consuming/holding transactions

• Setup special large segment for long transactions

• Extent rollback segment

Other OS commands for collect information

• sar

• iostat

• vmstat

• swapinfo

DB Issue Trouble Shooting Guideline (資料庫問題處理指導)

相關文章