在生產環境透過sql monitor監控到有一條sql執行效率很差。執行了大約5個小時，得到的sql monitor報告如下：

Global Information: EXECUTING

Instance ID

:

1

Buffer Gets

IO Requests

Database Time

Wait Activity

.

40M

.

17M

.

8450s

.

100%

Session

:

PRODUSER(14:22343)

SQL ID

:

fkzafpjs28d6d

SQL Execution ID

:

16777216

Execution Started

:

07/17/2014 12:02:17

First Refresh Time

:

07/17/2014 12:02:21

Last Refresh Time

:

07/17/2014 16:51:01

Duration

:

17328s

Module/Action

:

xxxxxxx (TNS V1-V3)/-

Service

:

PRODB

Program

:

NextPricePl@XXXX(TNS V1-V3)

Fetch Calls

:

2671

--》對應的sql語句如下：
select document.period_key,
document.cycle_seq_no,
document.ba_no,
document.customer_no,
bill_statement.pay_channel_no
from document, --千萬資料量 12671016 rows
cycle_control, --資料字典表，2118 rows
bill_statement, --千萬資料量 12671016 rows
cyc_payer_pop --百萬資料量 5400326 rows
where cycle_control.cycle_code = 2
and cycle_control.cycle_instance = 7
and cycle_control.cycle_year = 2014
and cyc_payer_pop.cycle_seq_no = cycle_control.cycle_seq_no
and cyc_payer_pop.db_status = 'BL'
and document.ba_no = cyc_payer_pop.ba_no
and document.cycle_seq_no = cyc_payer_pop.cycle_seq_no
and document.cycle_seq_run = cyc_payer_pop.cycle_seq_run
and document.period_key = cyc_payer_pop.period_key
and document.customer_key = cyc_payer_pop.customer_key
and document.doc_produce_ind in ('Y ', ' E ')
and document.document_status != ' N'
and bill_statement.ba_no = cyc_payer_pop.ba_no
and bill_statement.cycle_seq_no = document.cycle_seq_no
and bill_statement.cycle_seq_run = document.cycle_seq_run
and bill_statement.period_key = cyc_payer_pop.period_key
and bill_statement.customer_key = cyc_payer_pop.customer_key
and bill_statement.document_seq_no = document.doc_seq_no
可以透過執行計劃看到，效能的瓶頸主要在兩個地方，一個是做了全表掃描的部分表CYC_PAYER_POP，另外一個就是CPU資源的過度消耗，表DOCUMENT
SQL Plan Monitoring Details (Plan Hash Value=1606258714)

Id

Operation

Name

Estimated
Rows

Cost

Active Period
(17328s)

Execs

Rows

Memory

Temp

IO Requests

CPU Activity

Wait Activity

Progress

->

0

SELECT STATEMENT

.

1

270K

.

.12%

.

->

1

. NESTED LOOPS

.

1

270K

.

->

2

.. NESTED LOOPS

.

1

16500

.

1

270K

.

.12%

.

->

3

... NESTED LOOPS

.

23

16497

.

1

270K

.

->

4

.... NESTED LOOPS

.

56

16441

.

1

270K

.

5

..... TABLE ACCESS BY INDEX ROWID

CYCLE_CONTROL

1

.

1

.

->

6

...... INDEX UNIQUE SCAN

CYCLE_CONTROL_1UQ

1

.

1

.

->

7

.....PARTITION RANGE ALL

.

56

16440

.

1

270K

.

->

8

...... TABLE ACCESS FULL

CYC_PAYER_POP

56

16440

.

171

270K

.

4925 (<0.1%)

.12%

.07%

2859s

->

9

....PARTITION RANGE ITERATOR

.

1

.

270K

.

.35%

.

->

10

..... TABLE ACCESS BY LOCAL INDEX ROWID

DOCUMENT

1

.

270K

.

17M (98%)

.

96%

.

96%

.

->

11

...... INDEX RANGE SCAN

DOCUMENT_1IX

10

1

.

270K

2M

.

261K (1.5%)

.

1.5%

.

1.9%

.

->

12

...PARTITION RANGE ITERATOR

.

1

.

301K

270K

.

.47%

.

->

13

.... INDEX UNIQUE SCAN

BILL_STATEMENT_1IX

1

.

301K

270K

.

56807 (.3%)

.70%

.

1.3%

.

->

14

.. TABLE ACCESS BY LOCAL INDEX ROWID

BL1_BILL_STATEMENT

1

.

324K

270K

.

17599 (.1%)

.47%

.23%

.

可以看到CYC_PAYER_POP 做了全表掃描，估算出的資料條數是56條。
而絕大多數的IO都是在DOCUMENT,IO請求達1700萬次，大約是128G的資料量，而且還不停的走索引做資料查取。
奇怪的是估算的資料返回量和實際執行的資料返回差距太大,返回結果大約有27萬條。

->

8

...... TABLE ACCESS FULL

CYC_PAYER_POP

56

16440

.

171

270K

.

4925 (<0.1%)

但是第8步的資料得到。一次56條資料返回* 執行171次=9.5k 但是實際的返回結果得到了270K,差距實在太大。這是執行計劃不正確導致的。

明確了上面一步，下面DOCUMENT表做了1700萬次的io查詢就可以理解了，這些都是連帶的問題。

從千萬的資料中得到27萬的資料，還是很小的資料範圍。
首先排查得到從資料字典表cycle_control中根據如下的條件，得到的資料只有一行。這和執行計劃是一致的。
cycle_control.cycle_code = 2
and cycle_control.cycle_instance = 7
and cycle_control.cycle_year = 2014

因為document和bill_statement都是千萬資料量的大表。所以在和它們兩個表做資料關聯的時候應該儘可能的過濾掉大部分資料。因為資料最終的返回只有27萬，相對千萬的資料還是很小的一部分。從百萬的資料中走全表掃描還是不小的消耗，看看能不能從索引的角度入手。

索引資訊如下：
INDEX_NAME TABLESPACE INDEX_TYPE UNIQUENES PAR COLUMN_LIST TABLE_TYPE STATUS NUM_ROWS LAST_ANAL
------------------------------ ---------- ---------- --------- --- ------------------------------ ---------- ------ ---------- ---------
CYC_PAYER_POP_1IX NORMAL NONUNIQUE YES CUSTOMER_NO TABLE N/A 5320775 16-JUL-14 N
CYC_PAYER_POP_2IX NORMAL NONUNIQUE YES CONFIRM_GROUP,CYCLE_SEQ_NO TABLE N/A 5642000 16-JUL-14 N
CYC_PAYER_POP_3IX NORMAL NONUNIQUE YES FORMAT_EXT_GROUP,CYCLE_SEQ_NO, TABLE N/A 5623545 16-JUL-14 N
DB_STATUS
CYC_PAYER_POP_4IX NORMAL NONUNIQUE YES GROUP_ID,CYCLE_SEQ_NO TABLE N/A 5142606 16-JUL-14 N
CYC_PAYER_POP_5IX NORMAL NONUNIQUE YES QA_GROUP,CYCLE_SEQ_NO TABLE N/A 5776258 16-JUL-14 N
CYC_PAYER_POP_PK NORMAL UNIQUE YES BA_NO,CYCLE_SEQ_NO,PERIOD_KEY, TABLE N/A 5368484 16-JUL-14 N

碰巧的是在資料的連線條件和輸出列中，都是和主鍵相關的一些列。這樣就可以考慮透過hint來啟用索引了。當然啟用索引也有一定的標準，在這個查詢中。
透過索引和過濾條件查到的資料有不到30萬，資料量是500多萬，佔到的資料比例不到10%，是可以考慮啟用索引的。如果資料結果集較大，啟用索引反而不利於資料的查詢速度。

明確了這一點，我嘗試把CYC_PAYER_POP的查詢和資料字典表結合起來，過濾掉絕大部分資料。形成一個子查詢。
在子查詢中，啟用了hint來強制查詢按照計劃的順序和索引來執行。
(select /*+ leading (c p) index(p CYC_PAYER_POP_PK)*/
p.ba_no,p.cycle_seq_no,p.cycle_seq_run,p.period_key,p.customer_key
from cyc_payer_pop p, cycle_control c
where c.cycle_code = 2
and c.cycle_instance = 7
and c.cycle_year = 2014
and p.cycle_seq_no = c.cycle_seq_no
and p.db_status = 'BL' ) cyc_payer_pop

然後在這個基礎上，再和兩個大表做關聯，

最佳化後的sql語句如下：
select /*+ leading( cyc_payer_pop bill_statement document)*/
document.period_key,
document.cycle_seq_no,
document.ba_no,
document.customer_no,
bill_statement.pay_channel_no
from document, --千萬資料量 12671016 rows
bill_statement ,--千萬資料量 12671016 rows
(select /*+ leading (c p) index(p CYC_PAYER_POP_PK)*/
p.ba_no,p.cycle_seq_no,p.cycle_seq_run,p.period_key,p.customer_key
from cyc_payer_pop p, cycle_control c
where c.cycle_code = 2
and c.cycle_instance = 7
and c.cycle_year = 2014
and p.cycle_seq_no = c.cycle_seq_no
and p.db_status = 'BL' ) cyc_payer_pop
where
and document.ba_no = cyc_payer_pop.ba_no
and document.cycle_seq_no = cyc_payer_pop.cycle_seq_no
and document.cycle_seq_run = cyc_payer_pop.cycle_seq_run
and document.period_key = cyc_payer_pop.period_key
and document.customer_key = cyc_payer_pop.customer_key
and document.doc_produce_ind in ('Y ', ' E ')
and document.document_status != ' N'
and bill_statement.ba_no = cyc_payer_pop.ba_no
and bill_statement.cycle_seq_no = document.cycle_seq_no
and bill_statement.cycle_seq_run = document.cycle_seq_run
and bill_statement.period_key = cyc_payer_pop.period_key
and bill_statement.customer_key = cyc_payer_pop.customer_key
and bill_statement.document_seq_no = document.doc_seq_no

最佳化後的執行計劃如下。document表的io請求數從1700萬次，降低到了將近8萬次。解決了效能瓶頸。
SQL Plan Monitoring Details (Plan Hash Value=1573871804)

Id

Operation

Name

Estimated
Rows

Cost

Active Period
(247s)

Execs

Rows

Memory
(Max)

Temp
(Max)

IO Requests

CPU Activity

Wait Activity

.

0

SELECT STATEMENT

.

1

291K

.

1

. NESTED LOOPS

.

1

291K

.

2

.. NESTED LOOPS

.

1

5406

.

1

291K

.

3

... NESTED LOOPS

.

27

5403

.

1

291K

.

4

.... NESTED LOOPS

.

56

5347

.

1

291K

.

5

..... TABLE ACCESS BY INDEX ROWID

CYCLE_CONTROL

1

.

1

.

6

...... INDEX UNIQUE SCAN

CYCLE_CONTROL_1UQ

1

.

1

.

7

.....PARTITION RANGE ALL

.

56

5346

.

1

291K

.

8

...... TABLE ACCESS BY LOCAL INDEX ROWID

CYC_PAYER_POP

56

5346

.

181

291K

.

10002 (8.3%)

.

20%

.

4.6%

.

9

.......INDEX FULL SCAN

CYC_PAYER_POP_PK

29672

2540

.

181

479K

.

4673 (3.9%)

.

6.1%

.

10

....PARTITION RANGE ITERATOR

.

1

.

291K

.

11

..... TABLE ACCESS BY LOCAL INDEX ROWID

DOCUMENT

1

.

291K

.

79597 (66%)

.

52%

.

68%

.

12

...... INDEX RANGE SCAN

DOCUMENT_1IX

10

1

.

291K

2M

.

3877 (3.2%)

.

8.0%

.

2.6%

.

13

...PARTITION RANGE ITERATOR

.

1

.

294K

291K

.

14

.... INDEX UNIQUE SCAN

BILL_STATEMENT_1IX

1

.

294K

291K

.

4574 (3.8%)

.

12%

.

3.6%

.

15

.. TABLE ACCESS BY LOCAL INDEX ROWID

BILL_STATEMENT

1

.

308K

291K

.

15323 (12%)

.

8.0%

.

14%

當然了關鍵的還是查取速度。
查詢速度也從原本的5個小時降低到了4-5分鐘。
在測試和生產環境中實際執行的速度和預期是一致的。
290896 rows selected.
Elapsed: 00:04:08.04

291001 rows selected.
Elapsed: 00:05:08.66

生產環境sql語句調優實戰第二篇

相關文章