[20120214]異常資料導致執行計劃改變.txt

lfree發表於2012-02-15
今天上午,使用者反應一條sql執行有點慢。我檢查發現,原來使用索引的語句現在變成了全表掃描,而且昨晚oracle資料庫自動分析過這個表。

語句很複雜,抽取有問題的部分:

SELECT *
  FROM med_operation_schedule a
 WHERE (       scheduled_date_time >= TO_DATE ('2012-02-15 00:00', 'yyyy-mm-dd hh24:mi')
           AND scheduled_date_time < TO_DATE ('2012-02-15 23:59', 'yyyy-mm-dd hh24:mi')
        OR scheduled_date_time IS NULL
        OR NVL (emergency_indicator, 0) = 1 AND scheduled_date_time > TRUNC (SYSDATE, 'dd')
       )
執行計劃是全表掃描。把那個or單獨拆開來分析,發現這個條件走的是全表掃描NVL (emergency_indicator, 0) = 1 AND scheduled_date_time > TRUNC (SYSDATE, 'dd')。奇怪!這個條件scheduled_date_time > TRUNC (SYSDATE, 'dd')的記錄不會很多。

select * from med_operation_schedule a where a.scheduled_date_time > TRUNC (SYSDATE, 'dd');

這才發現原來裡面存在一條scheduled_date_time='5011-7-17 16:30:00' 異常記錄。

這樣造成最佳化器認為大於TRUNC (SYSDATE, 'dd')的記錄不會很多,執行計劃選擇全表掃描。

解決方法:
1.要求操作員更正資料,再分析表,這個不能保證以後不再出現,或者程式要做必要的檢查,不能輸入這樣的日期。
2.在該欄位建立直方圖,不過10g很麻煩,後臺的分析Method_Opt=> 'FOR ALL COLUMNS SIZE AUTO ',這樣可能下一次分析直方圖又會被取消。看來自己該修改自動分析的預設引數為Method_Opt=> 'FOR ALL COLUMNS SIZE REPEAT'
3.我實在不想跟他們提,我選擇的方法是修改統計資訊。

方法如下:
1.取出表定義:
exp system/xxxx@yyyy tables=(zzzz.med_operation_schedule) rows=N file=med.dmp

2.過濾出指令碼:[注意要加-3引數,具體看man strings文件,不然會丟失資訊]
strings -3 med.dmp > med.txt

3.找到如下內容:

DECLARE  SREC DBMS_STATS.STATREC; BEGIN SREC.MINVAL := '303231363032'; SREC.MAXVAL := '303231363036'; SREC.EAVS := 0; SREC.CHVALS := NULL; #
SREC.NOVALS := DBMS_STATS.NUMARRAY(I
250248268640273000000000000000000000,250248268640292000000000000000000000&
); SREC.BKVALS := DBMS_STATS.NUMARRAY(
0,1
); SREC.EPC := 2; DBMS_STATS.SET_COLUMN_STATS(NULL,'"MED_OPERATION_SCHEDULE"','"OPERATING_ROOM"', NULL ,NULL,NULL,4,.25,0,srec,7,6); END;

--刪除一些怪異的字元,重新排版,這樣可以很好檢查是否寫錯!
DECLARE
   srec   DBMS_STATS.statrec;
BEGIN
   srec.minval := '786E0B1E090101';
   srec.maxval := '966F0711111F01';
   srec.eavs := 0;
   srec.chvals := NULL;
   srec.novals := DBMS_STATS.numarray (2455531.33333333, 3551487.6875);
   srec.bkvals := DBMS_STATS.numarray (0,1);
   srec.epc := 2;
   DBMS_STATS.set_column_stats (NULL,
                                '"MED_OPERATION_SCHEDULE"',
                                '"SCHEDULED_DATE_TIME"',
                                NULL,
                                NULL,
                                NULL,
                                7555,
                                .000132362673726009,
                                0,
                                srec,
                                8,
                                6
                               );
END;

4.最大最小如何修改呢?需要了解srec.minval以及srec.maxval轉換。google找到如下連結:


SELECT column_name, data_type, low_value,high_value, density,rtrim(
               to_char(100*(to_number(substr(low_value,1,2),'XX')-100)
                      + (to_number(substr(low_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(low_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(low_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(low_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(low_value,13,2),'XX')-1,'fm00')) l,
               rtrim(
               to_char(100*(to_number(substr(high_value,1,2),'XX')-100)
                      + (to_number(substr(high_value,3,2),'XX')-100),'fm0000')||'-'||
               to_char(to_number(substr(high_value,5,2),'XX'),'fm00')||'-'||
               to_char(to_number(substr(high_value,7,2),'XX'),'fm00')||' '||
               to_char(to_number(substr(high_value,9,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,11,2),'XX')-1,'fm00')||':'||
               to_char(to_number(substr(high_value,13,2),'XX')-1,'fm00')) h
  FROM dba_tab_cols
 WHERE table_name = 'MED_OPERATION_SCHEDULE' AND column_name = 'SCHEDULED_DATE_TIME'

COLUMN_NAME,DATA_TYPE,LOW_VALUE,HIGH_VALUE,DENSITY,L,H
SCHEDULED_DATE_TIME,DATE,786E0B1E090101,966F0711111F01,0.000132362673726009,2010-11-30 08:00:00,5011-07-17 16:30:00,

--比較麻煩。放棄這樣算的方法!

5.採用建立一個表的方法,在測試庫建立:
export NLS_DATE_FORMAT="YYYY-MM-DD HH24:MI:SS"

SQL> create table t(vd date);
Table created.

SQL> insert into t values('2010-11-30 08:00:00');
1 row created.

SQL> insert into t values('2012-2-15 18:00:00');
1 row created.

SQL> commit;
Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=> 't', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

SQL> column column_name format a10
SQL> column data_type format a10
SQL> column lower_value format a20
SQL> column high_value format a20
SQL> SELECT column_name, data_type, low_value, high_value, density   FROM dba_tab_cols WHERE table_name = 'T' AND column_name = 'VD';
COLUMN_NAM DATA_TYPE  LOW_VALUE                                                        HIGH_VALUE              DENSITY
---------- ---------- ---------------------------------------------------------------- -------------------- ----------
VD         DATE       786E0B1E090101                                                   7870020F130101               .5

HIGH_VALUE=7870020F130101

這個如何修改呢?srec.novals := DBMS_STATS.numarray (2455531.33333333, 3551487.6875);
還是使用emp的方法(略),與上面相同:

SREC.NOVALS := DBMS_STATS.NUMARRAY(2455531.33333333,2455973.75)

最後修改如下:

DECLARE
   srec   DBMS_STATS.statrec;
BEGIN
   srec.minval := '786E0B1E090101';
   srec.maxval := '7870020F130101';
   srec.eavs := 0;
   srec.chvals := NULL;
--   srec.novals := DBMS_STATS.numarray (2455531.33333333, 3551487.6875);
   SREC.NOVALS := DBMS_STATS.NUMARRAY(2455531.33333333,2455973.75);
   srec.bkvals := DBMS_STATS.numarray (0,1);
   srec.epc := 2;
   DBMS_STATS.set_column_stats (NULL,
                                '"MED_OPERATION_SCHEDULE"',
                                '"SCHEDULED_DATE_TIME"',
                                NULL,
                                NULL,
                                NULL,
                                7555,
                                .000132362673726009,
                                0,
                                srec,
                                8,
                                6
                               );
END;

6.在測試上面的sql語句,發現可以使用索引了。只不過加了bitmap convert+bitmap or操作。

7.鎖定以後不分析表。
BEGIN
  SYS.DBMS_STATS.LOCK_TABLE_STATS (
      OwnName        => 'MEDSURGERY'
     ,TabName        => 'MED_OPERATION_SCHEDULE');
END;
/

整個最佳化完成!

8.BTW最終沒有選擇這樣的方式,我還是建立了直方圖。僅僅是為了學習!

9.補充學習:
關於SREC.NOVALS := DBMS_STATS.NUMARRAY(2455531.33333333,2455973.75);裡面的數字,表示的是Julian format, ie number of days since 1st Jan 4712BC.


SQL> select 2455973.75+to_date(1,'J')-1 from dual;

2455973.75+TO_DATE
-------------------
2012-02-15 18:00:00

SQL>  select TO_DATE(TRUNC(2455973.75),'J')+(2455973.75-TRUNC(2455973.75)) from dual;
TO_DATE(TRUNC(24559
-------------------
2012-02-15 18:00:00

反過來呢?如果知道日期如何轉換呢?'J'僅僅取到整數部分。修改如下:

SQL> select to_char(to_date('2012-02-15 18:00:00','YYYY-MM-DD HH24:MI:SS') ,'J')+to_date('2012-02-15 18:00:00','YYYY-MM-DD HH24:MI:SS')
-trunc(to_date('2012-02-15 18:00:00','YYYY-MM-DD HH24:MI:SS')) x from dual
SQL> /
         X
----------
2455973.75

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-716298/,如需轉載,請註明出處,否則將追究法律責任。

相關文章