MAX or MIN of Indexed Column

shiyihai發表於2007-07-26

最近在檢視一資料庫主機的效能時意外發現一個程式佔用的系統資源特別高,並且持續佔用著。
Load averages: 2.02, 1.87, 2.09
538 processes: 530 sleeping, 8 running
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 1.83 54.0% 0.0% 7.1% 38.9% 0.0% 0.0% 0.0% 0.0%
1 2.21 35.4% 0.0% 7.5% 57.1% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 2.02 44.7% 0.0% 7.3% 48.0% 0.0% 0.0% 0.0% 0.0%

Memory: 2677008K (431320K) real, 4241176K (651432K) virtual, 320544K free Page# 1/16

CPU TTY PID USERNAME PRI NI SIZE RES STATE TIME %WCPU %CPU COMMAND
0 ? 20209 oracle 154 20 2211M 2540K sleep 5:50 59.13 59.03 oracleora9i
0 ? 20491 oracle 235 20 2211M 2540K run 0:50 11.69 11.67 oracleora9i
1 ? 20288 oracle 154 20 2211M 2836K sleep 0:17 3.76 3.76 oracleora9i
0 ? 20062 oracle 154 20 2211M 2848K sleep 0:39 3.74 3.74 oracleora9i
1 ? 20290 oracle 154 20 2211M 2772K sleep 0:19 3.56 3.56 oracleora9i

[@more@]

程式id為20209
SQL>
SQL> select ses.sid
2 from v$session ses,v$process pro
3 where pro.spid=&spid
4 and ses.paddr=pro.addr;

SID
----------
882
得到對應的session的id為882,於是可以知道是那個使用者在執行什麼語句,見如下的查詢:
SQL> select * from v$session where sid=882;

SADDR SID SERIAL# AUDSID PADDR USER# USERNAME COMMAND OWNERID TADDR LOCKWAIT STATUS SERVER SCHEMA# SCHEMANAME OSUSER PROCESS MACHINE TERMINAL PROGRAM TYPE SQL_ADDRESS SQL_HASH_VALUE PREV_SQL_ADDR PREV_HASH_VALUE MODULE MODULE_HASH ACTION ACTION_HASH CLIENT_INFO FIXED_TABLE_SEQUENCE ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK# ROW_WAIT_ROW# LOGON_TIME LAST_CALL_ET PDML_ENABLED FAILOVER_TYPE FAILOVER_METHOD FAILED_OVER RESOURCE_CONSUMER_GROUP PDML_STATUS PDDL_STATUS PQ_STATUS CURRENT_QUEUE_DURATION CLIENT_IDENTIFIER
---------------- ---------- ---------- ---------- ---------------- ---------- ------------------------------ ---------- ---------- ---------------- ---------------- -------- --------- ---------- ------------------------------ ------------------------------ ------------ ---------------------------------------------------------------- ------------------------------ ------------------------------------------------ ---------- ---------------- -------------- ---------------- --------------- ------------------------------------------------ ----------- -------------------------------- ----------- ---------------------------------------------------------------- -------------------- ------------- -------------- --------------- ------------- ----------- ------------ ------------ ------------- --------------- ----------- -------------------------------- ----------- ----------- --------- ---------------------- ----------------------------------------------------------------
C0000000B0E994F0 882 29585 2134187 C0000000B0B38768 202 CEMPUSH 3 2147483644 ACTIVE DEDICATED 202 CEMPUSH 1234 sv378 USER C0000000BB4BEBA0 3128267399 C0000000BB4BEBA0 3128267399 0 0 6670189 0 162 161 0 2007-7-26 1 10 NO NONE NONE NO DEFAULT_CONSUMER_GROUP DISABLED ENABLED ENABLED 0

SQL>
SQL>
SQL> select sql_text
2 from v$sqltext_with_newlines
3 where (hash_value,address) in (select sql_hash_value,sql_address from v$session where sid=&sid) order by address,piece;

SQL_TEXT
----------------------------------------------------------------
select max(id) from CEMMESSAGE where requesttime between to_dat
e(to_char(sysdate-1,'yyyymmdd')||'000000','yyyy-mm-dd hh24:mi:ss
') and to_date(to_char(sysdate,'yyyymmdd')||'235959','yyyy-mm-d
d hh24:mi:ss') and SendHostIP = :1 and id >= :2

這裡我們可以看到,CEMPUSH這個使用者一直在執行上述的這個語句。經分析CEMPUSH為分割槽表,在id列上有local nonprefixed的index。如下的查詢結果多少有點意外:
SQL>
SQL> set timing on
SQL>
SQL> select max(id) from CEMMESSAGE
2 where requesttime between to_date(to_char(sysdate-1,'yyyymmdd')||'000000','yyyy-mm-dd hh24:mi:ss')
3 and to_date(to_char(sysdate,'yyyymmdd')||'235959','yyyy-mm-dd hh24:mi:ss')
4 and SendHostIP = '10.1.3.39' and id >= 263441;

MAX(ID)
----------
289327

Executed in 2.796 seconds

SQL> select max(id) from CEMMESSAGE;

MAX(ID)
----------
289327

Executed in 0.015 seconds
明細第二個求max的語句執行效率比第一個快N倍!看執行計劃結果如下:
SQL>
可看到
SQL> explain plan for
2 select max(id) from CEMMESSAGE
3 where requesttime between to_date(to_char(sysdate-1,'yyyymmdd')||'000000','yyyy-mm-dd hh24:mi:ss')
4 and to_date(to_char(sysdate,'yyyymmdd')||'235959','yyyy-mm-dd hh24:mi:ss')
5 and SendHostIP = '10.1.3.39' and id >= 263441;

Explained

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34
| 1 | SORT AGGREGATE | | 1 | 34
|* 2 | FILTER | | |
| 3 | PARTITION RANGE ITERATOR | | |
|* 4 | TABLE ACCESS BY LOCAL INDEX ROWID| CEMMESSAGE | 1 | 34
|* 5 | INDEX RANGE SCAN | INDEX_CEMMESSAGE | 1797 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_DATE(TO_CHAR(
hh24:mi:ss')<=TO_DATE(TO_CHAR(
4 - filter("CEMMESSAGE"."REQUESTTIME">=TO_DATE(TO_CHAR(
hh24:mi:ss') AND "CEMMESSAGE"."REQUESTTIME"<=TO_DATE(TO_CHAR(SYSDA
hh24:mi:ss') AND "CEMMESSAGE"."SENDHOSTIP"='10.1.3.39')

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
5 - access("CEMMESSAGE"."ID">=263441)
Note: cpu costing is off

23 rows selected

SQL>
SQL> explain plan for select max(id) from cemmessage;

Explained

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4
| 1 | SORT AGGREGATE | | 1 | 13 |
| 2 | PARTITION RANGE ALL | | | |
| 3 | INDEX FULL SCAN (MIN/MAX)| INDEX_CEMMESSAGE | 199K| 2534K| 4
--------------------------------------------------------------------------------
Note: cpu costing is off

11 rows selected

SQL>

可以看到,第二個語句走的是INDEX FULL SCAN (MIN/MAX)和PARTITION RANGE ALL,而第一個語句走的是INDEX RANGE SCAN和PARTITION RANGE ITERATOR。透過查詢文件瞭解到max函式時(max的列有對應的索引),是沿著最右邊的root-branch node-leaf node來查詢的,發現最右邊的leaf block是空的,於是沿著逆向指標往左走,一直走到最左邊發現都是空的,於是掃描了所有的leaf blocks。min的原理同max,採取類似的做法。如下是oracle的描述:
This access path is available for a SELECT statement, and all of the following conditions are true:
The query uses the MAX or MIN function to select the maximum or minimum value of either the column of a single-column index or the leading column of a composite index. The index cannot be a cluster index. The argument to the MAX or MIN function can be any expression involving the column, a constant, or the addition operator (+), the concatenation operation (||), or the CONCAT function.
There are no other expressions in the select list.
The statement has no WHERE clause or GROUP BY clause.
To execute the query, Oracle performs a range scan of the index to find the maximum or minimum indexed value. Because only this value is selected, Oracle need not access the table after scanning the index.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/38542/viewspace-928325/,如需轉載,請註明出處,否則將追究法律責任。

相關文章