[20150709]慎用標量子查詢.txt

lfree發表於2015-07-09

[20150709]慎用標量子查詢.txt

--花了幾天的時間除錯sql語句,最終發現是標量子查詢在作怪,原始的語句太複雜,我拿scott做一個例子來說明問題。

1.建立測試環境:

SCOTT@test> @ver1
PORT_STRING                    VERSION        BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx            11.2.0.3.0     Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

create view v_deptemp1 as select emp.*,dname,loc from emp,dept where dept.deptno=emp.deptno ;
create view v_deptemp2 as select emp.*,(select dname from dept where dept.deptno=emp.deptno) dname ,(select loc from dept where dept.deptno=emp.deptno )loc from emp ;

--第2種寫法就是標量子查詢。

2.測試1:
create index i_dept_dname on scott.dept (dname);
create index i_emp_deptno on scott.emp (deptno)

SCOTT@test> select * from v_deptemp1 where dname='ACCOUNTING' ;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- -------------
      7782 CLARK      MANAGER         7839 1981-06-09 00:00:00       2450                    10 ACCOUNTING     NEW YORK
      7839 KING       PRESIDENT            1981-11-17 00:00:00       5000                    10 ACCOUNTING     NEW YORK
      7934 MILLER     CLERK           7782 1982-01-23 00:00:00       1300                    10 ACCOUNTING     NEW YORK

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  09tbznuqjnxa4, child number 0
-------------------------------------
select * from v_deptemp1 where dname='ACCOUNTING'
Plan hash value: 2863776355
-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |        |       |     3 (100)|          |
|   1 |  NESTED LOOPS                 |              |        |       |            |          |
|   2 |   NESTED LOOPS                |              |      5 |   295 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT         |      1 |    20 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | I_DEPT_DNAME |      1 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | I_EMP_DEPTNO |      5 |       |     0   (0)|          |
|   6 |   TABLE ACCESS BY INDEX ROWID | EMP          |      5 |   195 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$F5BB74E1
   3 - SEL$F5BB74E1 / DEPT@SEL$2
   4 - SEL$F5BB74E1 / DEPT@SEL$2
   5 - SEL$F5BB74E1 / EMP@SEL$2
   6 - SEL$F5BB74E1 / EMP@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("DNAME"='ACCOUNTING')
   5 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")

--oracle 很好的分析確定索引,並且對emp表也可以使用i_emp_deptno。

--看看第2種情況:
SCOTT@test> select * from v_deptemp2 where dname='ACCOUNTING' ;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- -------------
      7782 CLARK      MANAGER         7839 1981-06-09 00:00:00       2450                    10 ACCOUNTING     NEW YORK
      7839 KING       PRESIDENT            1981-11-17 00:00:00       5000                    10 ACCOUNTING     NEW YORK
      7934 MILLER     CLERK           7782 1982-01-23 00:00:00       1300                    10 ACCOUNTING     NEW YORK

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  45acjyf0f8x54, child number 0
-------------------------------------
select * from v_deptemp2 where dname='ACCOUNTING'
Plan hash value: 2698003519
-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |        |       |     3 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPT       |      1 |    13 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | PK_DEPT    |      1 |       |     0   (0)|          |
|   3 |  TABLE ACCESS BY INDEX ROWID| DEPT       |      1 |    11 |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN         | PK_DEPT    |      1 |       |     0   (0)|          |
|*  5 |  VIEW                       | V_DEPTEMP2 |     14 |  1456 |     3   (0)| 00:00:01 |
|   6 |   TABLE ACCESS FULL         | EMP        |     14 |   546 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$3 / DEPT@SEL$3
   2 - SEL$3 / DEPT@SEL$3
   3 - SEL$4 / DEPT@SEL$4
   4 - SEL$4 / DEPT@SEL$4
   5 - SEL$2 / V_DEPTEMP2@SEL$1
   6 - SEL$2 / EMP@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DEPT"."DEPTNO"=:B1)
   4 - access("DEPT"."DEPTNO"=:B1)
   5 - filter("DNAME"='ACCOUNTING')

--這樣無論如何都不會使用emp的i_emp_deptno。當然我實際上的情況實際上開發想避免dname 為null的情況。
--因為標量子查詢如果查詢dname為null不是他所需要的。

3.測試2:
--再建立一個表tx
create table tx ( id number,name varchar2(20));
insert into tx values (7654,'MARTIN');
commit;

SCOTT@test> select * from v_deptemp1 v ,tx where tx.id=v.empno and dname is not null;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC                   ID NAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- ------------
      7654 MARTIN     SALESMAN        7698 1981-09-28 00:00:00       1250       1400         30 SALES          CHICAGO             7654 MARTIN

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  gskw0uw49rw0k, child number 0
-------------------------------------
select * from v_deptemp1 v ,tx where tx.id=v.empno and dname is not null
Plan hash value: 2230440165
------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name         | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |              |        |       |     5 (100)|          |
|   1 |  NESTED LOOPS                  |              |        |       |            |          |
|   2 |   NESTED LOOPS                 |              |      1 |    70 |     5   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |              |      1 |    50 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL          | TX           |      1 |    11 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS BY INDEX ROWID| EMP          |      1 |    39 |     1   (0)| 00:00:01 |
|*  6 |      INDEX UNIQUE SCAN         | PK_EMP_EMPNO |      1 |       |     0   (0)|          |
|*  7 |    INDEX UNIQUE SCAN           | PK_DEPT      |      1 |       |     0   (0)|          |
|*  8 |   TABLE ACCESS BY INDEX ROWID  | DEPT         |      1 |    20 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$F5BB74E1
   4 - SEL$F5BB74E1 / TX@SEL$1
   5 - SEL$F5BB74E1 / EMP@SEL$2
   6 - SEL$F5BB74E1 / EMP@SEL$2
   7 - SEL$F5BB74E1 / DEPT@SEL$2
   8 - SEL$F5BB74E1 / DEPT@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("TX"."ID"="EMP"."EMPNO")
   7 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")
   8 - filter("DNAME" IS NOT NULL)

--而如果換成v_deptemp2
SCOTT@test> select * from v_deptemp2 v ,tx where tx.id=v.empno and dname is not null;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC                   ID NAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- --------
      7654 MARTIN     SALESMAN        7698 1981-09-28 00:00:00       1250       1400         30 SALES          CHICAGO             7654 MARTIN

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  7b8950at9688x, child number 0
-------------------------------------
select * from v_deptemp2 v ,tx where tx.id=v.empno and dname is not null
Plan hash value: 2887512270
----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |        |       |     7 (100)|          |       |       |          |
|   1 |  TABLE ACCESS BY INDEX ROWID| DEPT       |      1 |    13 |     1   (0)| 00:00:01 |       |       |          |
|*  2 |   INDEX UNIQUE SCAN         | PK_DEPT    |      1 |       |     0   (0)|          |       |       |          |
|   3 |  TABLE ACCESS BY INDEX ROWID| DEPT       |      1 |    11 |     1   (0)| 00:00:01 |       |       |          |
|*  4 |   INDEX UNIQUE SCAN         | PK_DEPT    |      1 |       |     0   (0)|          |       |       |          |
|*  5 |  HASH JOIN                  |            |      1 |   106 |     7  (15)| 00:00:01 |  1156K|  1156K|  383K (0)|
|   6 |   TABLE ACCESS FULL         | TX         |      1 |    11 |     3   (0)| 00:00:01 |       |       |          |
|*  7 |   VIEW                      | V_DEPTEMP2 |     14 |  1330 |     3   (0)| 00:00:01 |       |       |          |
|   8 |    TABLE ACCESS FULL        | EMP        |     14 |   546 |     3   (0)| 00:00:01 |       |       |          |
----------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$3 / DEPT@SEL$3
   2 - SEL$3 / DEPT@SEL$3
   3 - SEL$4 / DEPT@SEL$4
   4 - SEL$4 / DEPT@SEL$4
   5 - SEL$1
   6 - SEL$1 / TX@SEL$1
   7 - SEL$2 / V@SEL$1
   8 - SEL$2 / EMP@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("DEPT"."DEPTNO"=:B1)
   4 - access("DEPT"."DEPTNO"=:B1)
   5 - access("TX"."ID"="V"."EMPNO")
   7 - filter("DNAME" IS NOT NULL)

--emp始終選擇的是全表掃描。

4.繼續測試:
--當把dname is not null刪除後,結果呢?

SCOTT@test> select * from v_deptemp2 v ,tx where tx.id=v.empno ;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC                   ID NAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- ---------
      7654 MARTIN     SALESMAN        7698 1981-09-28 00:00:00       1250       1400         30 SALES          CHICAGO             7654 MARTIN

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  1sd5umnwjt823, child number 0
-------------------------------------
select * from v_deptemp2 v ,tx where tx.id=v.empno
Plan hash value: 3698375752
----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name         | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |              |        |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID | DEPT         |      1 |    13 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN          | PK_DEPT      |      1 |       |     0   (0)|          |
|   3 |  TABLE ACCESS BY INDEX ROWID | DEPT         |      1 |    11 |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN          | PK_DEPT      |      1 |       |     0   (0)|          |
|   5 |  NESTED LOOPS                |              |        |       |            |          |
|   6 |   NESTED LOOPS               |              |      1 |    50 |     4   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL         | TX           |      1 |    11 |     3   (0)| 00:00:01 |
|*  8 |    INDEX UNIQUE SCAN         | PK_EMP_EMPNO |      1 |       |     0   (0)|          |
|   9 |   TABLE ACCESS BY INDEX ROWID| EMP          |      1 |    39 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$3        / DEPT@SEL$3
   2 - SEL$3        / DEPT@SEL$3
   3 - SEL$4        / DEPT@SEL$4
   4 - SEL$4        / DEPT@SEL$4
   5 - SEL$F5BB74E1
   7 - SEL$F5BB74E1 / TX@SEL$1
   8 - SEL$F5BB74E1 / EMP@SEL$2
   9 - SEL$F5BB74E1 / EMP@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DEPT"."DEPTNO"=:B1)
   4 - access("DEPT"."DEPTNO"=:B1)
   8 - access("TX"."ID"="EMP"."EMPNO")

--可以發現沒有這個條件(dname is not null),emp可以很好的選擇索引。
--實際我們的檢視定義有多處使用標量子查詢,當然我只要把在where出現的改寫成不使用標量子查詢。
--如果不願意修改還有1個簡單的改寫方法,就是:

SCOTT@test> select * from (select * from v_deptemp2 v ,tx where tx.id=v.empno ) where dname is not null ;
     EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO DNAME          LOC                   ID NAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- --------
      7654 MARTIN     SALESMAN        7698 1981-09-28 00:00:00       1250       1400         30 SALES          CHICAGO             7654 MARTIN

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  8yy4n9t1dqhmm, child number 0
-------------------------------------
select * from (select * from v_deptemp2 v ,tx where tx.id=v.empno )
where dname is not null
Plan hash value: 3563298867
-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              |        |       |     4 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID  | DEPT         |      1 |    13 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN           | PK_DEPT      |      1 |       |     0   (0)|          |
|   3 |  TABLE ACCESS BY INDEX ROWID  | DEPT         |      1 |    11 |     1   (0)| 00:00:01 |
|*  4 |   INDEX UNIQUE SCAN           | PK_DEPT      |      1 |       |     0   (0)|          |
|*  5 |  VIEW                         |              |      1 |   129 |     4   (0)| 00:00:01 |
|   6 |   NESTED LOOPS                |              |        |       |            |          |
|   7 |    NESTED LOOPS               |              |      1 |    50 |     4   (0)| 00:00:01 |
|   8 |     TABLE ACCESS FULL         | TX           |      1 |    11 |     3   (0)| 00:00:01 |
|*  9 |     INDEX UNIQUE SCAN         | PK_EMP_EMPNO |      1 |       |     0   (0)|          |
|  10 |    TABLE ACCESS BY INDEX ROWID| EMP          |      1 |    39 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$4        / DEPT@SEL$4
   2 - SEL$4        / DEPT@SEL$4
   3 - SEL$5        / DEPT@SEL$5
   4 - SEL$5        / DEPT@SEL$5
   5 - SEL$335DD26A / from$_subquery$_001@SEL$1
   6 - SEL$335DD26A
   8 - SEL$335DD26A / TX@SEL$2
   9 - SEL$335DD26A / EMP@SEL$3
  10 - SEL$335DD26A / EMP@SEL$3
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("DEPT"."DEPTNO"=:B1)
   4 - access("DEPT"."DEPTNO"=:B1)
   5 - filter("DNAME" IS NOT NULL)
   9 - access("TX"."ID"="EMP"."EMPNO")

--這樣改寫後emp可以使用索引!看來oracle cbo 還是不夠成熟,無法識別這種情況。

5.也許有人問v_deptemp1 與 v_deptemp2是不等價的。繼續測試:

SCOTT@test> create or replace  view v_deptemp3 as select emp.*,dname,loc from emp,dept where dept.deptno(+)=emp.deptno ;
View created.

alter table emp modify constraint fk_deptno disable novalidate;
--取消主外來鍵約束。

SCOTT@test> update emp set deptno=90 where empno=7654;
1 row updated.

SCOTT@test> commit ;
Commit complete.

SCOTT@test> select * from v_deptemp3 v ,tx where tx.id=v.empno and dname is not null;
no rows selected

SCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID  bwytb3gt4vtx1, child number 0
-------------------------------------
select * from v_deptemp3 v ,tx where tx.id=v.empno and dname is not null
Plan hash value: 2230440165
------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name         | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |              |        |       |     5 (100)|          |
|   1 |  NESTED LOOPS                  |              |        |       |            |          |
|   2 |   NESTED LOOPS                 |              |      1 |    70 |     5   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |              |      1 |    50 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL          | TX           |      1 |    11 |     3   (0)| 00:00:01 |
|   5 |     TABLE ACCESS BY INDEX ROWID| EMP          |      1 |    39 |     1   (0)| 00:00:01 |
|*  6 |      INDEX UNIQUE SCAN         | PK_EMP_EMPNO |      1 |       |     0   (0)|          |
|*  7 |    INDEX UNIQUE SCAN           | PK_DEPT      |      1 |       |     0   (0)|          |
|*  8 |   TABLE ACCESS BY INDEX ROWID  | DEPT         |      1 |    20 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$EB0D93E2
   4 - SEL$EB0D93E2 / TX@SEL$1
   5 - SEL$EB0D93E2 / EMP@SEL$2
   6 - SEL$EB0D93E2 / EMP@SEL$2
   7 - SEL$EB0D93E2 / DEPT@SEL$2
   8 - SEL$EB0D93E2 / DEPT@SEL$2
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access("TX"."ID"="EMP"."EMPNO")
   7 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")
   8 - filter("DNAME" IS NOT NULL)

--就是這樣定義,oracle 一樣選擇合理的執行方式。

總結:
1.不要大量不顧各種情況的使用標量子查詢,實際上標量子查詢存在許多缺陷,僅僅能查詢1個欄位,僅僅返回1行。
2.並且導致內部巢狀連線,而不能採用hash join等連線方式。假如emp表1萬行,這樣查詢dept表要有1萬次(我上面執行2次標量子查詢,也就是2萬次)。
  即使dept有索引,邏輯讀的數量不可小看.
3.另外一個原因我覺得可能是資料結構問題,這種情況如果太遵守正規化,資料結構設計耦合太緊,開發很容易選擇標量子查詢的寫法。
  例如:1個表有6個欄位有科室程式碼欄位,這樣要顯示科室名稱,這樣from中寫6次科室表,做連線6次查詢,這樣許多開發更願意選擇標量子查詢寫sql語句。
4.總之要開發合理選擇這種寫法,不能到處濫用。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-1727892/,如需轉載,請註明出處,否則將追究法律責任。

相關文章