mysql子查詢的缺陷以及5.6的優化

myownstars發表於2015-02-05
 

分類

子查詢分為from clausewhere clause,在執行計劃的select_type分別顯示為derived[dependent] subquery;

根據和外部查詢的依賴關係又可分為兩種,

相關子查詢:子查詢依賴外層連線的返回值

非相關子查詢:子查詢不依賴外層連線的返回值


缺陷

pre-5.6通常會將非相關子查詢改為相關子查詢,即先遍歷outer table,對於其返回的每一條記錄都執行一次subquery
注:mysql目前只實現nested-loop join,所以dependent subquery才會如此消耗資源,如果是oracle則可進行semi/anti hash join
http://blog.itpub.net/15480802/viewspace-703260

SELECT * FROM t1 WHERE t1.a IN (SELECT t2.b FROM t2 WHERE where_condition);

有可能被重寫為

SELECT * FROM t1 WHERE EXISTS (SELECT t2.b FROM t2 WHERE where_condition AND t1.a=t2.b);


Pre-5.6優化

Mysql可以將from clause的子查詢進行物化,此時先執行inner query並將結果存於臨時表

以下是一個5.0.3版本的優化案例  http://www.xaprb.com/blog/2006/04/30/how-to-optimize-subqueries-and-joins-in-mysql/   

select * from subcategory

where id in (

    select c.id

    from subcategory as c

        inner join item as i on i.subcategory = c.id

    where c.category = 14

    group by c.id

    having count(*) > 2000

);

此時mysql optimizer自作聰明的將非相關子查詢改寫為相關子查詢,執行計劃如下:

因為subquery被轉化為相關子查詢,即先遍歷subcategory(outer table),對於每條記錄都執行一次subquery(總計300783)

*************************** 1. row ***************************

           id: 1

  select_type: PRIMARY

        table: subcategory

         type: ALL

possible_keys: NULL

          key: NULL

      key_len: NULL

          ref: NULL

         rows: 300783

        Extra: Using where

*************************** 2. row ***************************

           id: 2

  select_type: DEPENDENT SUBQUERY

        table: c

         type: ref

possible_keys: PRIMARY,category

          key: category

      key_len: 4

          ref: const

         rows: 100

        Extra: Using where; Using index; Using temporary; Using filesort

*************************** 3. row ***************************

           id: 2

  select_type: DEPENDENT SUBQUERY

        table: i

         type: ref

possible_keys: subcategory

          key: subcategory

      key_len: 4

          ref: c.id

         rows: 28

        Extra: Using index

優化:採用物化子查詢,所做的就是將子查詢改為from clause,即新增一對括號即可;

select * from subcategory

where id in (

    select id from (

        select c.id

        from subcategory as c

            inner join item as i on i.subcategory = c.id

        where c.category = 14

        group by c.id

        having count(*) > 2000

    ) as x

);


5.6優化

引入物化子查詢(針對where clausesubquery)

5.6.5引入此功能,在此之前,優化器有時會把非關聯子查詢重寫為相關子查詢,導致效率變差;

子查詢物化將子查詢結果存入臨時表,確保子查詢只執行一次,該表不記錄重複資料且採用雜湊索引查詢;

Optimizer_switch需設定materialization=on


優化derived table

以前的版本在explain時就會對from clausesubquery進行物化 ,引發了部分執行,5.6消除了這個問題;

另外,優化器可能會為derived table新增索引以加速執行

SELECT * FROM t1 JOIN (SELECT * FROM t2) AS derived_t2 ON t1.f1=derived_t2.f1;

優化器可以為derived_t2f1新增索引以採用ref

http://dev.mysql.com/doc/refman/5.6/en/subquery-materialization.html

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-1427900/,如需轉載,請註明出處,否則將追究法律責任。

相關文章