sql如下
SELECT DISTINCT o.orders_id, o.oa_order_id,os.orders_status_name, o.order_type, o.date_purchased AS add_date,dop.resource, dop.country_code FROM dm_order_products AS dop LEFT JOIN orders AS o ON o.orders_id=dop.orders_id LEFT JOIN orders_total AS ot ON ot.orders_id=o.orders_id AND ot.class=`ot_total` LEFT JOIN orders_status AS os ON os.orders_status_id=o.orders_status WHERE o.date_purchased >= `2014-01-31 10:00:00` AND o.date_purchased <= `2014-02-24 09:59:59` ORDER BY o.orders_id DESC LIMIT 0, 20;
因為需要在大結果集中order by 去重,再顯示20條.
表特性是orders(o)表對dm_order_products(dop)表為一對多關係,而取出來的dop.country_code為一個訂單號對應唯一值,由於表結構設計問題,每次查詢該country_code都需要去dop查詢。所以,每次查詢都放大結果集,然後再去重,得到所要的結果集合。
explain
+----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+ | 1 | SIMPLE | o | range | PRIMARY,date_purchased | date_purchased | 9 | NULL | 952922 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | ot | ref | idx_orders_total_orders_id,class | idx_orders_total_orders_id | 4 | banggood_work.o.orders_id | 3 | | | 1 | SIMPLE | os | ref | PRIMARY | PRIMARY | 4 | banggood_work.o.orders_status | 1 | | | 1 | SIMPLE | dop | ref | orders_id | orders_id | 4 | banggood_work.o.orders_id | 2 | | +----+-------------+-------+-------+----------------------------------+----------------------------+---------+-------------------------------+-------+----------------------------------------------+
索引情況使用正常,但是發現需要掃描一個大結果集.
profiling,執行時間為將近20s
mysql> show profile cpu,block io for query 1; +--------------------------------+-----------+----------+------------+--------------+---------------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | +--------------------------------+-----------+----------+------------+--------------+---------------+ | starting | 0.000025 | 0.000000 | 0.000000 | 0 | 0 | | Waiting for query cache lock | 0.000004 | 0.000000 | 0.000000 | 0 | 0 | | checking query cache for query | 0.000080 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000005 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000003 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000003 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | Opening tables | 0.000034 | 0.000000 | 0.000000 | 0 | 0 | | System lock | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | | Waiting for query cache lock | 0.000024 | 0.000000 | 0.000000 | 0 | 0 | | init | 0.000046 | 0.000000 | 0.000000 | 0 | 0 | | optimizing | 0.000018 | 0.000000 | 0.000000 | 0 | 0 | | statistics | 0.000193 | 0.000000 | 0.000000 | 0 | 0 | | preparing | 0.000054 | 0.000000 | 0.000000 | 0 | 0 | | Creating tmp table | 0.000031 | 0.000000 | 0.000000 | 0 | 0 | | executing | 0.000004 | 0.000000 | 0.000000 | 0 | 0 | | Copying to tmp table | 12.491533 | 3.039538 | 3.107527 | 11896 | 824 | | Sorting result | 0.030709 | 0.034995 | 0.004000 | 16 | 496 | | Sending data | 0.000048 | 0.000000 | 0.000000 | 0 | 0 | | end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 | | removing tmp table | 0.010108 | 0.000000 | 0.010998 | 8 | 32 | | end | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | | query end | 0.000004 | 0.000000 | 0.000000 | 0 | 0 | | closing tables | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | | freeing items | 0.000338 | 0.000000 | 0.000000 | 0 | 0 | | logging slow query | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | logging slow query | 0.000033 | 0.000000 | 0.000000 | 0 | 8 | | cleaning up | 0.000006 | 0.000000 | 0.000000 | 0 | 0 |
可以看到Copying to tmp table 佔了大部分的cpu時間和io,最後sorting result佔比重不大。
我們可以上面描述的結合特性,是否能夠去掉Copying to tmp table 選項!因為是根據orders_id排序,取出最新的20條資料,如果我們在orders表中先把20條資料取出來,再和對應的表連線,這樣一來,就將整個大結果Copying to tmp table 再排序這一步去掉!
看sql語句如下
SELECT DISTINCT o.orders_id, o.oa_order_id,os.orders_status_name, o.order_type, o.date_purchased AS add_date,dop.resource, dop.country_code FROM ( SELECT * FROM orders AS o WHERE o.date_purchased >= `2014-01-31 10:00:00` AND o.date_purchased <= `2014-02-24 09:59:59` ORDER BY o.orders_id DESC LIMIT 0, 20 ) o LEFT JOIN dm_order_products AS dop ON o.orders_id=dop.orders_id LEFT JOIN orders_total AS ot ON ot.orders_id=o.orders_id AND ot.class=`ot_total` LEFT JOIN orders_status AS os ON os.orders_status_id=o.orders_status ORDER BY o.orders_id DESC LIMIT 0, 20; +----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 20 | Using temporary; Using filesort | | 1 | PRIMARY | dop | ref | orders_id | orders_id | 4 | o.orders_id | 2 | | | 1 | PRIMARY | ot | ref | idx_orders_total_orders_id,class | idx_orders_total_orders_id | 4 | o.orders_id | 3 | | | 1 | PRIMARY | os | ref | PRIMARY | PRIMARY | 4 | o.orders_status | 1 | | | 2 | DERIVED | o | index | date_purchased | PRIMARY | 4 | NULL | 330 | Using where | +----+-------------+------------+-------+----------------------------------+----------------------------+---------+-----------------+------+---------------------------------+
可以發現,結果集瞬間縮小N倍,再檢視profiling,執行時間變成0.01s
+--------------------------------+----------+----------+------------+--------------+---------------+ | Status | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out | +--------------------------------+----------+----------+------------+--------------+---------------+ | starting | 0.000028 | 0.000000 | 0.000000 | 0 | 0 | | Waiting for query cache lock | 0.000008 | 0.000000 | 0.000000 | 0 | 0 | | checking query cache for query | 0.000148 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | checking permissions | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | Opening tables | 0.000052 | 0.000000 | 0.000000 | 0 | 0 | | System lock | 0.001299 | 0.000000 | 0.001000 | 0 | 8 | | optimizing | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | | statistics | 0.000127 | 0.000000 | 0.000000 | 0 | 0 | | preparing | 0.000127 | 0.001000 | 0.000000 | 0 | 0 | | executing | 0.000009 | 0.000000 | 0.000000 | 0 | 0 | | Sorting result | 0.000012 | 0.000000 | 0.000000 | 0 | 0 | | Sending data | 0.002182 | 0.001000 | 0.001000 | 0 | 784 | | Waiting for query cache lock | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | Sending data | 0.000053 | 0.000000 | 0.000000 | 0 | 0 | | init | 0.000048 | 0.000000 | 0.000000 | 0 | 0 | | optimizing | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | statistics | 0.000315 | 0.000000 | 0.000000 | 0 | 0 | | preparing | 0.000025 | 0.000000 | 0.000000 | 0 | 0 | | Creating tmp table | 0.000043 | 0.000000 | 0.000000 | 0 | 0 | | executing | 0.000008 | 0.000000 | 0.000000 | 0 | 0 | | Copying to tmp table | 0.003390 | 0.002000 | 0.002000 | 0 | 0 | | Sorting result | 0.000028 | 0.000000 | 0.000000 | 0 | 0 | | Sending data | 0.000058 | 0.000000 | 0.000000 | 0 | 0 | | end | 0.000008 | 0.000000 | 0.000000 | 0 | 0 | | removing tmp table | 0.000013 | 0.000000 | 0.000000 | 0 | 0 | | end | 0.000008 | 0.000000 | 0.000000 | 0 | 0 | | query end | 0.000008 | 0.000000 | 0.000000 | 0 | 0 | | closing tables | 0.000007 | 0.000000 | 0.000000 | 0 | 0 | | removing tmp table | 0.000591 | 0.000000 | 0.000000 | 0 | 0 | | closing tables | 0.000021 | 0.000000 | 0.000000 | 0 | 0 | | freeing items | 0.000024 | 0.000000 | 0.000000 | 0 | 0 | | Waiting for query cache lock | 0.000007 | 0.000000 | 0.000000 | 0 | 0 | | freeing items | 0.000630 | 0.001000 | 0.000000 | 0 | 0 | | Waiting for query cache lock | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | freeing items | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | storing result in query cache | 0.000010 | 0.000000 | 0.000000 | 0 | 0 | | logging slow query | 0.000006 | 0.000000 | 0.000000 | 0 | 0 | | logging slow query | 0.000056 | 0.000000 | 0.000000 | 0 | 8 | | cleaning up | 0.000011 | 0.000000 | 0.000000 | 0 | 0 | +--------------------------------+----------+----------+------------+--------------+---------------+
可以看到Copying to tmp table已經被優化!執行時間提高了幾百倍!
注意:
1. 如果是limit m,n這種語句型別,第一步要想到,是不是可以抽取一部分使用臨時表優化後,再和外面的其他表進行優化
2. distinct和orders by 是天敵。在大結果集的sql語句中,儘量避免直接使用distinct.否則,mysql會將整個結果集copy導臨時表,重新排序後,再返回!