mysql關聯查詢優化
mysql中任何關聯查詢都是nest loop(巢狀迴圈)操作,nest loop是在驅動表中取出一條資料,然後從被驅動表中逐行比較,把符合規則的放入結果集中,然後再取下一行,依次迴圈,驅動表每返回一行,被驅動表就要掃描一次。
針對nest loop關聯機制需要從下面幾個方面著手優化:
1、減少nest loop迴圈次數,使用小結果集做驅動表,驅動大結果集。
2、被驅動表每次迴圈都要被掃描,所以要求關聯鍵上一定要有索引,而且選擇性要好。
3、如果第二條無法滿足,可以通過調join_buffer_size來設定join buffer的大小,不過還是建議新增索引而不是純粹的加大join_buffer_size
接下來通過下面的實驗來了解mysql的nest loop
實驗環境:Percona server5.6.27 大表bill、小表user,表上均有索引
mysql> select count(*) from bill;
+----------+
| count(*) |
+----------+
| 1966789 |
+----------+
mysql> select count(*) from user_tmp;
+----------+
| count(*) |
+----------+
| 36317 |
+----------+
一、執行計劃:
mysql> explain select a.user_id,b.loan_info_id from bill b left JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 1912096 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
左連線左表不管有多大總是驅動表,右表總是被驅動表
mysql> explain select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
| 1 | SIMPLE | a | index | PRIMARY | PRIMARY | 194 | NULL | 35970 | Using index |
| 1 | SIMPLE | b | ref | in_bill_user_id | in_bill_user_id | 194 | CDM.a.user_id | 3 | NULL |
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
2 rows in set (0.00 sec)
內連線,mysql的優化器會根據統計資訊自動選擇小表user_tmp做驅動表,大家可以看到rows列值和我們剛開始統計的行數不一致,是因為統計資訊和實際是有差異,所以有時候統計資訊的不準確會導致執行計劃不是最優的。內連線可以用STRAIGHT_JOIN按照順序執行,即指定左表為驅動表
mysql> explain select STRAIGHT_JOIN a.user_id,b.loan_info_id from bill b inner JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | in_bill_user_id | NULL | NULL | NULL | 1912096 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
這個時候mysql就不會根據統計資訊把右邊的小表當做驅動表
刪除被驅動表bill索引
mysql> explain select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 1905575 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
mysql優化器是基於成本的,bill沒有了索引,那麼就要掃描35970次bill全表 ,成本高於掃描1905575次user_tmp索引,所以又改變了執行計劃,變成了把大表做驅動表,進而降低了查詢效率
二、執行效率(關聯鍵都有索引):
當小表是驅動表的時候
mysql> select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
這裡結果集有幾萬條,省略
耗時:0.202s
使用STRAIGHT_JOIN強制大表是驅動表的時候
mysql>select STRAIGHT_JOIN a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id
耗時:5.260s
由於兩張表的相差幾十倍,兩種執行計劃的效率也是顯而易見的
注:如果大表的關聯鍵索引選擇性比較差(如重複資料多等),每次迴圈掃太多了,不如讓大表做驅動表,上述實驗是在大表的索引選擇性好的情況下得出的結果。
針對nest loop關聯機制需要從下面幾個方面著手優化:
1、減少nest loop迴圈次數,使用小結果集做驅動表,驅動大結果集。
2、被驅動表每次迴圈都要被掃描,所以要求關聯鍵上一定要有索引,而且選擇性要好。
3、如果第二條無法滿足,可以通過調join_buffer_size來設定join buffer的大小,不過還是建議新增索引而不是純粹的加大join_buffer_size
接下來通過下面的實驗來了解mysql的nest loop
實驗環境:Percona server5.6.27 大表bill、小表user,表上均有索引
mysql> select count(*) from bill;
+----------+
| count(*) |
+----------+
| 1966789 |
+----------+
mysql> select count(*) from user_tmp;
+----------+
| count(*) |
+----------+
| 36317 |
+----------+
一、執行計劃:
mysql> explain select a.user_id,b.loan_info_id from bill b left JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 1912096 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
左連線左表不管有多大總是驅動表,右表總是被驅動表
mysql> explain select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
| 1 | SIMPLE | a | index | PRIMARY | PRIMARY | 194 | NULL | 35970 | Using index |
| 1 | SIMPLE | b | ref | in_bill_user_id | in_bill_user_id | 194 | CDM.a.user_id | 3 | NULL |
+----+-------------+-------+-------+------------------------------+------------------------------+---------+---------------+-------+-------------+
2 rows in set (0.00 sec)
內連線,mysql的優化器會根據統計資訊自動選擇小表user_tmp做驅動表,大家可以看到rows列值和我們剛開始統計的行數不一致,是因為統計資訊和實際是有差異,所以有時候統計資訊的不準確會導致執行計劃不是最優的。內連線可以用STRAIGHT_JOIN按照順序執行,即指定左表為驅動表
mysql> explain select STRAIGHT_JOIN a.user_id,b.loan_info_id from bill b inner JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | in_bill_user_id | NULL | NULL | NULL | 1912096 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+------------------------------+---------+---------+---------------+---------+-------------+
這個時候mysql就不會根據統計資訊把右邊的小表當做驅動表
刪除被驅動表bill索引
mysql> explain select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
| 1 | SIMPLE | b | ALL | NULL | NULL | NULL | NULL | 1905575 | NULL |
| 1 | SIMPLE | a | eq_ref | PRIMARY | PRIMARY | 194 | CDM.b.user_id | 1 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+---------------+---------+-------------+
mysql優化器是基於成本的,bill沒有了索引,那麼就要掃描35970次bill全表 ,成本高於掃描1905575次user_tmp索引,所以又改變了執行計劃,變成了把大表做驅動表,進而降低了查詢效率
二、執行效率(關聯鍵都有索引):
當小表是驅動表的時候
mysql> select a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id;
這裡結果集有幾萬條,省略
耗時:0.202s
使用STRAIGHT_JOIN強制大表是驅動表的時候
mysql>select STRAIGHT_JOIN a.user_id,b.loan_info_id from bill b INNER JOIN user_tmp a on a.user_id=b.user_id
耗時:5.260s
由於兩張表的相差幾十倍,兩種執行計劃的效率也是顯而易見的
注:如果大表的關聯鍵索引選擇性比較差(如重複資料多等),每次迴圈掃太多了,不如讓大表做驅動表,上述實驗是在大表的索引選擇性好的情況下得出的結果。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29989552/viewspace-2129306/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- MySQL 優化五(關聯查詢子查詢以及 in 的效率問題)(高階篇)MySql優化
- MySQL查詢優化MySql優化
- MySQL多表關聯查詢MySql
- SQL優化之多表關聯查詢-案例一SQL優化
- mysql 三表關聯查詢MySql
- mysql三表關聯查詢MySql
- MySQL 的查詢優化MySql優化
- MySQL 慢查詢優化MySql優化
- MySQL優化COUNT()查詢MySql優化
- mysql查詢優化檢查 explainMySql優化AI
- MySQL調優之查詢優化MySql優化
- Mysql表關聯欄位未建索引導致查詢慢,優化後查詢效率顯著提升MySql索引優化
- mysql三張表關聯查詢MySql
- mysql中的多表關聯查詢MySql
- MySQL查詢優化利刃-EXPLAINMySql優化AI
- MySQL索引與查詢優化MySql索引優化
- MySQL分頁查詢優化MySql優化
- MySQL-效能優化-索引和查詢優化MySql優化索引
- MySQL——優化巢狀查詢和分頁查詢MySql優化巢狀
- MYSQL A、B表陣列關聯查詢MySql陣列
- mysql三表關聯查詢練習MySql
- 【資料庫】MySQL查詢優化資料庫MySql優化
- Mysql 慢查詢優化實踐MySql優化
- 十七、Mysql之SQL優化查詢MySql優化
- MySQL: 使用explain 優化查詢效能MySqlAI優化
- mysql查詢效能優化總結MySql優化
- MySQL子查詢的優化薦MySql優化
- MySQL聯結查詢和子查詢MySql
- MySQL 索引及查詢優化總結MySql索引優化
- MySQL分優化之超大頁查詢MySql優化
- mysql update join優化update in查詢效率MySql優化
- MySQL系列-- 4. 查詢效能優化MySql優化
- MySQL索引原理及慢查詢優化MySql索引優化
- 優化mysql資料字典表查詢優化MySql
- MySQL索引的最左字首原理與查詢的相關優化MySql索引優化
- 區分關聯子查詢和非關聯子查詢
- exist-in和關聯子查詢-非關聯子查詢
- mysql效能優化-慢查詢分析、優化索引和配置MySql優化索引