MySQL複合索引探究

gegeman發表於2021-02-03

原文網址 : https://www.cnblogs.com/lijiaman/p/14364171.html

複合索引(又稱為聯合索引)，是在多個列上建立的索引。建立複合索引最重要的是列順序的選擇，這關係到索引能否使用上，或者影響多少個謂詞條件能使用上索引。複合索引的使用遵循最左匹配原則，只有索引左邊的列匹配到，後面的列才能繼續匹配。本文主要探究複合索引的建立順序與使用情況。

（一）複合索引的概念

在單個列上建立的索引我們稱為單列索引，在2個以上的列上建立的索引稱為複合索引。在單個列上建立索引相對簡單，通常只需要考慮列的選擇率即可，選擇性越好，代表資料越分散，建立出來的索引效能也就更好。通常，某列選擇率的計算公式為：
selectivity = 施加謂詞條件後返回的記錄數 / 未施加謂詞條件後返回的記錄數
可選擇率的取值範圍是(0,1]，值越小，代表選擇性越好。
對於複合索引(又稱為聯合索引)，是在多個列上建立的索引。建立複合索引最重要的是列順序的選擇，這關係到索引能否使用上，或者影響多少個謂詞條件能使用上索引。複合索引的使用遵循最左匹配原則，只有索引左邊的列匹配到，後面的列才能繼續匹配。

（二）什麼情況下會使用複合索引的列

複合索引遵循最左匹配原則，只有索引中最左列匹配到，下一列才有可能被匹配。如果左邊列使用的是非等值查詢，則索引右邊的列將不會被查詢使用，也不會被排序使用。

實驗：哪些情況下會使用到複合索引

複合索引中的哪些欄位被使用到了，是我們非常關心的問題。網路上一個經典的例子：

-- 建立測試表
CREATE TABLE t1(
c1 CHAR(1) not null,
c2 CHAR(1) not null,
c3 CHAR(1) not null,
c4 CHAR(1) not null,
c5 CHAR(1) not null
)ENGINE innodb CHARSET UTF8;

-- 新增索引
alter table t1 add index idx_c1234(c1,c2,c3,c4);

--插入測試資料
insert into t1 values('1','1','1','1','1'),('2','2','2','2','2'),
('3','3','3','3','3'),('4','4','4','4','4'),('5','5','5','5','5');

需要探索下面哪些查詢語句使用到了索引idx_c1234，以及使用到了索引的哪些欄位？

(A) where c1=? and c2=? and c4>? and c3=?
(B) where c1=? and c2=? and c4=? order by c3
(C) where c1=? and c4=? group by c3,c2
(D) where c1=? and c5=? order by c2,c3
(E) where c1=? and c2=? and c5=? order by c2,c3
(F) where c1>? and c2=? and c4>? and c3=?

A選項：

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c4>'1' and c3='2';
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | range | idx_c1234     | idx_c1234 | 12      | NULL |    1 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+

使用的索引長度為12，代表4個欄位都使用了索引。由於c1、c2、c3都是等值查詢，所以後面的c4列也可以用上。

注：utf8編碼，一個索引長度為3，這裡12代表4個欄位都用到該索引。

B選項：

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c4='2' order by c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref         | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 6       | const,const |    1 |    20.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+-----------------------+

使用的索引長度為6，代表2個欄位使用了索引。根據最左使用原則，c1、c2使用了索引。因為查詢中沒有c3謂詞條件，所以索引值使用到c2後就發生了中斷，導致只使用了c1、c2列。這裡SQL使用了order by排序，但是在執行計劃Extra部分未有filesort關鍵字，說明在索引中按照c3欄位順序讀取資料即可。

這裡特別留意，雖然索引中的c3欄位沒有放在索引的最後，但是確實使用到了索引中c2欄位的有序特性，因為執行計劃的Extra部分未出現"fileasort"關鍵字。這是為什麼呢？這裡用到了MySQL5.6版本引入的Index Condition Pushdown (ICP) 優化。其核心思想是使用索引中的欄位做資料過濾。我們來整理一下不使用ICP和使用ICP的區別：

如果沒有使用ICP優化，其SQL執行步驟為：

1.使用索引列c1,c2獲取滿足條件的行資料。where c1='2' and c2='2'
2.回表查詢資料，使用where c4='2'來過濾資料
3.對資料排序輸出

如果使用了ICP優化，其SQL執行步驟為：

1.使用索引列c1,c2獲取滿足條件的行資料。where c1='2' and c2='2'
2.在索引中使用where c4='2'來過濾資料
3.因為資料有序，直接按順序取出滿足條件的資料

C選項：

mysql>  explain select c2,c3 from t1 where c1='2' and c4='2' group by c3,c2;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                                                     |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 3       | const |    2 |    14.29 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------------------------------------------+

使用的索引長度為3，代表1個欄位使用了索引。根據最左使用原則，c1使用了索引。因為查詢中沒有c2謂詞條件，所以索引值使用到c1後就發生了中斷，導致只使用了c1列。該SQL執行過程為：

1.在c1列使用索引找到c1='2'的所有行，然後回表使用c4='2'過濾掉不匹配的資料
2.根據上一步的結果，對結果中的c3,c2聯合排序，以便於得到連續變化的資料，同時在資料庫內部建立臨時表，用於儲存group by的結果。

C選項擴充套件：

mysql> explain select c2,c3 from t1 where c1='2' and c4='2' group by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 3       | const |    2 |    14.29 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+--------------------------+

使用的索引長度為3，代表1個欄位使用了索引。根據最左使用原則，c1使用了索引。

D選項：

mysql> explain select c2,c3 from t1 where c1='2' and c5='2' order by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                              |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 3       | const |    2 |    14.29 | Using index condition; Using where |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+------------------------------------+

使用的索引長度為3，代表1個欄位都使用了索引。根據最左使用原則，c1使用了索引。因為查詢中沒有c2謂詞條件，所以索引值使用到c1後就發生了中斷，導致只使用了c1列。

D選項擴充套件：

mysql> explain select c2,c3 from t1 where c1='2' and c5='2' order by c3,c2;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 3       | const |    2 |    14.29 | Using index condition; Using where; Using filesort |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+----------------------------------------------------+

E選項：

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1='2' and c2='2' and c5='2' order by c2,c3;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref         | rows | filtered | Extra                              |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | idx_c1234     | idx_c1234 | 6       | const,const |    2 |    14.29 | Using index condition; Using where |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------------+------+----------+------------------------------------+

使用的索引長度為6，代表2個欄位都使用了索引。根據最左使用原則，c1、c2使用了索引。這裡SQL使用了order by排序，但是在執行計劃Extra部分未有filesort關鍵字，說明在索引中按照c3欄位順序讀取資料即可(c2是常量)。

F選項：

mysql> explain select c1,c2,c3,c4,c5 from t1 where c1>'4' and c2='2' and c3='2' and c4='1';
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | range | idx_c1234     | idx_c1234 | 3       | NULL |    1 |    20.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+

使用的索引長度為3，代表1個欄位都使用了索引。根據最左使用原則，c1使用了索引。這裡c1使用了不等值查詢，導致後面的c2查詢無法使用索引。該案例非常值得警惕，謂詞條件中含有等值查詢和範圍查詢時，如果範圍查詢在索引前面，則等值查詢將無法使用索引；如果等值查詢在前面，範圍查詢在後面，則都可以使用到索引。

（三）如何建立複合索引

複合索引建立的難點在於欄位順序選擇，我的觀點如下：

如果存在等值查詢和排序，則在建立複合索引時，將等值查詢欄位放在前面，排序放在最後面；
如果存在多個等值查詢，則選擇性好的放在前面，選擇性差的放在後面；
如果存在等值查詢、範圍查詢、排序。等值查詢放在最前面，範圍查詢和排序需根據實際情況決定索引順序；

此外，《阿里巴巴Java開發手冊-2020最新嵩山版》中有幾個關於複合索引的規約，我們可以看一下：
1.如果有order by的場景，請注意利用索引的有序性。order by後的欄位是組合索引的一部分，並且放在組合索引的最後，避免出現filesort的情況，影響查詢效能。
正例：where a=? b=? order by c; 索引a_b_c
反例：索引如果存在範圍查詢，那麼索引有序性將無法使用。如：where a>10 order by b; 索引a_b無法排序。
2.建複合索引的時候，區分度最高的在最左邊，如果where a=? and b=?，a列的值幾乎接近唯一值，那麼只需建單列索引idx_a即可。
說明：存在等號和非等號混合判斷條件時，在建索引時，請把等號條件的列前置。如：where c>? and d=?,那麼即使c的區分度更高，也必須把d放在索引的最前列，即建立索引idx_d_c。

實驗：應該如何建立複合索引

在有的文件裡面講到過複合索引的建立規則：ESR原則：精確(Equal)匹配的欄位放在最前面，排序(Sort)條件放中間，範圍(Range)匹配的欄位放在最後面。接下來我們來探索一下該方法是否正確。

例子：存在員工表employees

mysql> show create table employees;
+-----------+-------------------------------
| Table     | Create Table                                                                                                                                                                                                                                                                          
+-----------+-------------------------------------
| employees | CREATE TABLE `employees` (
  `emp_no` int(11) NOT NULL,
  `birth_date` date NOT NULL,
  `first_name` varchar(14) NOT NULL,
  `last_name` varchar(16) NOT NULL,
  `gender` enum('M','F') NOT NULL,
  `hire_date` date NOT NULL,
  PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-----------+-------------------------------------

-- 資料量約30萬行
mysql> select count(*) from employees;
+----------+
| count(*) |
+----------+
|   300024 |
+----------+

現在需要查詢1998年後入職的first_name為"Ebbe"員工，並按照出生日期升序排序。

其SQL語句如下：

select  emp_no,birth_date,first_name,last_name,gender,hire_date 
from    employees 
where   hire_date >= '1998-01-01'
and     first_name = 'Ebbe'
order by birth_date;

為了優化該SQL語句的效能，需要在表上建立索引，為了保證where與order by都使用到索引，決定建立複合索引，有如下建立順序：

（A）hire_date,first_name,birth_date
（B）hire_date,birth_date,first_name
（C）first_name,hire_date,birth_date
（D）first_name,birth_date,hire_date
（E）birth_date,first_name,hire_date
（F）birth_date,hire_date,first_name
確認哪種順序建立索引是最優的。

Note:
1.date型別佔3個位元組的空間，hire_date和 birth_date都佔用3個位元組的空間。
2.first_name是變長欄位，多使用2個位元組，如果允許為NULL值，還需多使用1個位元組，佔用16個位元組

A選項：hire_date,first_name,birth_date

create index idx_a on employees(hire_date,first_name,birth_date);

其執行計劃如下：

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | employees | NULL       | range | idx_a         | idx_a | 19      | NULL | 5678 |    10.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+

這裡key_len長度為19，令人不解，hire_date是非等值查詢，理論上key_len應該為3，通過使用MySQL workbench檢視執行計劃，也可以發現索引只使用了hire_date列(如下圖)。為什麼會是19而不是3呢？實在令人費解，思考了好久也沒有想明白，如有知道，望各位大神不吝解答。

B選項：hire_date,birth_date,first_name

為避免干擾，刪除上面建立的索引idx_a，然後建立idx_b。

create index idx_b on employees(hire_date,birth_date,first_name);

其執行計劃如下：

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | employees | NULL       | range | idx_b         | idx_b | 3       | NULL | 5682 |    10.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+

這裡key_len長度為3，hire_date是非等值查詢，導致後面的索引列無法使用到。

C選項：first_name,hire_date,birth_date

為避免干擾，刪除上面建立的索引idx_b，然後建立idx_c。

create index idx_c on employees(first_name,hire_date,birth_date);

其執行計劃如下：

+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | employees | NULL       | range | idx_c         | idx_c | 19      | NULL |    5 |   100.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+---------------------------------------+

這裡key_len長度為19，first_name是等值查詢，可以繼續使用hire_date列，因為hire_date列是非等值查詢，導致索引無法繼續使用birth_date。

D選項：first_name,birth_date,hire_date

為避免干擾，刪除上面建立的索引idx_c，然後建立idx_d。

create index idx_d on employees(first_name,birth_date,hire_date);

其執行計劃如下：

+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table     | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | employees | NULL       | ref  | idx_d         | idx_d | 16      | const |  190 |    33.33 | Using index condition |
+----+-------------+-----------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+

這裡key_len長度為16，first_name是等值查詢，在謂詞過濾中未使用birth_date，導致只有first_name列使用上索引，但是birth_date列用於排序，上面執行計劃顯示SQL最終並沒有排序，說明資料是從索引按照birth_date有序取出的。

E選項：birth_date,first_name,hire_date

為避免干擾，刪除上面建立的索引idx_d，然後建立idx_e。

create index idx_e on employees(birth_date,first_name,hire_date);

其執行計劃如下：

+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra                       |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
|  1 | SIMPLE      | employees | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 299468 |     3.33 | Using where; Using filesort |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+

這裡未使用到索引，說明排序列放在複合索引的最前面是無法被使用到的。

F選項：birth_date,hire_date,first_name

為避免干擾，刪除上面建立的索引idx_e，然後建立idx_f。

create index idx_f on employees(birth_date,hire_date,first_name);

其執行計劃如下：

+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra                       |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+
|  1 | SIMPLE      | employees | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 299468 |     3.33 | Using where; Using filesort |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-----------------------------+

與E選項一樣，這裡未使用到索引，說明排序列放在複合索引的最前面是無法被使用到的。

通過上面的6個索引測試，我們發現，等值查詢列和範圍查詢列放在複合索引前面，複合索引都能被使用到，只是使用到的列可能不一樣。哪種方式建立索引最好呢？MySQL的查詢優化器是基於開銷(cost)來選擇最優的執行計劃的，我們不妨來看看上面的6個索引的執行開銷。

索引         開銷cost
----------   ------------
idx_a        8518
idx_b        8524
idx_c        13
idx_d        228
idx_e        78083
idx_f        78083

通過上面的開銷，可以看到：

idx_a和idx_b：索引使用範圍查詢欄位開頭，導致索引只能使用到第一列，無法消除排序，導致開銷較大；
idx_c和idx_d：索引使用等值查詢欄位開頭，範圍查詢和排序位於後面，開銷是最小的；
idx_e和idx_f ：索引使用排序欄位開頭，導致索引無法被使用到，走的全表掃描，開銷巨大。

更進一步，idx_c和idx_d如何選擇呢？idx_c使用索引進行等值查詢+範圍查詢，然後對資料進行排序；idx_d使用索引進行等值查詢+索引條件下推查詢，然後按照順序直接獲取資料。兩種方式各有優劣，我們不妨再來看一個例子：

把上面6個索引都加到表上，看看如下SQL會選擇哪個索引。

mysql> show index from employees;
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table     | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| employees |          0 | PRIMARY  |            1 | emp_no      | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_a    |            1 | hire_date   | A         |        5355 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_a    |            2 | first_name  | A         |      290745 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_a    |            3 | birth_date  | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_b    |            1 | hire_date   | A         |        6237 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_b    |            2 | birth_date  | A         |      297591 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_b    |            3 | first_name  | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_c    |            1 | first_name  | A         |        1260 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_c    |            2 | hire_date   | A         |      293517 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_c    |            3 | birth_date  | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_d    |            1 | first_name  | A         |        1218 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_d    |            2 | birth_date  | A         |      294525 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_d    |            3 | hire_date   | A         |      298095 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_e    |            1 | birth_date  | A         |        4767 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_e    |            2 | first_name  | A         |      292761 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_e    |            3 | hire_date   | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_f    |            1 | birth_date  | A         |        4767 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_f    |            2 | hire_date   | A         |      297864 | NULL     | NULL   |      | BTREE      |         |               |
| employees |          1 | idx_f    |            3 | first_name  | A         |      299468 | NULL     | NULL   |      | BTREE      |         |               |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

SQL1

mysql> explain select  emp_no,birth_date,first_name,last_name,gender,hire_date 
from    employees 
where   hire_date >= '1998-01-01'
and     first_name = 'Ebbe'
order by birth_date;
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+
| id | select_type | table     | partitions | type  | possible_keys           | key   | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | employees | NULL       | range | idx_a,idx_b,idx_c,idx_d | idx_c | 19      | NULL |    5 |   100.00 | Using index condition; Using filesort |
+----+-------------+-----------+------------+-------+-------------------------+-------+---------+------+------+----------+---------------------------------------+

這裡MySQL自動選擇了idx_c，是因為first_name+hire_date兩個欄位已經將資料過濾了只有5行，由於資料少，排序非常快。反之，如果選擇idx_d，則需要先通過first_name欄位過濾出符合條件的190行資料，然後再使用hire_date篩選資料，工作量較大。

SQL2

mysql> explain select  emp_no,birth_date,first_name,last_name,gender,hire_date 
from    employees 
where   hire_date >= '1980-01-01'
and     first_name = 'Ebbe'
order by birth_date;
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table     | partitions | type | possible_keys           | key   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | employees | NULL       | ref  | idx_a,idx_b,idx_c,idx_d | idx_d | 16      | const |  190 |    50.00 | Using index condition |
+----+-------------+-----------+------------+------+-------------------------+-------+---------+-------+------+----------+-----------------------+

如果選擇idx_c，first_name+hire_date兩個欄位通過索引過濾資料之後，資料量較大，導致排序非常慢。MySQL自動選擇了idx_d，通過索引的first_name列過濾資料，並通過索引條件下推過濾hire_date欄位，然後從索引中有序的取出資料，相對來說，由於使用idx_d無需排序，速度會更快。

（四）複合索引總結

1.複合索引的建立，如果存在多個等值查詢，則將選擇性好的列放在最前面，選擇性差的列放在後面；

2.複合索引的建立，如果涉及到等值查詢和範圍查詢，不管非等值查詢的列的選擇性如何好，等值查詢的欄位要放在非等值查詢的前面；

3.複合索引的建立，如果涉及到等值查詢和範圍查詢和排序(order by、group by)，則等值查詢放在索引最前面，範圍查詢和排序哪個在前，哪個在後，需要根據實際場景決定。如果範圍查詢在前，則無法使用到索引的有序性，需filesort，適用於返回結果較少的SQL，因為結果少則排序開銷小；如果排序在前，則可以使用到索引的有序性，但是需要回表(或者索引條件下推)去查詢資料，適用於返回結果較多的SQL，因為無需排序，直接取出資料。

4.複合索引的建立，一定不能把order by、group by的列放在索引的最前面，因為查詢中總是where先於order by執行；

5.使用索引進行範圍查詢會導致後續索引欄位無法被使用，如果有排序，無法消除filesort排序。例子：a_b_c索引，where a>? and b = ? order by c，則a可以被使用到，b無法被使用，c欄位需filesort。

【完】

Note：作者是一枚MySQL菜鳥，文章僅代表個人觀點，如有不對，敬請指出，謝謝。

MySQL複合索引
2024-06-29
MySql索引
雜湊索引在MySQL中的探究
2022-05-14
索引MySql
mysql 索引合併
2021-09-10
MySql索引
mysql組合索引，abc索引命中
2020-09-23
MySql索引
DataFrame刪除複合索引
2024-05-03
索引
MongoDB複合索引詳解
2018-03-23
MongoDB索引
mysql覆蓋索引高效能的探究
2021-09-11
MySql索引
MySQL的聯合索引
2024-06-13
MySql索引
mysql的組合索引
2024-07-26
MySql索引
oracle複合索引介紹(多欄位索引)
2019-05-10
Oracle索引
MongoDB中複合索引結構
2020-10-23
MongoDB索引
MySQL 組合索引不生效
2019-09-16
MySql索引
【Mysql】InnoDB 中的聚簇索引、二級索引、聯合索引
2021-07-27
MySql索引
MySQL學習筆記：組合索引-索引下推
2020-09-22
MySql筆記索引
mysql 建立和刪除聯合索引
2020-11-20
MySql索引
MySQL 配置索引頁的合併閾值
2022-03-17
MySql索引
MySQL組合索引和最左匹配原則
2019-03-11
MySql索引
MySQL 優化之 index_merge （索引合併）
2018-05-11
MySql優化Index索引
mysql 聯合索引的兩種特殊場景
2024-03-29
MySql索引
「生產事故」MongoDB複合索引引發的災難
2020-12-09
MongoDB索引
MySQL學習筆記：組合索引-最左原則
2020-09-22
MySql筆記索引
Mysql中聯合索引的最左匹配原則
2019-01-04
MySql索引
MySQL 調優之如何正確使用聯合索引
2021-11-11
MySql索引
mysql索引之字首索引
2019-08-20
MySql索引
ElasticSearch 索引 VS MySQL 索引
2020-10-09
Elasticsearch索引MySql
mysql索引
2024-05-11
MySql索引
[Mysql]索引
2024-07-05
MySql索引
MySQL 索引
2024-12-05
MySql索引
MySQL — 索引
2022-03-22
MySql索引
MySQL中為什麼要使用索引合併(Index Merge)？
2024-07-12
MySql索引Index
MySQL建立表的時候建立聯合索引的方法
2024-06-06
MySql索引
為什麼ElasticSearch比MySQL更適合全文索引
2021-02-20
ElasticsearchMySql索引
MySQL索引系列：全文索引
2020-11-07
MySql索引
MySQL 字串索引和字首索引
2024-11-06
MySql字串索引
索引與null（二）：組合索引
2019-06-01
索引Null
故障案例：MySQL唯一索引有重複值，官方卻說This is not a bug
2023-02-17
MySql索引
MySQL 索引初探
2018-11-09
MySql索引
Mysql索引使用
2024-01-12
MySql索引

MySQL複合索引探究

相關文章