一、索引優化
1,單表索引優化
建表
CREATE TABLE IF NOT EXISTS article( id INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, author_id INT(10) UNSIGNED NOT NULL, category_id INT(10) UNSIGNED NOT NULL, views INT(10) UNSIGNED NOT NULL, comments INT(10) UNSIGNED NOT NULL, title VARCHAR(255) NOT NULL, content TEXT NOT NULL ); INSERT INTO article(author_id,category_id,views,comments,title,content) VALUES (1,1,1,1,'1','1'), (2,2,2,2,'2','2'), (1,1,3,3,'3','3');
查詢案例
- 查詢category_id為1且comments 大於1的情況下,views最多的article_id。
- 查詢語句:SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
- 分析語句:EXPLAIN SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
- 分析:可看到上述type為ALL,證明未命中索引,並且Using filesort使用了檔案排序。故而可進行優化到使用索引。
優化一:
- 建立索引:CREATE INDEX idx_article_ccv ON article(category_id,comments,views);
- 檢視當前索引:SHOW INDEX FROM article;
- 檢視執行計劃:EXPLAIN SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
- 分析:(刪除索引:DROP INDEX idx_article_ccv ON article;)
- 因為按照B+Tree索引的工作原理,先排序 category_id,如果遇到相同的 category_id 則再排序comments,如果遇到相同的 comments 則再排序 views。
- 最左字首匹配原則,當comments欄位在聯合索引裡處於中間位置時,因為
comments>1
條件是一個範圍值(所謂 range),MySQL 無法利用索引再對後面的views部分進行檢索,即 range 型別查詢欄位後面的索引無效。 - 如果將條件comments > 1改成comments = 1,則當前索引為型別為ref,並且不存在filesort。但是不滿足題意
優化二:
- 建立索引:create index idx_article_ccv on article(category_id, views);
- 檢視當前索引:SHOW INDEX FROM article;
- 再次檢視執行計劃:EXPLAIN SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;
- 可看到此時滿足條件,為方便後續測試此時依然刪除當前索引:DROP INDEX idx_article_ccv ON article;
2,兩表索引優化
建表
CREATE TABLE IF NOT EXISTS class( id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, card INT(10) UNSIGNED NOT NULL, PRIMARY KEY(id) ); CREATE TABLE IF NOT EXISTS book( bookid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, card INT(10) UNSIGNED NOT NULL, PRIMARY KEY(bookid) ); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card) VALUES(FLOOR(1+(RAND()*20)));
查詢案例
- 實現兩表的連線,連線條件是 class.card = book.card
- 查詢語句:SELECT * FROM class LEFT JOIN book ON class.card = book.card;
- 分析語句:EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
- 分析:type 有 ALL ,rows 為表中資料總行數,說明 class 和 book 進行了全表檢索
新增右表索引
- CREATE INDEX idx_book_card ON book(card);
- 分析語句:EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
- 分析:滿足條件,故而刪除當前索引,方便後續測試:DROP INDEX idx_book_card ON book;
- 這是由左連線特性決定的。LEFT JOIN條件用於確定如何從右表搜尋行,左邊一定都有,所以右邊是我們的關鍵點,一定需要建立索引。
- 左表連線右表,則需要拿著左表的資料去右表裡面查,索引需要在右表中建立索引
新增左表索引
- CREATE INDEX idx_class_card ON class(card);
- 分析語句:EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card;
- 分析:有上圖可看出來索引未起到太大的作用,因為是左連線。左表驅動右邊的時候由於左表不論如何都需要迴圈遍歷,故而rows還是原來的,其實這樣並沒有太大的效果
- 將左連線轉成右連線:EXPLAIN SELECT * FROM class RIGHT JOIN book ON class.card = book.card;
- 分析:滿足條件方便,後續測試刪除當前索引:DROP INDEX idx_class_card ON class
- 這是因為RIGHT JOIN條件用於確定如何從左表搜尋行,右邊一定都有,所以左邊是我們的關鍵點,一定需要建立索引。
- class RIGHT JOIN book :book 裡面的資料一定存在於結果集中,我們需要拿著 book 表中的資料,去 class 表中搜尋,所以索引需要建立在 class 表中
3,三表索引優化
建表
CREATE TABLE IF NOT EXISTS phone( phoneid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, card INT(10) UNSIGNED NOT NULL, PRIMARY KEY(phoneid) )ENGINE=INNODB; INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card) VALUES(FLOOR(1+(RAND()*20)));
查詢案例:
- 實現三表聯查:SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card
- 分析語句:EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card
- 結論:
- type 有ALL,rows 為表資料總行數,說明 class、 book 和 phone 表都進行了全表檢索
- Extra 中 Using join buffer ,表明連線過程中使用了 join 緩衝區
建立索引:
-
ALTER TABLE book ADD INDEX x (card); ALTER TABLE phone ADD INDEX y (card);
- 分析語句:EXPLAIN SELECT * FROM class LEFT JOIN book ON class.card = book.card LEFT JOIN phone ON book.card = phone.card
- 分析:進行 LEFT JOIN ,永遠都在右表的欄位上建立索引
4,總結
將 left join 看作是兩層巢狀 for 迴圈
- 儘可能減少Join語句中的NestedLoop的迴圈總次數;
- 永遠用小結果集驅動大的結果集(在大結果集中建立索引,在小結果集中遍歷全表);
- 優先優化NestedLoop的內層迴圈;
- 保證Join語句中被驅動表上Join條件欄位已經被索引;
- 當無法保證被驅動表的Join條件欄位被索引且記憶體資源充足的前提下,不要太吝惜JoinBuffer的設定;
二、索引失效
建立表:
CREATE TABLE staffs( id INT PRIMARY KEY AUTO_INCREMENT, `name` VARCHAR(24)NOT NULL DEFAULT'' COMMENT'姓名', `age` INT NOT NULL DEFAULT 0 COMMENT'年齡', `pos` VARCHAR(20) NOT NULL DEFAULT'' COMMENT'職位', `add_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT'入職時間' )CHARSET utf8 COMMENT'員工記錄表'; INSERT INTO staffs(`name`,`age`,`pos`,`add_time`) VALUES('z3',22,'manager',NOW()); INSERT INTO staffs(`name`,`age`,`pos`,`add_time`) VALUES('July',23,'dev',NOW()); INSERT INTO staffs(`name`,`age`,`pos`,`add_time`) VALUES('2000',23,'dev',NOW());
1,索引失效準則
- 全值匹配我最愛
- 最佳左字首法則:如果索引了多例,要遵守最左字首法則。指的是查詢從索引的最左前列開始並且不跳過索引中的列。
- 不在索引列上做任何操作(計算、函式、(自動or手動)型別轉換),會導致索引失效而轉向全表掃描
- 儲存引擎不能使用索引中範圍條件右邊的列
- 儘量使用覆蓋索引(只訪問索引的查詢(索引列和查詢列一致)),減少
select *
- mysql在使用不等於(!=或者<>)的時候無法使用索引會導致全表掃描(並不絕對,需考慮成本問題,例如id!=''時還是會用到索引的)
is null
,is not null
也無法使用索引(並不絕對,需考慮成本問題)- like以萬用字元開頭(’%abc…’)mysql索引失效會變成全表掃描操作(如果是'a%'則使用range索引)
- 字串不加引號索引失效
- 少用or,用它連線時會索引失效
2,索引失效案例
a)建立複合索引
#建立複合索引
ALTER TABLE staffs ADD INDEX index_staffs_nameAgePos(`name`,`age`,`pos`);
#檢視索引
SHOW INDEX FROM staffs;
b)where條件匹配
- 當順序匹配時:
- 當不存在最左索引(name)時:可以看到此時索引失效
- 當中間索引不存在(age)時:可以看到此時有效索引只有一個const,即:只有name有效
- 當再索引列上計算或者使用函式時,會導致索引失效:使用left(name,4)='July'
- 當使用範圍索引之後,後續的索引就會失效:name條件不變,把age由原來的等於變成大於,此時索引型別就由ref -> range
- 儘量使用覆蓋索引(只訪問索引的查詢(索引列和查詢列一致)),減少
select *
- like中使用%的位置決定是否使用索引(如果%在左邊或者左右都有索引會失效,如果%只在右邊索引有效)
-
- 如果確實是需要將%放在左邊可採用覆蓋索引優化(只查詢需要的列並命中到索引上)
- 字串不加單引號索引失效:name=2000和name='2000'
- 少用or,用它連線時會索引失效
- mysql在使用不等於(!=或者<>)的時候無法使用索引會導致全表掃描(當然也並不絕對,這裡有一個回執成本問題)
- is null,is not null 會導致索引失效:key = null 表示索引失效(並不絕對,會考慮成本問題)
注意在in、!=、is null和is not null,到底什麼時候索引,什麼時候採用全表掃描呢? 詳情描述請點選檢視
成本。對於使用二級索引(innodb)進行查詢來說,成本組成主要有兩個方面:
讀取二級索引記錄的成本
將二級索引記錄執行回表操作,也就是到聚簇索引中找到完整的使用者記錄的操作所付出的成本。
很顯然,要掃描的二級索引記錄條數越多,那麼需要執行的回表操作的次數也就越多,達到了某個比例時,使用二級索引執行查詢的成本也就超過了全表掃描的成本
(舉一個極端的例子,比方說要掃描的全部的二級索引記錄,那就要對每條記錄執行一遍回表操作,自然不如直接掃描聚簇索引來的快)。
所以MySQL優化器在真正執行查詢之前,對於每個可能使用到的索引來說,都會預先計算一下需要掃描的二級索引記錄的數量。所以對於以上三種查詢條件是否會命中索引就取決於二級索引查詢的成本與全域性查詢成本的高低。
三、索引案例
1,建表
create table test03( id int primary key not null auto_increment, c1 char(10), c2 char(10), c3 char(10), c4 char(10), c5 char(10) ); insert into test03(c1,c2,c3,c4,c5) values ('a1','a2','a3','a4','a5'); insert into test03(c1,c2,c3,c4,c5) values ('b1','b2','b3','b4','b5'); insert into test03(c1,c2,c3,c4,c5) values ('c1','c2','c3','c4','c5'); insert into test03(c1,c2,c3,c4,c5) values ('d1','d2','d3','d4','d5'); insert into test03(c1,c2,c3,c4,c5) values ('e1','e2','e3','e4','e5'); #建立複合索引 create index idx_test03_c1234 on test03(c1,c2,c3,c4);
2,案例
#只有where EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c3='a3' AND c4='a4'; EXPLAIN SELECT * FROM test03 WHERE c4='a4' AND c3='a3' AND c2='a2' AND c1='a1'; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c3>'a3' AND c4='a4'; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c4>'a4' AND c3='a3'; #where條件與order by 結合 EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c4='a4' ORDER BY c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' ORDER BY c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' ORDER BY c4; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c5='a5' ORDER BY c2, c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c5='a5' ORDER BY c3, c2; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' ORDER BY c2, c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c5='a5' ORDER BY c2, c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c2='a2' AND c5='a5' ORDER BY c3, c2; #where與group by結合 EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c4='a4' GROUP BY c2, c3; EXPLAIN SELECT * FROM test03 WHERE c1='a1' AND c4='a4' GROUP BY c3, c2;
3,案例分析
a)where查詢
b)where與order by結合
c)where與group by
結論:group by 基本上都需要進行排序(使用情況基本與order by相同,索引順序均會出現在where之後),但凡使用不當,會有臨時表產生。
4,索引失效總結
a)建議
- 對於單鍵索引,儘量選擇針對當前query過濾性更好的索引
- 在選擇組合索引的時候,當前query中過濾性最好的欄位在索引欄位順序中,位置越靠左越好。
- 在選擇組合索引的時候,儘量選擇可以能包含當前query查詢條件中where子句更多欄位的索引
- 儘可能通過分析統計資訊和調整query的寫法來達到選擇合適索引的目的
b)案例