【案例】MySQL count操作優化案例一則

楊奇龍發表於2015-09-04
一 背景
 某業務的資料庫定期報 thread_runing 飆高,通定位發現一個慢查詢sql導致會話堆積。執行sql 耗時如下 

  1. root@db 05:32:05>select count(item_id) from xxxtable where selid = 345705650 and end_time > now();
  2. +----------------+
  3. | count(item_id) |
  4. +----------------+
  5. | 2247052 |
  6. +----------------+
  7. 1 row in set (4.65 sec)
二 分析  
慢查詢表結構如下 
  1. root@db >show create table xxxtable \G
  2. *************************** 1. row ***************************
  3.        Table: uac_shop_item_promotion_0091
  4. Create Table: CREATE TABLE `uac_shop_item_promotion_0091` (
  5.   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主鍵',
  6.   `gmt_modified` datetime NOT NULL COMMENT '修改時間',
  7.   `selid` bigint(20) NOT NULL COMMENT '分表欄位',
  8.   `end_time` datetime NOT NULL COMMENT '活動結束時間',
  9.   `item_id` bigint(20) NOT NULL COMMENT '商品id',
  10.   PRIMARY KEY (`id`),
  11.   UNIQUE KEY `idx_uq_item` (`item_id`),
  12.   KEY `idx_deller_id_end_time` (`selid`,`end_time`),
  13.   KEY `idx_deller_id_start_time` (`selid`,`start_time`),
  14.   KEY `idx_seller_item_start` (`selid`,`start_time`,`item_id`)
  15. ) ENGINE=InnoDB AUTO_INCREMENT=42132149 DEFAULT CHARSET=gbk COMMENT='索引表'
  16. 1 row in set (0.00 sec)
很明顯出現問題的sql由於使用了count(item_id) ,而item_id欄位並沒有和 selid 和end_time 構成有效索引  故該sql 沒有合理的使用索引 。檢視其直系計劃
  1. root@db >explain select count(item_id) from xxxtable
  2.         >where selid = 345705650 and end_time > now() \G
  3. *************************** 1. row ***************************
  4.            id: 1
  5.   select_type: SIMPLE
  6.         table: xxxtable
  7.          type: ref
  8. possible_keys: idx_deller_id_end_time,idx_deller_id_start_time,idx_seller_item_start
  9.           key: idx_deller_id_end_time
  10.       key_len: 8
  11.           ref: const
  12.          rows: 1726757
  13.         Extra: Using where
  14. 1 row in set (0.00 sec)
從key_len=8 和Extra: Using where 可以看出MySQL沒有完全利用到idx_deller_id_end_time組合索引而是利用到了 selid欄位作為過濾條件回表查詢。
count(item_id)的意思是符合where條件的結果集中item_id非空集合的總和。
三 如何優化
根據該sql的業務需求是需要獲取到某商家參加活動且活動截止時間大於當前時間的商品總數,可以使用如下sql滿足要求:
  1. select count(*) from xxxtable where selid = 345705650 and end_time > now()
執行時間僅為原來的1/4,新的sql釋出之後thread_running報警消失,業務校驗時間明顯縮短。
  1. root@db >select count(*) from xxxtable where selid = 345705650 and end_time > now();
  2. +----------+
  3. | count(*) |
  4. +----------+
  5. | 2247052 |
  6. +----------+
  7. 1 row in set (0.82 sec)
  8. root@db >select count(1) from xxxtable where selid = 345705650 and end_time > now();
  9. +----------+
  10. | count(1) |
  11. +----------+
  12. | 2247052 |
  13. +----------+
  14. 1 row in set (0.79 sec)
優化後的sql的explain 方式如下:
  1. root@db >explain select count(*) from xxxtable where selid = 345705650 and end_time > now() \G
  2. *************************** 1. row ***************************
  3.            id: 1
  4.   select_type: SIMPLE
  5.         table: xxxtable
  6.          type: range
  7. possible_keys: idx_deller_id_end_time,idx_deller_id_start_time,idx_seller_item_start
  8.           key: idx_deller_id_end_time
  9.       key_len: 16
  10.           ref: NULL
  11.          rows: 1726768
  12.         Extra: Using where; Using index
  13. 1 row in set (0.00 sec)
四 小結
 a 這個問題是在沒有修改索引的基礎中做出的優化,老的sql沒有有效的利用當前的索引導致耗時操作
 b 對於不同count型別的sql 總結如下
   count(*)/count(1) 返回結果集的總和包括null和重複的值。
   count(column) 返回結果集中非空 column 的總和,執行查詢的過程中會校驗欄位是否非空。
 c 在業務設計的時候 滿足業務邏輯的前提下推薦使用count(*).
 d 從官方文件中摘錄 Using where 和 Using index 的區別 
  1. Using index
  2.  The column information is retrieved from the table using only information in the index tree without having to do an additional seek to read the actual row. This strategy can be used when the query uses only columns that are part of a single index.
  3.  If the Extra column also says Using where, it means the index is being used to perform lookups of key values. Without Using where, the optimizer may be reading the index to avoid reading data rows but not using it for lookups. For example, if the index is a covering index for the query, the optimizer may scan it without using it for lookups. For InnoDB tables that have a user-defined clustered index, that index can be used even when Using index is absent from the Extra column. This is the case if type is index and key is PRIMARY.
  4.  Using where
  5.  A WHERE clause is used to restrict which rows to match against the next table or send to the client. Unless you specifically intend to fetch or examine all rows from the table, you may have something wrong in your query if the Extra value is not Using where and the table join type is ALL or index. Even if you are using an index for all parts of a WHERE clause, you may see Using where if the column can be NULL.
如果您覺得從這篇文章受益,可以贊助 北在南方 一瓶飲料 ^_^



來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22664653/viewspace-1791124/,如需轉載,請註明出處,否則將追究法律責任。

相關文章