【MySQL】效能優化之 count(*) VS count(col)

楊奇龍發表於2013-10-20

原文網址 : http://blog.itpub.net/22664653/viewspace-774679/

MySql優化

優化mysql資料庫時，經常有開發詢問 count(1)和count(primary_key) VS count(*)的效能有何差異？看似簡單的問題，估計會有很多人對此存在認知誤區：

1. 認為count(1) 和 count(primary_key) 比 count(*) 的效能好。

2. count(column) 和 count(*) 效果是一樣的。

本文對上述兩點做如下測試，

測試環境:

root@yang 07:17:04>CREATE TABLE `mytab` (

-> `id` int(10) unsigned NOT NULL,

-> `v1` int(11) default NULL,

-> `v2` int(10) unsigned NOT NULL,

-> KEY `idx_id` (`id`)

-> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Query OK, 0 rows affected (0.00 sec)

1 select語句不含有where 條件

root@yang 07:41:11>select count(*) from mytab;

+----------+

| count(*) |

+----------+

| 2000000 |

+----------+

1 row in set (0.00 sec)

root@yang 07:41:22>select count(id) from mytab;

+-----------+

| count(id) |

+-----------+

| 2000000 |

+-----------+

1 row in set (0.00 sec)

root@yang 07:41:37>select count(v1) from mytab;

+-----------+

| count(v1) |

+-----------+

| 2000000 |

+-----------+

1 row in set (0.12 sec)

root@yang 07:41:41>select count(v2) from mytab;

+-----------+

| count(v2) |

+-----------+

| 2000000 |

+-----------+

1 row in set (0.00 sec)

以上使用了myisam表做了測試，四種查詢方式的結果有所不同，注意到count(V1) 的時間是0.12s 因為myisam 表的特性其已經儲存了表的總行數， count(*)相對非常快。

coun(v2) 比count(v1) 快是因為v1 欄位可以為空，mysql 在執行count(col) 是表示結果集中有多少個col欄位不為空的記錄，mysql 儲存引擎會去檢查表中說有行記錄中col欄位是否為空，並計算出非空的個數。

2 使用帶有where 條件的查詢：

root@yang 01:13:13>select count(*) from t1 where id < 200000;

+----------+

| count(*) |

+----------+

| 200000 |

+----------+

1 row in set (0.06 sec)

root@yang 01:15:22>explain select count(*) from t1 where id < 200000;

+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+

1 row in set (0.00 sec)

root@yang 01:15:30> select count(v1) from t1 where id < 200000;

+-----------+

| count(v1) |

+-----------+

| 200000 |

+-----------+

1 row in set (0.17 sec)

root@yang 01:15:37>explain select count(v1) from t1 where id < 200000;

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

1 row in set (0.00 sec)

root@yang 01:15:42> select count(v2) from t1 where id < 200000;

+-----------+

| count(v2) |

+-----------+

| 200000 |

+-----------+

1 row in set (0.16 sec)

root@yang 01:15:49>explain select count(v2) from t1 where id < 200000;

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

+----+-------------+-------+-------+---------------+--------+---------+------+--------+-------------+

1 row in set (0.00 sec)

count(*) 可以使用覆蓋索引，而count(col)不行。v2 是非空列，處理起來應該和count(*)類似才對，這裡顯示卻和v1 欄位的處理一致。如果mysql 優化器處理的足夠好，檢查到欄位為非空時，即可和count(*) 做同樣的處理，這樣速度會更快一些。下面修改索引結構使用複合索引。

root@yang 01:17:07>alter table t1 drop key idx_id,add key idx_id_v1(id,v1);

Query OK, 2000000 rows affected (1.49 sec)

Records: 2000000 Duplicates: 0 Warnings: 0

root@yang 01:17:36> select count(v1) from t1 where id < 200000;

+-----------+

| count(v1) |

+-----------+

| 200000 |

+-----------+

1 row in set (0.07 sec)

root@yang 01:17:49>explain select count(v1) from t1 where id < 200000;

+----+-------------+-------+-------+---------------+-----------+---------+------+--------+--------------------------+

+----+-------------+-------+-------+---------------+-----------+---------+------+--------+--------------------------+

+----+-------------+-------+-------+---------------+-----------+---------+------+--------+--------------------------+

1 row in set (0.00 sec)

對於欄位v1 的查詢效能相對上例中提升兩倍多，當然如果是生產環境可能提升更高。最終面向開發是，最好先有開發修改應用程式中的sql 避免使其選擇count(col)。

對於第二個誤區：認為 count(column) 和 count(*) 是一樣的，其實是有差別的。請看下面的例子：

root@yang 10:01:38>create table t3 (id int ,v1 int ) engine =innodb;

Query OK, 0 rows affected (0.01 sec)

root@yang 10:03:54>insert t3 values (null,null),(1,null),(null,1),(1,null),(null,1),(1,null),(null,null);

Query OK, 7 rows affected (0.00 sec)

Records: 7 Duplicates: 0 Warnings: 0

root@yang 10:03:57>select count(id),count(id),count(v1) from t3;

+-----------+-----------+-----------+

| count(id) | count(id) | count(v1) |

+-----------+-----------+-----------+

| 3 | 3 | 2 |

+-----------+-----------+-----------+

1 row in set (0.00 sec)

count(col) 是表示結果集中有多少個column欄位不為空的記錄。
count(*) 是表示整個結果集有多少條記錄。

3 增加對innodb 儲存引擎的測試

root@yang 01:29:53>alter table t1 engine=innodb;

Query OK, 2000000 rows affected (11.25 sec)

Records: 2000000 Duplicates: 0 Warnings: 0

root@yang 01:30:26>select count(*) from t1;

+----------+

| count(*) |

+----------+

| 2000000 |

+----------+

1 row in set (0.69 sec)

對於 innodb表比myisam 表查詢總行數效能慢是因為innodb 儲存引擎並沒有儲存行的總數，innodb 表支援mvcc ，不同的事務可能看到不同的行記錄數。因此每次count(*) 和count(col)(沒有使用索引的情況下) 都要對錶進行索引掃描，可能大家對最終獲取結果的時間有疑問，為什麼myisam 表是0.17s 左右，而innodb 是0.77s ，因為innodb 表在磁碟儲存的大小比myisam大，掃描的物理page更多。

root@yang 01:30:32>select count(v1) from t1;

+-----------+

| count(v1) |

+-----------+

| 2000000 |

+-----------+

1 row in set (0.77 sec)

root@yang 01:30:40>select count(v2) from t1;

+-----------+

| count(v2) |

+-----------+

| 2000000 |

+-----------+

1 row in set (0.73 sec)

在使用where條件的情況下：等值查詢和使用到索引情況下，myisam 表和innodb的速度是幾乎無差別的，具體的效能表現和where 條件有關。

root@yang 10:14:03>select count(v2) from t1 where id =20000;

+-----------+

| count(v2) |

+-----------+

| 1 |

+-----------+

1 row in set (0.00 sec)

【MySQL】效能優化之 count(*) VS count(col)

相關文章