千萬級資料庫使用索引查詢速度更慢的疑惑-資料回表問題

yoylee_web發表於2018-08-01

資料庫索引

環境
- 資料庫：TiDB資料庫（和mysql資料庫極其相似的資料庫）
- 表名：index_basedata
- 表資料：13 000 000條資料
- 表索引：包含一個普通索引，索引列 ”year“
- 測試sql：
  - SQL1 : select brand from index_basedata where year = 2018 group by day limit 5;
  - SQL2 : select brand from index_basedata where month = 201807 group by day limit 5;
  - SQL3 : select brand from index_basedata where year = 2018 limit 5;
  - SQL4 : select brand from index_basedata where month = 201807 limit 5;
  - sql1與sql2對比， sql3與sql4對比
問題
- 測試sql執行時間：
  - SQL1 : 23.6s
  - SQL2 : 4.5s
  - SQL3 : 0.007s
  - SQL4 : 1.4s
- explain解釋：瞭解TiDB資料庫相關與explain請移步：https://pingcap.com/docs-cn/overview/#tidb-%e6%95%b4%e4%bd%93%e6%9e%b6%e6%9e%84
  - sql1:select brand from xcar_index_basedata_noprice where year = 2018 group by day limit 5;
  - sql2：select brand from xcar_index_basedata_noprice where month = 201807 group by day limit 5;
  - sql3:select brand from xcar_index_basedata_noprice where year = 2018 limit 5;
  - sql4:select brand from xcar_index_basedata_noprice where month = 201807 limit 5;
- 從圖中可以看出，sql1和sql3使用了索引，sql2和sql4沒有使用索引。
- 對於sql1和sql2，本應該使用所以的查詢時間少，但是使用了索引的sql1使用的時間是沒有使用索引的sql2查詢時間的5倍，為什麼？
- 對於sql3和sql4，恢復了正常，使用索引比不使用索引查詢速度快，為什麼上述兩個現象會相差如此之大？
解答（以下為個人理解，不同理解請不吝指教）
- 在sql1和sql2中，sql1索引列獲取資料的速度大於sql2中獲得資料的速度。但是在group by時在sql1中，使用索引得到的地址，需要回表才可以得到真實的資料，需要根據地址去獲取資料，資料回表問題嚴重。
- 在sql2中獲取的是直接資料，group by 不需要回表。
- sql2,sql3,sql4表現正常。
什麼是回表？
- 通俗的講：如果索引的列在select所需獲得列中就不需要回表（因為在mysql中索引是根據索引列的值進行排序的，所以索引節點中存在該列中的部分值），如果select所需獲得列中有大量的非索引列，索引就需要到表中找到相應的列的資訊，這就叫回表。
- 案例：
- 測試環境：與上述相同
- 測試sql：
  - sql1：select brand from index_basedata where year = 2018 group by day limit 5;
  - 執行時間：21.8s
  - explain一下：
  - 使用了索引“year”, 則索引列為year，但是select brand from..中brand並不是索引列，就需要回表（通過圖也可以看出，進行了tablescan，另外其中的IndexLookUp也說明了進行了回表），所以花費時間長
  - sql2：select brand from index_basedata where year = 2018 group by year limit 5;
  - 執行時間：21.7s
  - explain一下：
  - 使用了索引“year”, 則索引列為year，但是select brand from..中brand並不是索引列，就需要回表（通過圖也可以看出，進行了tablescan,另外其中的IndexLookUp也說明了進行了回表），所以花費時間長，另外，對於sql2中的group by使用的是索引列，所以使用的StreamAgg，不同於sql1
  - sql3：select year from index_basedata where year = 2018 group by year limit 5;
  - 執行時間：2.5s
  - explain一下：
  - 可以看到：沒有tablescan，也沒有使用IndexLookUp而是IndexReader說明直接從索引中讀取索引列並使用。
- 總結：在上述案例中，sql3使用了索引列，沒有進行回表，sql1與sql2進行了回表，所以花費時間長。所以說，發生嚴重的回表的時候，查詢速度比不使用索引還慢。
那麼，下面的5個sql執行時間能不能理解呢？
- select year from xcar_index_basedata_noprice where year = 2018 group by year limit 5;
- select month from xcar_index_basedata_noprice where month = 201807 group by month limit 5
- select brand_id from xcar_index_basedata_noprice where month = 201807 group by month limit 5;
- select brand_id from xcar_index_basedata_noprice where year = 2018 group by year limit 5;
- select brand_id from xcar_index_basedata_noprice where year = 2018 group by month limit 5;
- 對應執行時間：

mysql千萬級資料量根據索引優化查詢速度
2018-05-08
MySql索引優化
mysql千萬級資料量根據索引最佳化查詢速度
2021-09-09
MySql索引
資料庫表的唯一索引問題
2019-03-10
資料庫索引
AppBoxFuture: 二級索引及索引掃描查詢資料
2019-07-24
APP索引
QL Server 百萬級資料提高查詢速度的方法
2019-01-03
Server
資料庫查詢和資料庫(MySQL)索引的最佳化建議
2019-08-21
資料庫MySql索引
資料庫基礎查詢--單表查詢
2018-07-15
資料庫
資料庫中單表查詢
2020-10-05
資料庫
MyBatis千萬級資料查詢解決方案，避免OOM
2020-11-20
MyBatisOOM
資料庫索引分裂問題分析
2022-12-07
資料庫索引
mysql 5.7後使用sys資料庫下的表查詢資料庫效能狀況
2019-07-18
MySql資料庫
使用cglib實現資料庫框架的級聯查詢
2019-02-27
CGLib資料庫框架
查詢MySQL資料庫，MySQL表的大小
2020-11-03
MySql資料庫
查詢資料庫表及表欄位
2024-12-05
資料庫
在`Laravel`中使用`cursor`來查詢並處理資料 (輕鬆處理千萬級的資料)
2020-12-24
Laravel
在Laravel中使用cursor來查詢並處理資料 (輕鬆處理千萬級的資料)
2020-12-24
Laravel
上億級別資料庫查詢
2022-06-18
資料庫
千萬級MySQL資料庫建立索引，提高效能的祕訣
2019-08-31
MySql資料庫索引
資料庫索引層級
2021-12-11
資料庫索引
laravel 5.8 連線資料庫庫查詢資料速度慢，使用mysql 直接查詢響應就快，什麼原因呢？
2021-06-22
Laravel資料庫MySql
資料庫系列：覆蓋索引和規避回表
2023-04-04
資料庫索引
SQL Server 查詢資料庫中所有表資料條數
2024-05-06
SQLServer資料庫
indexedDB 通過索引查詢資料
2019-07-27
Index索引
SQL資料庫查詢最佳化技巧提升網站訪問速度的方法
2019-01-30
SQL資料庫網站
SAP中的資料庫表索引
2019-06-24
資料庫索引
Prometheus時序資料庫-資料的查詢
2021-03-15
Prometheus資料庫
資料庫全表查詢之-分頁查詢優化
2020-12-31
資料庫優化
提高mysql千萬級大資料SQL查詢優化30條經驗（Mysql索引優化注意）
2018-03-02
MySql大資料優化索引
百億級資料分庫分表後怎麼分頁查詢？
2022-12-05
Jemter查詢資料庫
2024-03-15
資料庫
求助：資料庫查詢
2020-08-16
資料庫
ThinkPHP 資料庫查詢
2019-12-18
PHP資料庫
資料庫排序查詢
2020-10-02
資料庫排序
查詢資料庫大小
2021-09-09
資料庫
TableStore多元索引，大資料查詢的利器
2019-02-28
索引大資料
查詢資料庫每個表佔用的大小
2018-09-25
資料庫
mysql資料庫連表查詢的幾種方法
2020-07-23
MySql資料庫
msyql千萬級別查詢優化之索引
2018-05-23
優化索引

千萬級資料庫使用索引查詢速度更慢的疑惑-資料回表問題

環境

問題

解答（以下為個人理解，不同理解請不吝指教）

什麼是回表？

那麼，下面的5個sql執行時間能不能理解呢？

相關文章