為什麼升級到 MySQL 8.x 後,帶有多個 IN 值的查詢會更昂貴?

abce發表於2024-07-01

MySQL的範圍最佳化

有多個 IN 值的查詢在 MySQL 查詢最佳化器中,會使用 "等值範圍最佳化"(Equality Range Optimization)。假設我們的查詢是這樣的:

SELECT COUNT(*) FROM test.sbtest1 WHERE id IN (10,50,200,...,30822);

比較列表中有一萬個值。在 MySQL 5.7 下執行這個程式會得到以下執行統計結果:

mysql57 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set (0.16 sec)

對應的慢查詢日誌(為了便於閱讀做了擷取):

# Time: 2024-06-23T13:34:05.912909Z
# User@Host: msandbox[msandbox] @ localhost []  Id:     6
# Query_time: 0.161071  Lock_time: 0.021591 Rows_sent: 1  Rows_examined: 17433
SET timestamp=1719149645;
select count(*) from test.sbtest1 where id in (...);

handler 的狀態資訊表明最佳化器恰當的使用了索引,最佳化起了作用:

mysql57 > show status like 'ha%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 17433 |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.00 sec)

檢視 EXPLAIN 可以確認範圍型別和索引:

mysql57 > source explain1.sql
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 17433
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.06 sec)

然而,在我們的升級例項中,使用相同的表,我們最終得到了以下結果:

mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set, 1 warning (0.38 sec)

mysql80 > show warnings;
*************************** 1. row ***************************
  Level: Warning
   Code: 3170
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.
1 row in set (0.00 sec)

上述警告已經告訴我們一些問題,這些問題會導致更糟糕的最佳化方案,如下所示:

# Time: 2024-06-23T13:44:51.610282Z
# User@Host: msandbox[msandbox] @ localhost []  Id:    18
# Query_time: 0.385067  Lock_time: 0.000004 Rows_sent: 1  Rows_examined: 1200000
SET timestamp=1719150291;
select count(*) from test.sbtest1 where id in (...);

mysql80 > show status like 'ha%';
+----------------------------+---------+
| Variable_name              | Value   |
+----------------------------+---------+
| Handler_commit             | 1       |
| Handler_delete             | 0       |
| Handler_discover           | 0       |
| Handler_external_lock      | 2       |
| Handler_mrr_init           | 0       |
| Handler_prepare            | 0       |
| Handler_read_first         | 1       |
| Handler_read_key           | 1       |
| Handler_read_last          | 0       |
| Handler_read_next          | 1200000 |
| Handler_read_prev          | 0       |
| Handler_read_rnd           | 0       |
| Handler_read_rnd_next      | 0       |
| Handler_rollback           | 0       |
| Handler_savepoint          | 0       |
| Handler_savepoint_rollback | 0       |
| Handler_update             | 0       |
| Handler_write              | 0       |
+----------------------------+---------+
18 rows in set (0.00 sec)

查詢計劃出人意料地顯示了不同列上的二級索引,而且幾乎掃描了所有錶行:

mysql80 > source explain1.sql
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: sbtest1
   partitions: NULL
         type: index
possible_keys: PRIMARY
          key: k
      key_len: 4
          ref: NULL
         rows: 1183608
     filtered: 50.00
        Extra: Using where; Using index
1 row in set, 2 warnings (0.05 sec)

範圍最佳化需要的記憶體限制由 range_optimizer_max_mem_size 變數定義,但在 MySQL 5.7 和 8.0 中是相同的(8MB)!

那麼,為什麼在新版本中不能對同一個表進行相同的查詢呢?

讓我們檢查一下 Performance Schema 中的相關記憶體佔用情況。在 5.7 版本中(截斷 Performance Schema 表並執行查詢後獲得),記憶體佔用約為 5.5 MB:

mysql57 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=(select THREAD_ID from performance_schema.threads where processlist_id=CONNECTION_ID()) and
    event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
                   THREAD_ID: 32
                  EVENT_NAME: memory/sql/test_quick_select
                 COUNT_ALLOC: 104
                  COUNT_FREE: 104
   SUM_NUMBER_OF_BYTES_ALLOC: 5705856
    SUM_NUMBER_OF_BYTES_FREE: 5705856
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 0
             HIGH_COUNT_USED: 104
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
   HIGH_NUMBER_OF_BYTES_USED: 5705856
1 row in set (0.00 sec)

但在 MySQL 8.0 ,它要大得多,約為 11.5 MB,高於允許的 8 MB 限制:

mysql8.0 > select * from performance_schema.memory_summary_by_thread_by_event_name where thread_id=PS_CURRENT_THREAD_ID() and event_name='memory/sql/test_quick_select'G
*************************** 1. row ***************************
                   THREAD_ID: 47
                  EVENT_NAME: memory/sql/test_quick_select
                 COUNT_ALLOC: 18
                  COUNT_FREE: 18
   SUM_NUMBER_OF_BYTES_ALLOC: 12099576
    SUM_NUMBER_OF_BYTES_FREE: 12099576
              LOW_COUNT_USED: 0
          CURRENT_COUNT_USED: 0
             HIGH_COUNT_USED: 18
    LOW_NUMBER_OF_BYTES_USED: 0
CURRENT_NUMBER_OF_BYTES_USED: 0
   HIGH_NUMBER_OF_BYTES_USED: 12099576
1 row in set (0.00 sec)

增加變數的值,可修復查詢計劃:

mysql80 > set range_optimizer_max_mem_size=12*1024*1024;
Query OK, 0 rows affected (0.00 sec)

mysql80 > source query1.sql
+----------+
| count(*) |
+----------+
|    17433 |
+----------+
1 row in set (0.10 sec)

Przemysław Malkowski 認為,這種記憶體需求差異是不合理的。因此,他報告了這一回歸 bug:

https://bugs.mysql.com/bug.php?id=115327

另外,Przemysław Malkowski 還報告了一個誤導性文件錯誤:

https://bugs.mysql.com/bug.php?id=115062

表上的索引越多,這種迴歸的影響就越大,因為 MySQL 8.0 也受到了另一個相關錯誤的影響:

https://bugs.mysql.com/bug.php?id=104000

最後

升級到 MySQL 8.x 可能具有挑戰性,在實施前應進行充分測試,以避免出現一些非常糟糕的意外情況,如上述情況!不過,從長遠來看,升級是不可避免的,因為 5.7 已經到了 EOL。

另外,有一個工具可以用來檢查任何查詢在新版本中是否有不同的表現。下面是一個快速更新,說明了它是如何發現額外警告的:

$ pt-upgrade h=127.0.0.1,P=5756,u=msandbox,p=***,D=test h=127.0.0.1,P=8055,u=msandbox,p=msandbox,D=test slow57.log

#-----------------------------------------------------------------------
# Logs
#-----------------------------------------------------------------------

File: slow57.log
Size: 141681

#-----------------------------------------------------------------------
# Hosts
#-----------------------------------------------------------------------

host1:

  DSN:       h=127.0.0.1,P=5756
  hostname:  przemek-dbg
  MySQL:     MySQL Community Server (GPL) 5.7.44

host2:

  DSN:       h=127.0.0.1,P=8055
  hostname:  przemek-dbg
  MySQL:     MySQL Community Server - GPL 8.0.37

########################################################################
# Query class F4A5056EC85D02D0
########################################################################

Reporting class because it has diffs, but hasn't been reported yet.

Total queries      1
Unique queries     1
Discarded queries  0

select count(*) from test.sbtest? where id in(?+)

##
## Warning diffs: 1
##

-- 1.

No warning 3170

vs.

   Code: 3170
  Level: Warning
Message: Memory capacity of 8388608 bytes for 'range_optimizer_max_mem_size' exceeded. Range optimization was not done for this query.

select count(*) from test.sbtest1 where id in (90, ... ,13668,15161)

#-----------------------------------------------------------------------
# Stats
#-----------------------------------------------------------------------

failed_queries        0
not_select            0
queries_filtered      0
queries_no_diffs      0
queries_read          1
queries_with_diffs    1
queries_with_errors   0

相關文章