cost量化分析
* GreatSQL社區原創內容未經授權不得隨意使用,轉載請聯系小編並注明來源。
前言:
我們在日常維護資料庫的時候,經常會遇到查詢慢的語句,這時候一般會透過執行EXPLAIN去檢視它的執行計劃,但是執行計劃往往只給我們帶來了最基礎的分析資訊,比如是否有使用索引,還有一些其他供我們分析的資訊,比如使用了臨時表、排序等等,卻無法展示為什麼一些其他的執行計劃未被選擇,比如說明明有索引,或者好幾個索引,但是為什麼查詢時未使用到期望的索引等
explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| 1 | SIMPLE | t2 | NULL | range | id_num_unique,idx_age,idx_age_id_num | idx_age | 1 | NULL | 9594 | 100.00 | Using index condition |
| 1 | SIMPLE | t1 | NULL | eq_ref | id_num_unique,idx_age | id_num_unique | 60 | test.t2.id_num | 1 | 50.00 | Using where |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
2 rows in set, 1 warning (0.01 sec)
如上面這個例子,為什麼t2表上列出了多個可能使用的索引,卻選擇了idx_age,最佳化器為什麼選擇了指定的索引,這時候並不能直觀的看出問題,這時候我們就可以開啟optimizer_trace跟蹤分析MySQL具體是怎麼選擇出最優的執行計劃的。
OPTIMIZER_TRACE:
optimizer_trace是什麼:
optimizer_trace是一個具有跟蹤功能的工具,可以跟蹤執行的語句的解析最佳化執行過程,並將跟蹤到的資訊記錄到INFORMATION_SCHEMA.OPTIMIZER_TRACE表中,但是每個會話都只能跟蹤它自己執行的語句,並且表中預設只記錄最後一個查詢的跟蹤結果
使用方法:
# 開啟optimizer trace功能 (預設情況下它是關閉的):
set optimizer_trace="enabled=on";
select ...; # 這裡輸入你自己的查詢語句
SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
# 當你停止檢視語句的最佳化過程時,把optimizer trace功能關閉
set optimizer_trace="enabled=off";
相關引數:
mysql> show variables like '%optimizer_trace%';
+------------------------------+----------------------------------------------------------------------------+
| Variable_name | Value |
+------------------------------+----------------------------------------------------------------------------+
| optimizer_trace | enabled=on,one_line=off |
| optimizer_trace_features | greedy_search=on,range_optimizer=on,dynamic_range=on,repeated_subselect=on |
| optimizer_trace_limit | 1 |
| optimizer_trace_max_mem_size | 1048576 |
| optimizer_trace_offset | -1 |
+------------------------------+----------------------------------------------------------------------------+
optimizer_trace: enabled 開啟/關閉optimizer_trace,one_line 是否單行顯示,關閉為json模式,一般不開啟 optimizer_trace_features:跟蹤資訊中可列印的項,一般不調整預設列印所有項 optimizer_trace_limit:儲存的跟蹤sql條數 optimizer_trace_offset:開始記錄的sql語句的偏移量,負數表示從最近執行倒數第幾條開始記錄 optimizer_trace_max_mem_size:optimizer_trace記憶體的大小,如果跟蹤資訊超過這個大小,資訊將會被截斷
optimizer_trace表資訊:
該表總共有4個欄位
QUERY 表示我們的查詢語句。 TRACE 表示最佳化過程的JSON格式文字。(重點關注) MISSING_BYTES_BEYOND_MAX_MEM_SIZE 由於最佳化過程可能會輸出很多,如果超過某個限制時,多餘的文字將不會被顯示,這個欄位展示了被忽略的文字位元組數。 INSUFFICIENT_PRIVILEGES 表示是否沒有許可權檢視最佳化過程,預設值是0,只有某些特殊情況下才會是 1,我們暫時不關心這個欄位的值。
資訊解讀:
透過 optimizer_trace表的query欄位可以看到,一條語句的執行過程主要分為三個步驟:
"join_preparation": {},(準備階段)
"join_optimization": {},(最佳化階段)
"join_execution": {},(執行階段)
各個步驟的詳細內容解讀:
preparation:
expanded_query :將語句進行格式化,補充隱藏的列名和表名等
transformations_to_nested_joins :查詢重寫,比如join的on改為where語句
optimization:
condition_processing{ :條件句處理。
transformation{:轉換型別句。這三次的轉換分別是
equality_propagation(等值條件句轉換),如:a = b and b = c and c = 5
constant_propagation(常量條件句轉換),如:a = 1 AND b > a
trivial_condition_removal(無效條件移除的轉換),如:1 = 1
}
}
substitute_generated_columns :替換虛擬生成列,測試了很多sql,這一列都沒有看到有用的資訊
table_dependencies :梳理表之間的依賴關係。
ref_optimizer_key_uses :如果最佳化器認為查詢可以使用ref的話,在這裡列出可以使用的索引。
rows_estimation{ :估算錶行數和掃描的代價。如果查詢中存在range掃描的話,對range掃描進行計劃分析及代價估算。
table_scan:全表掃描的行數(rows)以及所需要的代價(cost)。
potential_range_indexes:該階段會列出表中所有的索引並分析其是否可用,並且還會列出索引中可用的列欄位。
analyzing_range_alternatives :分析可選方案的代價。
}
considered_execution_plans{ :對比各可行計劃的代價,選擇相對最優的執行計劃。
plan_prefix:前置的執行計劃。
best_access_path:當前最優的執行順序資訊結果集。
access_type表示使用索引的方式,可參照為explain中的type欄位。
condition_filtering_pct:類似於explain中的filtered列,這是一個估算值。
rows_for_plan:該執行計劃最終的掃描行數,這裡的行數其實也是估算值,是由considered_access_paths的resulting_rows相乘之後再乘以condition_filtering_pct獲得。
cost_for_plan:該執行計劃的執行代價,由considered_access_paths的cost相加而得。
chosen:是否選擇了該執行計劃。
}
attaching_conditions_to_tables :新增附加條件,使得條件儘可能篩選單表資料。
refine_plan :最佳化後的執行計劃。
trace資訊中的json資訊很長,因為我們關心的是不同執行計劃的cost區別,所以只需要重點關注兩個部分rows_estimation 和considered_execution_plans
代價模型計算:
統計資訊和cost計算引數:
計算cost會涉及到表的主鍵索引資料頁(聚簇索引)數量和表中的記錄數,兩個資訊都可以透過innodb的表統計資訊mysql.innodb_table_stats查到,n_rows是記錄數,clustered_index_size是聚簇索引頁數。
mysql> select * from mysql.innodb_table_stats where table_name='basic_person_info';
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
| test | basic_person_info | 2022-12-23 18:27:24 | 86632 | 737 | 1401 |
+---------------+-------------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.01 sec)
代價模型將操作分為Server層和Engine(儲存引擎)層兩類,Server層主要是CPU代價,Engine層主要是IO代價,比如MySQL從磁碟讀取一個資料頁的代價io_block_read_cost為1,從buffer pool讀取的代價memory_block_read_cost為0.25,計算符合條件的行代價為row_evaluate_cost為0.1,除此之外還有:
memory_temptable_create_cost (default 1.0) 記憶體臨時表的建立代價。 memory_temptable_row_cost (default 0.1) 記憶體臨時表的行代價。 key_compare_cost (default 0.1) 鍵比較的代價,例如排序。 disk_temptable_create_cost (default 20.0) 內部myisam或innodb臨時表的建立代價。 disk_temptable_row_cost (default 0.5) 內部myisam或innodb臨時表的行代價。
這些都可以透過mysql.server_cost、mysql.engine_cost檢視defalt值和設定值
mysql> select * from mysql.server_cost;
+------------------------------+------------+---------------------+---------+---------------+
| cost_name | cost_value | last_update | comment | default_value |
+------------------------------+------------+---------------------+---------+---------------+
| disk_temptable_create_cost | NULL | 2022-05-11 16:09:37 | NULL | 20 |
| disk_temptable_row_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.5 |
| key_compare_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.05 |
| memory_temptable_create_cost | NULL | 2022-05-11 16:09:37 | NULL | 1 |
| memory_temptable_row_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.1 |
| row_evaluate_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.1 |
+------------------------------+------------+---------------------+---------+---------------+
mysql> select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name | cost_value | last_update | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default | 0 | io_block_read_cost | NULL | 2022-05-11 16:09:37 | NULL | 1 |
| default | 0 | memory_block_read_cost | NULL | 2023-01-09 11:17:39 | NULL | 0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
計算公式:
如上面介紹的一樣,代價模型將操作分為兩類io_cost和cpu_cost,io_cost+cpu_cost就是總的cost,下面是具體的計算方法:
全表掃描:
全表掃描成本 = io_cost + 1.1 + cpu_cost + 1 (io_cost +1.1和cpu_cost +1在程式碼裡是直接硬加上的,不知道為什麼,計算的時候直接加上)
io_cost = clustered_index_size (統計資訊中的主鍵頁數) * avg_single_page_cost(讀取一個頁的平均成本)
avg_single_page_cost = pages_in_memory_percent * 0.25(memory_block_read_cost) + pages_on_disk_percent * 1.0(io_block_read_cost)
pages_in_memory_percent 表示已經載入到 Buffer Pool 中的葉結點佔所有葉結點的比例 pages_on_disk_percent 表示沒有載入到 Buffer Pool 中的葉結點佔所有葉結點的比例
所以當資料已經全部讀取到buffer pool中的時候:
io_cost=clustered_index_size * 0.25
都沒有讀取到buffer pool中的時候:
io_cost=clustered_index_size * 1.0
當部分資料在buffer pool中,部分資料需要從磁碟讀取時,這時的係數介於0.25到1之間
cpu_cost = n_rows(統計資訊中記錄數) * 0.1(row_evaluate_cost)
走索引的成本:
和全表掃描的計算方法類似,其中io_cost與搜尋的區間數有關,比如掃描三個區間where a between 1 and 10 or a between 20 and 30 or a between 40 and 50,此時:
io_cost=3 * avg_single_page_cost
cpu_cost=記錄數 * 0.1(row_evaluate_cost)+0.01(程式碼中的微調引數)
針對二級索引還會有回表的操作:
MySQL認為每次回表都相當於是訪問一個頁面,所以每次回表都會進行一次IO,這部分成本:
io_cost=rows(記錄數)*avg_single_page_cost
對回表查詢的資料還需要進行一次計算:
cpu_cost=rows(記錄數) * 0.1(row_evaluate_cost)(需要注意的是當索引需要回表掃描時,在rows_estimation階段並不會計算這個值,在considered_execution_plans階段會重新加上這部分成本)
所以針對需要回表的查詢:
io_cost=查詢區間 * avg_single_page_cost + rows(記錄數) * avg_single_page_cost
cpu_cost=記錄數 * 0.1(row_evaluate_cost) + 0.01(程式碼中的微調引數) + rows(記錄數) * 0.1(row_evaluate_cost)
例子:
mysql> set optimizer_trace='enabled=on';
Query OK, 0 rows affected (0.00 sec)
mysql>explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
| 1 | SIMPLE | t2 | NULL | range | id_num_unique,idx_age,idx_age_id_num | idx_age | 1 | NULL | 9594 | 100.00 | Using index condition |
| 1 | SIMPLE | t1 | NULL | eq_ref | id_num_unique,idx_age | id_num_unique | 60 | test.t2.id_num | 1 | 50.00 | Using where |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+------+----------+-----------------------+
2 rows in set, 1 warning (0.04 sec)
檢視optimizer_trace的內容
select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20 | {
"steps": [
{
"join_preparation": {
"select#": 1,
"steps": [
{
"expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from (`basic_person_info` `t1` join `basic_person_info2` `t2` on((`t1`.`id_num` = `t2`.`id_num`))) where ((`t1`.`age` > 10) and (`t2`.`age` < 20))"
},
{
"transformations_to_nested_joins": {
"transformations": [
"JOIN_condition_to_WHERE",
"parenthesis_removal"
],
"expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`id_num` AS `id_num`,`t1`.`lastname` AS `lastname`,`t1`.`firstname` AS `firstname`,`t1`.`mobile` AS `mobile`,`t1`.`sex` AS `sex`,`t1`.`birthday` AS `birthday`,`t1`.`age` AS `age`,`t1`.`top_education` AS `top_education`,`t1`.`address` AS `address`,`t1`.`income_by_year` AS `income_by_year`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time`,`t2`.`id` AS `id`,`t2`.`id_num` AS `id_num`,`t2`.`lastname` AS `lastname`,`t2`.`firstname` AS `firstname`,`t2`.`mobile` AS `mobile`,`t2`.`sex` AS `sex`,`t2`.`birthday` AS `birthday`,`t2`.`age` AS `age`,`t2`.`top_education` AS `top_education`,`t2`.`address` AS `address`,`t2`.`income_by_year` AS `income_by_year`,`t2`.`create_time` AS `create_time`,`t2`.`update_time` AS `update_time` from `basic_person_info` `t1` join `basic_person_info2` `t2` where ((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))"
}
}
]
}
},
{
"join_optimization": {
"select#": 1,
"steps": [
{
"condition_processing": {
"condition": "WHERE",
"original_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and (`t1`.`id_num` = `t2`.`id_num`))",
"steps": [
{
"transformation": "equality_propagation",
"resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
},
{
"transformation": "constant_propagation",
"resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
},
{
"transformation": "trivial_condition_removal",
"resulting_condition": "((`t1`.`age` > 10) and (`t2`.`age` < 20) and multiple equal(`t1`.`id_num`, `t2`.`id_num`))"
}
]
}
},
{
"substitute_generated_columns": {
}
},
{
"table_dependencies": [
{
"table": "`basic_person_info` `t1`",
"row_may_be_null": false,
"map_bit": 0,
"depends_on_map_bits": [
]
},
{
"table": "`basic_person_info2` `t2`",
"row_may_be_null": false,
"map_bit": 1,
"depends_on_map_bits": [
]
}
]
},
{
"ref_optimizer_key_uses": [
{
"table": "`basic_person_info` `t1`",
"field": "id_num",
"equals": "`t2`.`id_num`",
"null_rejecting": true
},
{
"table": "`basic_person_info2` `t2`",
"field": "id_num",
"equals": "`t1`.`id_num`",
"null_rejecting": true
}
]
},
{
"rows_estimation": [
{
"table": "`basic_person_info` `t1`",
"range_analysis": {
"table_scan": {
"rows": 86734,
"cost": 8859.75
t1表的scan成本=聚簇索引頁數*0.25 + 行數 * 0.1 +1.1+1
737*0.25+1.1+86734*0.1+1=8859.75
},
"potential_range_indexes": [
{
"index": "PRIMARY",
"usable": false,
"cause": "not_applicable"
},
{
"index": "id_num_unique",
"usable": false,
"cause": "not_applicable"
},
{
"index": "mobile_unique",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_name",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_top_education",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_create_time",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_mobile",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_age",
"usable": true,
"key_parts": [
"age",
"id"
]
}
],
"setup_range_conditions": [
],
"group_index_range": {
"chosen": false,
"cause": "not_single_table"
},
"skip_scan_range": {
"chosen": false,
"cause": "not_single_table"
},
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": "idx_age",
"ranges": [
"10 < age"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 43367,
"cost": 15178.7,
透過索引idx_age讀取資料:
io_cost=區間數* 0.25 +記錄數* 0.25
io_cost=1*0.25+43367*0.25=10,842
cpu_cost=記錄數* 0.1 (沒有回表的cost)
cpu_cost=43367*0.1=4,336.7
cost=10842+4,336.7=15178.7
"chosen": false,
"cause": "cost"
}
],
"analyzing_roworder_intersect": {
"usable": false,
"cause": "too_few_roworder_scans"
}
}
}
},
{
"table": "`basic_person_info2` `t2`",
"range_analysis": {
"table_scan": {
"rows": 73845,
"cost": 7538.85
t2表的scan成本=聚簇索引頁數*0.25 + 行數 * 0.1 +1.1+1
609*0.25+1+73845*0.1+1.1=7538.85
},
"potential_range_indexes": [
{
"index": "PRIMARY",
"usable": false,
"cause": "not_applicable"
},
{
"index": "id_num_unique",
"usable": false,
"cause": "not_applicable"
},
{
"index": "mobile_unique",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_name",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_top_education",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_create_time",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_basic_person_info_mobile",
"usable": false,
"cause": "not_applicable"
},
{
"index": "idx_age",
"usable": true,
"key_parts": [
"age",
"id"
]
},
{
"index": "idx_age_id_num",
"usable": true,
"key_parts": [
"age",
"id_num",
"id"
]
}
],
"setup_range_conditions": [
],
"group_index_range": {
"chosen": false,
"cause": "not_single_table"
},
"skip_scan_range": {
"chosen": false,
"cause": "not_single_table"
},
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": "idx_age",
"ranges": [
"age < 20"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 9594,
"cost": 3358.16,
透過索引idx_age讀取資料:
io_cost=區間數* 0.25 +記錄數* 0.25
io_cost=1*0.25+9594*0.25=2,398.75
cpu_cost=記錄數* 0.1 (沒有回表的cost)
cpu_cost=9594*0.1959.4
cost=2,398.75+959.4=3,358.15
"chosen": true
},
{
"index": "idx_age_id_num",
"ranges": [
"age < 20"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 19086,
"cost": 6680.36,
透過索引idx_age_id_num讀取資料:
io_cost=區間數* 0.25 +記錄數* 0.25
io_cost=1*0.25+19086*0.25=4,771.75
cpu_cost=記錄數* 0.1 (沒有回表的cost)
cpu_cost=19086*0.1=1908.6
cost=4,771.75+1908.6=6,680.35
"chosen": false,
"cause": "cost"
}
],
"analyzing_roworder_intersect": {
"usable": false,
"cause": "too_few_roworder_scans"
}
},
"chosen_range_access_summary": {
"range_access_plan": {
"type": "range_scan",
"index": "idx_age",
"rows": 9594,
"ranges": [
"age < 20"
]
},
"rows_for_plan": 9594,
"cost_for_plan": 3358.16,
"chosen": true
}
}
}
]
},
{
"considered_execution_plans": [
{
"plan_prefix": [
],
"table": "`basic_person_info2` `t2`",
"best_access_path": {
"considered_access_paths": [
{
"access_type": "ref",
"index": "id_num_unique",
"usable": false,
"chosen": false
},
{
"rows_to_scan": 9594,
"filtering_effect": [
],
"final_filtering_effect": 1,
"access_type": "range",
"range_details": {
"used_index": "idx_age"
},
"resulting_rows": 9594,
"cost": 4317.56,
透過索引idx_age讀取資料:
io_cost=區間數* 0.25 +記錄數* 0.25
io_cost=1*0.25+9594*0.25=2,398.75
cpu_cost=記錄數* 0.1 + 記錄數* 0.1
cpu_cost=9594*0.1*2=1,918.8
cost=2,398.75+1,918.8=4317.56
"chosen": true
}
]
},
"condition_filtering_pct": 100,
"rows_for_plan": 9594,
"cost_for_plan": 4317.56,
"rest_of_plan": [
{
"plan_prefix": [
"`basic_person_info2` `t2`"
],
"table": "`basic_person_info` `t1`",
"best_access_path": {
"considered_access_paths": [
{
"access_type": "eq_ref",
"index": "id_num_unique",
"rows": 1,
"cost": 3357.9,
io_cost=t2表記錄數*0.25=9594*0.25=2398.5
cpu_cost=記錄數*0.1=9594*0.1=959.4
cost=2398.5+959.4=3357.9
"chosen": true
},
{
"rows_to_scan": 86734,
"filtering_effect": [
],
"final_filtering_effect": 0.5,
"access_type": "scan",
"using_join_cache": true,
"buffers_needed": 14,
"resulting_rows": 43367,
"cost": 4.16701e+07,
"chosen": false
}
]
},
"condition_filtering_pct": 100,
"rows_for_plan": 9594,
"cost_for_plan": 7675.46,
總cost=4,317.56+3,357.9=7,675.46
"chosen": true
}
]
},
{
"plan_prefix": [
],
"table": "`basic_person_info` `t1`",
"best_access_path": {
"considered_access_paths": [
{
"access_type": "ref",
"index": "id_num_unique",
"usable": false,
"chosen": false
},
{
"rows_to_scan": 86734,
"filtering_effect": [
],
"final_filtering_effect": 0.5,
"access_type": "scan",
"resulting_rows": 43367,
"cost": 8857.65,
t1的scan成本
"chosen": true
}
]
},
"condition_filtering_pct": 100,
"rows_for_plan": 43367,
"cost_for_plan": 8857.65,
"pruned_by_cost": true
放棄後續的計算
}
]
},
{
"attaching_conditions_to_tables": {
"original_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10) and (`t2`.`age` < 20))",
"attached_conditions_computation": [
],
"attached_conditions_summary": [
{
"table": "`basic_person_info2` `t2`",
"attached": "(`t2`.`age` < 20)"
},
{
"table": "`basic_person_info` `t1`",
"attached": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))"
}
]
}
},
{
"finalizing_table_conditions": [
{
"table": "`basic_person_info2` `t2`",
"original_table_condition": "(`t2`.`age` < 20)",
"final_table_condition ": "(`t2`.`age` < 20)"
},
{
"table": "`basic_person_info` `t1`",
"original_table_condition": "((`t1`.`id_num` = `t2`.`id_num`) and (`t1`.`age` > 10))",
"final_table_condition ": "(`t1`.`age` > 10)"
}
]
},
{
"refine_plan": [
{
"table": "`basic_person_info2` `t2`",
"pushed_index_condition": "(`t2`.`age` < 20)",
"table_condition_attached": null
},
{
"table": "`basic_person_info` `t1`"
}
]
}
]
}
},
{
"join_execution": {
"select#": 1,
"steps": [
]
}
}
]
}
成本常數修改:
前面已經介紹了成本常量值實際上存放在MySQL自帶的系統庫MySQL中的server_cost和engine_cost表中,其中server_cost表存放server層的成本常量,engine_cost表存放engine層成本常量
mysql> select * from mysql.server_cost;
+------------------------------+------------+---------------------+---------+---------------+
| cost_name | cost_value | last_update | comment | default_value |
+------------------------------+------------+---------------------+---------+---------------+
| disk_temptable_create_cost | NULL | 2022-05-11 16:09:37 | NULL | 20 |
| disk_temptable_row_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.5 |
| key_compare_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.05 |
| memory_temptable_create_cost | NULL | 2022-05-11 16:09:37 | NULL | 1 |
| memory_temptable_row_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.1 |
| row_evaluate_cost | NULL | 2022-05-11 16:09:37 | NULL | 0.1 |
+------------------------------+------------+---------------------+---------+---------------+
mysql> select * from mysql.engine_cost;
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| engine_name | device_type | cost_name | cost_value | last_update | comment | default_value |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
| default | 0 | io_block_read_cost | NULL | 2022-05-11 16:09:37 | NULL | 1 |
| default | 0 | memory_block_read_cost | NULL | 2023-01-09 11:17:39 | NULL | 0.25 |
+-------------+-------------+------------------------+------------+---------------------+---------+---------------+
其中 default_value的值是系統預設的,不能修改,cost_value列的值我們可以修改,如果cost_value列的值不為空系統將用該值覆蓋預設值,我們可以透過update語句來修改
mysql> update mysql.engine_cost set cost_value=10 where cost_name='memory_block_read_cost';
Query OK, 0 rows affected (0.00 sec)
mysql> update mysql.engine_cost set cost_value=10 where cost_name='io_block_read_cost';
Query OK, 0 rows affected (0.00 sec)
很多資料都說執行flush optimizer_costs就可以生效,不過我在修改完後並執行flush optimizer_costs並不能馬上生效,最後是透過重啟資料庫例項才生效,這個可能是資料庫版本的差異,大家可以自行驗證。
mysql> explain select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
| 1 | SIMPLE | t2 | NULL | ALL | id_num_unique,idx_age,idx_age_id_num | NULL | NULL | NULL | 73990 | 12.97 | Using where |
| 1 | SIMPLE | t1 | NULL | eq_ref | id_num_unique,idx_age | id_num_unique | 60 | test.t2.id_num | 1 | 50.00 | Using where |
+----+-------------+-------+------------+--------+--------------------------------------+---------------+---------+----------------+-------+----------+-------------+
"table": "`basic_person_info2` `t2`",
"range_analysis": {
"table_scan": {
"rows": 73990,
"cost": 13491.1
全表掃描cost=609*10+73990*0.1+1.1+1= 13491.1
},
"index": "idx_age",
"ranges": [
"age < 20"
],
"index_dives_for_eq_ranges": true,
"rowid_ordered": false,
"using_mrr": false,
"index_only": false,
"rows": 9594,
"cost": 96909.4,
idx_age索引掃描cost=1*10+9594*10+9594*0.1=96,909.4
"chosen": false,
"cause": "cost"
},
修改後的執行計劃,發現t2表走了全表掃描了而沒有走idx_age索引,分別檢視一下t2表走全表掃描和idx_age索引的cost發現全表掃描的cost為13491.1,而走索引的cost為96,909.4,因為全表掃描的cost比走索引低,所以最佳化器沒有選擇idx_age索引。
從這個例子可以看出,更改成本常量值會直接影響最佳化器的方案選擇,所以一定要慎重,沒有特殊原因建議不要修改。
explain format=json
雖然透過optimizer_trace可以看到很多詳細的最佳化器選擇過程,但是使用起來起來還是比較麻煩,需要過濾的資訊很多,這時explain format=json輸出json格式的分析資料也是一個不錯的選擇,它也包含語句將要執行的成本資訊,如下:
query_cost 總查詢成本
read_cost IO成本+除 eval_cost以外cpu成本
eval_cost 檢測rows * filter條記錄的成本
prefix_cost 單次查詢的成本,等於read_cost+eval_cost
mysql> explain format=json select * from basic_person_info t1 join basic_person_info2 t2 on t1.id_num=t2.id_num where t1.age >10 and t2.age<20;
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "7675.46"
},
"nested_loop": [
{
"table": {
"table_name": "t2",
"access_type": "range",
"possible_keys": [
"id_num_unique",
"idx_age",
"idx_age_id_num"
],
"key": "idx_age",
"used_key_parts": [
"age"
],
"key_length": "1",
"rows_examined_per_scan": 9594,
"rows_produced_per_join": 9594,
"filtered": "100.00",
"index_condition": "(`test`.`t2`.`age` < 20)",
"cost_info": {
"read_cost": "3358.16",
包含所有io成本+(cpu成本-eval_cost)
"eval_cost": "959.40",
計算扇出的cpu成本,最佳化器利用啟發式規則估算出滿足所有條件的的比例(filtered)
=rows_examined_per_scan*filtered*0.1
"prefix_cost": "4317.56",
單表查詢的總成本
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"id_num",
"lastname",
"firstname",
"mobile",
"sex",
"birthday",
"age",
"top_education",
"address",
"income_by_year",
"create_time",
"update_time"
]
}
},
{
"table": {
"table_name": "t1",
"access_type": "eq_ref",
"possible_keys": [
"id_num_unique",
"idx_age"
],
"key": "id_num_unique",
"used_key_parts": [
"id_num"
],
"key_length": "60",
"ref": [
"test.t2.id_num"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4797,
"filtered": "50.00",
"cost_info": {
"read_cost": "2398.50",
包含所有io成本+(cpu成本-eval_cost)
"eval_cost": "479.70",
計算扇出的cpu成本,最佳化器利用啟發式規則估算出滿足所有條件的的比例(filtered)
=rows_examined_per_scan*filtered*0.1
"prefix_cost": "7675.46",
兩表查詢的總cost
"data_read_per_join": "1M"
},
"used_columns": [
"id",
"id_num",
"lastname",
"firstname",
"mobile",
"sex",
"birthday",
"age",
"top_education",
"address",
"income_by_year",
"create_time",
"update_time"
],
"attached_condition": "(`test`.`t1`.`age` > 10)"
}
}
]
}
}
另外,explain結合show warnings語句一起使用還可以得知最佳化器改寫後的語句
mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`id_num` AS `id_num`,`test`.`t1`.`lastname` AS `lastname`,`test`.`t1`.`firstname` AS `firstname`,`test`.`t1`.`mobile` AS `mobile`,`test`.`t1`.`sex` AS `sex`,`test`.`t1`.`birthday` AS `birthday`,`test`.`t1`.`age` AS `age`,`test`.`t1`.`top_education` AS `top_education`,`test`.`t1`.`address` AS `address`,`test`.`t1`.`income_by_year` AS `income_by_year`,`test`.`t1`.`create_time` AS `create_time`,`test`.`t1`.`update_time` AS `update_time`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`id_num` AS `id_num`,`test`.`t2`.`lastname` AS `lastname`,`test`.`t2`.`firstname` AS `firstname`,`test`.`t2`.`mobile` AS `mobile`,`test`.`t2`.`sex` AS `sex`,`test`.`t2`.`birthday` AS `birthday`,`test`.`t2`.`age` AS `age`,`test`.`t2`.`top_education` AS `top_education`,`test`.`t2`.`address` AS `address`,`test`.`t2`.`income_by_year` AS `income_by_year`,`test`.`t2`.`create_time` AS `create_time`,`test`.`t2`.`update_time` AS `update_time` from `test`.`basic_person_info` `t1` join `test`.`basic_person_info2` `t2` where ((`test`.`t1`.`id_num` = `test`.`t2`.`id_num`) and (`test`.`t1`.`age` > 10) and (`test`.`t2`.`age` < 20)) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
總結:
MySQL的最佳化器是基於成本來選擇最優執行方案的,哪個成本最少就選哪個,所以重點在於計算出各個執行計劃的cost 成本由CPU成本和IO成本組成,每個成本常數值可以自己調整,非必要的情況下不要調整,以免影響整個資料庫的執行計劃選擇 透過開啟optimizer_trace可以跟蹤最佳化器的各個環節的分析步驟,可以判斷有時候為什麼沒有走索引而走了全表掃描 explain加上format=json選項後可以檢視成本資訊分為read_cost和eval_cost,但只能看到當前已經選擇的執行計劃,另外透過show warnings可以看到最佳化器改寫後的語句
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/70024922/viewspace-2938596/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- var let cost
- 6 Thing that determine composite cost
- HDU 1385 Minimum Transport Cost
- What is the Average Cost of Doing a Diploma?
- SAP QM Cost of Quality Inspection
- 746. Min Cost Climbing StairsAI
- PostgreSQL DBA(175) - Cost EST(SeqScan)SQL
- Least Cost Bracket Sequence(貪心)ASTRacket
- 【CodeChef】Graph Cost(動態規劃)動態規劃
- G. Reducing Delivery Cost(最短路)
- Oracle 監聽投毒COST解決Oracle
- 合約量化系統開發(語言)python|合約量化模式詳情分析Python模式
- 文字資料分析——主題提取+詞向量化
- [LeetCode] 857. Minimum Cost to Hire K WorkersLeetCode
- Paper Reading: Cost-sensitive deep forest for price predictionREST
- 量化合約系統開發(功能詳細)丨量化合約系統開發(策略及分析)
- 【量化跟單】合約跟單量化策略機器人系統設計開發詳情分析機器人
- 如何成為一名量化分析師(寬客)?
- 對比歸一化和標準化 —— 量化分析
- HASH量化合約交易系統技術開發分析
- 邏輯迴歸損失函式(cost function)邏輯迴歸函式Function
- [20230425]CBO cost與行遷移關係.txt
- 現貨量化網格系統/合約量化馬丁交易策略系統開發/Python技術分析Python
- 量化機器人開發技術丨量化交易系統開發市場_機器人_分析_策略機器人
- 量化合約開發需求版丨量化合約系統開發(開發方案及邏輯)丨量化合約原始碼及功能分析原始碼
- 闡述量化合約系統開發技術方案丨合約量化系統開發邏輯分析
- 量化跟單/秒合約/原始碼系統開發/永續合約量化交易開發技術分析原始碼
- 量化交易系統開發(說明流程)丨合約量化系統開發(技術分析及原始碼)原始碼
- 『做題記錄』[AGC028C] Min Cost CycleGC
- 幣管家量化炒幣機器人系統開發案例分析,幣管家量化機器人開發(原始碼)機器人原始碼
- 簡單介紹下量化分析的常用庫TA-lib
- 關於Python量化合約系統開發(原始碼分析搭建)Python原始碼
- 量化對沖搬磚交易策略系統開發行情分析
- 資料分析:複雜業務場景下,量化評估流程
- 量化基本知識點梳理-三種量化方式和量化框架以及trt框架
- 詳細分析:量化合約系統開發邏輯(Demo演示)合約量化原始碼系統開發功能方案原始碼
- 量化合約跟單交易系統開發說明分析,量化合約跟單交易原始碼平臺開發原始碼
- 使用Import-Cost VSCode外掛控制匯入包大小ImportVSCode