【GreatSQL最佳化器-02】索引和Sargable謂詞

GreatSQL發表於2024-11-15

【GreatSQL最佳化器-02】索引和Sargable謂詞

一、Sargable謂詞介紹

GreatSQL的最佳化器在有過濾條件的時候,需要先把條件按照是否有索引來進行區分,可以用索引來加速查詢的條件稱為Sargable,其中 arge 來源於 Search Argument(搜尋引數)的首字母拼成的"SARG"。GreatSQL用keyuse_array索引陣列和Sargables陣列來儲存Sargable謂詞,其中Sargable陣列是對keyuse_array的補充使用,比如``a<b``條件如果 a 列上建立有索引並且b不是常量,a存入keyuse_array陣列,而這個b就存入Sargable陣列。因此 Sargable 也可以叫做 可索引謂詞

Sargable謂詞在最佳化器執行make_join_plan的一開始就透過update_ref_and_keys提取出來了,用於對keyuse_array索引陣列沒有儲存到的索引做補充。

生成的keyuse_array陣列在後面的JOIN::optimize_keyuse_array函式里面會給每個索引的ref_table_rows變數賦值:一個索引資料對應多少行表資料,這樣在後續Optimize_TABLE_order::find_best_ref計算最佳執行計劃表順序的時候可以根據ref_table_rows變數估計掃描開銷,提取到的keyuse_array會讓表的掃描方式走ref或者eq_ref方式掃描(見下表<<join_type表掃描方式>>),如果沒有keyuse_array的話就是用固定數值進行預估,這樣算出來的表的read_cost是不準確的,表的掃描方式走range或者scan方式掃描,這樣執行的是全表或索引掃描,這會引起查詢的效能下降。

名稱 說明
Sargable謂詞 可索引謂詞,就是可以使用index的謂詞,=, >, <, >=, <=, BETWEEN, IS [NOT] NULL, IN,不以%開頭的LIKE
不可索引謂詞 無法使用索引的謂詞,比如NOT EXISTS, NOT LIKE,以%開頭的LIKE

下面用一個簡單的例子來說明Sargable謂詞是什麼。

CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT,date1 DATETIME);
INSERT INTO t1 VALUES (1,10,'2021-03-25 16:44:00.123456'),(2,1,'2022-03-26 16:44:00.123456'),(3,4,'2023-03-27 16:44:00.123456');
CREATE TABLE t2 (cc1 INT PRIMARY KEY, cc2 INT);
INSERT INTO t2 VALUES (1,3),(2,1),(3,2),(4,3);
CREATE INDEX idx1 ON t1(c2);
CREATE INDEX idx2 ON t1(c2,date1);
CREATE INDEX idx2_1 ON t2(cc2);

SET optimizer_trace = 'enabled=ON' ;
greatsql> SELECT * FROM t1 where c1 in (SELECT cc1 FROM t2) and  c1 >c2;
+----+------+---------------------+
| c1 | c2   | date1               |
+----+------+---------------------+
|  2 |    1 | 2022-03-26 16:44:00 |
+----+------+---------------------+
> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;
| SELECT * FROM t1 where c1 in (SELECT cc1 FROM t2) and  c1 >c2 | {
  "steps": [
    {// 1、sql語句轉換成更快執行的語句
      "join_preparatiON": {
        "SELECT#": 1,
        "steps": [
          {
            "join_preparatiON": {
              "SELECT#": 2,
              "steps": [
                {
                  "expanded_query": "/* SELECT#2 */ SELECT `t2`.`cc1` FROM `t2`"
                }
              ]
            }
          },
          {
            "expanded_query": "/* SELECT#1 */ SELECT `t1`.`c1` AS `c1`,`t1`.`c2` AS `c2`,`t1`.`date1` AS `date1` FROM `t1` where (`t1`.`c1` in (/* SELECT#2 */ SELECT `t2`.`cc1` FROM `t2`) and (`t1`.`c1` > `t1`.`c2`))"
          },
          {
            "transformatiON": {
              "SELECT#": 2,
              "FROM": "IN (SELECT)",
              "to": "semijoin",
              "chosen": true,
              "transformatiON_to_semi_join": {
                "subquery_predicate": "`t1`.`c1` in (/* SELECT#2 */ SELECT `t2`.`cc1` FROM `t2`)",
                "embedded in": "WHERE",
                "semi-join cONditiON": "(`t1`.`c1` = `t2`.`cc1`)",
                "decorrelated_predicates": [
                  {
                    "outer": "`t1`.`c1`",
                    "inner": "`t2`.`cc1`"
                  }
                ]
              }
            }
          },
          {
            "transformatiONs_to_nested_joins": {
              "transformatiONs": [
                "semijoin"
              ],// 這個sql語句被轉換成以下最終的語句,可以發現最後以semi join的形式查詢的。
              "expanded_query": "/* SELECT#1 */ SELECT `t1`.`c1` AS `c1`,`t1`.`c2` AS `c2`,`t1`.`date1` AS `date1` FROM `t1` semi join (`t2`) where ((`t1`.`c1` > `t1`.`c2`) and (`t1`.`c1` = `t2`.`cc1`))"
            }
          }
        ]
      }
    },
    {// 2、最佳化器執行計劃生成
      "join_optimizatiON": {
        "SELECT#": 1,
        "steps": [
          {
            "cONditiON_processing": {
              "cONditiON": "WHERE",
              "original_cONditiON": "((`t1`.`c1` > `t1`.`c2`) and (`t1`.`c1` = `t2`.`cc1`))",
              "steps": [
                {
                  "transformatiON": "equality_propagatiON",
                  "resulting_cONditiON": "((`t1`.`c1` > `t1`.`c2`) and multiple equal(`t1`.`c1`, `t2`.`cc1`))"
                },
                {
                  "transformatiON": "cONstant_propagatiON",
                  "resulting_cONditiON": "((`t1`.`c1` > `t1`.`c2`) and multiple equal(`t1`.`c1`, `t2`.`cc1`))"
                },
                {
                  "transformatiON": "trivial_cONditiON_removal",
                  "resulting_cONditiON": "((`t1`.`c1` > `t1`.`c2`) and multiple equal(`t1`.`c1`, `t2`.`cc1`))"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {// 表依賴,因為是兩張表做join,因此這裡顯示了2張表
            "TABLE_dependencies": [
              {
                "TABLE": "`t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_ON_map_bits": [
                ]
              },
              {
                "TABLE": "`t2`",
                "row_may_be_null": false,
                "map_bit": 1,
                "depends_ON_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [ //這裡就是keyuse_array的結果,實際用到了兩個索引,c1和cc2的唯一索引
              {
                "TABLE": "`t1`",
                "field": "c1",
                "equals": "`t2`.`cc1`",
                "null_rejecting": true
              },
              {
                "TABLE": "`t2`",
                "field": "cc1",
                "equals": "`t1`.`c1`",
                "null_rejecting": true
              }
            ]
          },
      
// 透過檢視系統表可以檢視到keyuse_array資訊,但是Sargables陣列資訊沒有顯示。
// 這個透過debug程式碼發現過程中提取出了一個謂片語,就是c1>c2,field值為cc1列,arg_value值為c2列,num_VALUES為所有Sargable謂詞數量,這裡為1。
// 這個Sargable陣列資訊在後面函式update_sargable_FROM_cONst補充檢查是否可以使用這裡面的cc1索引加速查詢。
struct SARGABLE_PARAM {
  Field *field;     // t2的cc1列
  Item **arg_value; // t1的c2列
  uINT num_VALUES;  // 值為1,因為陣列只有一個值
};

二、update_ref_and_keys程式碼執行過程

update_ref_and_keys函式里面透過add_key_fields把查詢sql的cond條件包含的索引資訊新增到Key_use_array,注意只有等於的條件才會透過add_key_field新增key_field。cond條件分為兩種:FUNC_ITEMCOND_ITEM,其中and_level用於在merge_key_fields時候把用不到的key_field刪掉。

條件型別 說明
FUNC_ITEM 只由一個條件組成
COND_ITEM 由若干個 AND 和 OR 連線起來的條件,包含Item_cond_or和Item_cond_and兩種 COND_AND_FUNC:同一個and條件的and_level不變 COND_OR_FUNC:處理前and_level需要自增

實際程式碼執行過程:

bool JOIN::make_join_plan() {
  // Build the key access informatiON, which is the basis for ref access.
  //如果有查詢條件或者查詢語句是outer join的話,需要執行update_ref_and_keys獲取所有sargables謂詞。
  if (where_cond || query_block->outer_join) {
    if (update_ref_and_keys(thd, keyuse_array, join_tab, TABLEs, where_cONd,
                            ~query_block->outer_join, query_block, &sargables))
      return true;
  }
}

// 這裡keyuse_array和sargables都是最後生成的結果,即所需的Sargable謂詞
static bool update_ref_and_keys(THD *thd, Key_use_array *keyuse_array,
                                JOIN_TAB *join_tab, uINT TABLEs, Item *cONd,
                                TABLE_map normal_TABLEs,
                                Query_block *query_block,
                                SARGABLE_PARAM **sargables) {
  if (cONd) {
    if (add_key_fields(thd, join, &end, &and_level, cONd, normal_TABLEs,
                       sargables))
      return true;
  }
  /* fill keyuse_array with found key parts */
  // 把上面找到的帶有索引的field加入到keyuse_array陣列
  for (; field != end; field++) {
    if (add_key_part(keyuse_array, field)) return true;//說明見下面
  }
  if (!keyuse_array->empty()) {
      // 對找到的索引按照sort_keyuse_array條件進行比大小排序
      std::sTABLE_sort(keyuse_array->begin(), keyuse_array->begin() + keyuse_array->size(),
                     sort_keyuse);
     // 刪除不用的索引列以及重複的索引列,這裡最後裝入的就是t1.c1和t2.cc1兩個索引列。
  }
}

bool add_key_fields(THD *thd, JOIN *join, Key_field **key_fields,
                    uINT *and_level, Item *cONd, TABLE_map usable_TABLEs,
                    SARGABLE_PARAM **sargables) {
    List_iterator_fast<Item> li(*((Item_cONd *)cONd)->argument_list());
    if (down_cast<Item_cONd *>(cONd)->functype() == Item_func::COND_AND_FUNC) {
      //對AND條件的所有引數進行廣度和深度遍歷,找出涉及的表的列相關索引,and_level不變,也就是說同一層AND連線的Key_field的and_level相同。
      while ((item = li++)) {
        if (add_key_fields(thd, join, key_fields, and_level, item,
                           usable_TABLEs, sargables))
          return true;
      }
    } else {
      //對OR條件的所有引數進行廣度和深度遍歷,找出涉及的表的列相關索引,這裡多一個merge_key_fields操作,用於對 OR 連線的謂詞之間儘可能做 merge 操作
      (*and_level)++;
      add_key_fields();
      merge_key_fields();
    }
  //按照不同函式的不同SELECT_optimize屬性來抽取引數涉及的表列,見下表
  auto optimize = cONd_func->SELECT_optimize(thd);
  switch (optimize) {
    case Item_func::OPTIMIZE_NONE:
      break;
    case Item_func::OPTIMIZE_KEY:
    case Item_func::OPTIMIZE_OP:
    case Item_func::OPTIMIZE_NULL:
    case Item_func::OPTIMIZE_EQUAL:
}

static Key_field *merge_key_fields(Key_field *start, Key_field *new_fields,
                                   Key_field *end, uint and_level) {
  for (; new_fields != end; new_fields++) {
    for (Key_field *old = start; old != first_free; old++) {
      //如果當前or條件的item_field等於之前已經標記的條件的item_field
      1、or條件的值不為常量:改變以下3個值
        old->level = and_level;
        old->optimize
        old->null_rejecting
      2、or條件的值等於之前已經標記的條件的值:改變以下3個值
        old->level = and_level;
        old->optimize
        old->null_rejecting
      3、or條件的值等於null,並且之前已經標記的條件的值為null並且為常量:改變以下3個值
        old->level = and_level;
        old->optimize = KEY_OPTIMIZE_REF_OR_NULL;
        old->null_rejecting = false;
      4、以上都不是,那麼可以合併,把之前條件的key_field刪除。
    }
  }
}

函式的SELECT_optimize屬性見下表。

函式的查詢最佳化型別 涉及函式 對應索引操作
OPTIMIZE_NONE
OPTIMIZE_KEY <> 、between 、IN函式 Item_func::BETWEEN : 把between轉換為a>1 and a<10這樣的結構,當最小值和最大值相等時,可以建立對應的 Key_field Item_func::MEMBER_OF_FUNC : IN ()尋找索引列 Item_func::IN_FUNC 和 Item_func::NE_FUNC(即 <>): (column1, column2, … ) IN ((const1_1, const1_2), …),存在一個 (column1, column2, …) 上的索引
OPTIMIZE_OP xor,strcmp,nullif,LIKE函式 新增EQUAL_FUNC條件的Item包含的列索引
OPTIMIZE_NULL isnull,isnotnull函式 column IS [NOT] NULL:對 column IS NULL 才會生成 Key_field,column IS NOT NULL 不是等值表示式,因此不會有對應的 Key_field 生成。
OPTIMIZE_EQUAL 連等號,比如x=y=z field1=field2=…=常量 型別的條件,可以對每個 field=常量 條件生成對應的 Key_field field1=field2=…=fieldn 型別的條件,對於任意兩個不同的 field 組成的等值式,儘可能生成 Key_field

三、Key_field舉例說明

INSERT INTO t1 VALUES (5,5,'2024-03-25 16:44:00.123456');
INSERT INTO t2 VALUES (5,15);
例子1:SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t2.cc1=t1.c1 or t1.c1=5;
轉換為:where (((`t1`.`c1` = `) and (`t2`.`cc1` = `t1`.`c1`)) or (`t1`.`c1` = 5))
對於 t1.c1=t2.cc1 and t2.cc1 = t1.c1 or t1.c1=5
可以生成三個 Key_field:
1. Key_field(c1=cc1, and_level=1)
2. Key_field(cc1=c1, and_level=1)
3. Key_field(c1=5, and_level=2)

做合併之前,先賦值field="cc1=c1"
merge_key_fields()函式執行merge由 OR 連線的左右條件:
->對於 c1=cc1:
        ->判斷c1=5是否可以合併
          -> c1相等。可以合併
                  ->將 c1=5 的Key_field刪除,剩下c1=cc1,注意這裡返回的end="cc1=c1"
->對於 cc1=c1:
        ->判斷 c1=5 是否可以合併
          ->cc1不等於c1,不能合併
->將所有沒有被合併的 Key_field 去掉

最終剩下2個 Key_field:
Key_field(c1=cc1, and_level=1, optimize=0, null_rejecting=true)
Key_field(cc1=c1, and_level=1, optimize=0, null_rejecting=true)
最後因為透過merge_key_fields算出來的field==end,因此不加入keyuse_array,注意只有or條件才會執行merge_key_fields。這裡條件如果去掉or t1.c1=5這兩個key_field就會加入keyuse_array

於是看到如下的trace,這裡面的access_type全是scan方式,說明沒有用索引提升查詢效能。
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "TABLE": "`t1`",
                "best_access_path": {
                  "cONsidered_access_paths": [
                    {
                      "rows_to_scan": 4,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 1,
                      "access_type": "scan", 這裡t1的掃描方式是索引掃描
                      "resulting_rows": 4,
                      "cost": 0.65,
                      "chosen": true
                    }
                  ]
                },
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`t1`"
                    ],
                    "TABLE": "`t2`",
                    "best_access_path": {
                      "cONsidered_access_paths": [
                        {
                          "rows_to_scan": 5,
                          "filtering_effect": [
                          ],
                          "final_filtering_effect": 1,
                          "access_type": "scan", 這裡t2的掃描方式是索引掃描
                          "using_join_cache": true,
                          "buffers_needed": 1,
                          "resulting_rows": 5,
                          "cost": 2.25005,
                          "chosen": true
                        }
                      ]
                    },
-- 下面的結果中,type=index,表明選擇了索引掃描,跟上面算出來的結論一致
greatsql> EXPLAIN SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t2.cc1=t1.c1 or t1.c1=5;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+---------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key    | key_len | ref  | rows | filtered | Extra                                                   |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+---------------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | index | PRIMARY       | idx2   | 11      | NULL |    4 |   100.00 | Using index                                             |   這裡選擇了索引掃描,跟上面算出來的結論一致
|  1 | SIMPLE      | t2    | NULL       | index | PRIMARY       | idx2_1 | 5       | NULL |    5 |   100.00 | Using where; Using index; Using join buffer (hash join) |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+---------------------------------------------------------+

例子2:SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t1.c2<5;
where條件被轉換為:where ((`t1`.`c1` = `t2`.`cc1`) and (`t1`.`c2` < 5))
t1.c2<5不是等於條件,因此不生成Key_field,最後生成2個 Key_field:
1. Key_field(c1=cc1, and_level=0)
2. Key_field(cc1=c1, and_level=0)

因為沒有OR條件因此不需要進行合併操作,最終剩下以上2個 Key_field:
Key_field(c1=cc1, and_level=0, optimize=0, null_rejecting=true)
Key_field(cc1=c1, and_level=0, optimize=0, null_rejecting=true)
這兩個Key_field會加入keyuse_array,在後面Optimize_TABLE_order::find_best_ref被用於執行最佳化,讓表的掃描方式為eq_ref,見下面。

          {
            "ref_optimizer_key_uses": [ 這裡用到了2條包含等號的索引列
              {
                "TABLE": "`t1`",
                "field": "c1",
                "equals": "`t2`.`cc1`",
                "null_rejecting": true
              },
              {
                "TABLE": "`t2`",
                "field": "cc1",
                "equals": "`t1`.`c1`",
                "null_rejecting": true
              }
            ]
          },
            "cONsidered_executiON_plans": [
              {
                "plan_prefix": [
                ],
                "TABLE": "`t1`",
                "best_access_path": {
                  "cONsidered_access_paths": [
                    {
                      "access_type": "ref", 這種ref掃描不被選擇
                      "INDEX": "PRIMARY",
                      "usable": false,
                      "chosen": false
                    },
                    {
                      "rows_to_scan": 3,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 1,
                      "access_type": "range", 第一次需要對於t1做範圍掃描
                      "range_details": {
                        "used_INDEX": "PRIMARY"
                      },
                      "resulting_rows": 3,
                      "cost": 0.860732,
                      "chosen": true
                    }
                  ]
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`t1`"
                    ],
                    "TABLE": "`t2`",
                    "best_access_path": {
                      "cONsidered_access_paths": [
                        {
                          "access_type": "eq_ref", 對於t2.cc1=t1.c1條件,因為t1.c1固定了因此這裡t2的掃描方式可以用eq_ref的方式
                          "INDEX": "PRIMARY",
                          "rows": 1,
                          "cost": 1.05,
                          "chosen": true,
                          "cause": "clustered_pk_chosen_by_heuristics"
                        },
                        {
                          "access_type": "range", 這種範圍掃描不被選擇
                          "range_details": {
                            "used_INDEX": "PRIMARY"
                          },
                          "chosen": false,
                          "cause": "heuristic_INDEX_cheaper"
                        }
                      ]
                    },
                    "cONditiON_filtering_pct": 100,
                    "rows_for_plan": 3,
                    "cost_for_plan": 1.91073,
                    "chosen": true
                  }
                ]
              {
                "plan_prefix": [
                ],
                "TABLE": "`t2`",
                "best_access_path": {
                  "cONsidered_access_paths": [
                    {
                      "access_type": "ref", 這種ref掃描不被選擇
                      "INDEX": "PRIMARY",
                      "usable": false,
                      "chosen": false
                    },
                    {
                      "rows_to_scan": 4,
                      "filtering_effect": [
                      ],
                      "final_filtering_effect": 1,
                      "access_type": "range", 對於t2表用到了範圍掃描
                      "range_details": {
                        "used_INDEX": "PRIMARY"
                      },
                      "resulting_rows": 4,
                      "cost": 1.06082,
                      "chosen": true
                    }
                  ]
                },
                "rest_of_plan": [
                  {
                    "plan_prefix": [
                      "`t2`"
                    ],
                    "TABLE": "`t1`",
                    "best_access_path": {
                      "cONsidered_access_paths": [
                        {
                          "access_type": "eq_ref", 對於t1.c1=t2.cc1條件,因為t2.cc1固定了因此這裡t1的掃描方式可以用eq_ref的方式
                          "INDEX": "PRIMARY",
                          "rows": 1,
                          "cost": 1.4,
                          "chosen": true,
                          "cause": "clustered_pk_chosen_by_heuristics"
                        },
                        {
                          "access_type": "range", 這種範圍掃描不被選擇
                          "range_details": {
                            "used_INDEX": "PRIMARY"
                          },
                          "chosen": false,
                          "cause": "heuristic_INDEX_cheaper"
                        }
                      ]
                    },
-- t1表:type=range,這個掃描方式,下一期講
-- t2表:type=eq_ref方式掃描
greatsql> EXPLAIN SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t1.c1<5;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref       | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | range  | PRIMARY       | PRIMARY | 4       | NULL      |    3 |   100.00 | Using where |
|  1 | SIMPLE      | t2    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | db1.t1.c1 |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------+------+----------+-------------+

從上面的例子發現有唯一索引查詢的時候,條件帶有or的話是無法使用Sargable謂詞提高效率的。如果是and條件的話,只有等於條件才會被判定為Sargable謂詞。因此做join連線的時候儘量避免or條件的過濾。

-- 對比上面2個最後生成的執行計劃,第一個預估cost=2.90,第二個預估cost=1.91,效率更高
greatsql> EXPLAIN FORMAT=TREE SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t2.cc1=t1.c1 or t1.c1=5;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                   |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: ((t2.cc1 = t1.c1) or (t1.c1 = 5))  (cost=2.90 rows=20)
    -> Inner hash join (no cONditiON)  (cost=2.90 rows=20)
        -> INDEX scan ON t2 using idx2_1  (cost=0.19 rows=5)
        -> Hash
            -> INDEX scan ON t1 using idx2  (cost=0.65 rows=4)
 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

greatsql> EXPLAIN FORMAT=TREE SELECT * FROM t1 join t2 ON t1.c1=t2.cc1 and t1.c1<5;
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop inner join  (cost=1.91 rows=3)
    -> Filter: (t1.c1 < 5)  (cost=0.86 rows=3)
        -> INDEX range scan ON t1 using PRIMARY over (c1 < 5)  (cost=0.86 rows=3)
    -> Single-row INDEX lookup ON t2 using PRIMARY (cc1=t1.c1)  (cost=0.28 rows=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

附錄:join_type表掃描方式

JT_UNKNOWN
JT_SYSTEM 表只有一行,比如SELECT * FROM (SELECT 1)
JT_CONST 表最多隻有一行滿足,比如WHERE TABLE.pk = 3
JT_EQ_REF =符號用在唯一索引
JT_REF =符號用在非唯一索引
JT_ALL 全表掃描
JT_RANGE 範圍掃描
JT_INDEX_SCAN 索引掃描
JT_FT Fulltext索引掃描
JT_REF_OR_NULL 包含null值,比如"WHERE col = ... OR col IS NULL
JT_INDEX_MERGE 一張表執行多次範圍掃描最後合併結果

四、總結

從上面最佳化器最早的步驟我們認識了Sargable謂詞的定義和判定方法,如果查詢用到了Sargable謂詞是可以進行eq_ref掃描方式的,有效提高了查詢效率。透過實際例子發現,在做多表連線的時候用OR條件會降低執行效率,同時用唯一索引列作為連線條件的話會提高效率。因此實際寫查詢sql的時候,儘量用唯一索引作為連線條件,少用OR條件進行過濾。


Enjoy GreatSQL 😃

關於 GreatSQL

GreatSQL是適用於金融級應用的國內自主開源資料庫,具備高效能、高可靠、高易用性、高安全等多個核心特性,可以作為MySQL或Percona Server的可選替換,用於線上生產環境,且完全免費併相容MySQL或Percona Server。

相關連結: GreatSQL社群 Gitee GitHub Bilibili

GreatSQL社群:

社群部落格有獎徵稿詳情:https://greatsql.cn/thread-100-1-1.html

image-20230105161905827

技術交流群:

微信:掃碼新增GreatSQL社群助手微信好友,傳送驗證資訊加群

image-20221030163217640

相關文章