PostgreSQL 原始碼解讀(25)- 查詢語句#10(查詢優化概覽)

husthxd發表於2018-08-22

本節簡單介紹了PG執行查詢語句中優化器部分(Optimizer)的相關函式和資料結構總體說明。查詢優化包括查詢邏輯優化和查詢物理優化,查詢邏輯優化是指使用關係代數中的等價規則,通過選擇下推、投影下推、連線交換等方法對SQL語句進行優化;查詢物理優化是指通過CBO對各種物理訪問資料的方法進行評估,得出最優的執行計劃。

一、總體說明

下面是PG原始碼目錄(/src/backend/optimizer)中的README檔案對優化器相關函式和資料結構的總體說明:

Optimizer Functions
-------------------

The primary entry point is planner().

planner()//優化器主入口函式
set up for recursive handling of subqueries//為子查詢配置處理器(遞迴方式)
-subquery_planner()//呼叫(子)查詢優化函式
 pull up sublinks and subqueries from rangetable, if possible//可以的話,上拉子連結和子查詢
 canonicalize qual//表示式規範化
     Attempt to simplify WHERE clause to the most useful form; this includes
     flattening nested AND/ORs and detecting clauses that are duplicated in
     different branches of an OR.//簡化WHERE語句
 simplify constant expressions//簡化常量表示式
 process sublinks//處理子連結
 convert Vars of outer query levels into Params//轉換外查詢的Vars變數到Params中
--grouping_planner()//
  preprocess target list for non-SELECT queries//預處理非SELECT語句的投影列
  handle UNION/INTERSECT/EXCEPT, GROUP BY, HAVING, aggregates,//處理集合操作/聚集函式/排序等
    ORDER BY, DISTINCT, LIMIT
--query_planner()//
   make list of base relations used in query//構造查詢中的基錶連結串列
   split up the qual into restrictions (a=1) and joins (b=c)//拆分表示式為限制條件和連線
   find qual clauses that enable merge and hash joins//查詢可以讓Merge和Hash連線生效的表示式
----make_one_rel()//
     set_base_rel_pathlists()//設定基表路徑連結串列
      find seqscan and all index paths for each base relation//遍歷每個基表,尋找順序掃描和所有可能的索引掃描路徑
      find selectivity of columns used in joins//查詢連線中使用的列的選擇性
     make_rel_from_joinlist()//通過join連結串列構造Relation
      hand off join subproblems to a plugin, GEQO, or standard_join_search()//
-----standard_join_search()//標準的連線搜尋函式
      call join_search_one_level() for each level of join tree needed//每一個join tree呼叫join_search_one_level
      join_search_one_level():
        For each joinrel of the prior level, do make_rels_by_clause_joins()//對於上一層的每一個joinrel,執行make_rels_by_clause_joins
        if it has join clauses, or make_rels_by_clauseless_joins() if not.
        Also generate "bushy plan" joins between joinrels of lower levels.
      Back at standard_join_search(), generate gather paths if needed for//回到standard_join_search函式,需要的話,收集相關的路徑並應用set_cheapest函式獲取代價最小的路徑
      each newly constructed joinrel, then apply set_cheapest() to extract
      the cheapest path for it.
      Loop back if this was not the top join level.//如果不是最頂層連線,迴圈
  Back at grouping_planner://回到grouping_planner函式
  do grouping (GROUP BY) and aggregation//處理分組和聚集
  do window functions//處理視窗函式
  make unique (DISTINCT)//處理唯一性
  do sorting (ORDER BY)//處理排序
  do limit (LIMIT/OFFSET)//處理Limit
Back at planner()://回到planner函式
convert finished Path tree into a Plan tree//轉換最終的路徑樹到計劃樹
do final cleanup after planning//收尾工作


Optimizer Data Structures
-------------------------

PlannerGlobal   - global information for a single planner invocation//全域性優化資訊

PlannerInfo     - information for planning a particular Query (we make//某個Planner的優化資訊
                  a separate PlannerInfo node for each sub-Query)

RelOptInfo      - a relation or joined relations//某個Relation(包括連線)的優化資訊

 RestrictInfo   - WHERE clauses, like "x = 3" or "y = z"//限制條件
                  (note the same structure is used for restriction and
                   join clauses)

 Path           - every way to generate a RelOptInfo(sequential,index,joins)//構造該關係(注意:中間結果也是關係的一種)的路徑
  SeqScan       - represents a sequential scan plan
  IndexPath     - index scan
  BitmapHeapPath - top of a bitmapped index scan
  TidPath       - scan by CTID
  SubqueryScanPath - scan a subquery-in-FROM
  ForeignPath   - scan a foreign table, foreign join or foreign upper-relation
  CustomPath    - for custom scan providers
  AppendPath    - append multiple subpaths together
  MergeAppendPath - merge multiple subpaths, preserving their common sort order
  ResultPath    - a childless Result plan node (used for FROM-less SELECT)
  MaterialPath  - a Material plan node
  UniquePath    - remove duplicate rows (either by hashing or sorting)
  GatherPath    - collect the results of parallel workers
  GatherMergePath - collect parallel results, preserving their common sort order
  ProjectionPath - a Result plan node with child (used for projection)
  ProjectSetPath - a ProjectSet plan node applied to some sub-path
  SortPath      - a Sort plan node applied to some sub-path
  GroupPath     - a Group plan node applied to some sub-path
  UpperUniquePath - a Unique plan node applied to some sub-path
  AggPath       - an Agg plan node applied to some sub-path
  GroupingSetsPath - an Agg plan node used to implement GROUPING SETS
  MinMaxAggPath - a Result plan node with subplans performing MIN/MAX
  WindowAggPath - a WindowAgg plan node applied to some sub-path
  SetOpPath     - a SetOp plan node applied to some sub-path
  RecursiveUnionPath - a RecursiveUnion plan node applied to two sub-paths
  LockRowsPath  - a LockRows plan node applied to some sub-path
  ModifyTablePath - a ModifyTable plan node applied to some sub-path(s)
  LimitPath     - a Limit plan node applied to some sub-path
  NestPath      - nested-loop joins
  MergePath     - merge joins
  HashPath      - hash joins

 EquivalenceClass - a data structure representing a set of values known equal//等價類

 PathKey        - a data structure representing the sort ordering of a path//排序鍵

下一節開始將根據總體說明中的函式逐個進行分析解讀.

二、小結

1、優化器函式總覽:大體介紹了優化器函式的呼叫過程等資訊;
2、資料結構:優化器相關的資料結構,如PlannerInfo等。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374891/,如需轉載,請註明出處,否則將追究法律責任。

相關文章