PostgreSQL 原始碼解讀(77)- 查詢語句#62(create_plan函式#1-主實...

husthxd發表於2018-11-05

本節簡單介紹了建立執行計劃主函式create_plan的實現邏輯。

一、資料結構

Plan
所有計劃節點透過將Plan結構作為第一個欄位從Plan結構“派生”。這確保了在將節點轉換為計劃節點時,一切都能正常工作。(在執行器中以通用方式傳遞時,節點指標經常被轉換為Plan *)

/* ----------------
 *      Plan node
 *
 * All plan nodes "derive" from the Plan structure by having the
 * Plan structure as the first field.  This ensures that everything works
 * when nodes are cast to Plan's.  (node pointers are frequently cast to Plan*
 * when passed around generically in the executor)
 * 所有計劃節點透過將Plan結構作為第一個欄位從Plan結構“派生”。
 * 這確保了在將節點轉換為計劃節點時,一切都能正常工作。
 * (在執行器中以通用方式傳遞時,節點指標經常被轉換為Plan *)
 *
 * We never actually instantiate any Plan nodes; this is just the common
 * abstract superclass for all Plan-type nodes.
 * 從未例項化任何Plan節點;這只是所有Plan-type節點的通用抽象超類。
 * ----------------
 */
typedef struct Plan
{
    NodeTag     type;//節點型別

    /*
     * 成本估算資訊;estimated execution costs for plan (see costsize.c for more info)
     */
    Cost        startup_cost;   /* 啟動成本;cost expended before fetching any tuples */
    Cost        total_cost;     /* 總成本;total cost (assuming all tuples fetched) */

    /*
     * 最佳化器估算資訊;planner's estimate of result size of this plan step
     */
    double      plan_rows;      /* 行數;number of rows plan is expected to emit */
    int         plan_width;     /* 平均行大小(Byte為單位);average row width in bytes */

    /*
     * 並行執行相關的資訊;information needed for parallel query
     */
    bool        parallel_aware; /* 是否參與並行執行邏輯?engage parallel-aware logic? */
    bool        parallel_safe;  /* 是否並行安全;OK to use as part of parallel plan? */

    /*
     * Plan型別節點通用的資訊.Common structural data for all Plan types.
     */
    int         plan_node_id;   /* unique across entire final plan tree */
    List       *targetlist;     /* target list to be computed at this node */
    List       *qual;           /* implicitly-ANDed qual conditions */
    struct Plan *lefttree;      /* input plan tree(s) */
    struct Plan *righttree;
    List       *initPlan;       /* Init Plan nodes (un-correlated expr
                                 * subselects) */

    /*
     * Information for management of parameter-change-driven rescanning
     * parameter-change-driven重掃描的管理資訊.
     * 
     * extParam includes the paramIDs of all external PARAM_EXEC params
     * affecting this plan node or its children.  setParam params from the
     * node's initPlans are not included, but their extParams are.
     *
     * allParam includes all the extParam paramIDs, plus the IDs of local
     * params that affect the node (i.e., the setParams of its initplans).
     * These are _all_ the PARAM_EXEC params that affect this node.
     */
    Bitmapset  *extParam;
    Bitmapset  *allParam;
} Plan;

二、原始碼解讀

create_plan呼叫create_plan_recurse函式,遞迴遍歷訪問路徑,相應的建立計劃(Plan)節點。

/*
 * create_plan
 *    Creates the access plan for a query by recursively processing the
 *    desired tree of pathnodes, starting at the node 'best_path'.  For
 *    every pathnode found, we create a corresponding plan node containing
 *    appropriate id, target list, and qualification information.
 *    從節點'best_path'開始,遞迴處理路徑節點樹,為查詢語句建立執行計劃。
 *    對於找到的每個訪問路徑節點,建立一個相應的計劃節點,其中包含合適的id、投影列和限定資訊。
 *
 *    The tlists and quals in the plan tree are still in planner format,
 *    ie, Vars still correspond to the parser's numbering.  This will be
 *    fixed later by setrefs.c.
 *    計劃中的投影列和約束條件仍然以最佳化器的格式儲存.
 *    比如Vars對應著解析樹的編號等,這些處理都在setrefs.c中完成
 *
 *    best_path is the best access path
 *    best_path是最優的訪問路徑
 *
 *    Returns a Plan tree.
 *    返回Plan樹.
 */
Plan *
create_plan(PlannerInfo *root, Path *best_path)
{
    Plan       *plan;

    /* plan_params should not be in use in current query level */
    //plan_params在當前查詢層次上不應使用(值為NULL)
    Assert(root->plan_params == NIL);

    /* Initialize this module's private workspace in PlannerInfo */
    //初始化該模組中最佳化器資訊的私有工作空間
    root->curOuterRels = NULL;
    root->curOuterParams = NIL;

    /* Recursively process the path tree, demanding the correct tlist result */
    //遞迴處理計劃樹(tlist引數設定為CP_EXACT_TLIST)
    plan = create_plan_recurse(root, best_path, CP_EXACT_TLIST);

    /*
     * Make sure the topmost plan node's targetlist exposes the original
     * column names and other decorative info.  Targetlists generated within
     * the planner don't bother with that stuff, but we must have it on the
     * top-level tlist seen at execution time.  However, ModifyTable plan
     * nodes don't have a tlist matching the querytree targetlist.
     * 確保最頂層計劃節點的投影列targetlist可以獲知原始列名和其他調整過的資訊。
     * 在計劃器中生成的投影列targetlist不需要處理這些資訊,但是必須在執行時看到最高層的投影列tlist。
     * 注意:ModifyTable計劃節點沒有一個匹配查詢樹targetlist的tlist。
     */
    if (!IsA(plan, ModifyTable))
        apply_tlist_labeling(plan->targetlist, root->processed_tlist);//非ModifyTable

    /*
     * Attach any initPlans created in this query level to the topmost plan
     * node.  (In principle the initplans could go in any plan node at or
     * above where they're referenced, but there seems no reason to put them
     * any lower than the topmost node for the query level.  Also, see
     * comments for SS_finalize_plan before you try to change this.)
     * 將此查詢級別中建立的任何initplan附加到最高層的計劃節點中。
     * (原則上,initplans可以在引用它們的任何計劃節點或以上的節點中訪問,
     * 但似乎沒有理由將它們放在查詢級別的最高層節點以下。
     * 另外,如需嘗試更改SS_finalize_plan,請參閱註釋。)
     */
    SS_attach_initplans(root, plan);

    /* Check we successfully assigned all NestLoopParams to plan nodes */
    //檢查已經為計劃節點引數NestLoopParams賦值
    if (root->curOuterParams != NIL)
        elog(ERROR, "failed to assign all NestLoopParams to plan nodes");

    /*
     * Reset plan_params to ensure param IDs used for nestloop params are not
     * re-used later
     * 重置plan_params引數,以確保用於nestloop引數的引數IDs不會在後續被重複使用
     */
    root->plan_params = NIL;

    return plan;
}

//------------------------------------------------------------------------ create_plan_recurse
/*
 * create_plan_recurse
 *    Recursive guts of create_plan().
 *    create_plan()函式中的遞迴實現過程.
 */
static Plan *
create_plan_recurse(PlannerInfo *root, Path *best_path, int flags)
{
    Plan       *plan;

    /* Guard against stack overflow due to overly complex plans */
    //確保堆疊不會溢位
    check_stack_depth();

    switch (best_path->pathtype)//根據路徑型別,執行相應的處理
    {
        case T_SeqScan://順序掃描
        case T_SampleScan://取樣掃描
        case T_IndexScan://索引掃描
        case T_IndexOnlyScan://索引快速掃描
        case T_BitmapHeapScan://點陣圖堆掃描
        case T_TidScan://TID掃描
        case T_SubqueryScan://子查詢掃描
        case T_FunctionScan://函式掃描
        case T_TableFuncScan://表函式掃描
        case T_ValuesScan://Values掃描
        case T_CteScan://CTE掃描
        case T_WorkTableScan://WorkTable掃描
        case T_NamedTuplestoreScan://NamedTuplestore掃描
        case T_ForeignScan://外表掃描
        case T_CustomScan://自定義掃描
            plan = create_scan_plan(root, best_path, flags);//掃描計劃
            break;
        case T_HashJoin://Hash連線
        case T_MergeJoin://合併連線
        case T_NestLoop://內嵌迴圈連線
            plan = create_join_plan(root,
                                    (JoinPath *) best_path);//連線結合
            break;
        case T_Append://追加(集合)
            plan = create_append_plan(root,
                                      (AppendPath *) best_path);//追加(集合並)計劃
            break;
        case T_MergeAppend://合併
            plan = create_merge_append_plan(root,
                                            (MergeAppendPath *) best_path);
            break;
        case T_Result://投影操作
            if (IsA(best_path, ProjectionPath))
            {
                plan = create_projection_plan(root,
                                              (ProjectionPath *) best_path,
                                              flags);
            }
            else if (IsA(best_path, MinMaxAggPath))
            {
                plan = (Plan *) create_minmaxagg_plan(root,
                                                      (MinMaxAggPath *) best_path);
            }
            else
            {
                Assert(IsA(best_path, ResultPath));
                plan = (Plan *) create_result_plan(root,
                                                   (ResultPath *) best_path);
            }
            break;
        case T_ProjectSet://投影集合操作
            plan = (Plan *) create_project_set_plan(root,
                                                    (ProjectSetPath *) best_path);
            break;
        case T_Material://物化
            plan = (Plan *) create_material_plan(root,
                                                 (MaterialPath *) best_path,
                                                 flags);
            break;
        case T_Unique://唯一處理
            if (IsA(best_path, UpperUniquePath))
            {
                plan = (Plan *) create_upper_unique_plan(root,
                                                         (UpperUniquePath *) best_path,
                                                         flags);
            }
            else
            {
                Assert(IsA(best_path, UniquePath));
                plan = create_unique_plan(root,
                                          (UniquePath *) best_path,
                                          flags);
            }
            break;
        case T_Gather://彙總收集
            plan = (Plan *) create_gather_plan(root,
                                               (GatherPath *) best_path);
            break;
        case T_Sort://排序
            plan = (Plan *) create_sort_plan(root,
                                             (SortPath *) best_path,
                                             flags);
            break;
        case T_Group://分組
            plan = (Plan *) create_group_plan(root,
                                              (GroupPath *) best_path);
            break;
        case T_Agg://聚集計算
            if (IsA(best_path, GroupingSetsPath))
                plan = create_groupingsets_plan(root,
                                                (GroupingSetsPath *) best_path);
            else
            {
                Assert(IsA(best_path, AggPath));
                plan = (Plan *) create_agg_plan(root,
                                                (AggPath *) best_path);
            }
            break;
        case T_WindowAgg://視窗函式
            plan = (Plan *) create_windowagg_plan(root,
                                                  (WindowAggPath *) best_path);
            break;
        case T_SetOp://集合操作
            plan = (Plan *) create_setop_plan(root,
                                              (SetOpPath *) best_path,
                                              flags);
            break;
        case T_RecursiveUnion://遞迴UNION
            plan = (Plan *) create_recursiveunion_plan(root,
                                                       (RecursiveUnionPath *) best_path);
            break;
        case T_LockRows://鎖定(for update)
            plan = (Plan *) create_lockrows_plan(root,
                                                 (LockRowsPath *) best_path,
                                                 flags);
            break;
        case T_ModifyTable://更新
            plan = (Plan *) create_modifytable_plan(root,
                                                    (ModifyTablePath *) best_path);
            break;
        case T_Limit://限制操作
            plan = (Plan *) create_limit_plan(root,
                                              (LimitPath *) best_path,
                                              flags);
            break;
        case T_GatherMerge://收集合並
            plan = (Plan *) create_gather_merge_plan(root,
                                                     (GatherMergePath *) best_path);
            break;
        default://其他非法型別
            elog(ERROR, "unrecognized node type: %d",
                 (int) best_path->pathtype);
            plan = NULL;        /* keep compiler quiet */
            break;
    }

    return plan;
}


//------------------------------------------------------------------------ apply_tlist_labeling

 /*
  * apply_tlist_labeling
  *      Apply the TargetEntry labeling attributes of src_tlist to dest_tlist
  *      將src_tlist的TargetEntry標記屬性應用到dest_tlist
  *
  * This is useful for reattaching column names etc to a plan's final output
  * targetlist.
  */
 void
 apply_tlist_labeling(List *dest_tlist, List *src_tlist)
 {
     ListCell   *ld,
                *ls;
 
     Assert(list_length(dest_tlist) == list_length(src_tlist));
     forboth(ld, dest_tlist, ls, src_tlist)
     {
         TargetEntry *dest_tle = (TargetEntry *) lfirst(ld);
         TargetEntry *src_tle = (TargetEntry *) lfirst(ls);
 
         Assert(dest_tle->resno == src_tle->resno);
         dest_tle->resname = src_tle->resname;
         dest_tle->ressortgroupref = src_tle->ressortgroupref;
         dest_tle->resorigtbl = src_tle->resorigtbl;
         dest_tle->resorigcol = src_tle->resorigcol;
         dest_tle->resjunk = src_tle->resjunk;
     }
 }
 

//------------------------------------------------------------------------ apply_tlist_labeling
 /*
  * SS_attach_initplans - attach initplans to topmost plan node
  * 將initplans附加到最頂層的計劃節點 
  *
  * Attach any initplans created in the current query level to the specified
  * plan node, which should normally be the topmost node for the query level.
  * (In principle the initPlans could go in any node at or above where they're
  * referenced; but there seems no reason to put them any lower than the
  * topmost node, so we don't bother to track exactly where they came from.)
  * We do not touch the plan node's cost; the initplans should have been
  * accounted for in path costing.
  */
 void
 SS_attach_initplans(PlannerInfo *root, Plan *plan)
 {
     plan->initPlan = root->init_plans;
 }

三、跟蹤分析

測試指令碼如下

testdb=# explain select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je 
testdb-# from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je 
testdb(#                         from t_grxx gr inner join t_jfxx jf 
testdb(#                                        on gr.dwbh = dw.dwbh 
testdb(#                                           and gr.grbh = jf.grbh) grjf
testdb-# order by dw.dwbh;
                                        QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Sort  (cost=20070.93..20320.93 rows=100000 width=47)
   Sort Key: dw.dwbh
   ->  Hash Join  (cost=3754.00..8689.61 rows=100000 width=47)
         Hash Cond: ((gr.dwbh)::text = (dw.dwbh)::text)
         ->  Hash Join  (cost=3465.00..8138.00 rows=100000 width=31)
               Hash Cond: ((jf.grbh)::text = (gr.grbh)::text)
               ->  Seq Scan on t_jfxx jf  (cost=0.00..1637.00 rows=100000 width=20)
               ->  Hash  (cost=1726.00..1726.00 rows=100000 width=16)
                     ->  Seq Scan on t_grxx gr  (cost=0.00..1726.00 rows=100000 width=16)
         ->  Hash  (cost=164.00..164.00 rows=10000 width=20)
               ->  Seq Scan on t_dwxx dw  (cost=0.00..164.00 rows=10000 width=20)
(11 rows)

啟動gdb,設定斷點,進入

(gdb) info break
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x00000000007b76c1 in create_plan at createplan.c:313
(gdb) c
Continuing.

Breakpoint 2, create_plan (root=0x26c1258, best_path=0x2722d00) at createplan.c:313
313     Assert(root->plan_params == NIL);

進入create_plan_recurse函式

313     Assert(root->plan_params == NIL);
(gdb) n
316     root->curOuterRels = NULL;
(gdb) 
317     root->curOuterParams = NIL;
(gdb) 
320     plan = create_plan_recurse(root, best_path, CP_EXACT_TLIST);
(gdb) step
create_plan_recurse (root=0x26c1258, best_path=0x2722d00, flags=1) at createplan.c:364
364     check_stack_depth();

根據訪問路徑型別(T_ProjectionPath)選擇處理分支

(gdb) p *best_path
$1 = {type = T_ProjectionPath, pathtype = T_Result, parent = 0x2722998, pathtarget = 0x27226f8, param_info = 0x0, 
  parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000, startup_cost = 20070.931487218411, 
  total_cost = 20320.931487218411, pathkeys = 0x26cfe98}

呼叫create_projection_plan函式

(gdb) n
400             if (IsA(best_path, ProjectionPath))
(gdb) 
402                 plan = create_projection_plan(root,

建立相應的Plan(T_Sort,存在左右子樹),下一節將詳細解釋create_projection_plan函式

(gdb) 
504     return plan;
(gdb) p *plan
$1 = {type = T_Sort, startup_cost = 20070.931487218411, total_cost = 20320.931487218411, plan_rows = 100000, 
  plan_width = 47, parallel_aware = false, parallel_safe = true, plan_node_id = 0, targetlist = 0x2724548, qual = 0x0, 
  lefttree = 0x27243d0, righttree = 0x0, initPlan = 0x0, extParam = 0x0, allParam = 0x0}

執行返回

(gdb) 
create_plan (root=0x270f9c8, best_path=0x2722d00) at createplan.c:329
329     if (!IsA(plan, ModifyTable))
(gdb) 
330         apply_tlist_labeling(plan->targetlist, root->processed_tlist);
(gdb) 
339     SS_attach_initplans(root, plan);
(gdb) 
342     if (root->curOuterParams != NIL)
(gdb) 
349     root->plan_params = NIL;
(gdb) 
351     return plan;

DONE!

四、參考資料

createplan.c
PG Document:Query Planning

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374813/,如需轉載,請註明出處,否則將追究法律責任。

相關文章