PostgreSQL 原始碼解讀(83)- 查詢語句#68(PortalStart函式)

husthxd發表於2018-11-12

本節介紹了PortalStart函式,該函式在create_simple_query中被呼叫,用於執行前初始化portal結構體中的相關資訊。

一、資料結構

Portal
包括場景PortalStrategy列舉定義/PortalStatus狀態定義/PortalData結構體.Portal是PortalData結構體指標,詳見程式碼註釋.

/*
 * We have several execution strategies for Portals, depending on what
 * query or queries are to be executed.  (Note: in all cases, a Portal
 * executes just a single source-SQL query, and thus produces just a
 * single result from the user's viewpoint.  However, the rule rewriter
 * may expand the single source query to zero or many actual queries.)
 * 對於Portals(客戶端請求),有幾種執行策略,具體取決於要執行什麼查詢。
 * (注意:無論什麼情況下,一個Portal只執行一個source-SQL查詢,因此從使用者的角度來看只產生一個結果。
 * 但是,規則重寫器可以將單個源查詢擴充套件為零或多個實際查詢。
 * 
 * PORTAL_ONE_SELECT: the portal contains one single SELECT query.  We run
 * the Executor incrementally as results are demanded.  This strategy also
 * supports holdable cursors (the Executor results can be dumped into a
 * tuplestore for access after transaction completion).
 * PORTAL_ONE_SELECT: 包含一個SELECT查詢。
 *                    按需要的結果重複(遞增)地執行執行器。
 *                    該策略還支援可持有遊標(執行器結果可以在事務完成後轉儲到tuplestore中進行訪問)。
 * 
 * PORTAL_ONE_RETURNING: the portal contains a single INSERT/UPDATE/DELETE
 * query with a RETURNING clause (plus possibly auxiliary queries added by
 * rule rewriting).  On first execution, we run the portal to completion
 * and dump the primary query's results into the portal tuplestore; the
 * results are then returned to the client as demanded.  (We can't support
 * suspension of the query partway through, because the AFTER TRIGGER code
 * can't cope, and also because we don't want to risk failing to execute
 * all the auxiliary queries.)
 * PORTAL_ONE_RETURNING: 包含一個帶有RETURNING子句的INSERT/UPDATE/DELETE查詢
                         (可能還包括由規則重寫新增的輔助查詢)。
 *                       在第一次執行時,執行Portal來完成並將主查詢的結果轉儲到Portal的tuplestore中;
 *                       然後根據需要將結果返回給客戶端。
 *                       (我們不能支援半途中斷的查詢,因為AFTER觸發器程式碼無法處理,
 *                       也因為不想冒執行所有輔助查詢失敗的風險)。
 * 
 * PORTAL_ONE_MOD_WITH: the portal contains one single SELECT query, but
 * it has data-modifying CTEs.  This is currently treated the same as the
 * PORTAL_ONE_RETURNING case because of the possibility of needing to fire
 * triggers.  It may act more like PORTAL_ONE_SELECT in future.
 * PORTAL_ONE_MOD_WITH: 只包含一個SELECT查詢,但它具有資料修改的CTEs。
 *                      這與PORTAL_ONE_RETURNING的情況相同,因為可能需要觸發觸發器。將來它的行為可能更像PORTAL_ONE_SELECT。
 * 
 * PORTAL_UTIL_SELECT: the portal contains a utility statement that returns
 * a SELECT-like result (for example, EXPLAIN or SHOW).  On first execution,
 * we run the statement and dump its results into the portal tuplestore;
 * the results are then returned to the client as demanded.
 * PORTAL_UTIL_SELECT: 包含一個實用程式語句,該語句返回一個類似SELECT的結果(例如,EXPLAIN或SHOW)。
 *                     在第一次執行時,執行語句並將其結果轉儲到portal tuplestore;然後根據需要將結果返回給客戶端。
 * 
 * PORTAL_MULTI_QUERY: all other cases.  Here, we do not support partial
 * execution: the portal's queries will be run to completion on first call.
 * PORTAL_MULTI_QUERY: 除上述情況外的其他情況。
 *                     在這裡,不支援部分執行:Portal的查詢語句將在第一次呼叫時執行到完成。
 */
typedef enum PortalStrategy
{
    PORTAL_ONE_SELECT,
    PORTAL_ONE_RETURNING,
    PORTAL_ONE_MOD_WITH,
    PORTAL_UTIL_SELECT,
    PORTAL_MULTI_QUERY
} PortalStrategy;

/*
 * A portal is always in one of these states.  It is possible to transit
 * from ACTIVE back to READY if the query is not run to completion;
 * otherwise we never back up in status.
 * Portal總是處於這些狀態中的之一。
 * 如果查詢沒有執行到完成,則可以從活動狀態轉回準備狀態;否則永遠不會後退。
 */
typedef enum PortalStatus
{
    PORTAL_NEW,                 /* 剛建立;freshly created */
    PORTAL_DEFINED,             /* PortalDefineQuery完成;PortalDefineQuery done */
    PORTAL_READY,               /* PortalStart完成;PortalStart complete, can run it */
    PORTAL_ACTIVE,              /* Portal正在執行;portal is running (can't delete it) */
    PORTAL_DONE,                /* Portal已經完成;portal is finished (don't re-run it) */
    PORTAL_FAILED               /* Portal出現錯誤;portal got error (can't re-run it) */
} PortalStatus;

typedef struct PortalData *Portal;//結構體指標

typedef struct PortalData
{
    /* Bookkeeping data */
    const char *name;           /* portal的名稱;portal's name */
    const char *prepStmtName;   /* 已完成準備的源語句;source prepared statement (NULL if none) */
    MemoryContext portalContext;    /* 記憶體上下文;subsidiary memory for portal */
    ResourceOwner resowner;     /* 資源的owner;resources owned by portal */
    void        (*cleanup) (Portal portal); /* cleanup鉤子函式;cleanup hook */

    /*
     * State data for remembering which subtransaction(s) the portal was
     * created or used in.  If the portal is held over from a previous
     * transaction, both subxids are InvalidSubTransactionId.  Otherwise,
     * createSubid is the creating subxact and activeSubid is the last subxact
     * in which we ran the portal.
     * 狀態資料,用於記住在哪個子事務中建立或使用Portal。
     * 如果Portal是從以前的事務中持有的,那麼兩個subxids都應該是InvalidSubTransactionId。
     * 否則,createSubid是正在建立的subxact,而activeSubid是執行Portal的最後一個subxact。
     */
    SubTransactionId createSubid;   /* 正在建立的subxact;the creating subxact */
    SubTransactionId activeSubid;   /* 活動的最後一個subxact;the last subxact with activity */

    /* The query or queries the portal will execute */
    //portal將會執行的查詢
    const char *sourceText;     /* 查詢的源文字;text of query (as of 8.4, never NULL) */
    const char *commandTag;     /* 源查詢的命令tag;command tag for original query */
    List       *stmts;          /* PlannedStmt連結串列;list of PlannedStmts */
    CachedPlan *cplan;          /* 快取的PlannedStmts;CachedPlan, if stmts are from one */

    ParamListInfo portalParams; /* 傳遞給查詢的引數;params to pass to query */
    QueryEnvironment *queryEnv; /* 查詢的執行環境;environment for query */

    /* Features/options */
    PortalStrategy strategy;    /* 場景;see above */
    int         cursorOptions;  /* DECLARE CURSOR選項位;DECLARE CURSOR option bits */
    bool        run_once;       /* 是否只執行一次;portal will only be run once */

    /* Status data */
    PortalStatus status;        /* Portal的狀態;see above */
    bool        portalPinned;   /* 是否不能被清除;a pinned portal can't be dropped */
    bool        autoHeld;       /* 是否自動從pinned到held;was automatically converted from pinned to
                                 * held (see HoldPinnedPortals()) */

    /* If not NULL, Executor is active; call ExecutorEnd eventually: */
    //如不為NULL,執行器處於活動狀態
    QueryDesc  *queryDesc;      /* 執行器需要使用的資訊;info needed for executor invocation */

    /* If portal returns tuples, this is their tupdesc: */
    //如Portal需要返回元組,這是元組的描述
    TupleDesc   tupDesc;        /* 結果元組的描述;descriptor for result tuples */
    /* and these are the format codes to use for the columns: */
    //列資訊的格式碼
    int16      *formats;        /* 每一列的格式碼;a format code for each column */

    /*
     * Where we store tuples for a held cursor or a PORTAL_ONE_RETURNING or
     * PORTAL_UTIL_SELECT query.  (A cursor held past the end of its
     * transaction no longer has any active executor state.)
     * 在這裡,為持有的遊標或PORTAL_ONE_RETURNING或PORTAL_UTIL_SELECT儲存元組。
     * (在事務結束後持有的遊標不再具有任何活動執行器狀態。)
     */
    Tuplestorestate *holdStore; /* 儲存持有的遊標資訊;store for holdable cursors */
    MemoryContext holdContext;  /* 持有holdStore的記憶體上下文;memory containing holdStore */

    /*
     * Snapshot under which tuples in the holdStore were read.  We must keep a
     * reference to this snapshot if there is any possibility that the tuples
     * contain TOAST references, because releasing the snapshot could allow
     * recently-dead rows to be vacuumed away, along with any toast data
     * belonging to them.  In the case of a held cursor, we avoid needing to
     * keep such a snapshot by forcibly detoasting the data.
     * 讀取holdStore中元組的Snapshot。
     * 如果元組包含TOAST引用的可能性存在,那麼必須保持對該快照的引用,
     * 因為釋放快照可能會使最近廢棄的行與屬於它們的TOAST資料一起被清除。
     * 對於持有的遊標,透過強制解壓資料來避免需要保留這樣的快照。
     */
    Snapshot    holdSnapshot;   /* 已註冊的快照資訊,如無則為NULL;registered snapshot, or NULL if none */

    /*
     * atStart, atEnd and portalPos indicate the current cursor position.
     * portalPos is zero before the first row, N after fetching N'th row of
     * query.  After we run off the end, portalPos = # of rows in query, and
     * atEnd is true.  Note that atStart implies portalPos == 0, but not the
     * reverse: we might have backed up only as far as the first row, not to
     * the start.  Also note that various code inspects atStart and atEnd, but
     * only the portal movement routines should touch portalPos.
     * atStart、atEnd和portalPos表示當前游標的位置。
     * portalPos在第一行之前為0,在獲取第N行查詢後為N。
     * 在執行結束後,portalPos = #查詢中的行號,atEnd為T。
     * 注意,atStart表示portalPos == 0,但不是相反:我們可能只回到到第一行,而不是開始。
     * 還要注意,各種程式碼在開始和結束時都要檢查,但是隻有Portal移動例程應該訪問portalPos。
     */
    bool        atStart;//處於開始位置?
    bool        atEnd;//處於結束位置?
    uint64      portalPos;//實際行號

    /* Presentation data, primarily used by the pg_cursors system view */
    //用於表示的資料,主要由pg_cursors系統檢視使用
    TimestampTz creation_time;  /* portal定義的時間;time at which this portal was defined */
    bool        visible;        /* 是否在pg_cursors中可見? include this portal in pg_cursors? */
}           PortalData;

/*
 * PortalIsValid
 *      True iff portal is valid.
 *      判斷Portal是否有效
 */
#define PortalIsValid(p) PointerIsValid(p)

QueryDesc
QueryDesc封裝了執行器執行查詢所需的所有內容。

/* ----------------
 *      query descriptor:
 *
 *  a QueryDesc encapsulates everything that the executor
 *  needs to execute the query.
 *  QueryDesc封裝了執行器執行查詢所需的所有內容。
 *
 *  For the convenience of SQL-language functions, we also support QueryDescs
 *  containing utility statements; these must not be passed to the executor
 *  however.
 *  為了使用SQL函式,還需要支援包含實用語句的QueryDescs;
 *  但是,這些內容不能傳遞給執行程式。
 * ---------------------
 */
typedef struct QueryDesc
{
    /* These fields are provided by CreateQueryDesc */
    //以下變數由CreateQueryDesc函式設定
    CmdType     operation;      /* 操作型別,如CMD_SELECT等;CMD_SELECT, CMD_UPDATE, etc. */
    PlannedStmt *plannedstmt;   /* 已規劃的語句,規劃器的輸出;planner's output (could be utility, too) */
    const char *sourceText;     /* 源SQL文字;source text of the query */
    Snapshot    snapshot;       /* 查詢使用的快照;snapshot to use for query */
    Snapshot    crosscheck_snapshot;    /* RI 更新/刪除交叉檢查快照;crosscheck for RI update/delete */
    DestReceiver *dest;         /* 元組輸出的接收器;the destination for tuple output */
    ParamListInfo params;       /* 需傳入的引數值;param values being passed in */
    QueryEnvironment *queryEnv; /* 查詢環境變數;query environment passed in */
    int         instrument_options; /* InstrumentOption選項;OR of InstrumentOption flags */

    /* These fields are set by ExecutorStart */
    //以下變數由ExecutorStart函式設定
    TupleDesc   tupDesc;        /* 結果元組tuples描述;descriptor for result tuples */
    EState     *estate;         /* 執行器狀態;executor's query-wide state */
    PlanState  *planstate;      /* per-plan-node狀態樹;tree of per-plan-node state */

    /* This field is set by ExecutorRun */
    //以下變數由ExecutorRun設定
    bool        already_executed;   /* 先前已執行,則為T;true if previously executed */

    /* This is always set NULL by the core system, but plugins can change it */
    //核心設定為NULL,可由外掛修改
    struct Instrumentation *totaltime;  /* ExecutorRun函式所花費的時間;total time spent in ExecutorRun */
} QueryDesc;

二、原始碼解讀

PortalStart
PortalStart函式的作用是在執行SQL語句前初始化portal結構體中的相關資訊,其中有2個重要資料結構的初始化:
1.呼叫CreateQueryDesc函式結構體QueryDesc
2.呼叫ExecutorStart函式初始化結構體EState,ExecutorStart函式呼叫InitPlan(下一節介紹)初始化計劃狀態樹.

/*
 * PortalStart
 *      Prepare a portal for execution.
 *      執行前初始化portal結構體中的相關資訊
 * 
 * Caller must already have created the portal, done PortalDefineQuery(),
 * and adjusted portal options if needed.
 * 呼叫者必須已經建立了portal,完成PortalDefineQuery()函式的呼叫,並且已經調整了portal中的相關選項
 * 
 * If parameters are needed by the query, they must be passed in "params"
 * (caller is responsible for giving them appropriate lifetime).
 * 如果查詢需要提供引數,透過"params"引數傳入
 * (呼叫者負責引數的生命週期管理)
 * 
 * The caller can also provide an initial set of "eflags" to be passed to
 * ExecutorStart (but note these can be modified internally, and they are
 * currently only honored for PORTAL_ONE_SELECT portals).  Most callers
 * should simply pass zero.
 * 呼叫者需要提供"eflags"變數的初始化集合,該引數用於傳遞給函式ExecutorStart
 * (要注意eflags可以在內部修改,它們目前只在PORTAL_ONE_SELECT中才會被使用)
 * 大多數的呼叫者只應該傳遞引數0
 *
 * The caller can optionally pass a snapshot to be used; pass InvalidSnapshot
 * for the normal behavior of setting a new snapshot.  This parameter is
 * presently ignored for non-PORTAL_ONE_SELECT portals (it's only intended
 * to be used for cursors).
 * 呼叫者可以選擇傳遞要使用的快照;為設定新快照的正常行為傳遞InvalidSnapshot。
 * 這個引數目前僅用於PORTAL_ONE_SELECT使用(用於遊標)。
 * 
 * On return, portal is ready to accept PortalRun() calls, and the result
 * tupdesc (if any) is known.
 * 該函式返回時,portal已做好接收PortalRun()呼叫返回的準備,結果tupdesc是已知的.
 */
void
PortalStart(Portal portal, ParamListInfo params,
            int eflags, Snapshot snapshot)
{
    Portal      saveActivePortal;
    ResourceOwner saveResourceOwner;
    MemoryContext savePortalContext;
    MemoryContext oldContext;
    QueryDesc  *queryDesc;
    int         myeflags;

    AssertArg(PortalIsValid(portal));
    AssertState(portal->status == PORTAL_DEFINED);

    /*
     * Set up global portal context pointers.
     * 設定全域性portal上下文指標
     */
    //保護"現場"
    saveActivePortal = ActivePortal;
    saveResourceOwner = CurrentResourceOwner;
    savePortalContext = PortalContext;
    PG_TRY();
    {
        ActivePortal = portal;
        if (portal->resowner)
            CurrentResourceOwner = portal->resowner;
        PortalContext = portal->portalContext;

        oldContext = MemoryContextSwitchTo(PortalContext);

        /* Must remember portal param list, if any */
        //記錄傳遞的引數資訊
        portal->portalParams = params;

        /*
         * Determine the portal execution strategy
         * 確定portal執行場景
         */
        portal->strategy = ChoosePortalStrategy(portal->stmts);

        /*
         * Fire her up according to the strategy
         * 根據場景觸發相應的處理
         */
        switch (portal->strategy)
        {
            case PORTAL_ONE_SELECT://PORTAL_ONE_SELECT

                /* Must set snapshot before starting executor. */
                //在開始執行前必須設定快照snapshot
                if (snapshot)
                    PushActiveSnapshot(snapshot);
                else
                    PushActiveSnapshot(GetTransactionSnapshot());

                /*
                 * Create QueryDesc in portal's context; for the moment, set
                 * the destination to DestNone.
                 * 在portal上下文中建立QueryDesc,同時設定接收的目標為DestNone
                 */
                queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
                                            portal->sourceText,
                                            GetActiveSnapshot(),
                                            InvalidSnapshot,
                                            None_Receiver,
                                            params,
                                            portal->queryEnv,
                                            0);

                /*
                 * If it's a scrollable cursor, executor needs to support
                 * REWIND and backwards scan, as well as whatever the caller
                 * might've asked for.
                 * 遊標可滾動,執行器需要支援REWIND和向後的掃描
                 */
                if (portal->cursorOptions & CURSOR_OPT_SCROLL)
                    myeflags = eflags | EXEC_FLAG_REWIND | EXEC_FLAG_BACKWARD;
                else
                    myeflags = eflags;

                /*
                 * Call ExecutorStart to prepare the plan for execution
                 * 呼叫ExecutorStart,為執行做準備
                 */
                ExecutorStart(queryDesc, myeflags);

                /*
                 * This tells PortalCleanup to shut down the executor
                 * 告知PortalCleanup關閉執行器
                 */
                portal->queryDesc = queryDesc;

                /*
                 * Remember tuple descriptor (computed by ExecutorStart)
                 * 記錄tuple描述符(queryDesc->tupDesc)
                 */
                portal->tupDesc = queryDesc->tupDesc;

                /*
                 * Reset cursor position data to "start of query"
                 * 重置遊標位置資料為"開始查詢"
                 */
                portal->atStart = true;//開始的位置
                portal->atEnd = false;  /* 允許可獲取資料;allow fetches */
                portal->portalPos = 0;//遊標位置

                PopActiveSnapshot();
                break;

            case PORTAL_ONE_RETURNING:
            case PORTAL_ONE_MOD_WITH:

                /*
                 * We don't start the executor until we are told to run the
                 * portal.  We do need to set up the result tupdesc.
                 * 執行器在呼叫的時候才會啟動,需要配置結果tupdesc.
                 * 
                 */
                {
                    PlannedStmt *pstmt;

                    pstmt = PortalGetPrimaryStmt(portal);//獲取主stmt
                    portal->tupDesc =
                        ExecCleanTypeFromTL(pstmt->planTree->targetlist,
                                            false);//設定元組描述符
                }

                /*
                 * Reset cursor position data to "start of query"
                 * 重置遊標位置
                 */
                portal->atStart = true;
                portal->atEnd = false;  /* allow fetches */
                portal->portalPos = 0;
                break;

            case PORTAL_UTIL_SELECT://PORTAL_UTIL_SELECT

                /*
                 * We don't set snapshot here, because PortalRunUtility will
                 * take care of it if needed.
                 */
                {
                    PlannedStmt *pstmt = PortalGetPrimaryStmt(portal);

                    Assert(pstmt->commandType == CMD_UTILITY);
                    portal->tupDesc = UtilityTupleDescriptor(pstmt->utilityStmt);
                }

                /*
                 * Reset cursor position data to "start of query"
                 */
                portal->atStart = true;
                portal->atEnd = false;  /* allow fetches */
                portal->portalPos = 0;
                break;

            case PORTAL_MULTI_QUERY://PORTAL_MULTI_QUERY
                /* Need do nothing now */
                portal->tupDesc = NULL;
                break;
        }
    }
    PG_CATCH();
    {
        /* Uncaught error while executing portal: mark it dead */
        MarkPortalFailed(portal);

        /* Restore global vars and propagate error */
        ActivePortal = saveActivePortal;
        CurrentResourceOwner = saveResourceOwner;
        PortalContext = savePortalContext;

        PG_RE_THROW();
    }
    PG_END_TRY();

    MemoryContextSwitchTo(oldContext);

    ActivePortal = saveActivePortal;
    CurrentResourceOwner = saveResourceOwner;
    PortalContext = savePortalContext;

    portal->status = PORTAL_READY;
}


/*
 * ChoosePortalStrategy
 *      Select portal execution strategy given the intended statement list.
 *      根據預期的語句連結串列選擇portal執行策略。
 *
 * The list elements can be Querys or PlannedStmts.
 * That's more general than portals need, but plancache.c uses this too.
 * 連結串列中的元素可以是Query或者是PlannedStmt.
 * 這比portal需要的更普遍,plancache.c也用這個。
 * 
 *
 * See the comments in portal.h.
 * 參見portal.h中的註釋.
 */
PortalStrategy
ChoosePortalStrategy(List *stmts)
{
    int         nSetTag;
    ListCell   *lc;

    /*
     * PORTAL_ONE_SELECT and PORTAL_UTIL_SELECT need only consider the
     * single-statement case, since there are no rewrite rules that can add
     * auxiliary queries to a SELECT or a utility command. PORTAL_ONE_MOD_WITH
     * likewise allows only one top-level statement.
     * PORTAL_ONE_SELECT和PORTAL_UTIL_SELECT只需要考慮單語句的情況,
     * 因為沒有可以向SELECT或實用程式命令新增輔助查詢的重寫規則。
     * PORTAL_ONE_MOD_WITH同樣只允許一個最上層語句。
     */
    if (list_length(stmts) == 1)//只有1條語句
    {
        Node       *stmt = (Node *) linitial(stmts);//獲取stmt

        if (IsA(stmt, Query))//Query
        {
            Query      *query = (Query *) stmt;

            if (query->canSetTag)
            {
                if (query->commandType == CMD_SELECT)//查詢命令
                {
                    if (query->hasModifyingCTE)
                        return PORTAL_ONE_MOD_WITH;//存在可更新的CTE-->PORTAL_ONE_MOD_WITH
                    else
                        return PORTAL_ONE_SELECT;//單個查詢語句
                }
                if (query->commandType == CMD_UTILITY)//工具語句
                {
                    if (UtilityReturnsTuples(query->utilityStmt))//返回元組
                        return PORTAL_UTIL_SELECT;//PORTAL_UTIL_SELECT
                    /* it can't be ONE_RETURNING, so give up */
                    return PORTAL_MULTI_QUERY;//返回PORTAL_MULTI_QUERY
                }
            }
        }
        else if (IsA(stmt, PlannedStmt))//PlannedStmt,參見Query處理邏輯
        {
            PlannedStmt *pstmt = (PlannedStmt *) stmt;

            if (pstmt->canSetTag)
            {
                if (pstmt->commandType == CMD_SELECT)
                {
                    if (pstmt->hasModifyingCTE)
                        return PORTAL_ONE_MOD_WITH;
                    else
                        return PORTAL_ONE_SELECT;
                }
                if (pstmt->commandType == CMD_UTILITY)
                {
                    if (UtilityReturnsTuples(pstmt->utilityStmt))
                        return PORTAL_UTIL_SELECT;
                    /* it can't be ONE_RETURNING, so give up */
                    return PORTAL_MULTI_QUERY;
                }
            }
        }
        else
            elog(ERROR, "unrecognized node type: %d", (int) nodeTag(stmt));
    }

    //存在多條語句
    /*
     * PORTAL_ONE_RETURNING has to allow auxiliary queries added by rewrite.
     * Choose PORTAL_ONE_RETURNING if there is exactly one canSetTag query and
     * it has a RETURNING list.
     * PORTAL_ONE_RETURNING必須允許透過重寫新增輔助查詢。
     * 如果只有一個canSetTag查詢,並且它有一個RETURNING連結串列,那麼選擇PORTAL_ONE_RETURNING。
     */
    nSetTag = 0;
    foreach(lc, stmts)//遍歷
    {
        Node       *stmt = (Node *) lfirst(lc);

        if (IsA(stmt, Query))
        {
            Query      *query = (Query *) stmt;

            if (query->canSetTag)
            {
                if (++nSetTag > 1)
                    return PORTAL_MULTI_QUERY;  /* no need to look further */
                if (query->commandType == CMD_UTILITY ||
                    query->returningList == NIL)
                    return PORTAL_MULTI_QUERY;  /* no need to look further */
            }
        }
        else if (IsA(stmt, PlannedStmt))
        {
            PlannedStmt *pstmt = (PlannedStmt *) stmt;

            if (pstmt->canSetTag)
            {
                if (++nSetTag > 1)
                    return PORTAL_MULTI_QUERY;  /* no need to look further */
                if (pstmt->commandType == CMD_UTILITY ||
                    !pstmt->hasReturning)
                    return PORTAL_MULTI_QUERY;  /* no need to look further */
            }
        }
        else
            elog(ERROR, "unrecognized node type: %d", (int) nodeTag(stmt));
    }
    if (nSetTag == 1)
        return PORTAL_ONE_RETURNING;

    /* Else, it's the general case... */
    //通常的情況
    return PORTAL_MULTI_QUERY;
}


/*
 * CreateQueryDesc
 * 構造QueryDesc結構體
 */
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
                const char *sourceText,
                Snapshot snapshot,
                Snapshot crosscheck_snapshot,
                DestReceiver *dest,
                ParamListInfo params,
                QueryEnvironment *queryEnv,
                int instrument_options)
{
    QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));

    qd->operation = plannedstmt->commandType;   /* 操作型別;operation */
    qd->plannedstmt = plannedstmt;  /* 已規劃的SQL語句;plan */
    qd->sourceText = sourceText;    /* 源SQL文字;query text */
    qd->snapshot = RegisterSnapshot(snapshot);  /* 快照;snapshot */
    /* RI check snapshot */
    qd->crosscheck_snapshot = RegisterSnapshot(crosscheck_snapshot);
    qd->dest = dest;            /* 輸出的目標端;output dest */
    qd->params = params;        /* 傳入到查詢語句中的引數值;parameter values passed into query */
    qd->queryEnv = queryEnv;    //查詢環境變數
    qd->instrument_options = instrument_options;    /* 是否需要instrumentation;instrumentation wanted? */

    /* null these fields until set by ExecutorStart */
    qd->tupDesc = NULL;//初始化為NULL
    qd->estate = NULL;
    qd->planstate = NULL;
    qd->totaltime = NULL;

    /* not yet executed */
    qd->already_executed = false;//未執行

    return qd;
}


/* ----------------------------------------------------------------
 *      ExecutorStart
 *
 *      This routine must be called at the beginning of any execution of any
 *      query plan
 *      ExecutorStart必須在執行開始前呼叫.
 *
 * Takes a QueryDesc previously created by CreateQueryDesc (which is separate
 * only because some places use QueryDescs for utility commands).  The tupDesc
 * field of the QueryDesc is filled in to describe the tuples that will be
 * returned, and the internal fields (estate and planstate) are set up.
 * 獲取先前由CreateQueryDesc建立的QueryDesc(該資料結構是獨立的,只是因為有些地方使用QueryDesc來執行實用命令)。
 * 填充QueryDesc的tupDesc欄位,以描述將要返回的元組,並設定內部欄位(estate和planstate)。
 * 
 * eflags contains flag bits as described in executor.h.
 * eflags儲存標誌位(在executor.h中有說明)
 * 
 * NB: the CurrentMemoryContext when this is called will become the parent
 * of the per-query context used for this Executor invocation.
 * 注意:CurrentMemoryContext會成為每個執行查詢的上下文的parent
 *
 * We provide a function hook variable that lets loadable plugins
 * get control when ExecutorStart is called.  Such a plugin would
 * normally call standard_ExecutorStart().
 * 我們提供了一個函式鉤子變數,可以讓可載入外掛在呼叫ExecutorStart時獲得控制權。
 * 這樣的外掛通常會呼叫standard_ExecutorStart()函式。
 *
 * ----------------------------------------------------------------
 */
void
ExecutorStart(QueryDesc *queryDesc, int eflags)
{
    if (ExecutorStart_hook)//存在鉤子函式
        (*ExecutorStart_hook) (queryDesc, eflags);
    else
        standard_ExecutorStart(queryDesc, eflags);
}

void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
    EState     *estate;
    MemoryContext oldcontext;

    /* sanity checks: queryDesc must not be started already */
    Assert(queryDesc != NULL);
    Assert(queryDesc->estate == NULL);

    /*
     * If the transaction is read-only, we need to check if any writes are
     * planned to non-temporary tables.  EXPLAIN is considered read-only.
     * 如果事務是隻讀的,需要檢查是否計劃對非臨時表進行寫操作。
     * EXPLAIN命令被認為是隻讀的。
     * 
     * Don't allow writes in parallel mode.  Supporting UPDATE and DELETE
     * would require (a) storing the combocid hash in shared memory, rather
     * than synchronizing it just once at the start of parallelism, and (b) an
     * alternative to heap_update()'s reliance on xmax for mutual exclusion.
     * INSERT may have no such troubles, but we forbid it to simplify the
     * checks.
     * 不要在並行模式下寫。
     * 支援更新和刪除需要:
     *   (a)在共享記憶體中儲存combocid雜湊,而不是在並行性開始時只同步一次;
     *   (b) heap_update()依賴xmax實現互斥的替代方法。
     * INSERT可能沒有這樣的麻煩,但我們禁止它簡化檢查。
     * 
     * We have lower-level defenses in CommandCounterIncrement and elsewhere
     * against performing unsafe operations in parallel mode, but this gives a
     * more user-friendly error message.
     * 在CommandCounterIncrement和其他地方,對於在並行模式下執行不安全的操作,
     * PG有較低階別的防禦,這裡提供了更使用者友好的錯誤訊息。
     */
    if ((XactReadOnly || IsInParallelMode()) &&
        !(eflags & EXEC_FLAG_EXPLAIN_ONLY))
        ExecCheckXactReadOnly(queryDesc->plannedstmt);

    /*
     * Build EState, switch into per-query memory context for startup.
     * 構建EState,切換至每個查詢的上下文中,準備開啟執行
     */
    estate = CreateExecutorState();
    queryDesc->estate = estate;

    oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);

    /*
     * Fill in external parameters, if any, from queryDesc; and allocate
     * workspace for internal parameters
     * 填充queryDesc的外部引數(如有);併為內部引數分配工作區
     */
    estate->es_param_list_info = queryDesc->params;

    if (queryDesc->plannedstmt->paramExecTypes != NIL)
    {
        int         nParamExec;

        nParamExec = list_length(queryDesc->plannedstmt->paramExecTypes);
        estate->es_param_exec_vals = (ParamExecData *)
            palloc0(nParamExec * sizeof(ParamExecData));
    }

    estate->es_sourceText = queryDesc->sourceText;

    /*
     * Fill in the query environment, if any, from queryDesc.
     * 填充查詢執行環境,從queryDesc中獲得
     */
    estate->es_queryEnv = queryDesc->queryEnv;

    /*
     * If non-read-only query, set the command ID to mark output tuples with
     * 非只讀查詢,設定命令ID
     */
    switch (queryDesc->operation)
    {
        case CMD_SELECT:

            /*
             * SELECT FOR [KEY] UPDATE/SHARE and modifying CTEs need to mark
             * tuples
             * SELECT FOR [KEY] UPDATE/SHARE和正在更新的CTEs需要標記元組
             */
            if (queryDesc->plannedstmt->rowMarks != NIL ||
                queryDesc->plannedstmt->hasModifyingCTE)
                estate->es_output_cid = GetCurrentCommandId(true);

            /*
             * A SELECT without modifying CTEs can't possibly queue triggers,
             * so force skip-triggers mode. This is just a marginal efficiency
             * hack, since AfterTriggerBeginQuery/AfterTriggerEndQuery aren't
             * all that expensive, but we might as well do it.
             * 不帶更新CTEs的SELECT不可能執行觸發器,因此強制為EXEC_FLAG_SKIP_TRIGGERS標記.
             * 這只是一個邊際效益問題,因為AfterTriggerBeginQuery/AfterTriggerEndQuery成本並不高,但不妨這樣做。
             */
            if (!queryDesc->plannedstmt->hasModifyingCTE)
                eflags |= EXEC_FLAG_SKIP_TRIGGERS;
            break;

        case CMD_INSERT:
        case CMD_DELETE:
        case CMD_UPDATE:
            estate->es_output_cid = GetCurrentCommandId(true);
            break;

        default:
            elog(ERROR, "unrecognized operation code: %d",
                 (int) queryDesc->operation);
            break;
    }

    /*
     * Copy other important information into the EState
     * 複製其他重要的資訊到EState資料結構中
     */
    estate->es_snapshot = RegisterSnapshot(queryDesc->snapshot);
    estate->es_crosscheck_snapshot = RegisterSnapshot(queryDesc->crosscheck_snapshot);
    estate->es_top_eflags = eflags;
    estate->es_instrument = queryDesc->instrument_options;
    estate->es_jit_flags = queryDesc->plannedstmt->jitFlags;

    /*
     * Set up an AFTER-trigger statement context, unless told not to, or
     * unless it's EXPLAIN-only mode (when ExecutorFinish won't be called).
     * 設定AFTER-trigger語句上下文,除非明確不需要執行此操作或者是EXPLAIN-only模式
     */
    if (!(eflags & (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
        AfterTriggerBeginQuery();

    /*
     * Initialize the plan state tree
     * 初始化計劃狀態樹
     */
    InitPlan(queryDesc, eflags);

    MemoryContextSwitchTo(oldcontext);
}

三、跟蹤分析

測試指令碼如下

testdb=# explain select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je 
testdb-# from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je 
testdb(#                         from t_grxx gr inner join t_jfxx jf 
testdb(#                                        on gr.dwbh = dw.dwbh 
testdb(#                                           and gr.grbh = jf.grbh) grjf
testdb-# order by dw.dwbh;
                                        QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Sort  (cost=20070.93..20320.93 rows=100000 width=47)
   Sort Key: dw.dwbh
   ->  Hash Join  (cost=3754.00..8689.61 rows=100000 width=47)
         Hash Cond: ((gr.dwbh)::text = (dw.dwbh)::text)
         ->  Hash Join  (cost=3465.00..8138.00 rows=100000 width=31)
               Hash Cond: ((jf.grbh)::text = (gr.grbh)::text)
               ->  Seq Scan on t_jfxx jf  (cost=0.00..1637.00 rows=100000 width=20)
               ->  Hash  (cost=1726.00..1726.00 rows=100000 width=16)
                     ->  Seq Scan on t_grxx gr  (cost=0.00..1726.00 rows=100000 width=16)
         ->  Hash  (cost=164.00..164.00 rows=10000 width=20)
               ->  Seq Scan on t_dwxx dw  (cost=0.00..164.00 rows=10000 width=20)
(11 rows)

啟動gdb,設定斷點,進入PortalStart函式

(gdb) b PortalStart
Breakpoint 1 at 0x8cb67b: file pquery.c, line 455.
(gdb) c
Continuing.

Breakpoint 1, PortalStart (portal=0x25cd468, params=0x0, eflags=0, snapshot=0x0) at pquery.c:455
455     AssertArg(PortalIsValid(portal));

校驗並保護現場

455     AssertArg(PortalIsValid(portal));
(gdb) n
456     AssertState(portal->status == PORTAL_DEFINED);
(gdb) 
461     saveActivePortal = ActivePortal;
(gdb) 
462     saveResourceOwner = CurrentResourceOwner;
(gdb) 
463     savePortalContext = PortalContext;

設定記憶體上下文,資源owner等資訊

466         ActivePortal = portal;
(gdb) 
467         if (portal->resowner)
(gdb) 
468             CurrentResourceOwner = portal->resowner;
(gdb) 
469         PortalContext = portal->portalContext;
(gdb) 
471         oldContext = MemoryContextSwitchTo(PortalContext);
(gdb) 
474         portal->portalParams = params;

場景為PORTAL_ONE_SELECT

(gdb) p portal->strategy
$1 = PORTAL_ONE_SELECT

根據strategy,進入相應的處理分支(PORTAL_ONE_SELECT)
設定快照

489                 if (snapshot)
(gdb) 
492                     PushActiveSnapshot(GetTransactionSnapshot());

建立QueryDesc結構體

(gdb) 
498                 queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),

檢視queryDesc結構體資訊

(gdb) n
512                 if (portal->cursorOptions & CURSOR_OPT_SCROLL)
(gdb) p *queryDesc
$2 = {operation = CMD_SELECT, plannedstmt = 0x2650df0, 
  sourceText = 0x2567eb8 "select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je \nfrom t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je \n", ' ' <repeats 24 times>, "from t_grxx gr inner join t_jfxx jf \n", ' ' <repeats 34 times>..., 
  snapshot = 0x260ce10, crosscheck_snapshot = 0x0, dest = 0xf8f280 <donothingDR>, params = 0x0, queryEnv = 0x0, 
  instrument_options = 0, tupDesc = 0x0, estate = 0x0, planstate = 0x0, already_executed = false, totaltime = 0x0}

設定標記位

(gdb) n
515                     myeflags = eflags;
(gdb) p eflags
$3 = 0

進入ExecutorStart函式

(gdb) n
147         standard_ExecutorStart(queryDesc, eflags);
(gdb) step
standard_ExecutorStart (queryDesc=0x2657f68, eflags=0) at execMain.c:157
157     Assert(queryDesc != NULL);

ExecutorStart-->執行相關校驗和判斷

157     Assert(queryDesc != NULL);
(gdb) n
158     Assert(queryDesc->estate == NULL);
(gdb) 
175     if ((XactReadOnly || IsInParallelMode()) &&

ExecutorStart-->建立EState,初始化EState結構體

(gdb) 
182     estate = CreateExecutorState();
(gdb) 
183     queryDesc->estate = estate;
(gdb) p *estate
$4 = {type = T_EState, es_direction = ForwardScanDirection, es_snapshot = 0x0, es_crosscheck_snapshot = 0x0, 
  es_range_table = 0x0, es_plannedstmt = 0x0, es_sourceText = 0x0, es_junkFilter = 0x0, es_output_cid = 0, 
  es_result_relations = 0x0, es_num_result_relations = 0, es_result_relation_info = 0x0, es_root_result_relations = 0x0, 
  es_num_root_result_relations = 0, es_tuple_routing_result_relations = 0x0, es_trig_target_relations = 0x0, 
  es_trig_tuple_slot = 0x0, es_trig_oldtup_slot = 0x0, es_trig_newtup_slot = 0x0, es_param_list_info = 0x0, 
  es_param_exec_vals = 0x0, es_queryEnv = 0x0, es_query_cxt = 0x2653e30, es_tupleTable = 0x0, es_rowMarks = 0x0, 
  es_processed = 0, es_lastoid = 0, es_top_eflags = 0, es_instrument = 0, es_finished = false, es_exprcontexts = 0x0, 
  es_subplanstates = 0x0, es_auxmodifytables = 0x0, es_per_tuple_exprcontext = 0x0, es_epqTuple = 0x0, 
  es_epqTupleSet = 0x0, es_epqScanDone = 0x0, es_use_parallel_mode = false, es_query_dsa = 0x0, es_jit_flags = 0, 
  es_jit = 0x0, es_jit_worker_instr = 0x0}

ExecutorStart-->EState結構體中的變數賦值

(gdb) n
185     oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
(gdb) 
191     estate->es_param_list_info = queryDesc->params;
(gdb) 
193     if (queryDesc->plannedstmt->paramExecTypes != NIL)
(gdb) 
202     estate->es_sourceText = queryDesc->sourceText;
(gdb) 
207     estate->es_queryEnv = queryDesc->queryEnv;

ExecutorStart-->根據queryDesc->operation的不同執行的處理

(gdb) 
212     switch (queryDesc->operation)
(gdb) 
220             if (queryDesc->plannedstmt->rowMarks != NIL ||
(gdb) p queryDesc->operation
$5 = CMD_SELECT
(gdb) n
221                 queryDesc->plannedstmt->hasModifyingCTE)
(gdb) 
220             if (queryDesc->plannedstmt->rowMarks != NIL ||
(gdb) 
230             if (!queryDesc->plannedstmt->hasModifyingCTE)
(gdb) 
231                 eflags |= EXEC_FLAG_SKIP_TRIGGERS;
(gdb) 
232             break;

ExecutorStart-->設定快照

(gdb) n
249     estate->es_snapshot = RegisterSnapshot(queryDesc->snapshot);
(gdb) p *queryDesc->snapshot
$6 = {satisfies = 0xa923ca <HeapTupleSatisfiesMVCC>, xmin = 1689, xmax = 1689, xip = 0x0, xcnt = 0, subxip = 0x0, 
  subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = true, curcid = 0, speculativeToken = 0, 
  active_count = 1, regd_count = 1, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0x0}, whenTaken = 0, 
  lsn = 0}

ExecutorStart-->設定其他EState中的變數

(gdb) n
250     estate->es_crosscheck_snapshot = RegisterSnapshot(queryDesc->crosscheck_snapshot);
(gdb) p *queryDesc->crosscheck_snapshot
Cannot access memory at address 0x0
(gdb) n
251     estate->es_top_eflags = eflags;
(gdb) 
252     estate->es_instrument = queryDesc->instrument_options;
(gdb) 
253     estate->es_jit_flags = queryDesc->plannedstmt->jitFlags;
(gdb) 
259     if (!(eflags & (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))

ExecutorStart-->執行InitPlan

(gdb) 
265     InitPlan(queryDesc, eflags);
(gdb) 
267     MemoryContextSwitchTo(oldcontext);
(gdb) 
268 }

ExecutorStart-->檢視QueryDesc和EState

(gdb) p *queryDesc
$7 = {operation = CMD_SELECT, plannedstmt = 0x2650df0, 
  sourceText = 0x2567eb8 "select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je \nfrom t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je \n", ' ' <repeats 24 times>, "from t_grxx gr inner join t_jfxx jf \n", ' ' <repeats 34 times>..., 
  snapshot = 0x25e46c0, crosscheck_snapshot = 0x0, dest = 0xf8f280 <donothingDR>, params = 0x0, queryEnv = 0x0, 
  instrument_options = 0, tupDesc = 0x2665058, estate = 0x2653f48, planstate = 0x2654160, already_executed = false, 
  totaltime = 0x0}
(gdb) p *estate
$8 = {type = T_EState, es_direction = ForwardScanDirection, es_snapshot = 0x25e46c0, es_crosscheck_snapshot = 0x0, 
  es_range_table = 0x264ec98, es_plannedstmt = 0x2650df0, 
  es_sourceText = 0x2567eb8 "select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je \nfrom t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je \n", ' ' <repeats 24 times>, "from t_grxx gr inner join t_jfxx jf \n", ' ' <repeats 34 times>..., 
  es_junkFilter = 0x0, es_output_cid = 0, es_result_relations = 0x0, es_num_result_relations = 0, 
  es_result_relation_info = 0x0, es_root_result_relations = 0x0, es_num_root_result_relations = 0, 
  es_tuple_routing_result_relations = 0x0, es_trig_target_relations = 0x0, es_trig_tuple_slot = 0x0, 
  es_trig_oldtup_slot = 0x0, es_trig_newtup_slot = 0x0, es_param_list_info = 0x0, es_param_exec_vals = 0x0, 
  es_queryEnv = 0x0, es_query_cxt = 0x2653e30, es_tupleTable = 0x2654af8, es_rowMarks = 0x0, es_processed = 0, 
  es_lastoid = 0, es_top_eflags = 16, es_instrument = 0, es_finished = false, es_exprcontexts = 0x2654550, 
  es_subplanstates = 0x0, es_auxmodifytables = 0x0, es_per_tuple_exprcontext = 0x0, es_epqTuple = 0x0, 
  es_epqTupleSet = 0x0, es_epqScanDone = 0x0, es_use_parallel_mode = false, es_query_dsa = 0x0, es_jit_flags = 0, 
  es_jit = 0x0, es_jit_worker_instr = 0x0}

ExecutorStart-->回到PortalStart

(gdb) n
ExecutorStart (queryDesc=0x2657f68, eflags=0) at execMain.c:148
148 }
(gdb) n
PortalStart (portal=0x25cd468, params=0x0, eflags=0, snapshot=0x0) at pquery.c:525
525                 portal->queryDesc = queryDesc;

設定portal中的變數atStart等

525                 portal->queryDesc = queryDesc;
(gdb) n
530                 portal->tupDesc = queryDesc->tupDesc;
(gdb) 
535                 portal->atStart = true;
(gdb) 
536                 portal->atEnd = false;  /* allow fetches */
(gdb) 
537                 portal->portalPos = 0;
(gdb) 
539                 PopActiveSnapshot();
(gdb) 
540                 break;

執行完畢,回到exec_simple_query

(gdb) 
613     portal->status = PORTAL_READY;
(gdb) 
614 }
(gdb) 
exec_simple_query (
    query_string=0x2567eb8 "select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je \nfrom t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je \n", ' ' <repeats 24 times>, "from t_grxx gr inner join t_jfxx jf \n", ' ' <repeats 34 times>...) at postgres.c:1091
warning: Source file is more recent than executable.
1091            format = 0;             /* TEXT is default */
(gdb) 

DONE!

四、參考資料

postgres.c
PG Document:Query Planning

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374807/,如需轉載,請註明出處,否則將追究法律責任。

相關文章