PostgreSQL 原始碼解讀(96)- 分割槽表#3(資料插入路由#3-獲取分割槽鍵值)

husthxd發表於2018-11-27

本節介紹了ExecPrepareTupleRouting->ExecFindPartition->FormPartitionKeyDatum函式,該函式獲取Tuple的分割槽鍵值。

一、資料結構

ModifyTable
透過插入、更新或刪除,將子計劃生成的行應用到結果表。

/* ----------------
 *   ModifyTable node -
 *      Apply rows produced by subplan(s) to result table(s),
 *      by inserting, updating, or deleting.
 *      透過插入、更新或刪除,將子計劃生成的行應用到結果表。
 *
 * If the originally named target table is a partitioned table, both
 * nominalRelation and rootRelation contain the RT index of the partition
 * root, which is not otherwise mentioned in the plan.  Otherwise rootRelation
 * is zero.  However, nominalRelation will always be set, as it's the rel that
 * EXPLAIN should claim is the INSERT/UPDATE/DELETE target.
 * 如果最初命名的目標表是分割槽表,則nominalRelation和rootRelation都包含分割槽根的RT索引,計劃中沒有另外提到這個索引。
 * 否則,根關係為零。但是,總是會設定名義關係,nominalRelation因為EXPLAIN應該宣告的rel是INSERT/UPDATE/DELETE目標關係。
 * 
 * Note that rowMarks and epqParam are presumed to be valid for all the
 * subplan(s); they can't contain any info that varies across subplans.
 * 注意,rowMarks和epqParam被假定對所有子計劃有效;
 * 它們不能包含任何在子計劃中變化的資訊。
 * ----------------
 */
typedef struct ModifyTable
{
    Plan        plan;
    CmdType     operation;      /* 操作型別;INSERT, UPDATE, or DELETE */
    bool        canSetTag;      /* 是否需要設定tag?do we set the command tag/es_processed? */
    Index       nominalRelation;    /* 用於EXPLAIN的父RT索引;Parent RT index for use of EXPLAIN */
    Index       rootRelation;   /* 根Root RT索引(如目標為分割槽表);Root RT index, if target is partitioned */
    bool        partColsUpdated;    /* 更新了層次結構中的分割槽關鍵字;some part key in hierarchy updated */
    List       *resultRelations;    /* RT索引的整型連結串列;integer list of RT indexes */
    int         resultRelIndex; /* 計劃連結串列中第一個resultRel的索引;index of first resultRel in plan's list */
    int         rootResultRelIndex; /* 分割槽表根索引;index of the partitioned table root */
    List       *plans;          /* 生成源資料的計劃連結串列;plan(s) producing source data */
    List       *withCheckOptionLists;   /* 每一個目標表均具備的WCO連結串列;per-target-table WCO lists */
    List       *returningLists; /* 每一個目標表均具備的RETURNING連結串列;per-target-table RETURNING tlists */
    List       *fdwPrivLists;   /* 每一個目標表的FDW私有資料連結串列;per-target-table FDW private data lists */
    Bitmapset  *fdwDirectModifyPlans;   /* FDW DM計劃索引點陣圖;indices of FDW DM plans */
    List       *rowMarks;       /* rowMarks連結串列;PlanRowMarks (non-locking only) */
    int         epqParam;       /* EvalPlanQual再解析使用的引數ID;ID of Param for EvalPlanQual re-eval */
    OnConflictAction onConflictAction;  /* ON CONFLICT action */
    List       *arbiterIndexes; /* 衝突仲裁器索引表;List of ON CONFLICT arbiter index OIDs  */
    List       *onConflictSet;  /* SET for INSERT ON CONFLICT DO UPDATE */
    Node       *onConflictWhere;    /* WHERE for ON CONFLICT UPDATE */
    Index       exclRelRTI;     /* RTI of the EXCLUDED pseudo relation */
    List       *exclRelTlist;   /* 已排除偽關係的投影列連結串列;tlist of the EXCLUDED pseudo relation */
} ModifyTable;

ResultRelInfo
ResultRelInfo結構體
每當更新一個現有的關係時,我們必須更新關係上的索引,也許還需要觸發觸發器。ResultRelInfo儲存關於結果關係所需的所有資訊,包括索引。

/*
 * ResultRelInfo
 * ResultRelInfo結構體
 *
 * Whenever we update an existing relation, we have to update indexes on the
 * relation, and perhaps also fire triggers.  ResultRelInfo holds all the
 * information needed about a result relation, including indexes.
 * 每當更新一個現有的關係時,我們必須更新關係上的索引,也許還需要觸發觸發器。
 * ResultRelInfo儲存關於結果關係所需的所有資訊,包括索引。
 * 
 * Normally, a ResultRelInfo refers to a table that is in the query's
 * range table; then ri_RangeTableIndex is the RT index and ri_RelationDesc
 * is just a copy of the relevant es_relations[] entry.  But sometimes,
 * in ResultRelInfos used only for triggers, ri_RangeTableIndex is zero
 * and ri_RelationDesc is a separately-opened relcache pointer that needs
 * to be separately closed.  See ExecGetTriggerResultRel.
 * 通常,ResultRelInfo是指查詢範圍表中的表;
 * ri_RangeTableIndex是RT索引,而ri_RelationDesc只是相關es_relations[]條目的副本。
 * 但有時,在只用於觸發器的ResultRelInfos中,ri_RangeTableIndex為零(NULL),
 *   而ri_RelationDesc是一個需要單獨關閉單獨開啟的relcache指標。
 *   具體可參考ExecGetTriggerResultRel結構體。
 */
typedef struct ResultRelInfo
{
    NodeTag     type;

    /* result relation's range table index, or 0 if not in range table */
    //RTE索引
    Index       ri_RangeTableIndex;

    /* relation descriptor for result relation */
    //結果/目標relation的描述符
    Relation    ri_RelationDesc;

    /* # of indices existing on result relation */
    //目標關係中索引數目
    int         ri_NumIndices;

    /* array of relation descriptors for indices */
    //索引的關係描述符陣列(索引視為一個relation)
    RelationPtr ri_IndexRelationDescs;

    /* array of key/attr info for indices */
    //索引的鍵/屬性陣列
    IndexInfo **ri_IndexRelationInfo;

    /* triggers to be fired, if any */
    //觸發的索引
    TriggerDesc *ri_TrigDesc;

    /* cached lookup info for trigger functions */
    //觸發器函式(快取)
    FmgrInfo   *ri_TrigFunctions;

    /* array of trigger WHEN expr states */
    //WHEN表示式狀態的觸發器陣列
    ExprState **ri_TrigWhenExprs;

    /* optional runtime measurements for triggers */
    //可選的觸發器執行期度量器
    Instrumentation *ri_TrigInstrument;

    /* FDW callback functions, if foreign table */
    //FDW回撥函式
    struct FdwRoutine *ri_FdwRoutine;

    /* available to save private state of FDW */
    //可用於儲存FDW的私有狀態
    void       *ri_FdwState;

    /* true when modifying foreign table directly */
    //直接更新FDW時為T
    bool        ri_usesFdwDirectModify;

    /* list of WithCheckOption's to be checked */
    //WithCheckOption連結串列
    List       *ri_WithCheckOptions;

    /* list of WithCheckOption expr states */
    //WithCheckOption表示式連結串列
    List       *ri_WithCheckOptionExprs;

    /* array of constraint-checking expr states */
    //約束檢查表示式狀態陣列
    ExprState **ri_ConstraintExprs;

    /* for removing junk attributes from tuples */
    //用於從元組中刪除junk屬性
    JunkFilter *ri_junkFilter;

    /* list of RETURNING expressions */
    //RETURNING表示式連結串列
    List       *ri_returningList;

    /* for computing a RETURNING list */
    //用於計算RETURNING連結串列
    ProjectionInfo *ri_projectReturning;

    /* list of arbiter indexes to use to check conflicts */
    //用於檢查衝突的仲裁器索引的列表
    List       *ri_onConflictArbiterIndexes;

    /* ON CONFLICT evaluation state */
    //ON CONFLICT解析狀態
    OnConflictSetState *ri_onConflict;

    /* partition check expression */
    //分割槽檢查表示式連結串列
    List       *ri_PartitionCheck;

    /* partition check expression state */
    //分割槽檢查表示式狀態
    ExprState  *ri_PartitionCheckExpr;

    /* relation descriptor for root partitioned table */
    //分割槽root根表描述符
    Relation    ri_PartitionRoot;

    /* Additional information specific to partition tuple routing */
    //額外的分割槽元組路由資訊
    struct PartitionRoutingInfo *ri_PartitionInfo;
} ResultRelInfo;

PartitionRoutingInfo
PartitionRoutingInfo結構體
分割槽路由資訊,用於將元組路由到表分割槽的結果關係資訊。

/*
 * PartitionRoutingInfo
 * PartitionRoutingInfo - 分割槽路由資訊
 * 
 * Additional result relation information specific to routing tuples to a
 * table partition.
 * 用於將元組路由到表分割槽的結果關係資訊。
 */
typedef struct PartitionRoutingInfo
{
    /*
     * Map for converting tuples in root partitioned table format into
     * partition format, or NULL if no conversion is required.
     * 對映,用於將根分割槽表格式的元組轉換為分割槽格式,如果不需要轉換,則轉換為NULL。
     */
    TupleConversionMap *pi_RootToPartitionMap;

    /*
     * Map for converting tuples in partition format into the root partitioned
     * table format, or NULL if no conversion is required.
     * 對映,用於將分割槽格式的元組轉換為根分割槽表格式,如果不需要轉換,則轉換為NULL。
     */
    TupleConversionMap *pi_PartitionToRootMap;

    /*
     * Slot to store tuples in partition format, or NULL when no translation
     * is required between root and partition.
     * 以分割槽格式儲存元組的slot.在根分割槽和分割槽之間不需要轉換時為NULL。
     */
    TupleTableSlot *pi_PartitionTupleSlot;
} PartitionRoutingInfo;

TupleConversionMap
TupleConversionMap結構體,用於儲存元組轉換對映資訊.


typedef struct TupleConversionMap
{
    TupleDesc   indesc;         /* 源行型別的描述符;tupdesc for source rowtype */
    TupleDesc   outdesc;        /* 結果行型別的描述符;tupdesc for result rowtype */
    AttrNumber *attrMap;        /* 輸入欄位的索引資訊,0表示NULL;indexes of input fields, or 0 for null */
    Datum      *invalues;       /* 析構源資料的工作空間;workspace for deconstructing source */
    bool       *inisnull;       //是否為NULL標記陣列
    Datum      *outvalues;      /* 構造結果的工作空間;workspace for constructing result */
    bool       *outisnull;      //null標記
} TupleConversionMap;

二、原始碼解讀

FormPartitionKeyDatum函式獲取Tuple的分割槽鍵值,返回鍵值values[]陣列和是否為null標記isnull[]陣列.


/* ----------------
 *      FormPartitionKeyDatum
 *          Construct values[] and isnull[] arrays for the partition key
 *          of a tuple.
 *          構造values[]陣列和isnull[]陣列
 *
 *  pd              Partition dispatch object of the partitioned table
 *  pd              分割槽表的分割槽分發器(dispatch)物件
 *
 *  slot            Heap tuple from which to extract partition key
 *  slot            從其中提前分割槽鍵的heap tuple
 *
 *  estate          executor state for evaluating any partition key
 *                  expressions (must be non-NULL)
 *  estate          解析分割槽鍵表示式(必須非NULL)的執行器狀態
 *
 *  values          Array of partition key Datums (output area)
 *                  分割槽鍵Datums陣列(輸出引數)
 *  isnull          Array of is-null indicators (output area)
 *                  is-null標記陣列(輸出引數)
 *
 * the ecxt_scantuple slot of estate's per-tuple expr context must point to
 * the heap tuple passed in.
 * estate的per-tuple上下文的ecxt_scantuple必須指向傳入的heap tuple
 * ----------------
 */
static void
FormPartitionKeyDatum(PartitionDispatch pd,
                      TupleTableSlot *slot,
                      EState *estate,
                      Datum *values,
                      bool *isnull)
{
    ListCell   *partexpr_item;
    int         i;

    if (pd->key->partexprs != NIL && pd->keystate == NIL)
    {
        /* Check caller has set up context correctly */
        //檢查呼叫者是否已正確配置記憶體上下文
        Assert(estate != NULL &&
               GetPerTupleExprContext(estate)->ecxt_scantuple == slot);

        /* First time through, set up expression evaluation state */
        //第一次進入,配置表示式解析器狀態
        pd->keystate = ExecPrepareExprList(pd->key->partexprs, estate);
    }

    partexpr_item = list_head(pd->keystate);//獲取分割槽鍵表示式狀態
    for (i = 0; i < pd->key->partnatts; i++)//迴圈遍歷分割槽鍵
    {
        AttrNumber  keycol = pd->key->partattrs[i];//分割槽鍵屬性編號
        Datum       datum;// typedef uintptr_t Datum;sizeof(Datum) == sizeof(void *) == 4 or 8
        bool        isNull;//是否null

        if (keycol != 0)//編號不為0
        {
            /* Plain column; get the value directly from the heap tuple */
            //扁平列,直接從堆元組中提取值
            datum = slot_getattr(slot, keycol, &isNull);
        }
        else
        {
            /* Expression; need to evaluate it */
            //表示式,需要解析
            if (partexpr_item == NULL)//分割槽鍵表示式狀態為NULL,報錯
                elog(ERROR, "wrong number of partition key expressions");
            //獲取表示式值
            datum = ExecEvalExprSwitchContext((ExprState *) lfirst(partexpr_item),
                                              GetPerTupleExprContext(estate),
                                              &isNull);
            //切換至下一個
            partexpr_item = lnext(partexpr_item);
        }
        values[i] = datum;//賦值
        isnull[i] = isNull;
    }

    if (partexpr_item != NULL)//引數設定有誤?報錯
        elog(ERROR, "wrong number of partition key expressions");
}



/*
 * slot_getattr - fetch one attribute of the slot's contents.
 * slot_getattr - 提取slot中的某個屬性值
 */
static inline Datum
slot_getattr(TupleTableSlot *slot, int attnum,
             bool *isnull)
{
    AssertArg(attnum > 0);

    if (attnum > slot->tts_nvalid)
        slot_getsomeattrs(slot, attnum);

    *isnull = slot->tts_isnull[attnum - 1];

    return slot->tts_values[attnum - 1];
}


/*
 * This function forces the entries of the slot's Datum/isnull arrays to be
 * valid at least up through the attnum'th entry.
 * 這個函式強制slot的Datum/isnull陣列的條目至少在attnum的第一個條目上是有效的。
 */
static inline void
slot_getsomeattrs(TupleTableSlot *slot, int attnum)
{
    if (slot->tts_nvalid < attnum)
        slot_getsomeattrs_int(slot, attnum);
}


/*
 * slot_getsomeattrs_int - workhorse for slot_getsomeattrs()
 * slot_getsomeattrs_int - slot_getsomeattrs()函式的實際實現
 */
void
slot_getsomeattrs_int(TupleTableSlot *slot, int attnum)
{
    /* Check for caller errors */
    //檢查呼叫者輸入引數是否有誤
    Assert(slot->tts_nvalid < attnum); /* slot_getsomeattr checked */
    Assert(attnum > 0);
    //attnum引數判斷
    if (unlikely(attnum > slot->tts_tupleDescriptor->natts))
        elog(ERROR, "invalid attribute number %d", attnum);

    /* Fetch as many attributes as possible from the underlying tuple. */
    //從元組中獲取儘可能多的屬性。
    slot->tts_ops->getsomeattrs(slot, attnum);

    /*
     * If the underlying tuple doesn't have enough attributes, tuple descriptor
     * must have the missing attributes.
     * 如果底層元組沒有足夠的屬性,那麼元組描述符必須具有缺少的屬性。
     */
    if (unlikely(slot->tts_nvalid < attnum))
    {
        slot_getmissingattrs(slot, slot->tts_nvalid, attnum);
        slot->tts_nvalid = attnum;
    }
}

三、跟蹤分析

測試指令碼如下

-- Hash Partition
drop table if exists t_hash_partition;
create table t_hash_partition (c1 int not null,c2  varchar(40),c3 varchar(40)) partition by hash(c1);
create table t_hash_partition_1 partition of t_hash_partition for values with (modulus 6,remainder 0);
create table t_hash_partition_2 partition of t_hash_partition for values with (modulus 6,remainder 1);
create table t_hash_partition_3 partition of t_hash_partition for values with (modulus 6,remainder 2);
create table t_hash_partition_4 partition of t_hash_partition for values with (modulus 6,remainder 3);
create table t_hash_partition_5 partition of t_hash_partition for values with (modulus 6,remainder 4);
create table t_hash_partition_6 partition of t_hash_partition for values with (modulus 6,remainder 5);

insert into t_hash_partition(c1,c2,c3) VALUES(20,'HASH0','HAHS0');

啟動gdb,設定斷點

(gdb) b FormPartitionKeyDatum
Breakpoint 5 at 0x6e30d2: file execPartition.c, line 1087.
(gdb) b slot_getattr
Breakpoint 6 at 0x489d9b: file heaptuple.c, line 1510.
(gdb) c
Continuing.

Breakpoint 5, FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, 
    isnull=0x7fff4e240780) at execPartition.c:1087
1087        if (pd->key->partexprs != NIL && pd->keystate == NIL)

迴圈,根據分割槽鍵獲取相應的鍵值

1087        if (pd->key->partexprs != NIL && pd->keystate == NIL)
(gdb) n
1097        partexpr_item = list_head(pd->keystate);
(gdb) 
1098        for (i = 0; i < pd->key->partnatts; i++)
(gdb) 
1100            AttrNumber  keycol = pd->key->partattrs[i];
(gdb) 
1104            if (keycol != 0)
(gdb) 
1107                datum = slot_getattr(slot, keycol, &isNull);

進入函式slot_getattr

(gdb) step

Breakpoint 6, slot_getattr (slot=0x2e1b8a0, attnum=1, isnull=0x7fff4e240735) at heaptuple.c:1510
1510        HeapTuple   tuple = slot->tts_tuple;

獲取結果,分割槽鍵值為20

...
(gdb) p *isnull
$31 = false
(gdb) p slot->tts_values[attnum - 1]
$32 = 20

返回到FormPartitionKeyDatum函式中

(gdb) n
1593    }
(gdb) 
FormPartitionKeyDatum (pd=0x2e1bfa0, slot=0x2e1b8a0, estate=0x2e1aeb8, values=0x7fff4e2407a0, isnull=0x7fff4e240780)
    at execPartition.c:1119
1119            values[i] = datum;

完成呼叫

1119            values[i] = datum;
(gdb) n
1120            isnull[i] = isNull;
(gdb) 
1098        for (i = 0; i < pd->key->partnatts; i++)
(gdb) 
1123        if (partexpr_item != NULL)
(gdb) 
1125    }
(gdb) 
ExecFindPartition (resultRelInfo=0x2e1b108, pd=0x2e1c5b8, slot=0x2e1b8a0, estate=0x2e1aeb8) at execPartition.c:282
282         if (partdesc->nparts == 0)

DONE!

四、參考資料

PG 11.1 Source Code.
注: doxygen上的原始碼與PG 11.1原始碼並不一致,本節基於11.1進行分析.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374794/,如需轉載,請註明出處,否則將追究法律責任。

相關文章