PostgreSQL 原始碼解讀(94)- 分割槽表#2(資料插入路由#2)
本節介紹了ExecPrepareTupleRouting->ExecFindPartition函式,該函式為heap tuple找到合適的分割槽。
一、資料結構
ModifyTable
ModifyTable Node
透過插入、更新或刪除,將子計劃生成的行應用到結果表。
/* ----------------
* ModifyTable node -
* Apply rows produced by subplan(s) to result table(s),
* by inserting, updating, or deleting.
* 透過插入、更新或刪除,將子計劃生成的行應用到結果表。
*
* If the originally named target table is a partitioned table, both
* nominalRelation and rootRelation contain the RT index of the partition
* root, which is not otherwise mentioned in the plan. Otherwise rootRelation
* is zero. However, nominalRelation will always be set, as it's the rel that
* EXPLAIN should claim is the INSERT/UPDATE/DELETE target.
* 如果最初命名的目標表是分割槽表,則nominalRelation和rootRelation都包含分割槽根的RT索引,計劃中沒有另外提到這個索引。
* 否則,根關係為零。但是,總是會設定名義關係,nominalRelation因為EXPLAIN應該宣告的rel是INSERT/UPDATE/DELETE目標關係。
*
* Note that rowMarks and epqParam are presumed to be valid for all the
* subplan(s); they can't contain any info that varies across subplans.
* 注意,rowMarks和epqParam被假定對所有子計劃有效;
* 它們不能包含任何在子計劃中變化的資訊。
* ----------------
*/
typedef struct ModifyTable
{
Plan plan;
CmdType operation; /* 操作型別;INSERT, UPDATE, or DELETE */
bool canSetTag; /* 是否需要設定tag?do we set the command tag/es_processed? */
Index nominalRelation; /* 用於EXPLAIN的父RT索引;Parent RT index for use of EXPLAIN */
Index rootRelation; /* 根Root RT索引(如目標為分割槽表);Root RT index, if target is partitioned */
bool partColsUpdated; /* 更新了層次結構中的分割槽關鍵字;some part key in hierarchy updated */
List *resultRelations; /* RT索引的整型連結串列;integer list of RT indexes */
int resultRelIndex; /* 計劃連結串列中第一個resultRel的索引;index of first resultRel in plan's list */
int rootResultRelIndex; /* 分割槽表根索引;index of the partitioned table root */
List *plans; /* 生成源資料的計劃連結串列;plan(s) producing source data */
List *withCheckOptionLists; /* 每一個目標表均具備的WCO連結串列;per-target-table WCO lists */
List *returningLists; /* 每一個目標表均具備的RETURNING連結串列;per-target-table RETURNING tlists */
List *fdwPrivLists; /* 每一個目標表的FDW私有資料連結串列;per-target-table FDW private data lists */
Bitmapset *fdwDirectModifyPlans; /* FDW DM計劃索引點陣圖;indices of FDW DM plans */
List *rowMarks; /* rowMarks連結串列;PlanRowMarks (non-locking only) */
int epqParam; /* EvalPlanQual再解析使用的引數ID;ID of Param for EvalPlanQual re-eval */
OnConflictAction onConflictAction; /* ON CONFLICT action */
List *arbiterIndexes; /* 衝突仲裁器索引表;List of ON CONFLICT arbiter index OIDs */
List *onConflictSet; /* SET for INSERT ON CONFLICT DO UPDATE */
Node *onConflictWhere; /* WHERE for ON CONFLICT UPDATE */
Index exclRelRTI; /* RTI of the EXCLUDED pseudo relation */
List *exclRelTlist; /* 已排除偽關係的投影列連結串列;tlist of the EXCLUDED pseudo relation */
} ModifyTable;
ResultRelInfo
ResultRelInfo結構體
每當更新一個現有的關係時,我們必須更新關係上的索引,也許還需要觸發觸發器。ResultRelInfo儲存關於結果關係所需的所有資訊,包括索引。
/*
* ResultRelInfo
* ResultRelInfo結構體
*
* Whenever we update an existing relation, we have to update indexes on the
* relation, and perhaps also fire triggers. ResultRelInfo holds all the
* information needed about a result relation, including indexes.
* 每當更新一個現有的關係時,我們必須更新關係上的索引,也許還需要觸發觸發器。
* ResultRelInfo儲存關於結果關係所需的所有資訊,包括索引。
*
* Normally, a ResultRelInfo refers to a table that is in the query's
* range table; then ri_RangeTableIndex is the RT index and ri_RelationDesc
* is just a copy of the relevant es_relations[] entry. But sometimes,
* in ResultRelInfos used only for triggers, ri_RangeTableIndex is zero
* and ri_RelationDesc is a separately-opened relcache pointer that needs
* to be separately closed. See ExecGetTriggerResultRel.
* 通常,ResultRelInfo是指查詢範圍表中的表;
* ri_RangeTableIndex是RT索引,而ri_RelationDesc只是相關es_relations[]條目的副本。
* 但有時,在只用於觸發器的ResultRelInfos中,ri_RangeTableIndex為零(NULL),
* 而ri_RelationDesc是一個需要單獨關閉單獨開啟的relcache指標。
* 具體可參考ExecGetTriggerResultRel結構體。
*/
typedef struct ResultRelInfo
{
NodeTag type;
/* result relation's range table index, or 0 if not in range table */
//RTE索引
Index ri_RangeTableIndex;
/* relation descriptor for result relation */
//結果/目標relation的描述符
Relation ri_RelationDesc;
/* # of indices existing on result relation */
//目標關係中索引數目
int ri_NumIndices;
/* array of relation descriptors for indices */
//索引的關係描述符陣列(索引視為一個relation)
RelationPtr ri_IndexRelationDescs;
/* array of key/attr info for indices */
//索引的鍵/屬性陣列
IndexInfo **ri_IndexRelationInfo;
/* triggers to be fired, if any */
//觸發的索引
TriggerDesc *ri_TrigDesc;
/* cached lookup info for trigger functions */
//觸發器函式(快取)
FmgrInfo *ri_TrigFunctions;
/* array of trigger WHEN expr states */
//WHEN表示式狀態的觸發器陣列
ExprState **ri_TrigWhenExprs;
/* optional runtime measurements for triggers */
//可選的觸發器執行期度量器
Instrumentation *ri_TrigInstrument;
/* FDW callback functions, if foreign table */
//FDW回撥函式
struct FdwRoutine *ri_FdwRoutine;
/* available to save private state of FDW */
//可用於儲存FDW的私有狀態
void *ri_FdwState;
/* true when modifying foreign table directly */
//直接更新FDW時為T
bool ri_usesFdwDirectModify;
/* list of WithCheckOption's to be checked */
//WithCheckOption連結串列
List *ri_WithCheckOptions;
/* list of WithCheckOption expr states */
//WithCheckOption表示式連結串列
List *ri_WithCheckOptionExprs;
/* array of constraint-checking expr states */
//約束檢查表示式狀態陣列
ExprState **ri_ConstraintExprs;
/* for removing junk attributes from tuples */
//用於從元組中刪除junk屬性
JunkFilter *ri_junkFilter;
/* list of RETURNING expressions */
//RETURNING表示式連結串列
List *ri_returningList;
/* for computing a RETURNING list */
//用於計算RETURNING連結串列
ProjectionInfo *ri_projectReturning;
/* list of arbiter indexes to use to check conflicts */
//用於檢查衝突的仲裁器索引的列表
List *ri_onConflictArbiterIndexes;
/* ON CONFLICT evaluation state */
//ON CONFLICT解析狀態
OnConflictSetState *ri_onConflict;
/* partition check expression */
//分割槽檢查表示式連結串列
List *ri_PartitionCheck;
/* partition check expression state */
//分割槽檢查表示式狀態
ExprState *ri_PartitionCheckExpr;
/* relation descriptor for root partitioned table */
//分割槽root根表描述符
Relation ri_PartitionRoot;
/* Additional information specific to partition tuple routing */
//額外的分割槽元組路由資訊
struct PartitionRoutingInfo *ri_PartitionInfo;
} ResultRelInfo;
PartitionRoutingInfo
PartitionRoutingInfo結構體
分割槽路由資訊,用於將元組路由到表分割槽的結果關係資訊。
/*
* PartitionRoutingInfo
* PartitionRoutingInfo - 分割槽路由資訊
*
* Additional result relation information specific to routing tuples to a
* table partition.
* 用於將元組路由到表分割槽的結果關係資訊。
*/
typedef struct PartitionRoutingInfo
{
/*
* Map for converting tuples in root partitioned table format into
* partition format, or NULL if no conversion is required.
* 對映,用於將根分割槽表格式的元組轉換為分割槽格式,如果不需要轉換,則轉換為NULL。
*/
TupleConversionMap *pi_RootToPartitionMap;
/*
* Map for converting tuples in partition format into the root partitioned
* table format, or NULL if no conversion is required.
* 對映,用於將分割槽格式的元組轉換為根分割槽表格式,如果不需要轉換,則轉換為NULL。
*/
TupleConversionMap *pi_PartitionToRootMap;
/*
* Slot to store tuples in partition format, or NULL when no translation
* is required between root and partition.
* 以分割槽格式儲存元組的slot.在根分割槽和分割槽之間不需要轉換時為NULL。
*/
TupleTableSlot *pi_PartitionTupleSlot;
} PartitionRoutingInfo;
TupleConversionMap
TupleConversionMap結構體,用於儲存元組轉換對映資訊.
typedef struct TupleConversionMap
{
TupleDesc indesc; /* 源行型別的描述符;tupdesc for source rowtype */
TupleDesc outdesc; /* 結果行型別的描述符;tupdesc for result rowtype */
AttrNumber *attrMap; /* 輸入欄位的索引資訊,0表示NULL;indexes of input fields, or 0 for null */
Datum *invalues; /* 析構源資料的工作空間;workspace for deconstructing source */
bool *inisnull; //是否為NULL標記陣列
Datum *outvalues; /* 構造結果的工作空間;workspace for constructing result */
bool *outisnull; //null標記
} TupleConversionMap;
二、原始碼解讀
ExecFindPartition函式在以父節點為根的分割槽樹中為包含在*slot中的元組找到目標分割槽(葉子分割槽)
/*
* ExecFindPartition -- Find a leaf partition in the partition tree rooted
* at parent, for the heap tuple contained in *slot
* ExecFindPartition —— 在以父節點為根的分割槽樹中為包含在*slot中的堆元組找到目標分割槽(葉子分割槽)
*
* estate must be non-NULL; we'll need it to compute any expressions in the
* partition key(s)
* estate不能為NULL;需要使用它計算分割槽鍵上的表示式
*
* If no leaf partition is found, this routine errors out with the appropriate
* error message, else it returns the leaf partition sequence number
* as an index into the array of (ResultRelInfos of) all leaf partitions in
* the partition tree.
* 如果沒有找到目標分割槽,則此例程將輸出適當的錯誤訊息,
* 否則它將分割槽樹中所有葉子分割槽的陣列(ResultRelInfos)的目標分割槽序列號作為索引返回。
*/
int
ExecFindPartition(ResultRelInfo *resultRelInfo, PartitionDispatch *pd,
TupleTableSlot *slot, EState *estate)
{
int result;//結果索引號
Datum values[PARTITION_MAX_KEYS];//值型別Datum
bool isnull[PARTITION_MAX_KEYS];//是否null?
Relation rel;//關係
PartitionDispatch dispatch;//
ExprContext *ecxt = GetPerTupleExprContext(estate);//表示式上下文
TupleTableSlot *ecxt_scantuple_old = ecxt->ecxt_scantuple;//原tuple slot
TupleTableSlot *myslot = NULL;//臨時變數
MemoryContext oldcxt;//原記憶體上下文
HeapTuple tuple;//tuple
/* use per-tuple context here to avoid leaking memory */
//使用每個元組上下文來避免記憶體洩漏
oldcxt = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
/*
* First check the root table's partition constraint, if any. No point in
* routing the tuple if it doesn't belong in the root table itself.
* 首先檢查根表的分割槽約束(如果有的話)。如果元組不屬於根表本身,則沒有必要路由它。
*/
if (resultRelInfo->ri_PartitionCheck)
ExecPartitionCheck(resultRelInfo, slot, estate, true);
/* start with the root partitioned table */
//從root分割槽表開始
tuple = ExecFetchSlotTuple(slot);//獲取tuple
dispatch = pd[0];//root
while (true)
{
PartitionDesc partdesc;//分割槽描述符
TupleConversionMap *map = dispatch->tupmap;//轉換對映
int cur_index = -1;//當前索引
rel = dispatch->reldesc;//relation
partdesc = RelationGetPartitionDesc(rel);//獲取rel描述符
/*
* Convert the tuple to this parent's layout, if different from the
* current relation.
* 如果元組與當前關係不同,則將tuple轉換為parent's layout。
*/
myslot = dispatch->tupslot;
if (myslot != NULL && map != NULL)
{
tuple = do_convert_tuple(tuple, map);
ExecStoreTuple(tuple, myslot, InvalidBuffer, true);
slot = myslot;
}
/*
* Extract partition key from tuple. Expression evaluation machinery
* that FormPartitionKeyDatum() invokes expects ecxt_scantuple to
* point to the correct tuple slot. The slot might have changed from
* what was used for the parent table if the table of the current
* partitioning level has different tuple descriptor from the parent.
* So update ecxt_scantuple accordingly.
* 從元組中提取分割槽鍵。
* FormPartitionKeyDatum()呼叫的表示式計算機制期望ecxt_scantuple指向正確的元組slot。
* 如果當前分割槽級別的表與父表具有不同的元組描述符,那麼slot可能已經改變了父表使用的slot。
* 因此相應地更新ecxt_scantuple。
*/
ecxt->ecxt_scantuple = slot;
FormPartitionKeyDatum(dispatch, slot, estate, values, isnull);
/*
* Nothing for get_partition_for_tuple() to do if there are no
* partitions to begin with.
* 如無分割槽,則退出(無需呼叫get_partition_for_tuple)
*/
if (partdesc->nparts == 0)
{
result = -1;
break;
}
//呼叫get_partition_for_tuple
cur_index = get_partition_for_tuple(rel, values, isnull);
/*
* cur_index < 0 means we failed to find a partition of this parent.
* cur_index >= 0 means we either found the leaf partition, or the
* next parent to find a partition of.
* cur_index < 0表示未能找到該父節點的分割槽。
* cur_index >= 0表示要麼找到葉子分割槽,要麼找到下一個父分割槽。
*/
if (cur_index < 0)
{
result = -1;
break;//找不到,退出
}
else if (dispatch->indexes[cur_index] >= 0)
{
result = dispatch->indexes[cur_index];
/* success! */
break;//找到了,退出迴圈
}
else
{
/* move down one level */
//移到下一層查詢
dispatch = pd[-dispatch->indexes[cur_index]];
/*
* Release the dedicated slot, if it was used. Create a copy of
* the tuple first, for the next iteration.
*/
if (slot == myslot)
{
tuple = ExecCopySlotTuple(myslot);
ExecClearTuple(myslot);
}
}
}
/* Release the tuple in the lowest parent's dedicated slot. */
//釋放位於最低父級的專用的slot相對應的元組。
if (slot == myslot)
ExecClearTuple(myslot);
/* A partition was not found. */
//找不到partition
if (result < 0)
{
char *val_desc;
val_desc = ExecBuildSlotPartitionKeyDescription(rel,
values, isnull, 64);
Assert(OidIsValid(RelationGetRelid(rel)));
ereport(ERROR,
(errcode(ERRCODE_CHECK_VIOLATION),
errmsg("no partition of relation \"%s\" found for row",
RelationGetRelationName(rel)),
val_desc ? errdetail("Partition key of the failing row contains %s.", val_desc) : 0));
}
MemoryContextSwitchTo(oldcxt);
ecxt->ecxt_scantuple = ecxt_scantuple_old;
return result;
}
/*
* get_partition_for_tuple
* Finds partition of relation which accepts the partition key specified
* in values and isnull
* get_partition_for_tuple
* 查詢引數為values和isnull中指定分割槽鍵的關係分割槽
*
* Return value is index of the partition (>= 0 and < partdesc->nparts) if one
* found or -1 if none found.
* 返回值是分割槽的索引(>= 0和< partdesc->nparts),
* 如果找到一個分割槽,則返回值;如果沒有找到,則返回值為-1。
*/
static int
get_partition_for_tuple(Relation relation, Datum *values, bool *isnull)
{
int bound_offset;
int part_index = -1;
PartitionKey key = RelationGetPartitionKey(relation);
PartitionDesc partdesc = RelationGetPartitionDesc(relation);
PartitionBoundInfo boundinfo = partdesc->boundinfo;
/* Route as appropriate based on partitioning strategy. */
//基於分割槽的策略進行路由
switch (key->strategy)
{
case PARTITION_STRATEGY_HASH://HASH分割槽
{
int greatest_modulus;
uint64 rowHash;
greatest_modulus = get_hash_partition_greatest_modulus(boundinfo);
rowHash = compute_partition_hash_value(key->partnatts,
key->partsupfunc,
values, isnull);
part_index = boundinfo->indexes[rowHash % greatest_modulus];
}
break;
case PARTITION_STRATEGY_LIST://列表分割槽
if (isnull[0])
{
if (partition_bound_accepts_nulls(boundinfo))
part_index = boundinfo->null_index;
}
else
{
bool equal = false;
bound_offset = partition_list_bsearch(key->partsupfunc,
key->partcollation,
boundinfo,
values[0], &equal);
if (bound_offset >= 0 && equal)
part_index = boundinfo->indexes[bound_offset];
}
break;
case PARTITION_STRATEGY_RANGE://範圍分割槽
{
bool equal = false,
range_partkey_has_null = false;
int i;
/*
* No range includes NULL, so this will be accepted by the
* default partition if there is one, and otherwise rejected.
* 任何範圍都不包含NULL值,因此預設分割槽將接受該值(如果存在),否則將拒絕該值。
*/
for (i = 0; i < key->partnatts; i++)
{
if (isnull[i])
{
range_partkey_has_null = true;
break;
}
}
if (!range_partkey_has_null)
{
bound_offset = partition_range_datum_bsearch(key->partsupfunc,
key->partcollation,
boundinfo,
key->partnatts,
values,
&equal);
/*
* The bound at bound_offset is less than or equal to the
* tuple value, so the bound at offset+1 is the upper
* bound of the partition we're looking for, if there
* actually exists one.
* bound_offset的邊界小於或等於元組值,所以offset+1的邊界是我們要找的分割槽的上界,如存在的話。
*/
part_index = boundinfo->indexes[bound_offset + 1];
}
}
break;
default:
elog(ERROR, "unexpected partition strategy: %d",
(int) key->strategy);//暫不支援其他分割槽
}
/*
* part_index < 0 means we failed to find a partition of this parent. Use
* the default partition, if there is one.
* part_index < 0表示沒有找到這個父節點的分割槽。如存在分割槽,則使用預設分割槽。
*/
if (part_index < 0)
part_index = boundinfo->default_index;
return part_index;
}
依賴的函式
/*
* get_hash_partition_greatest_modulus
*
* Returns the greatest modulus of the hash partition bound. The greatest
* modulus will be at the end of the datums array because hash partitions are
* arranged in the ascending order of their moduli and remainders.
* 返回雜湊分割槽邊界的最大模。
* 最大模量將位於datums陣列的末尾,因為雜湊分割槽按照它們的模組和餘數的升序排列。
*/
int
get_hash_partition_greatest_modulus(PartitionBoundInfo bound)
{
Assert(bound && bound->strategy == PARTITION_STRATEGY_HASH);
Assert(bound->datums && bound->ndatums > 0);
Assert(DatumGetInt32(bound->datums[bound->ndatums - 1][0]) > 0);
return DatumGetInt32(bound->datums[bound->ndatums - 1][0]);
}
/*
* compute_partition_hash_value
*
* Compute the hash value for given partition key values.
* 給定分割槽鍵值,計算相應的Hash值
*/
uint64
compute_partition_hash_value(int partnatts, FmgrInfo *partsupfunc,
Datum *values, bool *isnull)
{
int i;
uint64 rowHash = 0;//返回結果
Datum seed = UInt64GetDatum(HASH_PARTITION_SEED);
for (i = 0; i < partnatts; i++)
{
/* Nulls are just ignored */
if (!isnull[i])
{
//不為NULL
Datum hash;
Assert(OidIsValid(partsupfunc[i].fn_oid));
/*
* Compute hash for each datum value by calling respective
* datatype-specific hash functions of each partition key
* attribute.
* 透過呼叫每個分割槽鍵屬性的特定於資料型別的雜湊函式,計算每個資料值的雜湊值。
*/
hash = FunctionCall2(&partsupfunc[i], values[i], seed);
/* Form a single 64-bit hash value */
//組合成一個單獨的64bit雜湊值
rowHash = hash_combine64(rowHash, DatumGetUInt64(hash));
}
}
return rowHash;
}
/*
* Combine two 64-bit hash values, resulting in another hash value, using the
* same kind of technique as hash_combine(). Testing shows that this also
* produces good bit mixing.
* 使用與hash_combine()相同的技術組合兩個64位雜湊值,生成另一個雜湊值。
* 測試表明,該方法也能產生良好的混合效果。
*/
static inline uint64
hash_combine64(uint64 a, uint64 b)
{
/* 0x49a0f4dd15e5a8e3 is 64bit random data */
a ^= b + UINT64CONST(0x49a0f4dd15e5a8e3) + (a << 54) + (a >> 7);
return a;
}
//兩個引數的函式呼叫宏定義
#define FunctionCall2(flinfo, arg1, arg2) \
FunctionCall2Coll(flinfo, InvalidOid, arg1, arg2)
三、跟蹤分析
測試指令碼如下
-- Hash Partition
drop table if exists t_hash_partition;
create table t_hash_partition (c1 int not null,c2 varchar(40),c3 varchar(40)) partition by hash(c1);
create table t_hash_partition_1 partition of t_hash_partition for values with (modulus 6,remainder 0);
create table t_hash_partition_2 partition of t_hash_partition for values with (modulus 6,remainder 1);
create table t_hash_partition_3 partition of t_hash_partition for values with (modulus 6,remainder 2);
create table t_hash_partition_4 partition of t_hash_partition for values with (modulus 6,remainder 3);
create table t_hash_partition_5 partition of t_hash_partition for values with (modulus 6,remainder 4);
create table t_hash_partition_6 partition of t_hash_partition for values with (modulus 6,remainder 5);
insert into t_hash_partition(c1,c2,c3) VALUES(0,'HASH0','HAHS0');
啟動gdb,設定斷點,進入ExecFindPartition
(gdb) b ExecFindPartition
Breakpoint 1 at 0x6e19e7: file execPartition.c, line 227.
(gdb) c
Continuing.
Breakpoint 1, ExecFindPartition (resultRelInfo=0x14299a8, pd=0x142ae58, slot=0x142a140, estate=0x1429758)
at execPartition.c:227
227 ExprContext *ecxt = GetPerTupleExprContext(estate);
初始化變數,切換記憶體上下文
227 ExprContext *ecxt = GetPerTupleExprContext(estate);
(gdb) n
228 TupleTableSlot *ecxt_scantuple_old = ecxt->ecxt_scantuple;
(gdb)
229 TupleTableSlot *myslot = NULL;
(gdb)
234 oldcxt = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
(gdb) p ecxt_scantuple_old
$1 = (TupleTableSlot *) 0x0
提取tuple,獲取dispatch
(gdb) n
244 tuple = ExecFetchSlotTuple(slot);
(gdb)
245 dispatch = pd[0];
(gdb) n
249 TupleConversionMap *map = dispatch->tupmap;
(gdb) p *tuple
$2 = {t_len = 40, t_self = {ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, t_tableOid = 0, t_data = 0x142b158}
(gdb)
檢視分發器dispatch資訊
(gdb) p *dispatch
$3 = {reldesc = 0x7fbfa6900950, key = 0x1489860, keystate = 0x0, partdesc = 0x149b130, tupslot = 0x0, tupmap = 0x0,
indexes = 0x142ade8}
(gdb) p *dispatch->reldesc
$4 = {rd_node = {spcNode = 1663, dbNode = 16402, relNode = 16986}, rd_smgr = 0x0, rd_refcnt = 1, rd_backend = -1,
rd_islocaltemp = false, rd_isnailed = false, rd_isvalid = true, rd_indexvalid = 0 '\000', rd_statvalid = false,
rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_rel = 0x7fbfa6900b68, rd_att = 0x7fbfa6900c80, rd_id = 16986,
rd_lockInfo = {lockRelId = {relId = 16986, dbId = 16402}}, rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x0,
rd_rsdesc = 0x0, rd_fkeylist = 0x0, rd_fkeyvalid = false, rd_partkeycxt = 0x1489710, rd_partkey = 0x1489860,
rd_pdcxt = 0x149afe0, rd_partdesc = 0x149b130, rd_partcheck = 0x0, rd_indexlist = 0x0, rd_oidindex = 0, rd_pkindex = 0,
rd_replidindex = 0, rd_statlist = 0x0, rd_indexattr = 0x0, rd_projindexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0,
rd_idattr = 0x0, rd_projidx = 0x0, rd_pubactions = 0x0, rd_options = 0x0, rd_index = 0x0, rd_indextuple = 0x0,
rd_amhandler = 0, rd_indexcxt = 0x0, rd_amroutine = 0x0, rd_opfamily = 0x0, rd_opcintype = 0x0, rd_support = 0x0,
rd_supportinfo = 0x0, rd_indoption = 0x0, rd_indexprs = 0x0, rd_indpred = 0x0, rd_exclops = 0x0, rd_exclprocs = 0x0,
rd_exclstrats = 0x0, rd_amcache = 0x0, rd_indcollation = 0x0, rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x0}
----------------------------------------------------------------------------
testdb=# select relname from pg_class where oid=16986;
relname
------------------
t_hash_partition -->hash分割槽表
(1 row)
----------------------------------------------------------------------------
(gdb) p *dispatch->key
$5 = {strategy = 104 'h', partnatts = 1, partattrs = 0x14898f8, partexprs = 0x0, partopfamily = 0x1489918,
partopcintype = 0x1489938, partsupfunc = 0x1489958, partcollation = 0x14899b0, parttypid = 0x14899d0,
parttypmod = 0x14899f0, parttyplen = 0x1489a10, parttypbyval = 0x1489a30,
parttypalign = 0x1489a50 "i~\177\177\177\177\177\177\b", parttypcoll = 0x1489a70}
(gdb) p *dispatch->partdesc
$6 = {nparts = 6, oids = 0x149b168, boundinfo = 0x149b1a0}
(gdb) p *dispatch->partdesc->boundinfo
$8 = {strategy = 104 'h', ndatums = 6, datums = 0x149b1f8, kind = 0x0, indexes = 0x149b288, null_index = -1,
default_index = -1}
(gdb) p *dispatch->partdesc->boundinfo->datums
$9 = (Datum *) 0x149b2c0
(gdb) p **dispatch->partdesc->boundinfo->datums
$10 = 6
(gdb) p *dispatch->indexes
$15 = 0
分割槽描述符中的oids(分別對應t_hash_partition_1->6)
(gdb) p dispatch->partdesc->oids[0]
$11 = 16989
(gdb) p dispatch->partdesc->oids[1]
$12 = 16992
...
(gdb) p dispatch->partdesc->oids[5]
$13 = 17004
索引資訊
(gdb) p dispatch->indexes[0]
$16 = 0
...
(gdb) p dispatch->indexes[5]
$18 = 5
設定當前索引(-1),獲取relation資訊,獲取分割槽描述符
(gdb) n
250 int cur_index = -1;
(gdb)
252 rel = dispatch->reldesc;
(gdb)
253 partdesc = RelationGetPartitionDesc(rel);
(gdb)
259 myslot = dispatch->tupslot;
(gdb) p *partdesc
$19 = {nparts = 6, oids = 0x149b168, boundinfo = 0x149b1a0}
(gdb)
myslot為NULL
(gdb) n
260 if (myslot != NULL && map != NULL)
(gdb) p myslot
$20 = (TupleTableSlot *) 0x0
從元組中提取分割槽鍵
(gdb) n
275 ecxt->ecxt_scantuple = slot;
(gdb)
276 FormPartitionKeyDatum(dispatch, slot, estate, values, isnull);
(gdb)
282 if (partdesc->nparts == 0)
(gdb) p *partdesc
$21 = {nparts = 6, oids = 0x149b168, boundinfo = 0x149b1a0}
(gdb) p *slot
$22 = {type = T_TupleTableSlot, tts_isempty = false, tts_shouldFree = true, tts_shouldFreeMin = false, tts_slow = false,
tts_tuple = 0x142b140, tts_tupleDescriptor = 0x1429f28, tts_mcxt = 0x1429640, tts_buffer = 0, tts_nvalid = 1,
tts_values = 0x142a1a0, tts_isnull = 0x142a1b8, tts_mintuple = 0x0, tts_minhdr = {t_len = 0, t_self = {ip_blkid = {
bi_hi = 0, bi_lo = 0}, ip_posid = 0}, t_tableOid = 0, t_data = 0x0}, tts_off = 4, tts_fixedTupleDescriptor = true}
(gdb) p values
$23 = {0, 7152626, 21144656, 21144128, 7141053, 21143088, 21144128, 16372128, 140722434628688, 0, 0, 0, 21143872,
140722434628736, 140461078524324, 21141056, 21144128, 0, 21143088, 21141056, 7152279, 0, 7421941, 21141056, 21143088,
21614576, 140722434628800, 7422189, 21143872, 140722434628839, 21143088, 21144128}
(gdb) p isnull
$24 = {false, 91, 186, 126, 252, 127, false, false, 208, 166, 71, false, false, false, false, false, 2,
false <repeats 15 times>}
(gdb) p *estate
$25 = {type = T_EState, es_direction = ForwardScanDirection, es_snapshot = 0x1451ee0, es_crosscheck_snapshot = 0x0,
es_range_table = 0x14a71c0, es_plannedstmt = 0x14a72b8,
es_sourceText = 0x13acec8 "insert into t_hash_partition(c1,c2,c3) VALUES(0,'HASH0','HAHS0');", es_junkFilter = 0x0,
es_output_cid = 0, es_result_relations = 0x14299a8, es_num_result_relations = 1, es_result_relation_info = 0x14299a8,
es_root_result_relations = 0x0, es_num_root_result_relations = 0, es_tuple_routing_result_relations = 0x0,
es_trig_target_relations = 0x0, es_trig_tuple_slot = 0x142afc0, es_trig_oldtup_slot = 0x0, es_trig_newtup_slot = 0x0,
es_param_list_info = 0x0, es_param_exec_vals = 0x1429970, es_queryEnv = 0x0, es_query_cxt = 0x1429640,
es_tupleTable = 0x142a200, es_rowMarks = 0x0, es_processed = 0, es_lastoid = 0, es_top_eflags = 0, es_instrument = 0,
es_finished = false, es_exprcontexts = 0x1429ef0, es_subplanstates = 0x0, es_auxmodifytables = 0x0,
es_per_tuple_exprcontext = 0x142b080, es_epqTuple = 0x0, es_epqTupleSet = 0x0, es_epqScanDone = 0x0,
es_use_parallel_mode = false, es_query_dsa = 0x0, es_jit_flags = 0, es_jit = 0x0, es_jit_worker_instr = 0x0}
(gdb)
進入get_partition_for_tuple函式
(gdb) n
288 cur_index = get_partition_for_tuple(rel, values, isnull);
(gdb) step
get_partition_for_tuple (relation=0x7fbfa6900950, values=0x7ffc7eba5bb0, isnull=0x7ffc7eba5b90) at execPartition.c:1139
1139 int part_index = -1;
(gdb)
get_partition_for_tuple->獲取分割槽鍵
1139 int part_index = -1;
(gdb) n
1140 PartitionKey key = RelationGetPartitionKey(relation);
(gdb)
1141 PartitionDesc partdesc = RelationGetPartitionDesc(relation);
(gdb) p key
$26 = (PartitionKey) 0x1489860
(gdb) p *key
$27 = {strategy = 104 'h', partnatts = 1, partattrs = 0x14898f8, partexprs = 0x0, partopfamily = 0x1489918,
partopcintype = 0x1489938, partsupfunc = 0x1489958, partcollation = 0x14899b0, parttypid = 0x14899d0,
parttypmod = 0x14899f0, parttyplen = 0x1489a10, parttypbyval = 0x1489a30,
parttypalign = 0x1489a50 "i~\177\177\177\177\177\177\b", parttypcoll = 0x1489a70}
get_partition_for_tuple->獲取分割槽描述符&分割槽邊界資訊
(gdb) n
1142 PartitionBoundInfo boundinfo = partdesc->boundinfo;
(gdb)
1145 switch (key->strategy)
(gdb) p *partdesc
$28 = {nparts = 6, oids = 0x149b168, boundinfo = 0x149b1a0}
(gdb) p *boundinfo
$29 = {strategy = 104 'h', ndatums = 6, datums = 0x149b1f8, kind = 0x0, indexes = 0x149b288, null_index = -1,
default_index = -1}
get_partition_for_tuple->進入Hash分割槽處理分支
(gdb) n
1152 greatest_modulus = get_hash_partition_greatest_modulus(boundinfo);
(gdb) p key->strategy
$30 = 104 'h'
get_partition_for_tuple->計算模組數&行hash值,獲得分割槽編號(index)
(gdb) n
1153 rowHash = compute_partition_hash_value(key->partnatts,
(gdb) n
1157 part_index = boundinfo->indexes[rowHash % greatest_modulus];
(gdb)
1159 break;
(gdb) p part_index
$31 = 2
(gdb)
get_partition_for_tuple->返回
(gdb) n
1228 if (part_index < 0)
(gdb)
1231 return part_index;
(gdb)
1232 }
(gdb)
ExecFindPartition (resultRelInfo=0x14299a8, pd=0x142ae58, slot=0x142a140, estate=0x1429758) at execPartition.c:295
295 if (cur_index < 0)
(gdb)
已取得分割槽資訊(分割槽索引編號=2)
(gdb) n
300 else if (dispatch->indexes[cur_index] >= 0)
(gdb)
302 result = dispatch->indexes[cur_index];
(gdb) p dispatch->indexes[cur_index]
$32 = 2
(gdb) n
304 break;
(gdb)
324 if (slot == myslot)
(gdb)
328 if (result < 0)
(gdb)
342 MemoryContextSwitchTo(oldcxt);
(gdb)
343 ecxt->ecxt_scantuple = ecxt_scantuple_old;
(gdb)
345 return result;
(gdb)
完成函式呼叫
(gdb) n
346 }
(gdb)
ExecPrepareTupleRouting (mtstate=0x1429ac0, estate=0x1429758, proute=0x142a7a8, targetRelInfo=0x14299a8, slot=0x142a140)
at nodeModifyTable.c:1716
1716 Assert(partidx >= 0 && partidx < proute->num_partitions);
DONE!
四、參考資料
PG 11.1 Source Code.
注: doxygen上的原始碼與PG 11.1原始碼並不一致,本節基於11.1進行分析.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374796/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL 原始碼解讀(92)- 分割槽表#1(資料插入路由#1)SQL原始碼路由
- PostgreSQL 原始碼解讀(96)- 分割槽表#3(資料插入路由#3-獲取分割槽鍵值)SQL原始碼路由
- PostgreSQL 原始碼解讀(99)- 分割槽表#5(資料查詢路由#2-RelOptInfo數...SQL原始碼路由
- PostgreSQL 原始碼解讀(2)- 插入資料#2(RelationPutHeapTuple)SQL原始碼APT
- PostgreSQL 原始碼解讀(98)- 分割槽表#4(資料查詢路由#1-“擴充套件”分割槽表)SQL原始碼路由套件
- PostgreSQL 原始碼解讀(100)- 分割槽表#6(資料查詢路由#3-prune part...SQL原始碼路由
- PostgreSQL 原始碼解讀(101)- 分割槽表#7(資料查詢路由#4-prune part...SQL原始碼路由
- PostgreSQL 原始碼解讀(102)- 分割槽表#8(資料查詢路由#5-構建APPEND訪問路徑)SQL原始碼路由APP
- PostgreSQL 原始碼解讀(103)- 分割槽表#9(資料查詢路由#6-APPEND初始化和實現)SQL原始碼路由APP
- PostgreSQL 原始碼解讀(1)- 插入資料#1SQL原始碼
- PostgreSQL 原始碼解讀(5)- 插入資料#4(ExecInsert)SQL原始碼
- PostgreSQL 原始碼解讀(6)- 插入資料#5(ExecModifyTable)SQL原始碼
- PostgreSQL 原始碼解讀(8)- 插入資料#7(ExecutePlan)SQL原始碼
- PostgreSQL 原始碼解讀(10)- 插入資料#9(ProcessQuery)SQL原始碼
- PostgreSQL 原始碼解讀(13)- 插入資料#12(PostgresMain)SQL原始碼AI
- hive 動態分割槽插入資料表Hive
- PostgreSQL 原始碼解讀(4)- 插入資料#3(heap_insert)SQL原始碼
- PostgreSQL 原始碼解讀(12)- 插入資料#11(exec_simple_query)SQL原始碼
- PostgreSQL 原始碼解讀(7)- 插入資料#6(ExecProcNode和ExecPro...SQL原始碼
- PostgreSQL 原始碼解讀(9)- 插入資料#8(ExecutorRun和standard...SQL原始碼
- PostgreSQL 原始碼解讀(11)- 插入資料#10(PortalRunMulti和Por...SQL原始碼
- PostgreSQL/LightDB 分割槽表之分割槽裁剪SQL
- PostgreSQL 原始碼解讀(117)- MVCC#2(獲取快照#2)SQL原始碼MVCC#
- PostgreSQL:傳統分割槽表SQL
- PostgreSQL:內建分割槽表SQL
- Oracle分割槽表基礎運維-07增加分割槽(2 HASH分割槽)Oracle運維
- PostgreSQL 原始碼解讀(231)- 查詢#124(NOT IN實現#2)SQL原始碼
- PostgreSQL 原始碼解讀(193)- 查詢#109(排序#2 - ExecSort)SQL原始碼排序
- PostgreSQL 原始碼解讀(250)- PG 14(Improving connection scalability)#2SQL原始碼
- OceanBase 原始碼解讀(三):分割槽的一生原始碼
- PostgreSQL分割槽表更新思路SQL
- PostgreSQL 原始碼解讀(115)- 後臺程式#3(checkpointer程式#2)SQL原始碼
- PostgreSQL 原始碼解讀(114)- 後臺程式#2(checkpointer程式#1)SQL原始碼
- PostgreSQL 原始碼解讀(154)- 後臺程式#6(walsender#2)SQL原始碼
- PostgreSQL 原始碼解讀(150)- PG Tools#2(BaseBackup函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(236)- 後臺程式#14(autovacuum程式#2)SQL原始碼
- PostgreSQL 原始碼解讀(174)- 核心研發#2(如何新增系統欄位)#2SQL原始碼
- PostgreSQL 原始碼解讀(248)- HTAB動態擴充套件圖解#2SQL原始碼套件圖解