PostgreSQL 原始碼解讀(188)- 查詢#104(聚合函式#8 - ExecAgg Review)
本節對ExecAgg函式進行初步的Review,梳理相關實現邏輯.
一、資料結構
AggState
聚合函式執行時狀態結構體,內含AggStatePerAgg等結構體
/* ---------------------
* AggState information
*
* ss.ss_ScanTupleSlot refers to output of underlying plan.
* ss.ss_ScanTupleSlot指的是基礎計劃的輸出.
* (ss = ScanState,ps = PlanState)
*
* Note: ss.ps.ps_ExprContext contains ecxt_aggvalues and
* ecxt_aggnulls arrays, which hold the computed agg values for the current
* input group during evaluation of an Agg node's output tuple(s). We
* create a second ExprContext, tmpcontext, in which to evaluate input
* expressions and run the aggregate transition functions.
* 注意:ss.ps.ps_ExprContext包含了ecxt_aggvalues和ecxt_aggnulls陣列,
* 這兩個陣列儲存了在計算agg節點的輸出元組時當前輸入組已計算的agg值.
* ---------------------
*/
/* these structs are private in nodeAgg.c: */
//在nodeAgg.c中私有的結構體
typedef struct AggStatePerAggData *AggStatePerAgg;
typedef struct AggStatePerTransData *AggStatePerTrans;
typedef struct AggStatePerGroupData *AggStatePerGroup;
typedef struct AggStatePerPhaseData *AggStatePerPhase;
typedef struct AggStatePerHashData *AggStatePerHash;
typedef struct AggState
{
//第一個欄位是NodeTag(繼承自ScanState)
ScanState ss; /* its first field is NodeTag */
//targetlist和quals中所有的Aggref
List *aggs; /* all Aggref nodes in targetlist & quals */
//連結串列的大小(可以為0)
int numaggs; /* length of list (could be zero!) */
//pertrans條目大小
int numtrans; /* number of pertrans items */
//Agg策略模式
AggStrategy aggstrategy; /* strategy mode */
//agg-splitting模式,參見nodes.h
AggSplit aggsplit; /* agg-splitting mode, see nodes.h */
//指向當前步驟資料的指標
AggStatePerPhase phase; /* pointer to current phase data */
//步驟數(包括0)
int numphases; /* number of phases (including phase 0) */
//當前步驟
int current_phase; /* current phase number */
//per-Aggref資訊
AggStatePerAgg peragg; /* per-Aggref information */
//per-Trans狀態資訊
AggStatePerTrans pertrans; /* per-Trans state information */
//長生命週期資料的ExprContexts(hashtable)
ExprContext *hashcontext; /* econtexts for long-lived data (hashtable) */
////長生命週期資料的ExprContexts(每一個GS使用)
ExprContext **aggcontexts; /* econtexts for long-lived data (per GS) */
//輸入表示式的ExprContext
ExprContext *tmpcontext; /* econtext for input expressions */
#define FIELDNO_AGGSTATE_CURAGGCONTEXT 14
//當前活躍的aggcontext
ExprContext *curaggcontext; /* currently active aggcontext */
//當前活躍的aggregate(如存在)
AggStatePerAgg curperagg; /* currently active aggregate, if any */
#define FIELDNO_AGGSTATE_CURPERTRANS 16
//當前活躍的trans state
AggStatePerTrans curpertrans; /* currently active trans state, if any */
//輸入結束?
bool input_done; /* indicates end of input */
//Agg掃描結束?
bool agg_done; /* indicates completion of Agg scan */
//最後一個grouping set
int projected_set; /* The last projected grouping set */
#define FIELDNO_AGGSTATE_CURRENT_SET 20
//將要解析的當前grouping set
int current_set; /* The current grouping set being evaluated */
//當前投影操作的分組列
Bitmapset *grouped_cols; /* grouped cols in current projection */
//倒序的分組列連結串列
List *all_grouped_cols; /* list of all grouped cols in DESC order */
/* These fields are for grouping set phase data */
//-------- 下面的列用於grouping set步驟資料
//所有步驟中最大的sets大小
int maxsets; /* The max number of sets in any phase */
//所有步驟的陣列
AggStatePerPhase phases; /* array of all phases */
//對於phases > 1,已排序的輸入資訊
Tuplesortstate *sort_in; /* sorted input to phases > 1 */
//對於下一個步驟,輸入已拷貝
Tuplesortstate *sort_out; /* input is copied here for next phase */
//排序結果的slot
TupleTableSlot *sort_slot; /* slot for sort results */
/* these fields are used in AGG_PLAIN and AGG_SORTED modes: */
//------- 下面的列用於AGG_PLAIN和AGG_SORTED模式:
//per-group指標的grouping set編號陣列
AggStatePerGroup *pergroups; /* grouping set indexed array of per-group
* pointers */
//當前組的第一個元組拷貝
HeapTuple grp_firstTuple; /* copy of first tuple of current group */
/* these fields are used in AGG_HASHED and AGG_MIXED modes: */
//--------- 下面的列用於AGG_HASHED和AGG_MIXED模式:
//是否已填充hash表?
bool table_filled; /* hash table filled yet? */
//hash桶數?
int num_hashes;
//相應的雜湊表資料陣列
AggStatePerHash perhash; /* array of per-hashtable data */
//per-group指標的grouping set編號陣列
AggStatePerGroup *hash_pergroup; /* grouping set indexed array of
* per-group pointers */
/* support for evaluation of agg input expressions: */
//---------- agg輸入表示式解析支援
#define FIELDNO_AGGSTATE_ALL_PERGROUPS 34
//首先是->pergroups,然後是hash_pergroup
AggStatePerGroup *all_pergroups; /* array of first ->pergroups, than
* ->hash_pergroup */
//投影實現機制
ProjectionInfo *combinedproj; /* projection machinery */
} AggState;
/* Primitive options supported by nodeAgg.c: */
//nodeag .c支援的基本選項
#define AGGSPLITOP_COMBINE 0x01 /* substitute combinefn for transfn */
#define AGGSPLITOP_SKIPFINAL 0x02 /* skip finalfn, return state as-is */
#define AGGSPLITOP_SERIALIZE 0x04 /* apply serializefn to output */
#define AGGSPLITOP_DESERIALIZE 0x08 /* apply deserializefn to input */
/* Supported operating modes (i.e., useful combinations of these options): */
//支援的操作模式
typedef enum AggSplit
{
/* Basic, non-split aggregation: */
//基本 : 非split聚合
AGGSPLIT_SIMPLE = 0,
/* Initial phase of partial aggregation, with serialization: */
//部分聚合的初始步驟,序列化
AGGSPLIT_INITIAL_SERIAL = AGGSPLITOP_SKIPFINAL | AGGSPLITOP_SERIALIZE,
/* Final phase of partial aggregation, with deserialization: */
//部分聚合的最終步驟,反序列化
AGGSPLIT_FINAL_DESERIAL = AGGSPLITOP_COMBINE | AGGSPLITOP_DESERIALIZE
} AggSplit;
/* Test whether an AggSplit value selects each primitive option: */
//測試AggSplit選擇了哪些基本選項
#define DO_AGGSPLIT_COMBINE(as) (((as) & AGGSPLITOP_COMBINE) != 0)
#define DO_AGGSPLIT_SKIPFINAL(as) (((as) & AGGSPLITOP_SKIPFINAL) != 0)
#define DO_AGGSPLIT_SERIALIZE(as) (((as) & AGGSPLITOP_SERIALIZE) != 0)
#define DO_AGGSPLIT_DESERIALIZE(as) (((as) & AGGSPLITOP_DESERIALIZE) != 0)
二、原始碼解讀
ExecAgg函式,首先獲取AggState執行狀態,然後根據各個階段(aggstate->phase)的策略(aggstrategy)執行相應的邏輯.如使用Hash聚合,則只有一個節點,但有兩個策略,首先是AGG_HASHED,該策略對輸入元組按照分組列值進行Hash,同時執行轉換函式計算中間結果值,快取到雜湊表中;然後執行AGG_MIXED策略,從Hash表中獲取結果元組並返回結果元組(每一result為一個結果行).
/*
* ExecAgg -
*
* ExecAgg receives tuples from its outer subplan and aggregates over
* the appropriate attribute for each aggregate function use (Aggref
* node) appearing in the targetlist or qual of the node. The number
* of tuples to aggregate over depends on whether grouped or plain
* aggregation is selected. In grouped aggregation, we produce a result
* row for each group; in plain aggregation there's a single result row
* for the whole query. In either case, the value of each aggregate is
* stored in the expression context to be used when ExecProject evaluates
* the result tuple.
* ExecAgg接收從outer子計劃返回的元組合適的屬性上為每一個聚合函式(出現在投影列或節點表示式)執行聚合.
* 需要聚合的元組數量依賴於是否已分組或者選擇普通聚合.
* 在已分組的聚合操作巨集,為每一個組產生結果行;普通聚合,整個查詢只有一個結果行.
* 不管哪種情況,每一個聚合結果值都會儲存在表示式上下文中(ExecProject會解析結果元組)
*/
static TupleTableSlot *
ExecAgg(PlanState *pstate)
{
AggState *node = castNode(AggState, pstate);
TupleTableSlot *result = NULL;
CHECK_FOR_INTERRUPTS();
if (!node->agg_done)
{
/* Dispatch based on strategy */
//基於策略進行分發
switch (node->phase->aggstrategy)
{
case AGG_HASHED:
if (!node->table_filled)
agg_fill_hash_table(node);
/* FALLTHROUGH */
//填充後,執行MIXED
case AGG_MIXED:
result = agg_retrieve_hash_table(node);
break;
case AGG_PLAIN:
case AGG_SORTED:
result = agg_retrieve_direct(node);
break;
}
if (!TupIsNull(result))
return result;
}
return NULL;
}
agg_fill_hash_table
讀取輸入並構建雜湊表.
lookup_hash_entries函式根據輸入元組構建分組列雜湊表(搜尋或新建條目),advance_aggregates呼叫轉換函式計算中間結果並快取.
/*
* ExecAgg for hashed case: read input and build hash table
* 讀取輸入並構建雜湊表
*/
static void
agg_fill_hash_table(AggState *aggstate)
{
TupleTableSlot *outerslot;
ExprContext *tmpcontext = aggstate->tmpcontext;
/*
* Process each outer-plan tuple, and then fetch the next one, until we
* exhaust the outer plan.
* 處理每一個outer-plan返回的元組,然後繼續提取下一個,直至完成所有元組的處理.
*/
for (;;)
{
//--------- 迴圈直至完成所有元組的處理
//提取輸入的元組
outerslot = fetch_input_tuple(aggstate);
if (TupIsNull(outerslot))
break;//已完成處理,退出迴圈
/* set up for lookup_hash_entries and advance_aggregates */
//配置lookup_hash_entries和advance_aggregates函式
//把元組放在臨時記憶體上下文中
tmpcontext->ecxt_outertuple = outerslot;
/* Find or build hashtable entries */
//檢索或構建雜湊表條目
lookup_hash_entries(aggstate);
/* Advance the aggregates (or combine functions) */
//推動聚合(或組合函式)
advance_aggregates(aggstate);
/*
* Reset per-input-tuple context after each tuple, but note that the
* hash lookups do this too
* 重置per-input-tuple記憶體上下文,但需要注意hash檢索也會做這個事情
*/
ResetExprContext(aggstate->tmpcontext);
}
aggstate->table_filled = true;
/* Initialize to walk the first hash table */
//初始化用於遍歷第一個雜湊表
select_current_set(aggstate, 0, true);
ResetTupleHashIterator(aggstate->perhash[0].hashtable,
&aggstate->perhash[0].hashiter);
}
agg_retrieve_hash_table
agg_retrieve_hash_table函式在hash表中檢索結果,執行投影等相關操作.
/*
* ExecAgg for hashed case: retrieving groups from hash table
* ExecAgg(Hash實現版本):在hash表中檢索組
*/
static TupleTableSlot *
agg_retrieve_hash_table(AggState *aggstate)
{
ExprContext *econtext;
AggStatePerAgg peragg;
AggStatePerGroup pergroup;
TupleHashEntryData *entry;
TupleTableSlot *firstSlot;
TupleTableSlot *result;
AggStatePerHash perhash;
/*
* get state info from node.
* 從node節點中獲取狀態資訊.
*
* econtext is the per-output-tuple expression context.
* econtext是per-output-tuple表示式上下文.
*/
econtext = aggstate->ss.ps.ps_ExprContext;
peragg = aggstate->peragg;
firstSlot = aggstate->ss.ss_ScanTupleSlot;
/*
* Note that perhash (and therefore anything accessed through it) can
* change inside the loop, as we change between grouping sets.
* 注意,在分組之間切換時,perhash在迴圈中可能會改變
*/
perhash = &aggstate->perhash[aggstate->current_set];
/*
* We loop retrieving groups until we find one satisfying
* aggstate->ss.ps.qual
* 迴圈檢索groups,直至檢索到一個符合aggstate->ss.ps.qual條件的組.
*/
while (!aggstate->agg_done)
{
//------------- 選好
//獲取Slot
TupleTableSlot *hashslot = perhash->hashslot;
int i;
//檢查中斷
CHECK_FOR_INTERRUPTS();
/*
* Find the next entry in the hash table
* 檢索hash表的下一個條目
*/
entry = ScanTupleHashTable(perhash->hashtable, &perhash->hashiter);
if (entry == NULL)
{
//條目為NULL,切換到下一個set
int nextset = aggstate->current_set + 1;
if (nextset < aggstate->num_hashes)
{
/*
* Switch to next grouping set, reinitialize, and restart the
* loop.
* 切換至下一個grouping set,重新初始化並重啟迴圈
*/
select_current_set(aggstate, nextset, true);
perhash = &aggstate->perhash[aggstate->current_set];
ResetTupleHashIterator(perhash->hashtable, &perhash->hashiter);
continue;
}
else
{
/* No more hashtables, so done */
//已完成檢索,設定標記,退出
aggstate->agg_done = true;
return NULL;
}
}
/*
* Clear the per-output-tuple context for each group
* 為每一個group清除per-output-tuple上下文
*
* We intentionally don't use ReScanExprContext here; if any aggs have
* registered shutdown callbacks, they mustn't be called yet, since we
* might not be done with that agg.
* 在這裡不會用到ReScanExprContext,如果存在aggs註冊了shutdown回撥,
* 那應該還沒有呼叫,因為我們可能還沒有完成該agg的處理.
*/
ResetExprContext(econtext);
/*
* Transform representative tuple back into one with the right
* columns.
* 將典型元組轉回具有正確列的元組.
*/
ExecStoreMinimalTuple(entry->firstTuple, hashslot, false);
slot_getallattrs(hashslot);
//清理元組
//重置firstSlot
ExecClearTuple(firstSlot);
memset(firstSlot->tts_isnull, true,
firstSlot->tts_tupleDescriptor->natts * sizeof(bool));
for (i = 0; i < perhash->numhashGrpCols; i++)
{
//重置firstSlot
int varNumber = perhash->hashGrpColIdxInput[i] - 1;
firstSlot->tts_values[varNumber] = hashslot->tts_values[i];
firstSlot->tts_isnull[varNumber] = hashslot->tts_isnull[i];
}
ExecStoreVirtualTuple(firstSlot);
pergroup = (AggStatePerGroup) entry->additional;
/*
* Use the representative input tuple for any references to
* non-aggregated input columns in the qual and tlist.
* 為qual和tlist中的非聚合輸入列依賴使用典型輸入元組
*/
econtext->ecxt_outertuple = firstSlot;
//準備投影slot
prepare_projection_slot(aggstate,
econtext->ecxt_outertuple,
aggstate->current_set);
//最終的聚合操作
finalize_aggregates(aggstate, peragg, pergroup);
//投影
result = project_aggregates(aggstate);
if (result)
return result;
}
/* No more groups */
//沒有更多的groups了,返回NULL
return NULL;
}
三、跟蹤分析
N/A
四、參考資料
PostgreSQL 原始碼解讀(178)- 查詢#95(聚合函式)#1相關資料結構
PostgreSQL 原始碼解讀(160)- 查詢#80(如何實現表示式解析)
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2644146/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL 原始碼解讀(187)- 查詢#103(聚合函式#8 - Struct Review)SQL原始碼函式StructView
- PostgreSQL 原始碼解讀(183)- 查詢#99(聚合函式#4-ExecAgg)SQL原始碼函式
- PostgreSQL 原始碼解讀(182)- 查詢#98(聚合函式#3-ExecAgg)SQL原始碼函式
- PostgreSQL 原始碼解讀(184)- 查詢#100(聚合函式#5-simplehash)SQL原始碼函式
- PostgreSQL 原始碼解讀(185)- 查詢#101(聚合函式#6-simplehash)SQL原始碼函式
- PostgreSQL 原始碼解讀(181)- 查詢#97(聚合函式#2-ExecInitAgg)SQL原始碼函式
- PostgreSQL 原始碼解讀(186)- 查詢#102(聚合函式#7-advance_aggregates)SQL原始碼函式
- PostgreSQL 原始碼解讀(190)- 查詢#106(聚合函式#11 - finalize_aggregate)SQL原始碼函式
- PostgreSQL 原始碼解讀(191)- 查詢#107(聚合函式#12 - agg_retrieve_direct)SQL原始碼函式
- PostgreSQL 原始碼解讀(189)- 查詢#105(聚合函式#10 - agg_retrieve_hash_table)SQL原始碼函式
- PostgreSQL 原始碼解讀(178)- 查詢#95(聚合函式)#1相關資料結構SQL原始碼函式資料結構
- PostgreSQL 原始碼解讀(47)- 查詢語句#32(query_planner函式#8)SQL原始碼函式
- PostgreSQL 原始碼解讀(75)- 查詢語句#60(Review - standard_...SQL原始碼View
- PostgreSQL 原始碼解讀(74)- 查詢語句#59(Review - subquery_...SQL原始碼View
- PostgreSQL 原始碼解讀(83)- 查詢語句#68(PortalStart函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(81)- 查詢語句#66(Review - exec_simp...SQL原始碼View
- PostgreSQL 原始碼解讀(50)- 查詢語句#35(Optimizer Review#1)SQL原始碼View
- PostgreSQL 原始碼解讀(51)- 查詢語句#36(Optimizer Review#2)SQL原始碼View
- PostgreSQL 原始碼解讀(82)- 查詢語句#67(PortalXXX系列函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(89)- 查詢語句#74(SeqNext函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(90)- 查詢語句#75(ExecHashJoin函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(91)- 查詢語句#76(ExecHashJoin函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(88)- 查詢語句#73(SeqNext函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(93)- 查詢語句#77(ExecHashJoin函式#3)SQL原始碼函式
- PostgreSQL 原始碼解讀(58)- 查詢語句#43(make_one_rel函式#8-B...SQL原始碼函式
- PostgreSQL 原始碼解讀(73)- 查詢語句#58(grouping_planner函式...SQL原始碼函式
- PostgreSQL 原始碼解讀(164)- 查詢#84(表示式求值)SQL原始碼
- PostgreSQL 原始碼解讀(95)- 查詢語句#78(ExecHashJoin函式#4-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(97)- 查詢語句#79(ExecHashJoin函式#5-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(43)- 查詢語句#28(query_planner函式#5)SQL原始碼函式
- PostgreSQL 原始碼解讀(45)- 查詢語句#30(query_planner函式#6)SQL原始碼函式
- PostgreSQL 原始碼解讀(46)- 查詢語句#31(query_planner函式#7)SQL原始碼函式
- PostgreSQL 原始碼解讀(48)- 查詢語句#33(query_planner函式#9)SQL原始碼函式
- PostgreSQL 原始碼解讀(38)- 查詢語句#23(query_planner函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(39)- 查詢語句#24(query_planner函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(40)- 查詢語句#25(query_planner函式#3)SQL原始碼函式
- PostgreSQL 原始碼解讀(41)- 查詢語句#26(query_planner函式#4)SQL原始碼函式
- PostgreSQL 原始碼解讀(143)- Buffer Manager#8(BufTableHashCode函式)SQL原始碼函式