PostgreSQL 原始碼解讀(70)- 查詢語句#55(make_one_rel函式#20-...
本節大體介紹了動態規劃演算法實現(standard_join_search)中的join_search_one_level->make_join_rel->populate_joinrel_with_paths->add_paths_to_joinrel函式中的hash_inner_and_outer函式,該函式嘗試構造hash join訪問路徑。
一、資料結構
Cost相關
注意:實際使用的引數值透過系統配置檔案定義,而不是這裡的常量定義!
typedef double Cost; /* execution cost (in page-access units) */
/* defaults for costsize.c's Cost parameters */
/* NB: cost-estimation code should use the variables, not these constants! */
/* 注意:實際值透過系統配置檔案定義,而不是這裡的常量定義! */
/* If you change these, update backend/utils/misc/postgresql.sample.conf */
#define DEFAULT_SEQ_PAGE_COST 1.0 //順序掃描page的成本
#define DEFAULT_RANDOM_PAGE_COST 4.0 //隨機掃描page的成本
#define DEFAULT_CPU_TUPLE_COST 0.01 //處理一個元組的CPU成本
#define DEFAULT_CPU_INDEX_TUPLE_COST 0.005 //處理一個索引元組的CPU成本
#define DEFAULT_CPU_OPERATOR_COST 0.0025 //執行一次操作或函式的CPU成本
#define DEFAULT_PARALLEL_TUPLE_COST 0.1 //並行執行,從一個worker傳輸一個元組到另一個worker的成本
#define DEFAULT_PARALLEL_SETUP_COST 1000.0 //構建並行執行環境的成本
#define DEFAULT_EFFECTIVE_CACHE_SIZE 524288 /*先前已有介紹, measured in pages */
double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
double parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST;
double parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
Cost disable_cost = 1.0e10;//1後面10個0,透過設定一個巨大的成本,讓最佳化器自動放棄此路徑
int max_parallel_workers_per_gather = 2;//每次gather使用的worker數
二、原始碼解讀
hash join的演算法實現虛擬碼如下:
Step 1
FOR small_table_row IN (SELECT * FROM small_table)
LOOP
slot := HASH(small_table_row.join_key);
INSERT_HASH_TABLE(slot,small_table_row);
END LOOP;
Step 2
FOR large_table_row IN (SELECT * FROM large_table) LOOP
slot := HASH(large_table_row.join_key);
small_table_row = LOOKUP_HASH_TABLE(slot,large_table_row.join_key);
IF small_table_row FOUND THEN
output small_table_row + large_table_row;
END IF;
END LOOP;
hash_inner_and_outer
該函式建立hash join訪問路徑。
//------------------------------------------------ hash_inner_and_outer
/*
* hash_inner_and_outer
* Create hashjoin join paths by explicitly hashing both the outer and
* inner keys of each available hash clause.
* 透過顯式對外表和內表(應用每個可用的hash條件)進行hash操作,建立hash join訪問路徑
*
* 'joinrel' is the join relation
* 'outerrel' is the outer join relation
* 'innerrel' is the inner join relation
* 'jointype' is the type of join to do
* 'extra' contains additional input values
*/
static void
hash_inner_and_outer(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
JoinPathExtraData *extra)
{
JoinType save_jointype = jointype;
bool isouterjoin = IS_OUTER_JOIN(jointype);
List *hashclauses;
ListCell *l;
/*
* We need to build only one hashclauses list for any given pair of outer
* and inner relations; all of the hashable clauses will be used as keys.
* 只需要為給定的外表和內表對構建一個hashclauses條件連結串列;所有的hashable子句將用作hash鍵。
*
* Scan the join's restrictinfo list to find hashjoinable clauses that are
* usable with this pair of sub-relations.
* 掃描連線的約束條件restrictinfo連結串列,找到可用於這對子關係的hash連線hashjoinable子句。
*/
hashclauses = NIL;
foreach(l, extra->restrictlist)
{
RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
/*
* If processing an outer join, only use its own join clauses for
* hashing. For inner joins we need not be so picky.
* 如果處理外連線,則僅使用其自己的連線子句進行雜湊操作。對於內連線,則無需如此操作。
*/
if (isouterjoin && RINFO_IS_PUSHED_DOWN(restrictinfo, joinrel->relids))
continue;
if (!restrictinfo->can_join ||
restrictinfo->hashjoinoperator == InvalidOid)
continue; /* 不能被hash.not hashjoinable */
/*
* Check if clause has the form "outer op inner" or "inner op outer".
* 檢查條件是否有形如outer op inner或者inner op outer的形式
*/
if (!clause_sides_match_join(restrictinfo, outerrel, innerrel))
continue; /* no good for these input relations */
hashclauses = lappend(hashclauses, restrictinfo);//加入到hash條件中
}
/* If we found any usable hashclauses, make paths */
//如發現可用於hash連線的條件,則構建hash連線訪問路徑,如無則無法構建
if (hashclauses)
{
/*
* We consider both the cheapest-total-cost and cheapest-startup-cost
* outer paths. There's no need to consider any but the
* cheapest-total-cost inner path, however.
* 外表:既考慮了成本最低的總成本,也考慮了外表啟動成本最低的訪問路徑。
* 內表:除了成本最低的內部路徑之外,不需要考慮任何其他路徑。
*/
Path *cheapest_startup_outer = outerrel->cheapest_startup_path;
Path *cheapest_total_outer = outerrel->cheapest_total_path;
Path *cheapest_total_inner = innerrel->cheapest_total_path;
/*
* If either cheapest-total path is parameterized by the other rel, we
* can't use a hashjoin. (There's no use looking for alternative
* input paths, since these should already be the least-parameterized
* available paths.)
* 如果其中一個關係引數化了其中一個成本最低的訪問路徑,那麼不能使用hash join。
* (沒有必要尋找替代的輸入路徑,因為這些路徑應該已經是引數化最少的可用路徑了。)
*/
if (PATH_PARAM_BY_REL(cheapest_total_outer, innerrel) ||
PATH_PARAM_BY_REL(cheapest_total_inner, outerrel))
return;//直接退出
/* Unique-ify if need be; we ignore parameterized possibilities */
//如果需要保證唯一性,丟棄引數化
if (jointype == JOIN_UNIQUE_OUTER)
{
cheapest_total_outer = (Path *)
create_unique_path(root, outerrel,
cheapest_total_outer, extra->sjinfo);
Assert(cheapest_total_outer);
jointype = JOIN_INNER;
try_hashjoin_path(root,
joinrel,
cheapest_total_outer,
cheapest_total_inner,
hashclauses,
jointype,
extra);
/* no possibility of cheap startup here */
}
else if (jointype == JOIN_UNIQUE_INNER)
{
cheapest_total_inner = (Path *)
create_unique_path(root, innerrel,
cheapest_total_inner, extra->sjinfo);
Assert(cheapest_total_inner);
jointype = JOIN_INNER;
try_hashjoin_path(root,
joinrel,
cheapest_total_outer,
cheapest_total_inner,
hashclauses,
jointype,
extra);
if (cheapest_startup_outer != NULL &&
cheapest_startup_outer != cheapest_total_outer)
try_hashjoin_path(root,
joinrel,
cheapest_startup_outer,
cheapest_total_inner,
hashclauses,
jointype,
extra);
}
else//其他連線型別
{
/*
* For other jointypes, we consider the cheapest startup outer
* together with the cheapest total inner, and then consider
* pairings of cheapest-total paths including parameterized ones.
* There is no use in generating parameterized paths on the basis
* of possibly cheap startup cost, so this is sufficient.
* 對於其他連線型別,我們考慮成本最低的的外表啟動和內表啟動訪問路徑,
* 然後考慮包括引數化路徑在內的成本最低的訪問路徑對。
* 在基於可能較低的啟動成本的基礎上生成引數化路徑是沒有用的,上面的做法就足夠了。
*/
ListCell *lc1;
ListCell *lc2;
if (cheapest_startup_outer != NULL)//啟動成本最低的外表訪問路徑
try_hashjoin_path(root,
joinrel,
cheapest_startup_outer,
cheapest_total_inner,
hashclauses,
jointype,
extra);//構建hash join訪問路徑
foreach(lc1, outerrel->cheapest_parameterized_paths)//遍歷外表引數化路徑
{
Path *outerpath = (Path *) lfirst(lc1);
/*
* We cannot use an outer path that is parameterized by the
* inner rel.
* 不能使用被內表引數化使用的外表訪問路徑
*/
if (PATH_PARAM_BY_REL(outerpath, innerrel))
continue;
foreach(lc2, innerrel->cheapest_parameterized_paths)//遍歷內表引數化路徑
{
Path *innerpath = (Path *) lfirst(lc2);
/*
* We cannot use an inner path that is parameterized by
* the outer rel, either.
* 同樣的,不能使用被外表引數化使用的內表訪問路徑
*/
if (PATH_PARAM_BY_REL(innerpath, outerrel))
continue;
if (outerpath == cheapest_startup_outer &&
innerpath == cheapest_total_inner)
continue; /* already tried it */
try_hashjoin_path(root,
joinrel,
outerpath,
innerpath,
hashclauses,
jointype,
extra);//構建hash連線訪問路徑
}
}
}
/*
* If the joinrel is parallel-safe, we may be able to consider a
* partial hash join. However, we can't handle JOIN_UNIQUE_OUTER,
* because the outer path will be partial, and therefore we won't be
* able to properly guarantee uniqueness. Similarly, we can't handle
* JOIN_FULL and JOIN_RIGHT, because they can produce false null
* extended rows. Also, the resulting path must not be parameterized.
* We would be able to support JOIN_FULL and JOIN_RIGHT for Parallel
* Hash, since in that case we're back to a single hash table with a
* single set of match bits for each batch, but that will require
* figuring out a deadlock-free way to wait for the probe to finish.
* 如果連線是並行安全的,可以考慮並行雜湊連線。
* 但是,我們不能處理JOIN_UNIQUE_OUTER,因為外部路徑是部分的,因此我們不能正確地保證惟一性。
* 類似地,我們不能處理JOIN_FULL和JOIN_RIGHT,因為它們會產生假空擴充套件行。
* 此外,生成的路徑不能被引數化。
* 我們將能夠支援JOIN_FULL和JOIN_RIGHT用於並行雜湊,
* 因為在這種情況下,我們將返回到一個雜湊表,每個批處理只有一組匹配位,
* 但這需要找到一種沒有死鎖的方式來等待探測完成。
*/
if (joinrel->consider_parallel &&
save_jointype != JOIN_UNIQUE_OUTER &&
save_jointype != JOIN_FULL &&
save_jointype != JOIN_RIGHT &&
outerrel->partial_pathlist != NIL &&
bms_is_empty(joinrel->lateral_relids))
{
Path *cheapest_partial_outer;
Path *cheapest_partial_inner = NULL;
Path *cheapest_safe_inner = NULL;
cheapest_partial_outer =
(Path *) linitial(outerrel->partial_pathlist);
/*
* Can we use a partial inner plan too, so that we can build a
* shared hash table in parallel?
* 我們是否也可以使用部分內表訪問路徑,以便並行構建共享雜湊表?
*/
if (innerrel->partial_pathlist != NIL && enable_parallel_hash)
{
cheapest_partial_inner =
(Path *) linitial(innerrel->partial_pathlist);
try_partial_hashjoin_path(root, joinrel,
cheapest_partial_outer,
cheapest_partial_inner,
hashclauses, jointype, extra,
true /* parallel_hash */ );
}
/*
* Normally, given that the joinrel is parallel-safe, the cheapest
* total inner path will also be parallel-safe, but if not, we'll
* have to search for the cheapest safe, unparameterized inner
* path. If doing JOIN_UNIQUE_INNER, we can't use any alternative
* inner path.
* 通常,假設連線是並行安全的,最便宜的總內表訪問路徑也是並行安全的,
* 但如果不是,我們將不得不尋找成本最低的安全的、非引數化的內表訪問路徑。
* 如果執行JOIN_UNIQUE_INNER,則不能使用任何替代的內表訪問路徑。
*/
if (cheapest_total_inner->parallel_safe)
cheapest_safe_inner = cheapest_total_inner;
else if (save_jointype != JOIN_UNIQUE_INNER)
cheapest_safe_inner =
get_cheapest_parallel_safe_total_inner(innerrel->pathlist);
if (cheapest_safe_inner != NULL)
try_partial_hashjoin_path(root, joinrel,
cheapest_partial_outer,
cheapest_safe_inner,
hashclauses, jointype, extra,
false /* parallel_hash */ );
}
}
}
//----------------------------- try_hashjoin_path
/*
* try_hashjoin_path
* Consider a hash join path; if it appears useful, push it into
* the joinrel's pathlist via add_path().
* 嘗試構造hash join訪問路徑.
* 如果該訪問路徑可用,透過add_path函式新增到連線新生成的關係joinrel中的pathlist連結串列中
*/
static void
try_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
Path *outer_path,
Path *inner_path,
List *hashclauses,
JoinType jointype,
JoinPathExtraData *extra)
{
Relids required_outer;
JoinCostWorkspace workspace;
/*
* Check to see if proposed path is still parameterized, and reject if the
* parameterization wouldn't be sensible.
* 檢查建議的路徑是否仍然是引數化的,如果引數化不合理,則拒絕。
*
*/
required_outer = calc_non_nestloop_required_outer(outer_path,
inner_path);
if (required_outer &&
!bms_overlap(required_outer, extra->param_source_rels))
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
return;
}
/*
* See comments in try_nestloop_path(). Also note that hashjoin paths
* never have any output pathkeys, per comments in create_hashjoin_path.
* 參見try_nestloop_path()中的註釋。
* 還要注意,hash join訪問路徑從來沒有任何輸出路徑鍵,參見create_hashjoin_path中的註釋.
*/
initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
outer_path, inner_path, extra, false);//初步估算成本
if (add_path_precheck(joinrel,
workspace.startup_cost, workspace.total_cost,
NIL, required_outer))//初始判斷
{
add_path(joinrel, (Path *)
create_hashjoin_path(root,
joinrel,
jointype,
&workspace,
extra,
outer_path,
inner_path,
false, /* parallel_hash */
extra->restrictlist,
required_outer,
hashclauses));//建立hash join訪問路徑,並新增
}
else
{
/* Waste no memory when we reject a path here */
bms_free(required_outer);
}
}
//------------------ create_hashjoin_path
/*
* create_hashjoin_path
* Creates a pathnode corresponding to a hash join between two relations.
* 建立hash join訪問路徑Node
*
* 'joinrel' is the join relation
* 'jointype' is the type of join required
* 'workspace' is the result from initial_cost_hashjoin
* 'extra' contains various information about the join
* 'outer_path' is the cheapest outer path
* 'inner_path' is the cheapest inner path
* 'parallel_hash' to select Parallel Hash of inner path (shared hash table)
* 'restrict_clauses' are the RestrictInfo nodes to apply at the join
* 'required_outer' is the set of required outer rels
* 'hashclauses' are the RestrictInfo nodes to use as hash clauses
* (this should be a subset of the restrict_clauses list)
*/
HashPath *
create_hashjoin_path(PlannerInfo *root,
RelOptInfo *joinrel,
JoinType jointype,
JoinCostWorkspace *workspace,
JoinPathExtraData *extra,
Path *outer_path,
Path *inner_path,
bool parallel_hash,
List *restrict_clauses,
Relids required_outer,
List *hashclauses)
{
HashPath *pathnode = makeNode(HashPath);
pathnode->jpath.path.pathtype = T_HashJoin;
pathnode->jpath.path.parent = joinrel;
pathnode->jpath.path.pathtarget = joinrel->reltarget;
pathnode->jpath.path.param_info =
get_joinrel_parampathinfo(root,
joinrel,
outer_path,
inner_path,
extra->sjinfo,
required_outer,
&restrict_clauses);
pathnode->jpath.path.parallel_aware =
joinrel->consider_parallel && parallel_hash;
pathnode->jpath.path.parallel_safe = joinrel->consider_parallel &&
outer_path->parallel_safe && inner_path->parallel_safe;
/* This is a foolish way to estimate parallel_workers, but for now... */
pathnode->jpath.path.parallel_workers = outer_path->parallel_workers;
/*
* A hashjoin never has pathkeys, since its output ordering is
* unpredictable due to possible batching. XXX If the inner relation is
* small enough, we could instruct the executor that it must not batch,
* and then we could assume that the output inherits the outer relation's
* ordering, which might save a sort step. However there is considerable
* downside if our estimate of the inner relation size is badly off. For
* the moment we don't risk it. (Note also that if we wanted to take this
* seriously, joinpath.c would have to consider many more paths for the
* outer rel than it does now.)
* hashjoin從來沒有路徑鍵,因為由於可能的批處理,其輸出順序不可預測。
* 如果內部關係足夠小,可以指示執行器它不執行批處理,然後可以假設輸出繼承外部關係的順序,這樣可以節省排序步驟。
* 然而,如果對內部關係大小的估計嚴重不足,就會有相當大的負面影響。
* (還要注意,如果我們想認真對待這個問題,那就是joinpath.c將不得不考慮比現在更多的外表訪問路徑。)
*/
pathnode->jpath.path.pathkeys = NIL;
pathnode->jpath.jointype = jointype;
pathnode->jpath.inner_unique = extra->inner_unique;
pathnode->jpath.outerjoinpath = outer_path;
pathnode->jpath.innerjoinpath = inner_path;
pathnode->jpath.joinrestrictinfo = restrict_clauses;
pathnode->path_hashclauses = hashclauses;
/* final_cost_hashjoin will fill in pathnode->num_batches */
final_cost_hashjoin(root, pathnode, workspace, extra);//最終的成本估算
return pathnode;
}
三、跟蹤分析
測試指令碼如下
testdb=# explain verbose select dw.*,grjf.grbh,grjf.xm,grjf.ny,grjf.je
testdb-# from t_dwxx dw,lateral (select gr.grbh,gr.xm,jf.ny,jf.je
testdb(# from t_grxx gr inner join t_jfxx jf
testdb(# on gr.dwbh = dw.dwbh
testdb(# and gr.grbh = jf.grbh) grjf
testdb-# order by dw.dwbh;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Sort (cost=20070.93..20320.93 rows=100000 width=47)
Output: dw.dwmc, dw.dwbh, dw.dwdz, gr.grbh, gr.xm, jf.ny, jf.je
Sort Key: dw.dwbh
-> Hash Join (cost=3754.00..8689.61 rows=100000 width=47)
Output: dw.dwmc, dw.dwbh, dw.dwdz, gr.grbh, gr.xm, jf.ny, jf.je
Inner Unique: true
Hash Cond: ((gr.dwbh)::text = (dw.dwbh)::text)
-> Hash Join (cost=3465.00..8138.00 rows=100000 width=31)
Output: gr.grbh, gr.xm, gr.dwbh, jf.ny, jf.je
Hash Cond: ((jf.grbh)::text = (gr.grbh)::text)
-> Seq Scan on public.t_jfxx jf (cost=0.00..1637.00 rows=100000 width=20)
Output: jf.ny, jf.je, jf.grbh
-> Hash (cost=1726.00..1726.00 rows=100000 width=16)
Output: gr.grbh, gr.xm, gr.dwbh
-> Seq Scan on public.t_grxx gr (cost=0.00..1726.00 rows=100000 width=16)
Output: gr.grbh, gr.xm, gr.dwbh
-> Hash (cost=164.00..164.00 rows=10000 width=20)
Output: dw.dwmc, dw.dwbh, dw.dwdz
-> Seq Scan on public.t_dwxx dw (cost=0.00..164.00 rows=10000 width=20)
Output: dw.dwmc, dw.dwbh, dw.dwdz
(20 rows)
啟動gdb,設定斷點跟蹤
(gdb) b hash_inner_and_outer
Breakpoint 1 at 0x7b066b: file joinpath.c, line 1684.
(gdb) c
Continuing.
Breakpoint 1, hash_inner_and_outer (root=0x2676078, joinrel=0x26d2bc0, outerrel=0x26814e0, innerrel=0x2682a10,
jointype=JOIN_INNER, extra=0x7ffd6ea6b9d0) at joinpath.c:1684
1684 JoinType save_jointype = jointype;
連線型別為JOIN_INNER
(gdb) p jointype
$1 = JOIN_INNER
1號和3號RTE的連線(即t_dwxx和t_grxx)
(gdb) p *joinrel->relids->words
$3 = 10
開始遍歷連線條件,獲取hash連線條件
1697 foreach(l, extra->restrictlist)
(gdb)
1699 RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
成功獲取,t_dwxx.dwbh = t_grxx.dwbh
(gdb)
1697 foreach(l, extra->restrictlist)
(gdb)
1722 if (hashclauses)
(gdb) p *hashclauses
$4 = {type = T_List, length = 1, head = 0x26d4068, tail = 0x26d4068}
獲取成本最低的外表啟動路徑/成本最低的外表訪問路徑/成本最低的內部訪問路徑
分別是外表順序掃描/外表順序掃描/內部順序掃描
(gdb) n
1729 Path *cheapest_startup_outer = outerrel->cheapest_startup_path;
(gdb)
1730 Path *cheapest_total_outer = outerrel->cheapest_total_path;
(gdb)
1731 Path *cheapest_total_inner = innerrel->cheapest_total_path;
(gdb) p *cheapest_startup_outer
$5 = {type = T_Path, pathtype = T_SeqScan, parent = 0x26814e0, pathtarget = 0x2681718, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 10000, startup_cost = 0, total_cost = 164,
pathkeys = 0x0}
(gdb) p *cheapest_total_outer
$6 = {type = T_Path, pathtype = T_SeqScan, parent = 0x26814e0, pathtarget = 0x2681718, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 10000, startup_cost = 0, total_cost = 164,
pathkeys = 0x0}
(gdb) p *cheapest_total_inner
$7 = {type = T_Path, pathtype = T_SeqScan, parent = 0x2682a10, pathtarget = 0x2682c48, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000, startup_cost = 0, total_cost = 1726,
pathkeys = 0x0}
如外表成本最低的啟動路徑不為NULL,則嘗試hash連線
(gdb) n
1740 PATH_PARAM_BY_REL(cheapest_total_inner, outerrel))
(gdb)
1739 if (PATH_PARAM_BY_REL(cheapest_total_outer, innerrel) ||
(gdb)
1744 if (jointype == JOIN_UNIQUE_OUTER)
(gdb)
1760 else if (jointype == JOIN_UNIQUE_INNER)
(gdb)
1796 if (cheapest_startup_outer != NULL)
(gdb)
1797 try_hashjoin_path(root,
進入try_hashjoin_path
(gdb) step
try_hashjoin_path (root=0x2676078, joinrel=0x26d2bc0, outer_path=0x26853b8, inner_path=0x26cf610, hashclauses=0x26d4090,
jointype=JOIN_INNER, extra=0x7ffd6ea6b9d0) at joinpath.c:737
737 required_outer = calc_non_nestloop_required_outer(outer_path,
try_hashjoin_path->初步估算成本
...
751 initial_cost_hashjoin(root, &workspace, jointype, hashclauses,
(gdb) p workspace
$9 = {startup_cost = 3465, total_cost = 4261, run_cost = 796, inner_run_cost = 0,
inner_rescan_run_cost = 6.9528109284473596e-310, outer_rows = 3.7882102964330281e-317,
inner_rows = 2.0115578425988515e-316, outer_skip_rows = 2.0115578425988515e-316,
inner_skip_rows = 6.9528109284331305e-310, numbuckets = 131072, numbatches = 2, inner_rows_total = 100000}
try_hashjoin_path->進入函式create_hashjoin_path
(gdb) n
759 create_hashjoin_path(root,
(gdb) step
create_hashjoin_path (root=0x2676078, joinrel=0x26d2bc0, jointype=JOIN_INNER, workspace=0x7ffd6ea6b850,
extra=0x7ffd6ea6b9d0, outer_path=0x26853b8, inner_path=0x26cf610, parallel_hash=false, restrict_clauses=0x26d3098,
required_outer=0x0, hashclauses=0x26d4090) at pathnode.c:2330
2330 HashPath *pathnode = makeNode(HashPath);
try_hashjoin_path->create_hashjoin_path->計算成本並返回
(gdb)
2370 final_cost_hashjoin(root, pathnode, workspace, extra);
(gdb)
2372 return pathnode;
(gdb)
2373 }
(gdb) p *pathnode
$10 = {jpath = {path = {type = T_HashPath, pathtype = T_HashJoin, parent = 0x26d2bc0, pathtarget = 0x26d2df8,
param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000,
startup_cost = 3465, total_cost = 5386, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = false,
outerjoinpath = 0x26853b8, innerjoinpath = 0x26cf610, joinrestrictinfo = 0x26d3098}, path_hashclauses = 0x26d4090,
num_batches = 2, inner_rows_total = 100000}
try_hashjoin_path->新增路徑
(gdb) n
try_hashjoin_path (root=0x2676078, joinrel=0x26d2bc0, outer_path=0x26853b8, inner_path=0x26cf610, hashclauses=0x26d4090,
jointype=JOIN_INNER, extra=0x7ffd6ea6b9d0) at joinpath.c:758
758 add_path(joinrel, (Path *)
(gdb)
776 }
(gdb)
回到hash_inner_and_outer,繼續迴圈
(gdb)
hash_inner_and_outer (root=0x2676078, joinrel=0x26d2bc0, outerrel=0x26814e0, innerrel=0x2682a10, jointype=JOIN_INNER,
extra=0x7ffd6ea6b9d0) at joinpath.c:1805
1805 foreach(lc1, outerrel->cheapest_parameterized_paths)
結束函式呼叫
1904 }
(gdb)
add_paths_to_joinrel (root=0x2676078, joinrel=0x26d2bc0, outerrel=0x26814e0, innerrel=0x2682a10, jointype=JOIN_INNER,
sjinfo=0x7ffd6ea6bac0, restrictlist=0x26d3098) at joinpath.c:315
315 if (joinrel->fdwroutine &&
(gdb) p *joinrel->pathlist
$11 = {type = T_List, length = 2, head = 0x26d4160, tail = 0x26d3e30}
檢視joinrel的路徑連結串列
(gdb) p *(Node *)joinrel->pathlist->head->data.ptr_value
$12 = {type = T_HashPath}
(gdb) p *(Node *)joinrel->pathlist->head->next->data.ptr_value
$13 = {type = T_MergePath}
(gdb) p *(HashPath *)joinrel->pathlist->head->data.ptr_value
$14 = {jpath = {path = {type = T_HashPath, pathtype = T_HashJoin, parent = 0x26d2bc0, pathtarget = 0x26d2df8,
param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000,
startup_cost = 3465, total_cost = 5386, pathkeys = 0x0}, jointype = JOIN_INNER, inner_unique = false,
outerjoinpath = 0x26853b8, innerjoinpath = 0x26cf610, joinrestrictinfo = 0x26d3098}, path_hashclauses = 0x26d4090,
num_batches = 2, inner_rows_total = 100000}
(gdb) p *(MergePath *)joinrel->pathlist->head->next->data.ptr_value
$15 = {jpath = {path = {type = T_MergePath, pathtype = T_MergeJoin, parent = 0x26d2bc0, pathtarget = 0x26d2df8,
param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 100000,
startup_cost = 10035.66023721841, total_cost = 11955.396048959938, pathkeys = 0x2685928}, jointype = JOIN_INNER,
inner_unique = false, outerjoinpath = 0x26ce070, innerjoinpath = 0x26cf610, joinrestrictinfo = 0x26d3098},
path_mergeclauses = 0x26d3eb8, outersortkeys = 0x0, innersortkeys = 0x26d3f18, skip_mark_restore = false,
materialize_inner = false}
DONE!
函式initial_cost_hashjoin和final_cost_hashjoin在下一小節介紹.
四、參考資料
allpaths.c
cost.h
costsize.c
PG Document:Query Planning
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374828/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL 原始碼解讀(55)- 查詢語句#40(make_one_rel函式#5-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(65)- 查詢語句#50(make_one_rel函式#15-...SQL原始碼函式
- PostgreSQL 原始碼解讀(62)- 查詢語句#47(make_one_rel函式#12-...SQL原始碼函式
- PostgreSQL 原始碼解讀(63)- 查詢語句#48(make_one_rel函式#13-...SQL原始碼函式
- PostgreSQL 原始碼解讀(64)- 查詢語句#49(make_one_rel函式#14-...SQL原始碼函式
- PostgreSQL 原始碼解讀(60)- 查詢語句#45(make_one_rel函式#10-...SQL原始碼函式
- PostgreSQL 原始碼解讀(61)- 查詢語句#46(make_one_rel函式#11-...SQL原始碼函式
- PostgreSQL 原始碼解讀(69)- 查詢語句#54(make_one_rel函式#19-...SQL原始碼函式
- PostgreSQL 原始碼解讀(66)- 查詢語句#51(make_one_rel函式#16-...SQL原始碼函式
- PostgreSQL 原始碼解讀(67)- 查詢語句#52(make_one_rel函式#17-...SQL原始碼函式
- PostgreSQL 原始碼解讀(68)- 查詢語句#53(make_one_rel函式#18-...SQL原始碼函式
- PostgreSQL 原始碼解讀(71)- 查詢語句#56(make_one_rel函式#21-...SQL原始碼函式
- PostgreSQL 原始碼解讀(72)- 查詢語句#57(make_one_rel函式#22-...SQL原始碼函式
- PostgreSQL 原始碼解讀(52)- 查詢語句#37(make_one_rel函式#2-估...SQL原始碼函式
- PostgreSQL 原始碼解讀(53)- 查詢語句#38(make_one_rel函式#3-順...SQL原始碼函式
- PostgreSQL 原始碼解讀(54)- 查詢語句#39(make_one_rel函式#4-生...SQL原始碼函式
- PostgreSQL 原始碼解讀(56)- 查詢語句#41(make_one_rel函式#6-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(57)- 查詢語句#42(make_one_rel函式#7-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(58)- 查詢語句#43(make_one_rel函式#8-B...SQL原始碼函式
- PostgreSQL 原始碼解讀(59)- 查詢語句#44(make_one_rel函式#9-B...SQL原始碼函式
- PostgreSQL 原始碼解讀(49)- 查詢語句#34(make_one_rel函式#1-概覽)SQL原始碼函式
- PostgreSQL 原始碼解讀(83)- 查詢語句#68(PortalStart函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(90)- 查詢語句#75(ExecHashJoin函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(91)- 查詢語句#76(ExecHashJoin函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(93)- 查詢語句#77(ExecHashJoin函式#3)SQL原始碼函式
- PostgreSQL 原始碼解讀(85)- 查詢語句#70(PortalRun->InitPla...SQL原始碼
- PostgreSQL 原始碼解讀(52)- 查詢語句#37(make_one_rel函式#2-估算關係大小)SQL原始碼函式
- PostgreSQL 原始碼解讀(95)- 查詢語句#78(ExecHashJoin函式#4-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(97)- 查詢語句#79(ExecHashJoin函式#5-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(88)- 查詢語句#73(SeqNext函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(89)- 查詢語句#74(SeqNext函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(46)- 查詢語句#31(query_planner函式#7)SQL原始碼函式
- PostgreSQL 原始碼解讀(47)- 查詢語句#32(query_planner函式#8)SQL原始碼函式
- PostgreSQL 原始碼解讀(48)- 查詢語句#33(query_planner函式#9)SQL原始碼函式
- PostgreSQL 原始碼解讀(41)- 查詢語句#26(query_planner函式#4)SQL原始碼函式
- PostgreSQL 原始碼解讀(40)- 查詢語句#25(query_planner函式#3)SQL原始碼函式
- PostgreSQL 原始碼解讀(43)- 查詢語句#28(query_planner函式#5)SQL原始碼函式
- PostgreSQL 原始碼解讀(45)- 查詢語句#30(query_planner函式#6)SQL原始碼函式