PostgreSQL 原始碼解讀(58)- 查詢語句#43(make_one_rel函式#8-B...
這一小節繼續介紹查詢物理最佳化中的create_index_paths->create_bitmap_heap_path函式,該函式建立點陣圖堆掃描訪問路徑節點。
關於BitmapHeapScan的相關知識,請參照PostgreSQL DBA(6) - SeqScan vs IndexScan vs BitmapHeapScan這篇文章.
本節沒有描述具體的Cost成本計算方法(公式),後續再行詳述。
一、資料結構
Cost相關
注意:實際使用的引數值透過系統配置檔案定義,而不是這裡的常量定義!
typedef double Cost; /* execution cost (in page-access units) */
/* defaults for costsize.c's Cost parameters */
/* NB: cost-estimation code should use the variables, not these constants! */
/* 注意:實際值透過系統配置檔案定義,而不是這裡的常量定義! */
/* If you change these, update backend/utils/misc/postgresql.sample.conf */
#define DEFAULT_SEQ_PAGE_COST 1.0 //順序掃描page的成本
#define DEFAULT_RANDOM_PAGE_COST 4.0 //隨機掃描page的成本
#define DEFAULT_CPU_TUPLE_COST 0.01 //處理一個元組的CPU成本
#define DEFAULT_CPU_INDEX_TUPLE_COST 0.005 //處理一個索引元組的CPU成本
#define DEFAULT_CPU_OPERATOR_COST 0.0025 //執行一次操作或函式的CPU成本
#define DEFAULT_PARALLEL_TUPLE_COST 0.1 //並行執行,從一個worker傳輸一個元組到另一個worker的成本
#define DEFAULT_PARALLEL_SETUP_COST 1000.0 //構建並行執行環境的成本
#define DEFAULT_EFFECTIVE_CACHE_SIZE 524288 /*先前已有介紹, measured in pages */
double seq_page_cost = DEFAULT_SEQ_PAGE_COST;
double random_page_cost = DEFAULT_RANDOM_PAGE_COST;
double cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST;
double cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST;
double cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST;
double parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST;
double parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;
int effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
Cost disable_cost = 1.0e10;//1後面10個0,透過設定一個巨大的成本,讓最佳化器自動放棄此路徑
int max_parallel_workers_per_gather = 2;//每次gather使用的worker數
二、原始碼解讀
create_bitmap_heap_path函式
create_index_paths->create_bitmap_heap_path函式,建立點陣圖堆掃描訪問路徑節點.
/*
* create_bitmap_heap_path
* Creates a path node for a bitmap scan.
* 建立點陣圖堆掃描訪問路徑節點
*
* 'bitmapqual' is a tree of IndexPath, BitmapAndPath, and BitmapOrPath nodes.
* bitmapqual-IndexPath, BitmapAndPath, and BitmapOrPath節點組成的樹
* 'required_outer' is the set of outer relids for a parameterized path.
* required_outer-引數化路徑中依賴的外部relids
* 'loop_count' is the number of repetitions of the indexscan to factor into
* estimates of caching behavior.
* loop_count-上一節已介紹
*
* loop_count should match the value used when creating the component
* IndexPaths.
*/
BitmapHeapPath *
create_bitmap_heap_path(PlannerInfo *root,
RelOptInfo *rel,
Path *bitmapqual,
Relids required_outer,
double loop_count,
int parallel_degree)
{
BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);//建立節點
pathnode->path.pathtype = T_BitmapHeapScan;
pathnode->path.parent = rel;
pathnode->path.pathtarget = rel->reltarget;
pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
required_outer);
pathnode->path.parallel_aware = parallel_degree > 0 ? true : false;
pathnode->path.parallel_safe = rel->consider_parallel;
pathnode->path.parallel_workers = parallel_degree;
pathnode->path.pathkeys = NIL; /* always unordered */
pathnode->bitmapqual = bitmapqual;
cost_bitmap_heap_scan(&pathnode->path, root, rel,
pathnode->path.param_info,
bitmapqual, loop_count);//成本估算
return pathnode;//返回結果
}
//-------------------------------------------------------- cost_bitmap_heap_scan
/*
* cost_bitmap_heap_scan
* Determines and returns the cost of scanning a relation using a bitmap
* index-then-heap plan.
* 確定並返回使用BitmapIndexScan和BitmapHeapScan的成本.
*
* 'baserel' is the relation to be scanned
* baserel-需掃描的Relation
* 'param_info' is the ParamPathInfo if this is a parameterized path, else NULL
* param_info-引數化資訊
* 'bitmapqual' is a tree of IndexPaths, BitmapAndPaths, and BitmapOrPaths
* bitmapqual-點陣圖條件表示式,IndexPath, BitmapAndPath, and BitmapOrPath節點組成的樹
* 'loop_count' is the number of repetitions of the indexscan to factor into
* estimates of caching behavior
*
* Note: the component IndexPaths in bitmapqual should have been costed
* using the same loop_count.
*/
void
cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,
ParamPathInfo *param_info,
Path *bitmapqual, double loop_count)
{
Cost startup_cost = 0;//啟動成本
Cost run_cost = 0;//執行成本
Cost indexTotalCost;//索引掃描總成本
QualCost qpqual_cost;//表示式成本
Cost cpu_per_tuple;
Cost cost_per_page;
Cost cpu_run_cost;
double tuples_fetched;
double pages_fetched;
double spc_seq_page_cost,
spc_random_page_cost;
double T;
/* Should only be applied to base relations */
Assert(IsA(baserel, RelOptInfo));
Assert(baserel->relid > 0);
Assert(baserel->rtekind == RTE_RELATION);
/* Mark the path with the correct row estimate */
if (param_info)
path->rows = param_info->ppi_rows;
else
path->rows = baserel->rows;
if (!enable_bitmapscan)//不允許點陣圖掃描
startup_cost += disable_cost;//禁用之
pages_fetched = compute_bitmap_pages(root, baserel, bitmapqual,
loop_count, &indexTotalCost,
&tuples_fetched);//計算頁面數
startup_cost += indexTotalCost;//啟動成本為BitmapIndexScan的總成本
T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;//頁面數
/* Fetch estimated page costs for tablespace containing table. */
get_tablespace_page_costs(baserel->reltablespace,
&spc_random_page_cost,
&spc_seq_page_cost);//訪問表空間頁面成本
/*
* For small numbers of pages we should charge spc_random_page_cost
* apiece, while if nearly all the table's pages are being read, it's more
* appropriate to charge spc_seq_page_cost apiece. The effect is
* nonlinear, too. For lack of a better idea, interpolate like this to
* determine the cost per page.
* 對於少量的頁面,每個頁面的成本為spc_random_page_cost,
* 而如果幾乎所有的頁面都被讀取,則每個頁面的成本為spc_seq_page_cost。
* 這種影響也是非線性的。由於缺乏更好的方法,透過插值法確定每頁的成本。
*/
if (pages_fetched >= 2.0)
cost_per_page = spc_random_page_cost -
(spc_random_page_cost - spc_seq_page_cost)
* sqrt(pages_fetched / T);
else
cost_per_page = spc_random_page_cost;
run_cost += pages_fetched * cost_per_page;//執行成本
/*
* Estimate CPU costs per tuple.
* 為每個元組估算CPU成本(Rechck步驟的成本)
*
* Often the indexquals don't need to be rechecked at each tuple ... but
* not always, especially not if there are enough tuples involved that the
* bitmaps become lossy. For the moment, just assume they will be
* rechecked always. This means we charge the full freight for all the
* scan clauses.
* 通常情況下,索引約束條件不需要在每個元組上重新檢查,但現實並非如此理想,
* 尤其是當涉及到較多的元組時。就目前而言,
* 最佳化器會假設它們總是會被重新檢查。這意味著我們需要為所有掃描條件計算成本。
*/
get_restriction_qual_cost(root, baserel, param_info, &qpqual_cost);//獲取條件表示式
startup_cost += qpqual_cost.startup;//增加啟動成本
cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;//增加處理每個元組的CPU成本
cpu_run_cost = cpu_per_tuple * tuples_fetched;//CPU執行成本
/* Adjust costing for parallelism, if used. */
if (path->parallel_workers > 0)//是否並行?
{
double parallel_divisor = get_parallel_divisor(path);
/* The CPU cost is divided among all the workers. */
cpu_run_cost /= parallel_divisor;
path->rows = clamp_row_est(path->rows / parallel_divisor);
}
//計算最終成本
run_cost += cpu_run_cost;
/* tlist eval costs are paid per output row, not per tuple scanned */
startup_cost += path->pathtarget->cost.startup;
run_cost += path->pathtarget->cost.per_tuple * path->rows;
path->startup_cost = startup_cost;
path->total_cost = startup_cost + run_cost;
}
//--------------------------------------- compute_bitmap_pages
/*
* compute_bitmap_pages
*
* compute number of pages fetched from heap in bitmap heap scan.
* 計算頁面數
*/
double
compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
int loop_count, Cost *cost, double *tuple)
{
Cost indexTotalCost;
Selectivity indexSelectivity;
double T;
double pages_fetched;
double tuples_fetched;
double heap_pages;
long maxentries;
/*
* Fetch total cost of obtaining the bitmap, as well as its total
* selectivity.
* 獲取點陣圖的總成本,以及它的總選擇性。
*/
cost_bitmap_tree_node(bitmapqual, &indexTotalCost, &indexSelectivity);
/*
* Estimate number of main-table pages fetched.
* 估算主表的頁面數
*/
tuples_fetched = clamp_row_est(indexSelectivity * baserel->tuples);//計算總元組數
T = (baserel->pages > 1) ? (double) baserel->pages : 1.0;
/*
* For a single scan, the number of heap pages that need to be fetched is
* the same as the Mackert and Lohman formula for the case T <= b (ie, no
* re-reads needed).
* 對於單個掃描,需要獲取的堆頁面數量與T <= b(即不需要重新讀取)的Mackert和Lohman公式相同。
*/
pages_fetched = (2.0 * T * tuples_fetched) / (2.0 * T + tuples_fetched);
/*
* Calculate the number of pages fetched from the heap. Then based on
* current work_mem estimate get the estimated maxentries in the bitmap.
* (Note that we always do this calculation based on the number of pages
* that would be fetched in a single iteration, even if loop_count > 1.
* That's correct, because only that number of entries will be stored in
* the bitmap at one time.)
* 計算從堆中讀取的頁面數.
* 根據當前的work_mem估算得到點陣圖中粗略的最大訪問入口(entries)。
* (請注意,我們總是根據單個迭代中獲取的頁面數來進行計算,
* 即使loop_count > 1也是如此。因為只有該數量的條目在點陣圖中只儲存一次。
*/
heap_pages = Min(pages_fetched, baserel->pages);//堆頁面數
maxentries = tbm_calculate_entries(work_mem * 1024L);//點陣圖最大入口數
if (loop_count > 1)
{
/*
* For repeated bitmap scans, scale up the number of tuples fetched in
* the Mackert and Lohman formula by the number of scans, so that we
* estimate the number of pages fetched by all the scans. Then
* pro-rate for one scan.
*/
pages_fetched = index_pages_fetched(tuples_fetched * loop_count,
baserel->pages,
get_indexpath_pages(bitmapqual),
root);
pages_fetched /= loop_count;
}
if (pages_fetched >= T)
pages_fetched = T;//資料字典中的頁面數
else
pages_fetched = ceil(pages_fetched);
if (maxentries < heap_pages)//最大入口數小於堆頁面數
{
double exact_pages;
double lossy_pages;
/*
* Crude approximation of the number of lossy pages. Because of the
* way tbm_lossify() is coded, the number of lossy pages increases
* very sharply as soon as we run short of memory; this formula has
* that property and seems to perform adequately in testing, but it's
* possible we could do better somehow.
* 粗略估計缺頁的數目。由於tbm_lossify()的編碼方式,
* 一旦記憶體不足,缺頁的數量就會急劇增加;
* 這個公式有這個性質,在測試中表現得很好,但有可能做得更好。
*/
lossy_pages = Max(0, heap_pages - maxentries / 2);
exact_pages = heap_pages - lossy_pages;
/*
* If there are lossy pages then recompute the number of tuples
* processed by the bitmap heap node. We assume here that the chance
* of a given tuple coming from an exact page is the same as the
* chance that a given page is exact. This might not be true, but
* it's not clear how we can do any better.
* 如果存在缺頁面,則重新計算點陣圖堆節點處理的元組數量。
* 這裡假設給定元組來自精確頁面的機率與給定頁面的機率相同。
* 但這可能不符合實際情況,但我們不清楚如何才能做得更好:(
*/
if (lossy_pages > 0)
tuples_fetched =
clamp_row_est(indexSelectivity *
(exact_pages / heap_pages) * baserel->tuples +
(lossy_pages / heap_pages) * baserel->tuples);
}
if (cost)
*cost = indexTotalCost;
if (tuple)
*tuple = tuples_fetched;
return pages_fetched;
}
//--------------------------- tbm_calculate_entries
/*
* tbm_calculate_entries
*
* Estimate number of hashtable entries we can have within maxbytes.
*/
long
tbm_calculate_entries(double maxbytes)
{
long nbuckets;
/*
* Estimate number of hashtable entries we can have within maxbytes. This
* estimates the hash cost as sizeof(PagetableEntry), which is good enough
* for our purpose. Also count an extra Pointer per entry for the arrays
* created during iteration readout.
* 估計maxbytes中可以包含的雜湊表條目的數量。
* 這將雜湊成本估計為sizeof(PagetableEntry),這已經足夠好了。
* 還要為迭代讀出期間建立的陣列中每個條目計算額外的指標。
*/
nbuckets = maxbytes /
(sizeof(PagetableEntry) + sizeof(Pointer) + sizeof(Pointer));//桶數
nbuckets = Min(nbuckets, INT_MAX - 1); /* safety limit */
nbuckets = Max(nbuckets, 16); /* sanity limit */
return nbuckets;
}
//--------------------------- cost_bitmap_tree_node
/*
* cost_bitmap_tree_node
* Extract cost and selectivity from a bitmap tree node (index/and/or)
*/
void
cost_bitmap_tree_node(Path *path, Cost *cost, Selectivity *selec)
{
if (IsA(path, IndexPath))//索引訪問路徑
{
*cost = ((IndexPath *) path)->indextotalcost;
*selec = ((IndexPath *) path)->indexselectivity;
/*
* Charge a small amount per retrieved tuple to reflect the costs of
* manipulating the bitmap. This is mostly to make sure that a bitmap
* scan doesn't look to be the same cost as an indexscan to retrieve a
* single tuple.
* 對每個檢索到的元組計算少量成本,以反映操作點陣圖的成本。
* 這主要是為了確保點陣圖掃描與索引掃描檢索單個元組的成本不一樣。
*/
*cost += 0.1 * cpu_operator_cost * path->rows;
}
else if (IsA(path, BitmapAndPath))//BitmapAndPath
{
*cost = path->total_cost;
*selec = ((BitmapAndPath *) path)->bitmapselectivity;
}
else if (IsA(path, BitmapOrPath))//BitmapOrPath
{
*cost = path->total_cost;
*selec = ((BitmapOrPath *) path)->bitmapselectivity;
}
else
{
elog(ERROR, "unrecognized node type: %d", nodeTag(path));
*cost = *selec = 0; /* keep compiler quiet */
}
}
三、跟蹤分析
測試指令碼如下
select t1.*
from t_dwxx t1
where dwbh > '10000' and dwbh < '30000';
啟動gdb跟蹤
(gdb) b create_bitmap_heap_path
Breakpoint 1 at 0x78f1c1: file pathnode.c, line 1090.
(gdb) c
Continuing.
Breakpoint 1, create_bitmap_heap_path (root=0x23d93d8, rel=0x248a788, bitmapqual=0x2473a08, required_outer=0x0,
loop_count=1, parallel_degree=0) at pathnode.c:1090
1090 BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);
建立節點,並賦值
1090 BitmapHeapPath *pathnode = makeNode(BitmapHeapPath);
(gdb) n
1092 pathnode->path.pathtype = T_BitmapHeapScan;
(gdb) n
1093 pathnode->path.parent = rel;
(gdb) n
1094 pathnode->path.pathtarget = rel->reltarget;
(gdb) n
1095 pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
(gdb)
1097 pathnode->path.parallel_aware = parallel_degree > 0 ? true : false;
(gdb)
1098 pathnode->path.parallel_safe = rel->consider_parallel;
(gdb)
1099 pathnode->path.parallel_workers = parallel_degree;
(gdb)
1100 pathnode->path.pathkeys = NIL; /* always unordered */
(gdb)
1102 pathnode->bitmapqual = bitmapqual;
進入cost_bitmap_heap_scan函式
(gdb)
1104 cost_bitmap_heap_scan(&pathnode->path, root, rel,
(gdb) step
cost_bitmap_heap_scan (path=0x24737d8, root=0x23d93d8, baserel=0x248a788, param_info=0x0, bitmapqual=0x2473a08,
loop_count=1) at costsize.c:949
949 Cost startup_cost = 0;
輸入引數,其中bitmapqual為T_IndexPath節點
路徑的其他關鍵資訊:rows = 2223, startup_cost = 0.28500000000000003, total_cost = 169.23871600907944
(gdb) p *(IndexPath *)bitmapqual
$2 = {path = {type = T_IndexPath, pathtype = T_IndexScan, parent = 0x248a788, pathtarget = 0x248a998, param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 2223, startup_cost = 0.28500000000000003,
total_cost = 169.23871600907944, pathkeys = 0x0}, indexinfo = 0x23b63b8, indexclauses = 0x2473948,
indexquals = 0x2473b38, indexqualcols = 0x2473b88, indexorderbys = 0x0, indexorderbycols = 0x0,
indexscandir = ForwardScanDirection, indextotalcost = 50.515000000000001, indexselectivity = 0.22227191011235958}
開始計算成本
...
980 startup_cost += indexTotalCost;
(gdb) p indexTotalCost
$16 = 51.070750000000004
(gdb) p startup_cost
$17 = 0
(gdb) p pages_fetched
$18 = 64
(gdb) p baserel->pages
$19 = 64
...
(gdb) p qpqual_cost
$20 = {startup = 0, per_tuple = 0.0050000000000000001}
最終的訪問路徑資訊
(gdb) p *(BitmapHeapPath *)path
$22 = {path = {type = T_BitmapHeapPath, pathtype = T_BitmapHeapScan, parent = 0x248a788, pathtarget = 0x248a998,
param_info = 0x0, parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 2223,
startup_cost = 51.070750000000004, total_cost = 148.41575, pathkeys = 0x0}, bitmapqual = 0x2473a08}
除了BitmapHeapPath,還有BitmapOr和BitmapAnd,這兩種Path的解析後續再詳述.
四、參考資料
allpaths.c
cost.h
costsize.c
PG Document:Query Planning
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/6906/viewspace-2374849/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- PostgreSQL 原始碼解讀(73)- 查詢語句#58(grouping_planner函式...SQL原始碼函式
- PostgreSQL 原始碼解讀(43)- 查詢語句#28(query_planner函式#5)SQL原始碼函式
- PostgreSQL 原始碼解讀(70)- 查詢語句#55(make_one_rel函式#20-...SQL原始碼函式
- PostgreSQL 原始碼解讀(71)- 查詢語句#56(make_one_rel函式#21-...SQL原始碼函式
- PostgreSQL 原始碼解讀(67)- 查詢語句#52(make_one_rel函式#17-...SQL原始碼函式
- PostgreSQL 原始碼解讀(68)- 查詢語句#53(make_one_rel函式#18-...SQL原始碼函式
- PostgreSQL 原始碼解讀(69)- 查詢語句#54(make_one_rel函式#19-...SQL原始碼函式
- PostgreSQL 原始碼解讀(66)- 查詢語句#51(make_one_rel函式#16-...SQL原始碼函式
- PostgreSQL 原始碼解讀(72)- 查詢語句#57(make_one_rel函式#22-...SQL原始碼函式
- PostgreSQL 原始碼解讀(65)- 查詢語句#50(make_one_rel函式#15-...SQL原始碼函式
- PostgreSQL 原始碼解讀(62)- 查詢語句#47(make_one_rel函式#12-...SQL原始碼函式
- PostgreSQL 原始碼解讀(63)- 查詢語句#48(make_one_rel函式#13-...SQL原始碼函式
- PostgreSQL 原始碼解讀(64)- 查詢語句#49(make_one_rel函式#14-...SQL原始碼函式
- PostgreSQL 原始碼解讀(60)- 查詢語句#45(make_one_rel函式#10-...SQL原始碼函式
- PostgreSQL 原始碼解讀(61)- 查詢語句#46(make_one_rel函式#11-...SQL原始碼函式
- PostgreSQL 原始碼解讀(52)- 查詢語句#37(make_one_rel函式#2-估...SQL原始碼函式
- PostgreSQL 原始碼解讀(53)- 查詢語句#38(make_one_rel函式#3-順...SQL原始碼函式
- PostgreSQL 原始碼解讀(54)- 查詢語句#39(make_one_rel函式#4-生...SQL原始碼函式
- PostgreSQL 原始碼解讀(55)- 查詢語句#40(make_one_rel函式#5-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(56)- 查詢語句#41(make_one_rel函式#6-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(57)- 查詢語句#42(make_one_rel函式#7-索...SQL原始碼函式
- PostgreSQL 原始碼解讀(59)- 查詢語句#44(make_one_rel函式#9-B...SQL原始碼函式
- PostgreSQL 原始碼解讀(49)- 查詢語句#34(make_one_rel函式#1-概覽)SQL原始碼函式
- PostgreSQL 原始碼解讀(83)- 查詢語句#68(PortalStart函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(82)- 查詢語句#67(PortalXXX系列函式)SQL原始碼函式
- PostgreSQL 原始碼解讀(89)- 查詢語句#74(SeqNext函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(90)- 查詢語句#75(ExecHashJoin函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(91)- 查詢語句#76(ExecHashJoin函式#2)SQL原始碼函式
- PostgreSQL 原始碼解讀(88)- 查詢語句#73(SeqNext函式#1)SQL原始碼函式
- PostgreSQL 原始碼解讀(93)- 查詢語句#77(ExecHashJoin函式#3)SQL原始碼函式
- PostgreSQL 原始碼解讀(52)- 查詢語句#37(make_one_rel函式#2-估算關係大小)SQL原始碼函式
- PostgreSQL 原始碼解讀(95)- 查詢語句#78(ExecHashJoin函式#4-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(97)- 查詢語句#79(ExecHashJoin函式#5-H...SQL原始碼函式
- PostgreSQL 原始碼解讀(45)- 查詢語句#30(query_planner函式#6)SQL原始碼函式
- PostgreSQL 原始碼解讀(46)- 查詢語句#31(query_planner函式#7)SQL原始碼函式
- PostgreSQL 原始碼解讀(47)- 查詢語句#32(query_planner函式#8)SQL原始碼函式
- PostgreSQL 原始碼解讀(48)- 查詢語句#33(query_planner函式#9)SQL原始碼函式
- PostgreSQL 原始碼解讀(38)- 查詢語句#23(query_planner函式#1)SQL原始碼函式