通過10104閱讀hash join工作機制 ---OnePass
三、Onepass hash join
1、記憶體hash表總體資訊
kxhfWrite: hash-join is spilling to disk
-- 發生了dump disk的動作
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
--hash表分為8個分割槽
Number of partitions which could fit in memory: 1
--適合存放在記憶體中分割槽數量為1
Number of partitions left in memory: 1
--留在記憶體分割槽數量為1
Total number of slots in in-memory partitions: 4
--在記憶體中分割槽的slot數量為4
Total number of rows in in-memory partitions: 222
--記憶體分割槽中存放小表的行數為222.
Estimated max # of build rows that can fit in avail memory: 1872
--評估所分配記憶體,最大能存放小表的行數位1872,測試中,小表的行數為2000行,所以可用預估,需要onepass hash join
2、分割槽表具體存放行數的資訊
### Partition Distribution ###
Partition:0 rows:249 clusters:4 slots:1 kept=0
Partition:1 rows:272 clusters:4 slots:1 kept=0
Partition:2 rows:243 clusters:4 slots:1 kept=0
Partition:3 rows:269 clusters:4 slots:1 kept=0
Partition:4 rows:251 clusters:4 slots:1 kept=0
Partition:5 rows:249 clusters:4 slots:1 kept=0
Partition:6 rows:222 clusters:4 slots:4 kept=1
Partition:7 rows:245 clusters:4 slots:2 kept=0
--存放2000行資料小表分佈在8個分割槽,目前partition 6儲存在記憶體中(kept=1),其他的存放在磁碟中。
Number of buckets: 512
--hash 表有512個hash bucket管理
3、hash表的具體存放資訊:
352+114+36+5+4+1=512
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 352
Number of buckets with 1 rows: 114
Number of buckets with 2 rows: 36
Number of buckets with 3 rows: 5
Number of buckets with 4 rows: 4
Number of buckets with 5 rows: 1
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
對應的行數量=114 + 26*2 + 5*3 +4*4 + 1*5=202,我們發現計算的行數,正如所註釋一樣,少於實際行數,但不清楚為什麼存在此現象?
4、partition 6的hash 表統計資訊
Total buckets: 512
Empty buckets: 352
Non-empty buckets: 160
Total number of rows: 222
Maximum number of rows in a bucket: 5
Average number of rows in non-empty buckets: 1.387500
5、大表對應小表partition 6進行過濾
Used bitmap filtering: filtered rows=23 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=237 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=448 minimum required=50 out f=1000
..............................
以上第一階段與optimal hash join機制基本相同。以下是onepass處理的第二階段
1、從磁碟中提取分割槽,進行hash join資訊:
*** RowSrcId: 1 HASH JOIN GET FLUSHED PARTITIONS (PHASE 2) ***
Getting a pair of flushed partions.
BUILD PARTION: nrows:249 size=(4 slots, 288K)
PROBE PARTION: nrows:210 size=(3 slots, 216K)
ROLE REVERSAL OCCURRED
--從磁碟提取的另一個分割槽表,行數為249,我們可以從上面分割槽分配情況得知其為partition 0,對應進行處理的大表行數為210行。我們從早先的跟蹤檔案資訊知道multiblock IO=9,這就意味著4 slots對應著36個blocks。
2、hash join處理重要資訊
Number of blocks that may be used to build the hash hable 72
Number of rows left to be iterated over (start of function): 210
Number of rows iterated over this function call: 210
Number of rows left to be iterated over (end of function): 0
-- 最後一行為0,表示此次處理只需要一次onepass就能完成,如果非0,意味著需要multi-pass處理。
3、partition 0 hash表資訊
Number of buckets with 0 rows: 346
Number of buckets with 1 rows: 132
Number of buckets with 2 rows: 24
Number of buckets with 3 rows: 10
Number of buckets with 4 rows: 0
Number of buckets with 5 rows: 0
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 512 Empty buckets: 346 Non-empty buckets: 166
Total number of rows: 210
Maximum number of rows in a bucket: 3
Average number of rows in non-empty buckets: 1.265060
其他分割槽處理與以上類似。
OnePass工作機制:
1、hash 記憶體區域中分配了8個分割槽,每個分割槽對應4K bitmap點陣圖,小錶行存放到hash表中,因為記憶體限制,在記憶體中只保留一個分割槽,其餘分割槽dump到磁碟中。其處理機制與optimal相同。
2、大表按照小表分割槽表進行整理分類,當從磁碟中提取其他分割槽時候,對應提取整理過大表分割槽。獲取所需要的行
3、從跟蹤檔案中,小表處理過程先讀取資料到磁碟,然後將部分資料dump到磁碟,然後再從磁碟讀取到記憶體中,而大表處理過程,也是先從磁碟讀到記憶體,然後dump到磁碟,再從磁碟讀取。對磁碟的寫和讀為1次。
1、記憶體hash表總體資訊
kxhfWrite: hash-join is spilling to disk
-- 發生了dump disk的動作
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
--hash表分為8個分割槽
Number of partitions which could fit in memory: 1
--適合存放在記憶體中分割槽數量為1
Number of partitions left in memory: 1
--留在記憶體分割槽數量為1
Total number of slots in in-memory partitions: 4
--在記憶體中分割槽的slot數量為4
Total number of rows in in-memory partitions: 222
--記憶體分割槽中存放小表的行數為222.
Estimated max # of build rows that can fit in avail memory: 1872
--評估所分配記憶體,最大能存放小表的行數位1872,測試中,小表的行數為2000行,所以可用預估,需要onepass hash join
2、分割槽表具體存放行數的資訊
### Partition Distribution ###
Partition:0 rows:249 clusters:4 slots:1 kept=0
Partition:1 rows:272 clusters:4 slots:1 kept=0
Partition:2 rows:243 clusters:4 slots:1 kept=0
Partition:3 rows:269 clusters:4 slots:1 kept=0
Partition:4 rows:251 clusters:4 slots:1 kept=0
Partition:5 rows:249 clusters:4 slots:1 kept=0
Partition:6 rows:222 clusters:4 slots:4 kept=1
Partition:7 rows:245 clusters:4 slots:2 kept=0
--存放2000行資料小表分佈在8個分割槽,目前partition 6儲存在記憶體中(kept=1),其他的存放在磁碟中。
Number of buckets: 512
--hash 表有512個hash bucket管理
3、hash表的具體存放資訊:
352+114+36+5+4+1=512
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 352
Number of buckets with 1 rows: 114
Number of buckets with 2 rows: 36
Number of buckets with 3 rows: 5
Number of buckets with 4 rows: 4
Number of buckets with 5 rows: 1
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
對應的行數量=114 + 26*2 + 5*3 +4*4 + 1*5=202,我們發現計算的行數,正如所註釋一樣,少於實際行數,但不清楚為什麼存在此現象?
4、partition 6的hash 表統計資訊
Total buckets: 512
Empty buckets: 352
Non-empty buckets: 160
Total number of rows: 222
Maximum number of rows in a bucket: 5
Average number of rows in non-empty buckets: 1.387500
5、大表對應小表partition 6進行過濾
Used bitmap filtering: filtered rows=23 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=237 minimum required=50 out f=1000
Used bitmap filtering: filtered rows=448 minimum required=50 out f=1000
..............................
以上第一階段與optimal hash join機制基本相同。以下是onepass處理的第二階段
1、從磁碟中提取分割槽,進行hash join資訊:
*** RowSrcId: 1 HASH JOIN GET FLUSHED PARTITIONS (PHASE 2) ***
Getting a pair of flushed partions.
BUILD PARTION: nrows:249 size=(4 slots, 288K)
PROBE PARTION: nrows:210 size=(3 slots, 216K)
ROLE REVERSAL OCCURRED
--從磁碟提取的另一個分割槽表,行數為249,我們可以從上面分割槽分配情況得知其為partition 0,對應進行處理的大表行數為210行。我們從早先的跟蹤檔案資訊知道multiblock IO=9,這就意味著4 slots對應著36個blocks。
2、hash join處理重要資訊
Number of blocks that may be used to build the hash hable 72
Number of rows left to be iterated over (start of function): 210
Number of rows iterated over this function call: 210
Number of rows left to be iterated over (end of function): 0
-- 最後一行為0,表示此次處理只需要一次onepass就能完成,如果非0,意味著需要multi-pass處理。
3、partition 0 hash表資訊
Number of buckets with 0 rows: 346
Number of buckets with 1 rows: 132
Number of buckets with 2 rows: 24
Number of buckets with 3 rows: 10
Number of buckets with 4 rows: 0
Number of buckets with 5 rows: 0
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 512 Empty buckets: 346 Non-empty buckets: 166
Total number of rows: 210
Maximum number of rows in a bucket: 3
Average number of rows in non-empty buckets: 1.265060
其他分割槽處理與以上類似。
OnePass工作機制:
1、hash 記憶體區域中分配了8個分割槽,每個分割槽對應4K bitmap點陣圖,小錶行存放到hash表中,因為記憶體限制,在記憶體中只保留一個分割槽,其餘分割槽dump到磁碟中。其處理機制與optimal相同。
2、大表按照小表分割槽表進行整理分類,當從磁碟中提取其他分割槽時候,對應提取整理過大表分割槽。獲取所需要的行
3、從跟蹤檔案中,小表處理過程先讀取資料到磁碟,然後將部分資料dump到磁碟,然後再從磁碟讀取到記憶體中,而大表處理過程,也是先從磁碟讀到記憶體,然後dump到磁碟,再從磁碟讀取。對磁碟的寫和讀為1次。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/354732/viewspace-623427/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 通過10104閱讀hash join工作機制 ---Multipass
- 通過10104閱讀hash join工作機制 ---Optimal
- 通過swap_join_inputs來控制多表連線hash join中的build tableUI
- ORACLE Hash JoinOracle
- 通過分析 JDK 原始碼研究 Hash 儲存機制JDK原始碼
- HASH JOIN ,MERGE JOIN ,NESTED LOOPOOP
- HASH join詳解
- nested loop,sort merge join,hash joinOOP
- Mysql join 的本質,Nested Loop Join 和 Hash JoinMySqlOOP
- 通過兩個案例初步瞭解Linux下selinux的安全機制工作機制Linux
- 如何通過閱讀文件,構建概念模型?模型
- hash join\nest loop join\sort merge join的實驗OOP
- HASH JOIN ,MERGE JOIN ,NESTED LOOP(R2)OOP
- HASH JOIN ,MERGE JOIN ,NESTED LOOP的比較OOP
- 【譯】通過閱讀原始碼來提高 JS 知識原始碼JS
- 閱讀原始碼,通過LinkedList回顧基礎原始碼
- Hash join演算法原理演算法
- Hash join演算法原理(轉)演算法
- Oracle中的Hash Join祥解Oracle
- Oracle 的 hash join連線方式Oracle
- oracle hash join演算法原理Oracle演算法
- Oracle中的Hash Join詳解Oracle
- Data Warehourse Guide閱讀筆記(五):Bitmap join indexGUIIDE筆記Index
- oracle工作機制(1)Oracle
- oracle工作機制(2)Oracle
- ORACLE的工作機制Oracle
- oracle工作機制(轉)Oracle
- oracle hash join原理及注意事項Oracle
- Oracle中的Hash Join詳解 ztOracle
- 表的連線方式:NESTED LOOP、HASH JOIN、SORT MERGE JOIN(轉)OOP
- MapReduce之MapTask工作機制APT
- oracle工作機制導圖Oracle
- ORACLE的工作機制-1Oracle
- ORACLE的工作機制-2Oracle
- ORACLE的工作機制-3Oracle
- ORACLE的工作機制-4Oracle
- ORACLE的工作機制-5Oracle
- ORACLE的工作機制(簡)Oracle