通過10104閱讀hash join工作機制 ---Multipass
四、Multipass hash join
1、記憶體hash表資訊
Hash-join fanout: 8
Number of partitions: 8
Number of slots: 14
Multiblock IO: 1
Block size(KB): 8
Cluster (slot) size(KB): 8
Minimum number of bytes per block: 8160
Bit vector memory allocation(KB): 0
Per partition bit vector length(KB): 0
Maximum possible row length: 1077
Estimated build size (KB): 2046
Estimated Build Row Length (includes overhead): 1048
.....................
kxhfWrite: hash-join is spilling to disk
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
Number of partitions which could fit in memory: 0
--8個分割槽中沒有一個分割槽能全部放到記憶體中
Number of partitions left in memory: 0
--沒有一個分割槽能整個留在記憶體中
Total number of slots in in-memory partitions: 0
Total number of rows in in-memory partitions: 0
(used as preliminary number of buckets in hash table)
Estimated max # of build rows that can fit in avail memory: 224
--從以上的初始化資訊,我們就很容易判斷hash join機制是optimal、onepass還是multipass。
2、分割槽表資訊
### Partition Distribution ###
Partition:0 rows:262 clusters:34 slots:1 kept=0
Partition:1 rows:269 clusters:35 slots:1 kept=0
Partition:2 rows:245 clusters:32 slots:1 kept=0
Partition:3 rows:231 clusters:30 slots:1 kept=0
Partition:4 rows:244 clusters:32 slots:1 kept=0
Partition:5 rows:254 clusters:33 slots:1 kept=0
Partition:6 rows:254 clusters:33 slots:1 kept=0
Partition:7 rows:241 clusters:31 slots:6 kept=0
--在multpass 中所有kept=0,說明每個分割槽都沒有資料完全能夠存放在hash 記憶體中。
3、第二階段處理過程
因為分割槽表無法單獨存放在記憶體中,所以需要將該分割槽再次分割成幾個子分割槽,以下就是各個子分割槽的處理跟蹤資訊:
分割槽1處理過程:
*** RowSrcId: 1 HASH JOIN GET FLUSHED PARTITIONS (PHASE 2) ***
Getting a pair of flushed partions.
BUILD PARTION: nrows:231 size=(30 slots, 240K)
PROBE PARTION: nrows:246 size=(32 slots, 256K)
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 2) ***
Number of blocks that may be used to build the hash table 9
--從資訊中可以看出,記憶體hash表中9個blocks來處理30blocks ( 30slots = 30 blocks) 資料,所以該記憶體需要分四次才能處理完成(round(30/9)=4),從側面上可以推斷至少需要提高4倍的hash area size才能實現onepass功能。
Number of rows left to be iterated over (start of function): 231
--231行處理中,從第一行進行處理
Number of rows iterated over this function call: 70
Number of rows left to be iterated over (end of function): 161
--還有161行未處理完成,需要繼續進行處理
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 448
Number of buckets with 1 rows: 58
Number of buckets with 2 rows: 6
Number of buckets with 3 rows: 0
Number of buckets with 4 rows: 0
Number of buckets with 5 rows: 0
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 512 Empty buckets: 448 Non-empty buckets: 64
Total number of rows: 70
Maximum number of rows in a bucket: 2
Average number of rows in non-empty buckets: 1.093750
kxhfResetIter(1105ed460)
qerhjFetchPhase2(): building a hash table
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 2) ***
Number of blocks that may be used to build the hash hable 9
Number of rows left to be iterated over (start of function): 161
Number of rows iterated over this function call: 71
Number of rows left to be iterated over (end of function): 90
以下省略。
以上實驗Multipass工作機制
1、hash 記憶體中分配8個分割槽存放小表資料,但所提供的記憶體不足以支撐任何一個分割槽進行hash join,所以一個分割槽再分成多個子分割槽,然後分別從磁碟提取到hash記憶體與大表對應的分割槽表進行匹配,匹配的原理與前面所述一致。
2、小表的分割槽表總的被掃描匹配一次,但大表的分割槽表,每次都要與小表進行掃描匹配一次,這就意味著,如果一個分割槽表被分成四個子分割槽表,則對應的大表分割槽表,將被掃描匹配4次。所以大表的multipass=4。
1、記憶體hash表資訊
Hash-join fanout: 8
Number of partitions: 8
Number of slots: 14
Multiblock IO: 1
Block size(KB): 8
Cluster (slot) size(KB): 8
Minimum number of bytes per block: 8160
Bit vector memory allocation(KB): 0
Per partition bit vector length(KB): 0
Maximum possible row length: 1077
Estimated build size (KB): 2046
Estimated Build Row Length (includes overhead): 1048
.....................
kxhfWrite: hash-join is spilling to disk
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 1) ***
Total number of partitions: 8
Number of partitions which could fit in memory: 0
--8個分割槽中沒有一個分割槽能全部放到記憶體中
Number of partitions left in memory: 0
--沒有一個分割槽能整個留在記憶體中
Total number of slots in in-memory partitions: 0
Total number of rows in in-memory partitions: 0
(used as preliminary number of buckets in hash table)
Estimated max # of build rows that can fit in avail memory: 224
--從以上的初始化資訊,我們就很容易判斷hash join機制是optimal、onepass還是multipass。
2、分割槽表資訊
### Partition Distribution ###
Partition:0 rows:262 clusters:34 slots:1 kept=0
Partition:1 rows:269 clusters:35 slots:1 kept=0
Partition:2 rows:245 clusters:32 slots:1 kept=0
Partition:3 rows:231 clusters:30 slots:1 kept=0
Partition:4 rows:244 clusters:32 slots:1 kept=0
Partition:5 rows:254 clusters:33 slots:1 kept=0
Partition:6 rows:254 clusters:33 slots:1 kept=0
Partition:7 rows:241 clusters:31 slots:6 kept=0
--在multpass 中所有kept=0,說明每個分割槽都沒有資料完全能夠存放在hash 記憶體中。
3、第二階段處理過程
因為分割槽表無法單獨存放在記憶體中,所以需要將該分割槽再次分割成幾個子分割槽,以下就是各個子分割槽的處理跟蹤資訊:
分割槽1處理過程:
*** RowSrcId: 1 HASH JOIN GET FLUSHED PARTITIONS (PHASE 2) ***
Getting a pair of flushed partions.
BUILD PARTION: nrows:231 size=(30 slots, 240K)
PROBE PARTION: nrows:246 size=(32 slots, 256K)
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 2) ***
Number of blocks that may be used to build the hash table 9
--從資訊中可以看出,記憶體hash表中9個blocks來處理30blocks ( 30slots = 30 blocks) 資料,所以該記憶體需要分四次才能處理完成(round(30/9)=4),從側面上可以推斷至少需要提高4倍的hash area size才能實現onepass功能。
Number of rows left to be iterated over (start of function): 231
--231行處理中,從第一行進行處理
Number of rows iterated over this function call: 70
Number of rows left to be iterated over (end of function): 161
--還有161行未處理完成,需要繼續進行處理
### Hash table ###
# NOTE: The calculated number of rows in non-empty buckets may be smaller
# than the true number.
Number of buckets with 0 rows: 448
Number of buckets with 1 rows: 58
Number of buckets with 2 rows: 6
Number of buckets with 3 rows: 0
Number of buckets with 4 rows: 0
Number of buckets with 5 rows: 0
Number of buckets with 6 rows: 0
Number of buckets with 7 rows: 0
Number of buckets with 8 rows: 0
Number of buckets with 9 rows: 0
Number of buckets with between 10 and 19 rows: 0
Number of buckets with between 20 and 29 rows: 0
Number of buckets with between 30 and 39 rows: 0
Number of buckets with between 40 and 49 rows: 0
Number of buckets with between 50 and 59 rows: 0
Number of buckets with between 60 and 69 rows: 0
Number of buckets with between 70 and 79 rows: 0
Number of buckets with between 80 and 89 rows: 0
Number of buckets with between 90 and 99 rows: 0
Number of buckets with 100 or more rows: 0
### Hash table overall statistics ###
Total buckets: 512 Empty buckets: 448 Non-empty buckets: 64
Total number of rows: 70
Maximum number of rows in a bucket: 2
Average number of rows in non-empty buckets: 1.093750
kxhfResetIter(1105ed460)
qerhjFetchPhase2(): building a hash table
*** RowSrcId: 1 HASH JOIN BUILD HASH TABLE (PHASE 2) ***
Number of blocks that may be used to build the hash hable 9
Number of rows left to be iterated over (start of function): 161
Number of rows iterated over this function call: 71
Number of rows left to be iterated over (end of function): 90
以下省略。
以上實驗Multipass工作機制
1、hash 記憶體中分配8個分割槽存放小表資料,但所提供的記憶體不足以支撐任何一個分割槽進行hash join,所以一個分割槽再分成多個子分割槽,然後分別從磁碟提取到hash記憶體與大表對應的分割槽表進行匹配,匹配的原理與前面所述一致。
2、小表的分割槽表總的被掃描匹配一次,但大表的分割槽表,每次都要與小表進行掃描匹配一次,這就意味著,如果一個分割槽表被分成四個子分割槽表,則對應的大表分割槽表,將被掃描匹配4次。所以大表的multipass=4。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/354732/viewspace-623433/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 通過10104閱讀hash join工作機制 ---OnePass
- 通過10104閱讀hash join工作機制 ---Optimal
- 通過swap_join_inputs來控制多表連線hash join中的build tableUI
- ORACLE Hash JoinOracle
- 通過分析 JDK 原始碼研究 Hash 儲存機制JDK原始碼
- HASH JOIN ,MERGE JOIN ,NESTED LOOPOOP
- HASH join詳解
- nested loop,sort merge join,hash joinOOP
- Mysql join 的本質,Nested Loop Join 和 Hash JoinMySqlOOP
- 通過兩個案例初步瞭解Linux下selinux的安全機制工作機制Linux
- 如何通過閱讀文件,構建概念模型?模型
- hash join\nest loop join\sort merge join的實驗OOP
- HASH JOIN ,MERGE JOIN ,NESTED LOOP(R2)OOP
- HASH JOIN ,MERGE JOIN ,NESTED LOOP的比較OOP
- 【譯】通過閱讀原始碼來提高 JS 知識原始碼JS
- 閱讀原始碼,通過LinkedList回顧基礎原始碼
- Hash join演算法原理演算法
- Hash join演算法原理(轉)演算法
- Oracle中的Hash Join祥解Oracle
- Oracle 的 hash join連線方式Oracle
- oracle hash join演算法原理Oracle演算法
- Oracle中的Hash Join詳解Oracle
- Data Warehourse Guide閱讀筆記(五):Bitmap join indexGUIIDE筆記Index
- oracle工作機制(1)Oracle
- oracle工作機制(2)Oracle
- ORACLE的工作機制Oracle
- oracle工作機制(轉)Oracle
- oracle hash join原理及注意事項Oracle
- Oracle中的Hash Join詳解 ztOracle
- 表的連線方式:NESTED LOOP、HASH JOIN、SORT MERGE JOIN(轉)OOP
- MapReduce之MapTask工作機制APT
- oracle工作機制導圖Oracle
- ORACLE的工作機制-1Oracle
- ORACLE的工作機制-2Oracle
- ORACLE的工作機制-3Oracle
- ORACLE的工作機制-4Oracle
- ORACLE的工作機制-5Oracle
- ORACLE的工作機制(簡)Oracle