HybridDBPostgreSQL”Sort、Group、distinct聚合、JOIN”不懼怕資料傾斜的黑科技和原理-多階段聚合

德哥發表於2017-11-23

原文網址 : https://flycode.co/archives/186314

SQL

背景

對於分散式系統，資料分佈儲存，例如隨機、雜湊分佈。

Greenplum資料庫支援兩種資料分佈模式：

1、雜湊（指定單個、或多個欄位）

2、隨機分佈（無需指定任何欄位）

資料分佈儲存後，面臨一些挑戰：

JOIN，排序，group by，distinct。

1、JOIN涉及非分佈鍵欄位

2、排序，如何保證輸出順序全域性有序

3、group by非分佈鍵欄位

4、distinct設計非分佈鍵欄位

一些功能不完整的資料庫，可能無法支援以上功能。

Greenplum商業化數十年，功能方面非常完善，那麼它有什麼祕密法寶呢？

（ HybridDB for PostgreSQL基於GPDB開源版本改進而來，已包含這個功能。）

非分佈鍵 JOIN，排序，group by，distinct

1、非分佈鍵 group by

例子，

tbl_ao_col表是c1的分佈鍵，但是我們group by使用了c398欄位，因此看看它是怎麼做的呢？請看執行計劃的解釋。

postgres=# explain analyze select c398,count(*),sum(c399),avg(c399),min(c399),max(c399) from tbl_ao_col group by c398;      
                                                                       QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------      
 Gather Motion 48:1  (slice2; segments: 48)  (cost=123364.18..123582.28 rows=9693 width=96)      
 // 返回結果    
   Rows out:  10001 rows at destination with 120 ms to end, start offset by 1.921 ms.      
   ->  HashAggregate  (cost=123364.18..123582.28 rows=202 width=96)      
   // 重分佈後再次聚合。    
         Group By: tbl_ao_col.c398      
         Rows out:  Avg 208.4 rows x 48 workers.  Max 223 rows (seg17) with 0.001 ms to first row, 54 ms to end, start offset by 35 ms.      
         ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=122928.00..123121.86 rows=202 width=96)      
         // 第一次聚合後，記錄數以及降低到了幾千行，因此重分佈後即使出現傾斜，關係也不大。    
               Hash Key: tbl_ao_col.c398      
               Rows out:  Avg 8762.2 rows x 48 workers at destination.  Max 9422 rows (seg46) with 31 ms to end, start offset by 63 ms.      
               ->  HashAggregate  (cost=122928.00..122928.00 rows=202 width=96)      
               // 這一步是在segment節點聚合    
                     Group By: tbl_ao_col.c398      
                     Rows out:  Avg 8762.2 rows x 48 workers.  Max 8835 rows (seg2) with 0.004 ms to first row, 8.004 ms to end, start offset by 82 ms.      
                     ->  Append-only Columnar Scan on tbl_ao_col  (cost=0.00..107928.00 rows=20834 width=16)      
                           Rows out:  0 rows (seg0) with 28 ms to end, start offset by 64 ms.      
 Slice statistics:      
   (slice0)    Executor memory: 377K bytes.      
   (slice1)    Executor memory: 1272K bytes avg x 48 workers, 1272K bytes max (seg0).      
   (slice2)    Executor memory: 414K bytes avg x 48 workers, 414K bytes max (seg0).      
 Statement statistics:      
   Memory used: 128000K bytes      
 Settings:  optimizer=off      
 Optimizer status: legacy query optimizer      
 Total runtime: 122.173 ms      
(22 rows)

執行計劃解讀：

非分佈鍵 GROUP BY，首先會在本地節點group by，然後按GROUP BY欄位進行資料重分佈，然後再在本地節點GROUP BY，最後返回GROUP BY結果給master節點，返回給使用者。

Greenplum會根據group by的欄位的distinct值的比例，考慮是直接重分佈資料，還是先在本地聚合後再重分佈資料（減少重分佈的資料量）。

2、非分佈鍵 distinct

例子，

tbl 為隨機分佈

postgres=# explain analyze select count(distinct c2) from tbl;  
                                                                         QUERY PLAN                                                                           
------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=1549462.55..1549462.56 rows=1 width=8)  
   Rows out:  1 rows with 0.002 ms to first row, 0.645 ms to end, start offset by 1.681 ms.  
   ->  Gather Motion 48:1  (slice2; segments: 48)  (cost=1548947.03..1549450.04 rows=1001 width=4)  
         Rows out:  1001 rows at destination with 498 ms to end, start offset by 1.684 ms.  
         ->  HashAggregate  (cost=1548947.03..1548959.55 rows=21 width=4)  
               Group By: tbl.c2  
               Rows out:  Avg 20.9 rows x 48 workers.  Max 31 rows (seg17) with 0.002 ms to first row, 152 ms to end, start offset by 39 ms.  
               ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=1548912.00..1548932.02 rows=21 width=4)  
                     Hash Key: tbl.c2  
                     Rows out:  Avg 1001.0 rows x 48 workers at destination.  Max 1488 rows (seg17) with 309 ms to end, start offset by 39 ms.  
                     ->  HashAggregate  (cost=1548912.00..1548912.00 rows=21 width=4)  
                           Group By: tbl.c2  
                           Rows out:  Avg 1001.0 rows x 48 workers.  Max 1001 rows (seg0) with 0.006 ms to first row, 271 ms to end, start offset by 42 ms.  
                           ->  Append-only Columnar Scan on tbl  (cost=0.00..1048912.00 rows=2083334 width=4)  
                                 Rows out:  0 rows (seg0) with 25 ms to end, start offset by 42 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 327K bytes.  
   (slice1)    Executor memory: 764K bytes avg x 48 workers, 764K bytes max (seg0).  
   (slice2)    Executor memory: 292K bytes avg x 48 workers, 292K bytes max (seg0).  
 Statement statistics:  
   Memory used: 128000K bytes  
 Settings:  enable_bitmapscan=off; enable_seqscan=off; optimizer=off  
 Optimizer status: legacy query optimizer  
 Total runtime: 502.576 ms  
(24 rows)

執行計劃解讀：

非分佈鍵求distinct，首先會在本地節點hash 聚合，然後按distinct欄位進行資料重分佈，然後再在本地節點hash 聚合，最後返回結果給master節點，返回給使用者。

Greenplum會根據欄位的distinct值的比例，考慮是直接重分佈資料，還是先在本地聚合後再重分佈資料（減少重分佈的資料量）。

3、非分佈鍵 distinct + 非分佈鍵 group by

tbl 為隨機分佈

postgres=# explain analyze select count(distinct c2) from tbl group by c3;  
                                                                           QUERY PLAN                                                                              
-----------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 48:1  (slice2; segments: 48)  (cost=1805483.56..1805484.83 rows=101 width=12)  
   Rows out:  101 rows at destination with 990 ms to end, start offset by 519 ms.  
   ->  HashAggregate  (cost=1805483.56..1805484.83 rows=3 width=12)  
         Group By: partial_aggregation.c3  
         Rows out:  Avg 2.5 rows x 41 workers.  Max 4 rows (seg9) with 0.005 ms to first row, 0.284 ms to end, start offset by 577 ms.  
         ->  HashAggregate  (cost=1802703.29..1803967.05 rows=2107 width=8)  
               Group By: tbl.c3, tbl.c2  
               Rows out:  Avg 2465.9 rows x 41 workers.  Max 4004 rows (seg9) with 0.001 ms to first row, 260 ms to end, start offset by 577 ms.  
               ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=1798912.00..1800934.02 rows=2107 width=8)  
                     Hash Key: tbl.c3  
                     Rows out:  Avg 118362.0 rows x 41 workers at destination.  Max 192192 rows (seg9) with 663 ms to end, start offset by 577 ms.  
                     ->  HashAggregate  (cost=1798912.00..1798912.00 rows=2107 width=8)  
                           Group By: tbl.c3, tbl.c2  
                           Rows out:  Avg 101100.9 rows x 48 workers.  Max 101101 rows (seg0) with 0.005 ms to first row, 747 ms to end, start offset by 562 ms.  
                           ->  Append-only Columnar Scan on tbl  (cost=0.00..1048912.00 rows=2083334 width=8)  
                                 Rows out:  0 rows (seg0) with 40 ms to end, start offset by 560 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 327K bytes.  
   (slice1)    Executor memory: 1117K bytes avg x 48 workers, 1117K bytes max (seg0).  
   (slice2)    Executor memory: 435K bytes avg x 48 workers, 452K bytes max (seg0).  
 Statement statistics:  
   Memory used: 128000K bytes  
 Settings:  enable_bitmapscan=off; enable_seqscan=off; optimizer=off  
 Optimizer status: legacy query optimizer  
 Total runtime: 1511.120 ms  
(25 rows)

distinct和group by都是非分佈鍵，Greenplum分散式執行計劃優雅的解決了非分佈鍵group by與distinct資料重分佈帶來的網路傳輸的問題。

4、非分佈鍵 join

對於兩個表JOIN時，採用了非分佈鍵時，Greenplum會自動對資料進行重分佈（或者小表使用廣播模式）。

join欄位有資料傾斜時，需要注意。

本例為1000萬個重複ID作為JOIN欄位。JOIN重分佈後，會落到一個節點。

postgres=# explain analyze select a.c1,count(*) from a join b on (a.id=b.id) group by a.c1;  
                                                                                          QUERY PLAN                                                                                             
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 48:1  (slice3; segments: 48)  (cost=0.00..2730.45 rows=1 width=12)  
   Rows out:  1 rows at destination with 7190 ms to end, start offset by 2.357 ms.  
   ->  GroupAggregate  (cost=0.00..2730.45 rows=1 width=12)  
         Group By: a.c1  
         Rows out:  1 rows (seg22) with 0.001 ms to first row, 0.320 ms to end, start offset by 54 ms.  
         ->  Sort  (cost=0.00..2730.44 rows=1 width=12)  
               Sort Key: a.c1  
               Rows out:  1 rows (seg22) with 0.001 ms to end, start offset by 54 ms.  
               Executor memory:  33K bytes avg, 33K bytes max (seg0).  
               Work_mem used:  33K bytes avg, 33K bytes max (seg0). Workfile: (0 spilling, 0 reused)  
               ->  Redistribute Motion 48:48  (slice2; segments: 48)  (cost=0.00..2730.44 rows=1 width=12)  
                     Hash Key: a.c1  
                     Rows out:  1 rows at destination (seg22) with 7138 ms to end, start offset by 54 ms.  
                     ->  Result  (cost=0.00..2730.44 rows=1 width=12)  
                           Rows out:  1 rows (seg42) with 0.003 ms to end, start offset by 77 ms.  
                           ->  GroupAggregate  (cost=0.00..2730.44 rows=1 width=12)  
                                 Group By: a.c1  
                                 Rows out:  1 rows (seg42) with 0.002 ms to first row, 1054 ms to end, start offset by 77 ms.  
                                 ->  Sort  (cost=0.00..2730.44 rows=1 width=4)  
                                       Sort Key: a.c1  
                                       Rows out:  10000000 rows (seg42) with 0.003 ms to end, start offset by 77 ms.  
                                       Executor memory:  1400K bytes avg, 65676K bytes max (seg42).  
                                       Work_mem used:  1400K bytes avg, 65676K bytes max (seg42). Workfile: (1 spilling, 0 reused)  
                                       Work_mem wanted: 481337K bytes avg, 481337K bytes max (seg42) to lessen workfile I/O affecting 1 workers.  
                                       ->  Hash Join  (cost=0.00..2730.44 rows=1 width=4)  
                                             Hash Cond: b.id = a.id  
                                             Rows out:  10000000 rows (seg42) with 0.014 ms to first row, 4989 ms to end, start offset by 77 ms.  
                                             Executor memory:  6511K bytes avg, 6513K bytes max (seg18).  
                                             Work_mem used:  6511K bytes avg, 6513K bytes max (seg18). Workfile: (0 spilling, 0 reused)  
                                             ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=0.00..471.72 rows=208130 width=4)  
                                                   Hash Key: b.id  
                                                   Rows out:  10000000 rows at destination (seg42) with 0.004 ms to end, start offset by 77 ms.  
                                                   ->  Table Scan on b  (cost=0.00..436.27 rows=208130 width=4)  
                                                         Rows out:  Avg 208333.3 rows x 48 workers.  Max 208430 rows (seg17) with 4.815 ms to first row, 824 ms to end, start offset by 92 ms.  
                                             ->  Hash  (cost=436.27..436.27 rows=208475 width=8)  
                                                   Rows in:  (No row requested) 0 rows (seg0) with 0 ms to end.  
                                                   ->  Table Scan on a  (cost=0.00..436.27 rows=208475 width=8)  
                                                         Rows out:  Avg 208333.3 rows x 48 workers.  Max 208401 rows (seg18) with 34 ms to first row, 46 ms to end, start offset by 63 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 330K bytes.  
   (slice1)    Executor memory: 1129K bytes avg x 48 workers, 1129K bytes max (seg0).  
   (slice2)  * Executor memory: 2139K bytes avg x 48 workers, 66504K bytes max (seg42).  Work_mem: 65676K bytes max, 481337K bytes wanted.  
   (slice3)    Executor memory: 372K bytes avg x 48 workers, 388K bytes max (seg22).  Work_mem: 33K bytes max.  
 Statement statistics:  
   Memory used: 128000K bytes  
   Memory wanted: 1444908K bytes  
 Settings:  enable_bitmapscan=on; enable_seqscan=on; optimizer=on  
 Optimizer status: PQO version 1.602  
 Total runtime: 7193.902 ms  
(49 rows)

JOIN兩個非分佈鍵

對於兩個表JOIN時，採用了非分佈鍵時，Greenplum會自動對資料進行重分佈（或者小表使用廣播模式）。

postgres=# explain analyze select a.c1,count(*) from a join b on (a.id=b.id) group by a.c1;  
                                                                                               QUERY PLAN                                                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 48:1  (slice4; segments: 48)  (cost=0.00..990.85 rows=101 width=12)  
   Rows out:  101 rows at destination with 752 ms to first row, 753 ms to end, start offset by 732 ms.  
   ->  GroupAggregate  (cost=0.00..990.85 rows=3 width=12)  
         Group By: a.c1  
         Rows out:  Avg 2.5 rows x 41 workers.  Max 4 rows (seg9) with 746 ms to end, start offset by 738 ms.  
         ->  Sort  (cost=0.00..990.85 rows=3 width=12)  
               Sort Key: a.c1  
               Rows out:  Avg 118.2 rows x 41 workers.  Max 192 rows (seg9) with 746 ms to end, start offset by 738 ms.  
               Executor memory:  58K bytes avg, 58K bytes max (seg0).  
               Work_mem used:  58K bytes avg, 58K bytes max (seg0). Workfile: (0 spilling, 0 reused)  
               ->  Redistribute Motion 48:48  (slice3; segments: 48)  (cost=0.00..990.85 rows=3 width=12)  
                     Hash Key: a.c1  
                     Rows out:  Avg 118.2 rows x 41 workers at destination.  Max 192 rows (seg9) with 594 ms to first row, 746 ms to end, start offset by 738 ms.  
                     ->  Result  (cost=0.00..990.85 rows=3 width=12)  
                           Rows out:  Avg 101.0 rows x 48 workers.  Max 101 rows (seg0) with 675 ms to first row, 676 ms to end, start offset by 740 ms.  
                           ->  HashAggregate  (cost=0.00..990.85 rows=3 width=12)  
                                 Group By: a.c1  
                                 Rows out:  Avg 101.0 rows x 48 workers.  Max 101 rows (seg0) with 675 ms to first row, 676 ms to end, start offset by 740 ms.  
                                 Executor memory:  4185K bytes avg, 4185K bytes max (seg0).  
                                 ->  Hash Join  (cost=0.00..964.88 rows=208191 width=4)  
                                       Hash Cond: a.id = b.id  
                                       Rows out:  Avg 208333.3 rows x 48 workers.  Max 208401 rows (seg18) with 282 ms to first row, 661 ms to end, start offset by 767 ms.  
                                       Executor memory:  4883K bytes avg, 4885K bytes max (seg18).  
                                       Work_mem used:  4883K bytes avg, 4885K bytes max (seg18). Workfile: (0 spilling, 0 reused)  
                                       (seg18)  Hash chain length 1.3 avg, 4 max, using 159471 of 262151 buckets.  
                                       ->  Redistribute Motion 48:48  (slice1; segments: 48)  (cost=0.00..444.59 rows=208378 width=8)  
                                             Hash Key: a.id  
                                             Rows out:  Avg 208333.3 rows x 48 workers at destination.  Max 208401 rows (seg18) with 0.112 ms to first row, 104 ms to end, start offset by 1048 ms.  
                                             ->  Table Scan on a  (cost=0.00..436.27 rows=208378 width=8)  
                                                   Rows out:  Avg 208333.3 rows x 48 workers.  Max 208422 rows (seg31) with 0.117 ms to first row, 64 ms to end, start offset by 749 ms.  
                                       ->  Hash  (cost=440.42..440.42 rows=208191 width=4)  
                                             Rows in:  Avg 208333.3 rows x 48 workers.  Max 208401 rows (seg18) with 250 ms to end, start offset by 798 ms.  
                                             ->  Redistribute Motion 48:48  (slice2; segments: 48)  (cost=0.00..440.42 rows=208191 width=4)  
                                                   Hash Key: b.id  
                                                   Rows out:  Avg 208333.3 rows x 48 workers at destination.  Max 208401 rows (seg18) with 0.219 ms to first row, 132 ms to end, start offset by 798 ms.  
                                                   ->  Table Scan on b  (cost=0.00..436.27 rows=208191 width=4)  
                                                         Rows out:  Avg 208333.3 rows x 48 workers.  Max 208388 rows (seg3) with 0.146 ms to first row, 77 ms to end, start offset by 760 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 313K bytes.  
   (slice1)    Executor memory: 1096K bytes avg x 48 workers, 1096K bytes max (seg0).  
   (slice2)    Executor memory: 1096K bytes avg x 48 workers, 1096K bytes max (seg0).  
   (slice3)    Executor memory: 25518K bytes avg x 48 workers, 25518K bytes max (seg0).  Work_mem: 4885K bytes max.  
   (slice4)    Executor memory: 374K bytes avg x 48 workers, 382K bytes max (seg0).  Work_mem: 58K bytes max.  
 Statement statistics:  
   Memory used: 128000K bytes  
 Settings:  enable_bitmapscan=on; enable_seqscan=on; optimizer=on  
 Optimizer status: PQO version 1.602  
 Total runtime: 1486.335 ms  
(48 rows)

非分佈鍵排序

1、merge sort

為了保證全域性有序，以及資料排序的效率。

Greenplum使用了merge sort，首先在資料節點本地排序（所有節點並行），然後master節點向segment請求資料，在master節點merge sort合併。

體現了排序的效率。

非分佈鍵 group by 和 distinct 的原理

對於非分佈鍵的分組聚合請求，Greenplum採用了多階段聚合如下：

第一階段，在SEGMENT本地聚合。（Greenplum會根據欄位的distinct值的比例，考慮是直接重分佈資料，還是先在本地聚合後再重分佈資料（減少重分佈的資料量）。）

第二階段，根據分組欄位，將結果資料重分佈。

第三階段，再次在SEGMENT本地聚合。

第四階段，返回結果給master，有必要的話master節點呼叫聚合函式的final func（已經是很少的記錄數和運算量）。

非分佈鍵 JOIN 的原理

1、對於JOIN為分佈鍵的表，Greenplum根據表的大小，選擇對這張表根據JOIN列重分佈（大表），或廣播（小表）。

2、重分佈完成後，SEGMENT節點並行的執行本地JOIN。

參考

《Greenplum 行存、列存，堆表、AO表的原理和選擇》

《分散式DB(Greenplum)中資料傾斜的原因和解法 – 阿里雲HybridDB for PostgreSQL最佳實踐》

視窗，強制重分佈

《日增量萬億+級實時分析、資料規整 – 阿里雲HybridDB for PostgreSQL最佳實踐》

【Hive】資料傾斜優化 shuffle, join, group by
2020-09-24
Hive優化
資料庫的sort group by和hash group by
2023-04-10
資料庫
Spark SQL三種join和資料傾斜的產生和解決辦法
2022-02-28
SparkSQL
Spark學習——資料傾斜
2019-05-04
Spark
sql case when, Exist ,group by ,聚合
2024-03-15
SQL
資料傾斜解決辦法
2018-09-06
IoT資料傾斜如何解決
2022-09-30
ES資料聚合
2024-10-18
PostgreSQL DBA(193) - 資料傾斜下的HashJoin
2022-11-04
SQL
Redis 切片叢集的資料傾斜分析
2022-06-22
Redis
Flink sql 之兩階段聚合與 TwoStageOptimizedAggregateRule（原始碼分析）
2022-01-06
SQLZed原始碼
hive優化-資料傾斜優化
2018-08-16
Hive優化
Spark 資料傾斜及其解決方案
2019-12-30
Spark
DDD之4聚合和聚合根
2020-06-07
oracle中distinct和group by的區別
2018-12-30
Oracle
SQL -去重Group by 和Distinct的效率
2021-01-22
SQL
hadoop 透過cachefile來避免資料傾斜
2018-09-03
Hadoop
Hive千億級資料傾斜解決方案
2021-04-29
Hive
Elasticsearch 系列（五）- 資料聚合
2024-03-31
Elasticsearch
mongoDB資料庫之聚合
2020-11-28
MongoDB資料庫
4G多卡聚合路由器工作原理
2020-07-08
路由器
es筆記七之聚合操作之桶聚合和矩陣聚合
2023-11-17
筆記矩陣
MySQL 中的 distinct 和 group by 的效能比較
2023-02-20
MySql
【Spark篇】---Spark解決資料傾斜問題
2018-03-04
Spark
聚合路由器工作原理
2020-02-20
路由器
Elasticsearch系列---聚合查詢原理
2020-04-17
Elasticsearch
多卡多鏈路聚合路由器的關鍵技術、原理分析
2020-08-07
路由器
Spring Boot 高效資料聚合之道
2019-06-12
Spring Boot
大資料SQL優化之資料傾斜解決案例全集
2022-01-06
大資料SQL優化
一種自平衡解決資料傾斜的分表方法
2023-03-30
聚合函式及分組與過濾（GROUP BY … HAVING）
2020-11-01
函式
SQL Server資料庫————模糊查詢和聚合函式
2018-12-03
SQLServer資料庫函式
如何解決 Redis 資料傾斜、熱點等問題
2022-12-13
Redis
Spark效能最佳化篇三：資料傾斜調優
2021-09-09
Spark
Oracle資料傾斜導致的問題-無繫結變數
2020-02-22
Oracle變數
Oracle資料傾斜導致的問題-有繫結變數
2020-02-23
Oracle變數
es java 聚合方法——聚合後根據count排序並取前2條資料
2018-07-13
Java排序
聚合
2024-08-22
五款傾斜攝影與三維資料處理工具介紹：GISBox、Cesiumlab、OSGBLab、靈易智模、傾斜伴侶
2024-10-08

HybridDBPostgreSQL”Sort、Group、distinct聚合、JOIN”不懼怕資料傾斜的黑科技和原理-多階段聚合

標籤

背景

資料分佈儲存後，面臨一些挑戰：

非分佈鍵 JOIN，排序，group by，distinct

1、非分佈鍵 group by

2、非分佈鍵 distinct

3、非分佈鍵 distinct + 非分佈鍵 group by

4、非分佈鍵 join

非分佈鍵排序

非分佈鍵 group by 和 distinct 的原理

非分佈鍵 JOIN 的原理

參考

相關文章

HybridDBPostgreSQL”Sort、Group、distinct聚合、JOIN”不懼怕資料傾斜的黑科技和原理-多階段聚合

標籤

背景

資料分佈儲存後，面臨一些挑戰：

非分佈鍵 JOIN，排序，group by，distinct

1、非分佈鍵 group by

2、非分佈鍵 distinct

3、非分佈鍵 distinct + 非分佈鍵 group by

4、非分佈鍵 join

非分佈鍵 排序

非分佈鍵 group by 和 distinct 的原理

非分佈鍵 JOIN 的原理

參考

相關文章

非分佈鍵排序