請問：hive中avg聚合函式會使用到combiner功能嗎？

菜鳥coder發表於2018-11-23

原文網址 : https://flycode.co/archives/141515

例如下面這條SQL，肯定是用上了combiner功能的

select deptno, sum(sal) as sum_sal from emp group by deptno

hive (test)> explain select deptno, sum(sal) as sum_sal from emp group by deptno;
OK
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: emp
            Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: deptno (type: int), sal (type: decimal(22,2))
              outputColumnNames: deptno, sal
              Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
              Group By Operator
                aggregations: sum(sal)
                keys: deptno (type: int)
                mode: hash
                outputColumnNames: _col0, _col1
                Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
                Reduce Output Operator
                  key expressions: _col0 (type: int)
                  sort order: +
                  Map-reduce partition columns: _col0 (type: int)
                  Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
                  value expressions: _col1 (type: decimal(32,2))
      Reduce Operator Tree:
        Group By Operator
          aggregations: sum(VALUE._col0)
          keys: KEY._col0 (type: int)
          mode: mergepartial
          outputColumnNames: _col0, _col1
          Statistics: Num rows: 2 Data size: 241 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 2 Data size: 241 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

如果是這個SQL，是否能用上combiner功能？？之前學習的時候說combiner不能處理avg這種函式的

select deptno, avg(sal) as avg_sal from emp group by deptno

我看執行計劃和使用sum聚合函式無差異

hive (test)> explain select deptno, avg(sal) as avg_sal from emp group by deptno;
OK
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: emp
            Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
            Select Operator
              expressions: deptno (type: int), sal (type: decimal(22,2))
              outputColumnNames: deptno, sal
              Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
              Group By Operator
                aggregations: avg(sal)
                keys: deptno (type: int)
                mode: hash
                outputColumnNames: _col0, _col1
                Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
                Reduce Output Operator
                  key expressions: _col0 (type: int)
                  sort order: +
                  Map-reduce partition columns: _col0 (type: int)
                  Statistics: Num rows: 5 Data size: 603 Basic stats: COMPLETE Column stats: NONE
                  value expressions: _col1 (type: struct<count:bigint,sum:decimal(32,2),input:decimal(22,2)>)
      Reduce Operator Tree:
        Group By Operator
          aggregations: avg(VALUE._col0)
          keys: KEY._col0 (type: int)
          mode: mergepartial
          outputColumnNames: _col0, _col1
          Statistics: Num rows: 2 Data size: 241 Basic stats: COMPLETE Column stats: NONE
          File Output Operator
            compressed: false
            Statistics: Num rows: 2 Data size: 241 Basic stats: COMPLETE Column stats: NONE
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

combiner函式
2018-03-26
函式
laravel mysql聚合函式使用方法(count,sum,max,min,avg)
2020-12-10
LaravelMySql函式
【hive】中的concat函式
2018-12-19
Hive函式
Hive中自定義函式
2020-10-13
Hive函式
hive函式
2020-12-14
Hive函式
SQL-函式 - 聚合函式
2024-12-04
SQL函式
Hive函式大全
2018-08-20
Hive函式
mongoDB中聚合函式java處理
2019-04-14
MongoDB函式Java
spark中的聚合函式總結
2018-09-13
Spark函式
Django：聚合函式
2024-08-20
Django函式
Stream聚合函式
2021-09-28
函式
Hive常用函式及自定義函式
2018-06-08
Hive函式
hive學習筆記之十：使用者自定義聚合函式(UDAF)
2021-07-09
Hive筆記函式
hive內建函式
2018-10-17
Hive函式
Hive是否支援in函式
2018-08-20
Hive函式
Hive視窗函式
2020-10-07
Hive函式
Hive（五）常用函式
2024-10-09
Hive函式
Hive（六）JSON函式
2024-10-09
HiveJSON函式
Hive之分析函式
2021-08-05
Hive函式
Django（18）聚合函式
2021-05-19
Django函式
Hive函式（內建函式+自定義標準函式UDF）
2020-09-23
Hive函式
hive 3.0.0自定義函式
2018-09-06
Hive函式
hive視窗函式使用
2020-09-24
Hive函式
Hive行轉列函式
2020-10-03
Hive函式
【大資料開發】Hive——Hive函式大全
2020-11-06
大資料Hive函式
Oracle OCP(04)：聚合函式
2019-01-16
Oracle函式
python strip()函式爬蟲用到
2020-12-03
Python函式爬蟲
SQL 自定義函式生成網路卡地址,MES開發中經常會用到的
2024-09-07
SQL函式
MySQL函式大全(字串函式，數學函式，日期函式，系統級函式，聚合函式)
2020-11-14
MySql函式字串
hive05_視窗函式
2024-08-08
Hive函式
Python利用partial偏函式生成不同的聚合函式
2024-04-15
Python函式
SQL查詢中用到的函式
2018-05-14
SQL函式
pandas 將函式應用到列（qbit）
2022-12-29
函式
PHP經常用到的函式大全
2020-12-24
PHP函式
JS 中的函式表示式和函式宣告你混淆了嗎？
2022-12-16
JS函式
探索MySQL高階語句（數學函式、聚合函式、字串函式、日期時間函式）
2020-12-27
MySql函式字串
Hive視窗函式保姆級教程
2021-06-15
Hive函式
使用 useRequestURL 組合函式訪問請求URL
2024-07-26
函式

請問：hive中avg聚合函式會使用到combiner功能嗎？

相關文章