Hive列合併與元素蒐集

鴨梨山大哎發表於2020-10-24

舉例,某個表形式如下

+--+--------+------+-----+----+
|id|name    |course|score|year|
+--+--------+------+-----+----+
|1 |zhangsan|數學    |80   |2015|
|2 |lisi    |語文    |90   |2016|
|3 |lisi    |數學    |70   |2016|
|4 |wangwu  |化學    |80   |2017|
|5 |zhangsan|語文    |85   |2015|
|6 |zhangsan|化學    |90   |2015|
+--+--------+------+-----+----+

concat相當於兩列資料合併

select concat(id,":",name),course,score,year from course_score;
+----------+------+-----+----+
|c0        |course|score|year|
+----------+------+-----+----+
|1:zhangsan|數學    |80   |2015|
|2:lisi    |語文    |90   |2016|
|3:lisi    |數學    |70   |2016|
|4:wangwu  |化學    |80   |2017|
|5:zhangsan|語文    |85   |2015|
|6:zhangsan|化學    |90   |2015|
+----------+------+-----+----+

concat_ws

也是合併多列的,不過分隔符只需要寫一次.
只能連線字串,所以要轉型一下,利用cast函式

select concat_ws(":",cast(id as string),course,cast(score as string)) ,year from course_score;
+-------+----+
|c0     |year|
+-------+----+
|1:數學:80|2015|
|2:語文:90|2016|
|3:數學:70|2016|
|4:化學:80|2017|
|5:語文:85|2015|
|6:化學:90|2015|
+-------+----+

collect_set

收集某一列的,和上面兩個功能完全不同. 另外,也可以去重

select  collect_set(name) from course_score;
+----------------------------+
|c0                          |
+----------------------------+
|["zhangsan","lisi","wangwu"]|
+----------------------------+

相關文章