hive複雜資料型別的用法

哇塞兒發表於2021-02-25

1、簡單描述

arrays: ARRAY<data_type>
maps: MAP<primitive_type, data_type>
structs: STRUCT<col_name : data_type [COMMENT col_comment], ...>
union: UNIONTYPE<data_type, data_type, ...>

Hive 中對該型別的完全支援仍然不完整。如果 JOIN、WHERE 和 GROUP BY 子句中引用的 UNIONTYPE 欄位的查詢將會失敗，Hive 沒有定義語法來提取 UNIONTYPE 的 tag 或 value 欄位。

複雜資料型別的建構函式：

建構函式	運算元	描述
map	(key1, value1, key2, value2, ...)	Creates a map with the given key/value pairs.
struct	(val1, val2, val3, ...)	Creates a struct with the given field values. Struct field names will be col1, col2, ....
named_struct	(name1, val1, name2, val2, ...)	Creates a struct with the given field names and values. (As of Hive 0.8.0.)
array	(val1, val2, ...)	Creates an array with the given elements.
create_union	(tag, val1, val2, ...)	Creates a union type with the value that is being pointed to by the tag parameter.

注：create_union 中的 tag 讓我們知道 union 的哪一部分正在被使用。

複雜資料型別訪問元素：

建構函式	運算元	描述
A[n]	A is an Array and n is an int	Returns the nth element in the array A. The first element has index 0. For example, if A is an array comprising of ['foo', 'bar'] then A[0] returns 'foo' and A[1] returns 'bar'.
M[key]	M is a Map<K, V> and key has type K	Returns the value corresponding to the key in the map. For example, if M is a map comprising of {'f' -> 'foo', 'b' -> 'bar', 'all' -> 'foobar'} then M['all'] returns 'foobar'.
S.x	S is a struct	Returns the x field of S. For example for the struct foobar {int foo, int bar}, foobar.foo returns the integer stored in the foo field of the struct.

2、測試

-- ------------------------------ ARRAY ------------------------------

-- ARRAY<data_type>
create table arraytest (id int,info array<string>) 
row format delimited 
fields terminated by '\t'
collection items terminated by ',' 
stored as textfile;

-- 不要忽略`collection items terminated by ',' 
-- 它表示陣列元素間的分隔符
-- 如果忽略了輸出是這樣的：
hive> select * from arraytest;
OK
1       ["zhangsan,male"]
2       ["lisi,male"]

-- 資料 
1	zhangsan,male
2	lisi,male

-- 匯入
load data local inpath '/root/data/arraytest.txt' into table arraytest;

-- 檢視
hive> select * from arraytest;
OK
1       ["zhangsan","male"]
2       ["lisi","male"]

-- 索引檢視陣列元素
hive> select id,info[0] from arraytest;
OK
1       zhangsan
2       lisi

-- 將陣列的所有元素展開輸出
hive> select explode(info) from arraytest;
OK
zhangsan
male
lisi
male

-- ------------------------------ MAP ------------------------------

-- MAP<primitive_type, data_type>
create table maptest (id int,info map<string,string>) 
row format delimited 
fields terminated by '\t'
collection items terminated by ','
map keys terminated by ':' 
stored as textfile;

-- 不要忽略`map keys terminated by ':' 
-- 它表示鍵值間的分隔符

-- 資料 
1	name:zhangsan,sex:male
2	name:lisi,sex:male

-- 匯入
load data local inpath '/root/data/maptest.txt' into table maptest;

-- 檢視
hive> select * from maptest;
OK
1       {"name":"zhangsan","sex":"male"}
2       {"name":"lisi","sex":"male"}

-- 檢視map元素
hive> select id,info["name"] from maptest;
OK
1       zhangsan
2       lisi


-- ------------------------------ STRUCT ------------------------------

-- STRUCT<col_name : data_type [COMMENT col_comment], ...>
create table structtest (id int,info struct<name:string,sex:string>) 
row format delimited 
fields terminated by '\t'
collection items terminated by ','
stored as textfile;

-- 資料 
1	zhangsan,male
2	lisi,male

-- 匯入
load data local inpath '/root/data/structtest.txt' into table structtest;

-- 檢視
hive> select * from structtest;
OK
1       {"name":"zhangsan","sex":"male"}
2       {"name":"lisi","sex":"male"}

hive> select id,info.name from structtest;
OK
1       zhangsan
2       lisi

-- ------------------------------ 綜合array\map\struct ------------------------------

create table alltest(
    id int,
    name string,
    salary bigint,
    sub array<string>,
    details map<string, int>,
    address struct<city:string, state:string, pin:int>
) 
row format delimited 
fields terminated by ','
collection items terminated by '$'
map keys terminated by '#' 
stored as textfile;

-- 資料 
1,abc,40000,a$b$c,pf#500$epf#200,hyd$ap$500001
2,def,3000,d$f,pf#500,bang$kar$600038
4,abc,40000,a$b$c,pf#500$epf#200,bhopal$MP$452013
5,def,3000,d$f,pf#500,Indore$MP$452014

-- 匯入資料
load data local inpath '/root/data/alltest.txt' into table alltest;

-- 檢視
hive> select * from alltest;
OK
1       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"hyd","state":"ap","pin":500001}
2       def     3000    ["d","f"]       {"pf":500}      {"city":"bang","state":"kar","pin":600038}
4       abc     40000   ["a","b","c"]   {"pf":500,"epf":200}    {"city":"bhopal","state":"MP","pin":452013}
5       def     3000    ["d","f"]       {"pf":500}      {"city":"Indore","state":"MP","pin":452014}

-- ------------------------------ UNIONTYPE ------------------------------

-- create_union(tag, val1, val2, ...)
-- Creates a union type with the value that is being pointed to by the tag parameter. 

-- ---- 簡單示例：裡面都是基本型別 ------

create table uniontest(
    id int,
    info uniontype<string,string>
) 
row format delimited 
fields terminated by '\t'
collection items terminated by ','
stored as textfile;

-- 插入資料：insert into
-- tag 索引後面的值是從 0 開始的
insert into table uniontest 
    values
    (1,create_union(0,"zhangsan","male")),  -- 使用 "zhangsan"
    (1,create_union(1,"zhangsan","male")),  -- 使用 "male"
    (2,create_union(0,"lisi","female")),
    (2,create_union(1,"lisi","female"));

-- 檢視
hive> select * from uniontest;
OK
1       {0:"zhangsan"}
1       {1:"male"}
2       {0:"lisi"}
2       {1:"female"}

-- 資料 
1	0,zhangsan
1	1,male
2	0,lisi
2	1,female

-- 插入資料：load data
load data local inpath '/root/data/uniontest.txt' into table uniontest;

-- 檢視
hive> select * from uniontest;
OK
1       {0:"zhangsan"}
1       {1:"male"}
2       {0:"lisi"}
2       {1:"female"}

-- 如果資料格式是這樣的：
-- 1	0,zhangsan,male
-- 1	1,zhangsan,male
-- 2	0,lisi,female
-- 2	1,lisi,female
-- 會把後面的字串當作一個整體，輸出：
-- 1       {0:"zhangsan,male"}
-- 1       {1:"zhangsan,male"}
-- 2       {0:"lisi,female"}
-- 2       {1:"lisi,female"}


-- ---- 複雜示例：裡面包含複雜型別 ------

create table uniontest_comp(
    id int,
    info uniontype<int, 
                   string,
                   array<string>,
                   map<string,string>,
                   struct<sex:string,age:string>>
) 
row format delimited 
fields terminated by '\t'
collection items terminated by ','
stored as textfile;

-- 插入資料
-- 也可以使用 `insert into table ....select ....`
insert into table uniontest_comp
    values
    (1,create_union(0,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),
    (1,create_union(1,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),
    (1,create_union(2,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),
    (1,create_union(3,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33"))),
    (1,create_union(4,1,"zhangsan",array("male","33"),map("sex","male","age","33"),named_struct("sex","male","age","33")));

-- 檢視
hive> select * from uniontest_comp;
OK
1       {0:1}
1       {1:"zhangsan"}
1       {2:["male","33"]}
1       {3:{"sex":"male","age":"33"}}
1       {4:{"sex":"male","age":"33"}}

參考：http://querydb.blogspot.com/2015/11/hive-complex-data-types.html

使用ajax請求傳送複雜的json資料型別，並解決fastjson解析複雜的json資料型別的問題
2018-12-19
JSON資料型別AST
Hive中的集合資料型別
2020-12-20
Hive資料型別
C語言中的複雜資料型別，你掌握了哪些？
2020-12-26
C語言資料型別
【MySQL】資料型別的基本用法
2020-09-26
MySql資料型別
MySQL資料型別DECIMAL用法
2020-05-14
MySql資料型別Decimal
Hive中的資料型別以及案例實操
2020-10-01
Hive資料型別
Hive（一）資料型別以及DDL資料定義
2024-09-02
Hive資料型別
簡單型別與複雜型別及原型鏈
2018-05-16
型別原型
javascript複雜型別如何傳參
2021-09-11
JavaScript型別
Spark SQL：Hive資料來源複雜綜合案例實戰
2018-09-28
SparkSQLHive
46_初識搜尋引擎_mapping複雜資料型別以及object型別資料底層結構大揭秘
2024-10-02
APP資料型別Object
javascript中的資料型別及其常見用法
2019-04-05
JavaScript資料型別
vue開發黑科技--利用引用型別的值處理複雜資料的編輯
2018-10-15
Vue型別
hive學習筆記之一：基本資料型別
2021-07-01
Hive筆記資料型別
C#學習筆記--複雜資料型別、函式和結構體
2023-10-11
C#筆記資料型別函式結構體
Solidity-變數和資料型別[複合型別_1]
2023-09-08
Solid變數資料型別
Java雜記1—資料型別和變數
2018-04-25
Java資料型別變數
SAP UI5 初學者教程之十九 - SAP UI5 資料型別和複雜的資料繫結
2022-02-15
UI資料型別
關於Sql server資料型別HierarchyID 資料型別用法和遞迴顯示完整路徑
2024-03-13
SQLServer資料型別遞迴
linux中查詢find命令的複雜用法
2021-09-19
Linux
Typescript複雜型別的宣告：寫一個工具函式庫
2020-01-01
TypeScript型別函式
Redis五大資料型別之 Hash（雜湊）
2020-09-11
Redis大資料資料型別
Day 7.5 資料型別總結 + 複製淺複製深複製
2024-10-08
資料型別
複雜性Complex與複雜Complicated區別 - Sonja
2021-11-04
複雜SQL構造資料：
2022-07-21
SQL
js資料型別之基本資料型別和引用資料型別
2018-06-19
JS資料型別
資料型別: 資料型別有哪些？
2021-02-01
資料型別
Python中基礎資料型別（List、Tuple、Dict）的概念和用法
2021-03-15
Python資料型別
hive中round、floor、ceil區別及用法
2020-10-17
Hive
Java基本資料型別總結、型別轉換、常量的宣告規範，final關鍵字的用法
2019-03-29
Java資料型別
Java中的基本資料型別與引用資料型別
2021-01-21
Java資料型別
複雜的資料結構設計求解？
2019-08-25
資料結構
Redis基礎、常用型別介紹、時間複雜度
2019-04-13
Redis型別時間複雜度
TypeScript魔法堂：函式型別宣告其實很複雜
2020-11-02
TypeScript函式型別
區別值型別資料和引用型別資料
2018-05-30
型別
告別複雜的流關閉
2019-05-09
複雜二進位制資料
2024-04-27
資料複雜性和簡單
2024-11-02

hive複雜資料型別的用法

1、簡單描述

2、測試

相關文章