在hive中建立幾種表

榴芒姐姐發表於2020-11-14

原文網址 : https://blog.csdn.net/alisa_Ge/article/details/109698111

1.建立內部表

create table 表名(
    屬性名 屬性型別，
    ...
    比如：
    name struct<first:string,last:string>,
    age int,
    hobbies array<string>,
    deliveryAdd map<string,string>
)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
;

2.建立外部表：

create external table 表名(
    屬性名 屬性型別，
    ...
    比如：
    name struct<first:string,last:string>,
    age int,
    hobbies array<string>,
    deliveryAdd map<string,string>
)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
;

建立外部表需要注意的是，表中的資料檔案存在hdfs檔案系統上，所以在資料庫中刪除只會刪除表結構，表中資料依然存在。如需刪除，需要使用以下命令：

hdfs dfs -rm -rf /檔案路徑;

3.建立分割槽表

create external table 表名(
    屬性名 屬性型別，
    ...
    比如：
    age int,
    hobbies array<string>,
    deliveryAdd map<string,string>
)
partitioned by(username string)
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
;

這裡需要注意的是，上述分割槽是按照username來分割槽的。上傳檔案時使用以下命令：

load data local inpath '/檔案路徑/表1.log' into table 表名partition(username='表1');
load data local inpath '/檔案路徑/表2.log' [overwrite覆蓋] into table 表名partition(username='表2');

若需要檢視分割槽表結構，使用以下命令：

show partitions 表名;

4.建立分桶表（抽象的，方便抽樣，提高join查詢效率）

二選一：
set hive.enforce.bucketing = true;//優化
set mapreduce.reduce.tasks = num;//優化。設定mapreduce的數量和分桶數量一致

create external table 表名(
    屬性名 屬性型別，
    ...
    比如：
    name struct<first:string,last:string>,
    age int,
    hobbies array<string>,
    deliveryAdd map<string,string>
)
clustered by(name) into n buckets
row format delimited
fields terminated by '|'
collection items terminated by ','
map keys terminated by ':'
lines terminated by '\n'
stored as textfile
;

建立表之後，需要做以下操作：

在表建立好後，需要將表中資料上傳，放至表中：

load data [local] inpath '檔案路徑' into table 表名;

local:本地上傳

將資料檔案掛到hdfs檔案系統上用以下命令：

hdfs dfs -put 資料檔案 /目錄

5.with語法：可以理解成檢視。目的：封裝重用。是一個臨時結果集

with
臨時表名 as (select ... from 表名 where 屬性名=' '),
select *from 臨時表名;

[hive]hive資料模型中四種表
2018-08-14
Hive模型
在 .NET 中建立物件的幾種方式的對比
2021-07-22
物件
JS中建立函式的幾種方式
2019-02-16
JS函式
Hive 表的兩種分類
2020-11-30
Hive
建立物件的幾種模式
2018-11-02
物件模式
將資料匯入kudu表（建立臨時hive表，從hive匯入kudu）步驟
2020-09-24
Hive
Hive建立索引
2018-11-28
Hive索引
五種方法建立java物件，你知道幾種呢？
2020-11-14
Java物件
在Js中匿名函式的幾種寫法
2024-10-05
JS函式
JavaScript物件的建立方式有幾種？
2023-12-04
JavaScript物件
執行緒池建立的幾種方式
2024-08-15
執行緒
Spring在程式碼中獲取bean的幾種方式
2018-05-09
SpringBean
在Linux中，有哪幾種linux/unix發行版本？
2024-07-11
Linux
Billboards 技術在Unity 中的幾種使用方法
2022-05-21
Unity
hive表中yyyymmdd格式日期校驗
2020-10-15
Hive
DreamWeaver中應用CSS樣式表的幾種情況
2018-05-04
CSS
hive建表
2024-10-05
Hive
DbForge Studio for SQL Server入門教程：在表編輯器中建立表
2018-08-31
SQLServer
理解水平居中的幾種表現
2018-10-13
在django中怎麼檢視建立的資料表
2021-09-11
Django
在Linux中，如何檢視檔案內容？列出幾種方法。
2024-04-23
Linux
域名被牆後的幾種表現
2021-04-25
TaroEcharts-各種圖表在Taro中的實踐
2019-01-17
Echarts
hive建立分割槽表報錯AccessControlException Permission denied: user=NONE, access=WRITE, inode
2020-09-23
HiveExceptionNone
在nodejs中建立cluster
2021-01-31
NodeJS
通過命令在navicat中建立資料庫及表結構
2018-06-28
資料庫
Stream 的幾種中間操作
2020-11-17
Java中的幾種註釋
2020-11-21
Java
Python中幾種lambda排序方法
2024-08-28
Python排序
Java的幾種建立例項方法的效能對比
2019-07-26
Java
Java建立多執行緒的幾種方式實現
2020-10-27
Java執行緒
C#委託的幾種表現方式
2018-07-13
C#
java中Stream的四種建立
2021-09-11
Java
幾種集合的幾種方法
2020-12-06
hive分桶表排序
2020-09-26
Hive排序
Hive表的基本操作
2021-01-10
Hive
在nodejs中建立child process
2021-01-25
NodeJS
iOS開發中的幾種鎖
2018-08-18
iOS