Hive常用命令總結

longliqiang88發表於2015-08-12

本文只是總結一些在Hive中常用的命令,並且假設需要的目錄或者資料已經存在。

建立表,\t作為列的分隔符

create table trade_detail (id bigint,income double,expenses double,time string) row formate delimited fields terminated by '\t';
create table user_info(id bigint, account string, name string, age int) row format delimited fields terminated by '\t';

接下來是稍複雜的語句,建立表的的同時進行賦值

create table result row format delimited fields terminated by '\t' as select t1.account, t1.income, t1.expenses, t1.surplus, t2.name from user_info t2 join (select account, sum(income) as income, sum(expenses) as expenses, sum(income-expenses) as surplus from trade_detail group by account) t1 on(t1.account = t2.account);

載入本地檔案到資料表中

load data local inpath '/home/hadoop/data/student.txt' overwrite into table student;
load data local inpath '/home/hadoop/data/user_info.doc' overwrite into table user_info;

建立外部表 ,建立外部表的一般情況指的是:先有檔案存放著資料,之後我們再來建立表,也就是說建立一張表,然後指向這個有資料的目錄。以後只要是向這個目錄中上傳符合格式的資料會被自動裝在到資料庫表中,因為在metastore(後設資料)會記錄這些資訊

create external table t_detail(id bigint, account string, income double, expenses double, time string) ) row format delimited fields terminated by '\t' location '/hive/td_partition';

建立分割槽表,一般用於資料量比較大的情況下, partitioned by (logdate string)用來指定按照什麼進行分割槽

create external table t_detail(id bigint, account string, income double, expenses double, time string)  row format delimited fields terminated by '\t' location '/hive/td_partition' partitioned by (logdate string);

將mysql中的資料直接儲存到Hive中

sqoop export --connect jdbc:mysql://192.168.8.103:3306/hmbbs --username root --password hadoop --export-dir '/user/hive/warehouse/pv_2013_05_31/000000_0' --table pv

基本的插入語法

insert overwrite table tablename [partiton(partcol1=val1,partclo2=val2)]select_statement from t_statement
insert overwrite table test_insert select * from test_table;

更新表的名稱

hive> alter table source RENAME TO target;

新增新一列

alter table invites add columns (new_col2 INT COMMENT 'a comment');

刪除表:

DROP TABLE records;

刪除表中資料,但要保持表的結構定義

dfs -rmr /user/hive/warehouse/records;

顯示所有函式

show functions;

檢視函式用法

describe function substr;

內連線

SELECT sales.*, things.* FROM sales JOIN things ON (sales.id = things.id);

檢視hive為某個查詢使用多少個MapReduce作業

Explain SELECT sales.*, things.* FROM sales JOIN things ON (sales.id = things.id);

外連線

SELECT sales.*, things.* FROM sales LEFT OUTER JOIN things ON (sales.id = things.id);
SELECT sales.*, things.* FROM sales RIGHT OUTER JOIN things ON (sales.id = things.id);
SELECT sales.*, things.* FROM sales FULL OUTER JOIN things ON (sales.id = things.id);

建立檢視

hive> CREATE VIEW valid_records AS SELECT * FROM records2 WHERE temperature !=9999;

檢視檢視詳細資訊

hive> DESCRIBE EXTENDED valid_records;

相關文章