[Hadoop]Hive r0.9.0中文文件(三)之Hive相關命令

大搜車-自娛發表於2012-10-24
[size=large][b]一、Hive命令選項[/b][/size]
Usage:

Usage: hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string>] [-S]

-i <filename> Initialization Sql from file (executed automatically and silently before any other commands)
-e 'quoted query string' Sql from command line
-f <filename> Sql from file
-S Silent mode in interactive shell where only data is emitted
-hiveconf x=y Use this to set hive/hadoop configuration variables.

-e and -f cannot be specified together. In the absence of these options, interactive shell is started. However, -i can be used with any other options.

To see this usage help, run hive -h


下面的例子是做一個命令列的查詢:
$HIVE_HOME/bin/hive -e 'select a.col from tab1 a'


下面的例子是指定Hive配置查詢:
$HIVE_HOME/bin/hive -e 'select a.col from tab1 a' -hiveconf hive.exec.scratchdir=/home/my/hive_scratch  -hiveconf mapred.reduce.tasks=32


下面的例子是將查詢結果匯入到文字檔案:
$HIVE_HOME/bin/hive -S -e 'select a.col from tab1 a' > a.txt


下面的例子是使用SQL檔案進行操作:
$HIVE_HOME/bin/hive -f /home/my/hive-script.sql


下面的例子是在進入互動式介面之前跑一個初始化的指令碼:
$HIVE_HOME/bin/hive -i /home/my/hive-init.sql


[size=large][b]二、hiverc file[/b][/size]
如果沒有-i引數,那麼hive會直接進入命令列介面,同時會載入HIVE_HOME/bin/.hiverc and $HOME/.hiverc作為初始化所需要的檔案


[size=large][b]三、hive互動的Shell命令[/b][/size]

Command	Description
quit Use quit or exit to leave the interactive shell.
set key=value Use this to set value of particular configuration variable. One thing to note here is that if you misspell the variable name, cli will not show an error.
set This will print a list of configuration variables that are overridden by user or hive.
set -v This will print all hadoop and hive configuration variables.
add FILE [file] [file]* Adds a file to the list of resources
list FILE list all the files added to the distributed cache
list FILE [file]* Check if given resources are already added to distributed cache
! [cmd] Executes a shell command from the hive shell
dfs [dfs cmd] Executes a dfs command from the hive shell
[query] Executes a hive query and prints results to standard out
source FILE Used to execute a script file inside the CLI.


例子:

hive> set mapred.reduce.tasks=32;
hive> set;
hive> select a.* from tab1;
hive> !ls;
hive> dfs -ls;


[size=large][b]四、Hive日誌[/b][/size]

Hive使用Log4j寫日誌,這些日誌將不會以標準輸出方式進行輸出,預設情況Hive將使用hive-log4j,配置檔案在conf目錄下,日誌輸出在 /tmp/$USER/hive.log 下,日誌級別為WARN。

為了Debug,你可以修改日誌的輸出格式以及改變日誌的輸出級別,你可以在命令列下使用以下命令:

$HIVE_HOME/bin/hive -hiveconf hive.root.logger=INFO,console 


hive.root.logger 指定了日誌的級別以及日誌輸出位置,輸出在控制檯。這樣日誌不會輸出到檔案中。

[size=large][b]五、Hive 資源[/b][/size]
hive可以管理查詢有效的附加資源到Session中。任何本地的acessible檔案會加入到這個session,hive載入這個檔案到session中後可以進行相關的map/reduce任務,hive使用haddop cache來處理被載入的檔案。

   ADD { FILE[S] | JAR[S] | ARCHIVE[S] } <filepath1> [<filepath2>]*
LIST { FILE[S] | JAR[S] | ARCHIVE[S] } [<filepath1> <filepath2> ..]
DELETE { FILE[S] | JAR[S] | ARCHIVE[S] } [<filepath1> <filepath2> ..]


檔案資源僅被新增到目標cache中。Jar資源將被新增到Java classpath中。ARCHIVE資源將被自動新增來描述他們。
例如:

hive> add FILE /tmp/tt.py;
hive> list FILES;
/tmp/tt.py
hive> from networks a MAP a.networkid USING 'python tt.py' as nn where a.ds = '2009-01-04' limit 10;


如果命令在所有節點上均有效就沒有必要加入到Session中. For example:

... MAP a.networkid USING 'wc -l' ...: here wc is an executable available on all machines
... MAP a.networkid USING '/home/nfsserv1/hadoopscripts/tt.py' ...: here tt.py may be accessible via a nfs mount point that's configured identically on all the
cluster nodes.

相關文章