hadoop上安裝hive2.3.2學習總結—hive安裝+mysql以及碰到坑點
一、環境準備
hapdoop版本:基於hadoop2.8.1,本教程是基於hadoop2.8.1上安裝的hive
hive版本:apache-hive-2.3.2-bin.tar.gz
二、安裝配置hive+mysql(遠端模式)
1、首先安裝好mysql資料庫
2、去hive官網下載hive安裝包:apache-hive-2.3.2-bin.tar.gz
tar -zxvf apache-hive-2.3.2-bin.tar.gz
cd apache-hive-2.3.2-bin
[chenxun@chen.local 22:13 ~/software/apache-hive-2.3.2-bin]$ll
total 6840
-rw-r--r-- 1 chenxun staff 20798 Nov 10 00:26 LICENSE
-rw-r--r-- 1 chenxun staff 230 Nov 10 00:26 NOTICE
-rw-r--r-- 1 chenxun staff 1979 Nov 10 00:58 RELEASE_NOTES.txt
drwxr-xr-x 13 chenxun staff 442 Jan 28 01:14 bin
drwxr-xr-x 21 chenxun staff 714 Jan 25 22:08 binary-package-licenses
drwxr-xr-x 13 chenxun staff 442 Jan 28 01:47 conf
drwxr-xr-x 4 chenxun staff 136 Jan 25 22:08 examples
drwxr-xr-x 7 chenxun staff 238 Jan 25 22:08 hcatalog
drwxr-xr-x 3 chenxun staff 102 Jan 25 22:08 jdbc
drwxr-xr-x 256 chenxun staff 8704 Jan 25 22:41 lib
drwxr-xr-x 9 chenxun staff 306 Nov 22 15:31 mysql-connector-java-5.1.45
-rw-r--r--@ 1 chenxun staff 3467861 Nov 22 14:31 mysql-connector-java-5.1.45.tar.gz
drwxr-xr-x 4 chenxun staff 136 Jan 25 22:08 scripts
drwxr-xr-x 16 chenxun staff 544 Jan 28 02:03 tmp
與 Hadoop 類似,Hive 也有 3 種執行模式:
-
內嵌模式
將後設資料儲存在本地內嵌的 Derby 資料庫中,這是使用 Hive 最簡單的方式。但是這種方式缺點也比較明顯,因為一個內嵌的 Derby 資料庫每次只能訪問一個資料檔案,這也就意味著它不支援多會話連線。 -
本地模式
這種模式是將後設資料儲存在本地獨立的資料庫中(一般是 MySQL),這用就可以支援多會話和多使用者連線了。 -
遠端模式
此模式應用於 Hive 客戶端較多的情況。把 MySQL 資料庫獨立出來,將後設資料儲存在遠端獨立的 MySQL 服務中,避免了在每個客戶端都安裝 MySQL 服務從而造成冗餘浪費的情況。
hive的配置檔案在conf目錄下面:
cp hive-default.xml.template hive-site.xml
vim hive-site.xml檔案:
1、把{system:java.io.tmpdir} 改成 /Users/chenxun/software/apache-hive-2.3.2-bin/tmp
/Users/chenxun/software/apache-hive-2.3.2-bin/tmp 這個路勁是自己建立一個路勁
2、把所有{system:user.name} 改成 {user.name}
3、配置mysql
找到下面的xml選項依次修改value
其中連線mysql的資料庫的密碼和使用者都是hive
其中hivedb是建立的資料庫名字,特別注意配置前後要一直
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://172.16.110.140:3306/hivedb?createDatabaseIfNotExist=true&characterEncoding=UTF-8&useSSL=false</value>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
在mysql端執行如下命令授權hive使用者以及密碼登陸訪問許可權;
create database hivedb;
grant all on hivedb.* to hive@'%' identified by 'hive';
grant all on hivedb.* to hive@'localhost' identified by 'hive';
flush privileges;
下載jdbc connector:(mysql-connector-java-5.1.45-bin.jar放到lib目錄下) https://dev.mysql.com/downloads/connector/j/
wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.45.tar.gz
tar -zxvf mysql-connector-java-5.1.45
mv mysql-connector-java-5.1.45-bin.jar ../lib
執行初始化工作:在bin目錄下執行下面的命令
schematool -initSchema -dbType mysql
然後在mysql端可以檢視是否成功:
use hivedb
show tables
如果你按照本文下面的方法把t1和t2表建立成功你可以檢視其中的後設資料的一些資訊:
mysql> select * from TBLS;
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT | IS_REWRITE_ENABLED |
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
| 1 | 1516893558 | 1 | 0 | chenxun | 0 | 1 | t1 | MANAGED_TABLE | NULL | NULL | |
| 6 | 1516969402 | 1 | 0 | chenxun | 0 | 6 | t2 | MANAGED_TABLE | NULL | NULL | |
+--------+-------------+-------+------------------+---------+-----------+-------+----------+---------------+--------------------+--------------------+--------------------+
2 rows in set (0.00 sec)
mysql> select * from DBS;
+-------+-----------------------+-----------------------------------------------------------+--------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------------------------+--------------+------------+------------+
| 1 | Default Hive database | hdfs://localhost:9000/user/hive/warehouse | default | public | ROLE |
| 6 | NULL | hdfs://localhost:9000/user/hive/warehouse/db_hive_test.db | db_hive_test | chenxun | USER |
+-------+-----------------------+-----------------------------------------------------------+--------------+------------+------------+
2 rows in set (0.00 sec)
在hive端建立表t1和t2
hive>CREATE TABLE t1(id int); // 建立內部表t1,只有一個int型別的id欄位
hive>CREATE TABLE t2(id int, name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; // 建立內部表t2,有兩個欄位,它們之間通過tab分隔
其中t1.txt和t2.txt檔案內容是:(在本地建立好這兩個兩個檔案)
[root@master temp]# cat t1.txt
1
2
3
4
5
6
7
9
[root@master temp]# cat t2.txt
1 a
2 b
3 c
9 x
把t2.txt傳到hadoop目錄下:
[chenxun@chen.local 23:12 ~]$hadoop fs -put -f /Users/chenxun/t2.txt /
[chenxun@chen.local 23:12 ~]$hadoop fs -ls /
Found 3 items
-rw-r--r-- 1 chenxun supergroup 16 2018-01-28 23:09 /t2.txt
載入資料到hive:
hive>LOAD DATA LOCAL INPATH '/Users/chenxun/t1.txt' INTO TABLE t1; // 從本地檔案載入
hive>LOAD DATA INPATH '/t2.txt' INTO TABLE t1; // 從HDFS中載入
三、啟動服務hiveserver2(如果碰到問題和錯誤後面有解決方法)
配置hive中hiveserver2選項:
<name>hive.server2.thrift.bind.host</name>
<value>127.0.0.1<value/>
<name>hive.server2.thrift.port</name>
<value>10000</value>
在後臺啟動hiveserver2服務:
hive --service hiveserver2 &
啟動beeline: 只用!connect jdbc:hive2://127.0.0.1:10000連線hive
[chenxun@chen.local 23:24 ~/software/apache-hive-2.3.2-bin/bin]$beeline
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/chenxun/software/apache-hive-2.3.2-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/chenxun/software/hadoop-2.8.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.3.2 by Apache Hive
beeline> !connect jdbc:hive2://127.0.0.1:10000
Connecting to jdbc:hive2://127.0.0.1:10000
Enter username for jdbc:hive2://127.0.0.1:10000:
Enter password for jdbc:hive2://127.0.0.1:10000:
Connected to: Apache Hive (version 2.3.2)
Driver: Hive JDBC (version 2.3.2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://127.0.0.1:10000> show databases;
OK
+----------------+
| database_name |
+----------------+
| db_hive_test |
| default |
+----------------+
2 rows selected (2.367 seconds)
0: jdbc:hive2://127.0.0.1:10000> select * from t1
. . . . . . . . . . . . . . . .> ;
OK
+--------+
| t1.id |
+--------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 9 |
| NULL |
+--------+
9 rows selected (2.416 seconds)
0: jdbc:hive2://127.0.0.1:10000> select * from t2;
OK
+--------+----------+
| t2.id | t2.name |
+--------+----------+
| 1 | a |
| 2 | b |
| 3 | c |
| 9 | x |
+--------+----------+
4 rows selected (0.254 seconds)
0: jdbc:hive2://127.0.0.1:10000>
此時可以用一些簡單的查詢語句來查詢hive,但是為了生成MapReduce作業,我們將語句寫得稍微複雜些:
0: jdbc:hive2://127.0.0.1:10000> select t2.name from t1 left join t2 on t1.id = t2.id;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = chenxun_20180128232948_92e7d150-7613-43e0-9e14-fefcecdd3aff
Total jobs = 1
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/chenxun/software/apache-hive-2.3.2-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/chenxun/software/hadoop-2.8.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/chenxun/software/apache-hive-2.3.2-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/chenxun/software/hadoop-2.8.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2018-01-28 23:29:55 Starting to launch local task to process map join; maximum memory = 477626368
2018-01-28 23:29:55 Starting to launch local task to process map join; maximum memory = 477626368
2018-01-28 23:29:56 Dump the side-table for tag: 1 with group count: 4 into file: file:/Users/chenxun/software/apache-hive-2.3.2-bin/tmp/chenxun/cc45ef96-0ff8-4482-aca9-8552cba0b159/hive_2018-01-28_23-29-48_650_9212450484633339980-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2018-01-28 23:29:57 Uploaded 1 File to: file:/Users/chenxun/software/apache-hive-2.3.2-bin/tmp/chenxun/cc45ef96-0ff8-4482-aca9-8552cba0b159/hive_2018-01-28_23-29-48_650_9212450484633339980-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable (344 bytes)
2018-01-28 23:29:57 End of local task; Time Taken: 1.493 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
2018-01-28 23:29:56 Dump the side-table for tag: 1 with group count: 4 into file: file:/Users/chenxun/software/apache-hive-2.3.2-bin/tmp/chenxun/cc45ef96-0ff8-4482-aca9-8552cba0b159/hive_2018-01-28_23-29-48_650_9212450484633339980-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2018-01-28 23:29:57 Uploaded 1 File to: file:/Users/chenxun/software/apache-hive-2.3.2-bin/tmp/chenxun/cc45ef96-0ff8-4482-aca9-8552cba0b159/hive_2018-01-28_23-29-48_650_9212450484633339980-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable (344 bytes)
2018-01-28 23:29:57 End of local task; Time Taken: 1.493 sec.
Starting Job = job_1517075946129_0001, Tracking URL = http://chen.local:8088/proxy/application_1517075946129_0001/
Kill Command = /Users/chenxun/software/hadoop-2.8.1/bin/hadoop job -kill job_1517075946129_0001
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2018-01-28 23:30:17,415 Stage-3 map = 0%, reduce = 0%
2018-01-28 23:30:26,953 Stage-3 map = 100%, reduce = 0%
Ended Job = job_1517075946129_0001
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1 HDFS Read: 5584 HDFS Write: 218 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
+----------+
| t2.name |
+----------+
| a |
| b |
| c |
| NULL |
| NULL |
| NULL |
| NULL |
| x |
| NULL |
+----------+
9 rows selected (40.835 seconds)
0: jdbc:hive2://127.0.0.1:10000>
三、碰到的坑點:
1、使用HiveServer2 and Beeline模式執行時,啟動好HiveServer2後執行碰到下面的錯誤
java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
User root is not allowed to impersonate anonymous 錯誤。
在hadoop目錄etc/hadoop目錄下core-site.xml中新增下面的內容賦予使用者許可權
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
因為我的使用者是root所以是hadoop.proxyuser.root.hosts和hadoop.proxyuser.root.groups
如果提示是其他使用者比如說是chen 那麼就要把上面內容中的root改成chen
2、錯誤2:!connect jdbc:hive2://127.0.0.1:10000出現錯誤
Error: Could not open client transport with JDBC Uri: jdbc:hive2://127.0.0.1:10000: Failed to open new session:
Permission denied: user=anonymous, access=EXECUTE, inode=”/tmp”
由於Hive沒有hdfs:/tmp目錄的許可權,賦許可權即可:
hadoop fs -chmod -R 777 /tmp
beeline> !connect jdbc:hive2://127.0.0.1:10000
Connecting to jdbc:hive2://127.0.0.1:10000
Enter username for jdbc:hive2://127.0.0.1:10000:
Enter password for jdbc:hive2://127.0.0.1:10000:
18/01/28 01:59:20 [main]: WARN jdbc.HiveConnection: Failed to connect to 127.0.0.1:10000
Error: Could not open client transport with JDBC Uri: jdbc:hive2://127.0.0.1:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: Permission denied: user=anonymous, access=EXECUTE, inode="/tmp":chenxun:supergroup:drwx------
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:310)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:271)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:206)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:189)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:499)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1603)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkTraverse(FSDirectory.java:1621)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.resolvePath(FSDirectory.java:542)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:110)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:2929)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1106)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:858)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455) (state=08S01,code=0)
參考文章:
https://www.cnblogs.com/netuml/p/7841387.html
http://blog.csdn.net/roy_88/article/details/60884980
http://blog.csdn.net/lym152898/article/details/77334997
https://www.cnblogs.com/hmy-blog/p/6506417.html
https://www.cnblogs.com/garfieldcgf/p/8134452.html
相關文章
- Hive學習之Hive的安裝Hive
- Hive從概念到安裝使用總結Hive
- Hadoop安裝錯誤總結Hadoop
- mysql學習-安裝MySql
- hive在centos上安裝HiveCentOS
- hive的安裝(包括mysql)HiveMySql
- Hive學習之二 《Hive的安裝之自定義mysql資料庫》HiveMySql資料庫
- Hadoop 學習之-HBase安裝Hadoop
- MySQL 5.7安裝部署總結MySql
- hadoop+hive+hbase 的安裝配置HadoopHive
- Hive學習之一 《Hive的介紹和安裝》Hive
- Hive安裝Hive
- Hadoop之hive安裝過程以及執行常見問題HadoopHive
- Hive -------- 使用mysql儲存hive後設資料,Mysql的安裝以及配置步驟HiveMySql
- Mysql--所有版本安裝以及完全解除安裝MySql
- MySql安裝及整合Hive手冊MySqlHive
- Mac 安裝 Hadoop 教程【避坑指南】MacHadoop
- Hive的安裝Hive
- Hive安裝配置Hive
- 數倉小組作業(一)Mac 安裝JDK、Mysql、Hadoop、HiveMacJDKMySqlHadoopHive
- mac下hadoop環境的搭建以及碰到的坑點MacHadoop
- 從0開始學習Hadoop(2)安裝JDK以及設定SSHHadoopJDK
- mysql總結(二)--Windows下安裝教程MySqlWindows
- Mysql學習筆記(安裝篇)MySql筆記
- MySQL概述以及MySQL的安裝以及啟動MySql
- Solr學習總結(二)Solr的安裝與配置Solr
- Snmp學習總結(二)——WinXP安裝和配置SNMP
- 【Hive一】Hive安裝及配置Hive
- 聽說你要在 Ubuntu 上安裝 MySQL ?踩坑了吧。UbuntuMySql
- 如何安裝vmware以及如何在vmware上安裝centOS 7.0CentOS
- Webpack學習 – Webpack安裝及安裝Web
- CocoaPods的安裝以及遇到的坑
- windows安裝mysql以及安裝Navicat Premium並破解的方法WindowsMySqlREM
- Mysql 學習篇之原始碼安裝mysqlMySql原始碼
- 【MySQL學習】在RHEL7.3上使用yum安裝MySQL5.7MySql
- Hive 3.1.2安裝部署Hive
- Hive的安裝部署Hive
- HIVE的安裝配置Hive