配置hadoop HIVE後設資料儲存在mysql中

longqidong發表於2012-03-18

先確保已經成功安裝了HIVEMYSQL


hive-site.xml中新增如下內容,指定METASTORE的地址以及連線方式

剛安裝好hive,conf下是沒有hive-site.xml檔案的,需要複製 

hive-default.xml為hive-site.xml。然後再進行修改。

     
javax.jdo.option.ConnectionURL     
jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8     
JDBC connect string for a JDBC metastore  
  
     
javax.jdo.option.ConnectionDriverName     
com.mysql.jdbc.Driver     
Driver class name for a JDBC metastore  
  
     
javax.jdo.option.ConnectionUserName     
root     
username to use against metastore database  
  
     
javax.jdo.option.ConnectionPassword     
123     
password to use against metastore database  

 

然後登陸到HIVE客戶端,建立一個表試試

[gpadmin1@hadoop5 hive-0.6.0]$ bin/hive
Hive history file=/tmp/gpadmin1/hive_job_log_gpadmin1_201106081130_1156785421.txt
hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalDataStoreException: Unknown database 'hive'
NestedThrowables:
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown database 'hive'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

 

報錯了,提示很明顯,識別不到名稱為hivedatabase,需要手動在mysql中建立相應的資料庫

[Intranet root@hadoop6 /var/lib/mysql]
#mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or /g.
Your MySQL connection id is 41
Server version: 5.5.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '/h' for help. Type '/c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> create database hive;
Query OK, 1 row affected (0.00 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| test               |
+--------------------+
5 rows in set (0.00 sec)

 

再登陸到HIVE裡看看

[gpadmin1@hadoop5 hive-0.6.0]$ bin/hive
Hive history file=/tmp/gpadmin1/hive_job_log_gpadmin1_201106081130_544334815.txt
hive> show table;                       
FAILED: Parse Error: line 0:-1 mismatched input '' expecting EXTENDED in show statement

hive> show tables;
OK
Time taken: 5.173 seconds
hive>         CREATE TABLE cite (id1 INT,
    >            id2 int
    >            )
    >          ROW FORMAT DELIMITED
    >          FIELDS TERMINATED BY ',';
OK
Time taken: 0.266 seconds
hive> show tables;                         
OK
cite
Time taken: 0.197 seconds
hive>

 

OK了,果然是這個問題

另外,HIVE會在資料庫中建立一些儲存後設資料的表,我們可以看下都有哪些

mysql> use hive;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+----------------+
| Tables_in_hive |
+----------------+
| BUCKETING_COLS |
| COLUMNS        |
| DBS            |
| PARTITION_KEYS |
| SDS            |
| SD_PARAMS      |
| SEQUENCE_TABLE |
| SERDES         |
| SERDE_PARAMS   |
| SORT_COLS      |
| TABLE_PARAMS   |
| TBLS           |
+----------------+
12 rows in set (0.00 sec)
mysql> select * from TBLS;     
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER    | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
|      1 |  1307504073 |     1 |                0 | gpadmin1 |         0 |     1 | cite | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+----------+-----------+-------+----------+---------------+--------------------+--------------------+
1 row in set (0.00 sec)

mysql>

 

剛才我們建立的表cite也可以查到

 

另外有一個需要注意的地方是,使用mysql儲存後設資料,hive需要能夠訪問到mysql,需要mysql jdbc的驅動包,需要把一個jarmysql-connector-java-5.1.15-bin.jar複製到hivelib目錄下才行,否則執行語句的時候會報錯,類似下面這樣

hive> show tables;
FAILED: Error in metadata: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

複製相應的jar包既可以。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22418990/viewspace-718885/,如需轉載,請註明出處,否則將追究法律責任。

相關文章