(十二)SparkSQL Catalog訪問Hive後設資料資訊
SparkSQL如何直接訪問hive後設資料資訊
不再需要去mysql裡的表裡去查詢
[hadoop@hadoop001 bin]$ ./spark-shell --master local[2] --jars ~/software/mysql-connector-java-5.1.27.jar
scala> val catalog = spark.catalog
scala> catalog.listDatabases.show(false)
scala> catalog.listDatabases.show(false)
+---------------+----------------------------------+------------------------------------------------------------------+
|name |description |locationUri |
+---------------+----------------------------------+------------------------------------------------------------------+
|default |Default Hive database |hdfs://192.168.137.141:9000/user/hive/warehouse |
|hive | |hdfs://192.168.137.141:9000/user/hive/warehouse/hive.db |
|hive2_ruozedata|this is ruozedata 03 test database|hdfs://192.168.137.141:9000/user/hive/warehouse/hive2_ruozedata.db|
|hive3 | |hdfs://192.168.137.141:9000/zh |
|ruozedata | |hdfs://192.168.137.141:9000/user/hive/warehouse/ruozedata.db |
+---------------+----------------------------------+------------------------------------------------------------------+
scala> catalog.listDatabases.select("name").show(false)
+---------------+
|name |
+---------------+
|default |
|hive |
|hive2_ruozedata|
|hive3 |
|ruozedata |
+---------------+
scala> catalog.listTables("default").show(false)
+---------------+--------+-----------+---------+-----------+
|name |database|description|tableType|isTemporary|
+---------------+--------+-----------+---------+-----------+
|hive_array |default |null |MANAGED |false |
|hive_map |default |null |MANAGED |false |
|hive_rownumber |default |null |MANAGED |false |
|hive_struct |default |null |MANAGED |false |
|hive_wc |default |null |MANAGED |false |
|rating_json |default |null |MANAGED |false |
|ruoze_test |default |null |MANAGED |false |
|user_click1 |default |null |MANAGED |false |
|user_click_tmp1|default |null |MANAGED |false |
|zh |default |null |MANAGED |false |
+---------------+--------+-----------+---------+-----------+
scala> catalog.listColumns("ruozedata","ruozedata_emp").show(false)
+--------+-----------+--------+--------+-----------+--------+
|name |description|dataType|nullable|isPartition|isBucket|
+--------+-----------+--------+--------+-----------+--------+
|empno |null |int |true |false |false |
|ename |null |string |true |false |false |
|job |null |string |true |false |false |
|mgr |null |int |true |false |false |
|hiredate|null |string |true |false |false |
|salary |null |double |true |false |false |
|comm |null |double |true |false |false |
|deptno |null |int |true |false |false |
+--------+-----------+--------+--------+-----------+--------+
檢視MySQL驗證一下:
mysql> show databases;
mysql> use ruozedata_basic03;
//查詢所有資料庫
mysql> select * from dbs;
+-------+------------------------------------+--------------------------------------------------------------------+-----------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+------------------------------------+--------------------------------------------------------------------+-----------------+------------+------------+
| 1 | Default Hive database | hdfs://192.168.137.141:9000/user/hive/warehouse | default | public | ROLE |
| 6 | NULL | hdfs://192.168.137.141:9000/user/hive/warehouse/hive.db | hive | hadoop | USER |
| 9 | this is ruozedata 03 test database | hdfs://192.168.137.141:9000/user/hive/warehouse/hive2_ruozedata.db | hive2_ruozedata | hadoop | USER |
| 10 | NULL | hdfs://192.168.137.141:9000/zh | hive3 | hadoop | USER |
| 11 | NULL | hdfs://192.168.137.141:9000/user/hive/warehouse/ruozedata.db | ruozedata | hadoop | USER |
+-------+------------------------------------+--------------------------------------------------------------------+-----------------+------------+------------+
5 rows in set (0.00 sec)
//查詢資料庫default的所有表格
mysql> select * from tbls where DB_ID=1;
+--------+-------------+-------+------------------+--------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------------+---------------+--------------------+--------------------+
| 1 | 1529135635 | 1 | 0 | hadoop | 0 | 1 | zh | MANAGED_TABLE | NULL | NULL |
| 31 | 1529287737 | 1 | 0 | hadoop | 0 | 41 | hive_wc | MANAGED_TABLE | NULL | NULL |
| 32 | 1529288829 | 1 | 0 | hadoop | 0 | 42 | rating_json | MANAGED_TABLE | NULL | NULL |
| 33 | 1529289493 | 1 | 0 | hadoop | 0 | 43 | hive_rownumber | MANAGED_TABLE | NULL | NULL |
| 76 | 1531056372 | 1 | 0 | hadoop | 0 | 91 | user_click_tmp1 | MANAGED_TABLE | NULL | NULL |
| 78 | 1531056697 | 1 | 0 | hadoop | 0 | 95 | user_click1 | MANAGED_TABLE | NULL | NULL |
| 81 | 1532851994 | 1 | 0 | hadoop | 0 | 101 | hive_array | MANAGED_TABLE | NULL | NULL |
| 83 | 1532855555 | 1 | 0 | hadoop | 0 | 103 | hive_map | MANAGED_TABLE | NULL | NULL |
| 85 | 1532856628 | 1 | 0 | hadoop | 0 | 105 | hive_struct | MANAGED_TABLE | NULL | NULL |
| 91 | 1535883337 | 1 | 0 | hadoop | 0 | 111 | ruoze_test | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------------+---------------+--------------------+--------------------+
10 rows in set (0.00 sec)
//查詢ruozedata資料庫中ruoze_emp的所有列,需分幾步才能完成
mysql> select * from tbls where DB_ID=11;
+--------+-------------+-------+------------------+--------+-----------+-------+-------------------------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+--------+-----------+-------+-------------------------+---------------+--------------------+--------------------+
| 11 | 1529212083 | 11 | 0 | hadoop | 0 | 11 | ruozedata_person | MANAGED_TABLE | NULL | NULL |
| 12 | 1529213071 | 11 | 0 | hadoop | 0 | 12 | ruozedata_emp | MANAGED_TABLE | NULL | NULL |
| 13 | 1529214712 | 11 | 0 | hadoop | 0 | 13 | ruozedata_emp2 | MANAGED_TABLE | NULL | NULL |
| 14 | 1529215446 | 11 | 0 | hadoop | 0 | 14 | ruozedata_emp3_new | MANAGED_TABLE | NULL | NULL |
| 17 | 1529220164 | 11 | 0 | hadoop | 0 | 17 | ruozedata_emp4 | MANAGED_TABLE | NULL | NULL |
| 22 | 1529234706 | 11 | 0 | hadoop | 0 | 22 | a | MANAGED_TABLE | NULL | NULL |
| 24 | 1529235629 | 11 | 0 | hadoop | 0 | 24 | b | MANAGED_TABLE | NULL | NULL |
| 25 | 1529237177 | 11 | 0 | hadoop | 0 | 25 | order_partition | MANAGED_TABLE | NULL | NULL |
| 26 | 1529241702 | 11 | 0 | hadoop | 0 | 29 | order_4_partition | MANAGED_TABLE | NULL | NULL |
| 27 | 1529242467 | 11 | 0 | hadoop | 0 | 31 | order_mulit_partition | MANAGED_TABLE | NULL | NULL |
| 28 | 1529244966 | 11 | 0 | hadoop | 0 | 33 | ruozedata_static_emp | MANAGED_TABLE | NULL | NULL |
| 29 | 1529245719 | 11 | 0 | hadoop | 0 | 35 | ruozedata_dynamic_emp | MANAGED_TABLE | NULL | NULL |
| 30 | 1529247858 | 11 | 0 | hadoop | 0 | 40 | dual | MANAGED_TABLE | NULL | NULL |
| 36 | 1530106021 | 11 | 0 | hadoop | 0 | 46 | emp_sqoop | MANAGED_TABLE | NULL | NULL |
| 41 | 1530107925 | 11 | 0 | hadoop | 0 | 51 | ruozedata_emp_partition | MANAGED_TABLE | NULL | NULL |
| 51 | 1530622246 | 11 | 0 | hadoop | 0 | 66 | city_info | MANAGED_TABLE | NULL | NULL |
| 52 | 1530622255 | 11 | 0 | hadoop | 0 | 67 | product_info | MANAGED_TABLE | NULL | NULL |
| 72 | 1531051049 | 11 | 0 | hadoop | 0 | 87 | user_click_tmp | MANAGED_TABLE | NULL | NULL |
| 79 | 1531060915 | 11 | 0 | hadoop | 0 | 96 | product_rank | MANAGED_TABLE | NULL | NULL |
| 80 | 1531061272 | 11 | 0 | hadoop | 0 | 97 | user_click | MANAGED_TABLE | NULL | NULL |
| 86 | 1532861861 | 11 | 0 | hadoop | 0 | 106 | ruoze_dept | MANAGED_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+--------+-----------+-------+-------------------------+---------------+--------------------+--------------------+
21 rows in set (0.00 sec)
mysql> select SD_ID,CD_ID from sds where SD_ID=12;
+-------+-------+
| SD_ID | CD_ID |
+-------+-------+
| 12 | 12 |
+-------+-------+
1 row in set (0.01 sec)
mysql> select * from columns_v2 where CD_ID=12;
+-------+---------+-------------+-----------+-------------+
| CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX |
+-------+---------+-------------+-----------+-------------+
| 12 | NULL | comm | double | 6 |
| 12 | NULL | deptno | int | 7 |
| 12 | NULL | empno | int | 0 |
| 12 | NULL | ename | string | 1 |
| 12 | NULL | hiredate | string | 4 |
| 12 | NULL | job | string | 2 |
| 12 | NULL | mgr | int | 3 |
| 12 | NULL | salary | double | 5 |
+-------+---------+-------------+-----------+-------------+
8 rows in set (0.00 sec)
相關文章
- 基於Hive進行數倉建設的資源後設資料資訊統計:Hive篇Hive
- SparkSQL:Parquet資料來源之合併後設資料SparkSQL
- hive(4)——後設資料概述Hive
- SparkSQL與Hive metastore ParquetSparkSQLHiveAST
- 從Hive遷移到SparkSQL,有讚的大資料實踐HiveSparkSQL大資料
- Salesforce和SAPHANA的後設資料訪問加速Salesforce
- Salesforce和SAP HANA的後設資料訪問加速Salesforce
- hive的安裝(後設資料庫: MySQL)Hive資料庫MySql
- 基於Hive進行數倉建設的資源後設資料資訊統計:Spark篇HiveSpark
- hive後設資料和mysql表的對應HiveMySql
- 【Spark篇】---SparkSQL on Hive的配置和使用SparkSQLHive
- 開源大資料叢集部署(十二)Ranger 整合 hive大資料RangerHive
- 配置hadoop HIVE後設資料儲存在mysql中HadoopHiveMySql
- Hive -------- 使用mysql儲存hive後設資料,Mysql的安裝以及配置步驟HiveMySql
- Hive學習之JDBC訪問HiveJDBC
- 2- hive後設資料與hadoop的關係HiveHadoop
- 大資料元件-Hive部署基於MySQL作為後設資料儲存大資料元件HiveMySql
- SparkSQL外部資料來源SparkSQL
- Hive安裝(本地獨立模式,MySql為後設資料庫)Hive模式MySql資料庫
- 使用AMDU DUMP ASM磁碟後設資料資訊ASM
- C#與資料庫訪問技術總結(十二)資料閱讀器(DataReader)2C#資料庫
- 重灌Windows XP系統後資料夾不能訪問Windows
- 大資料量資料遷移後統計資訊問題大資料
- 資料訪問層
- 帶著問題讀 TiDB 原始碼:Hive 後設資料使用 TiDB 啟動報錯TiDB原始碼Hive
- 【Hive】hive資料遷移Hive
- rman連線catalog註冊資料庫問題資料庫
- 【RMAN】catalog資料庫資料庫
- 建立catalog 資料庫資料庫
- 建立CATALOG資料庫。資料庫
- Mysql資料庫的遠端訪問設定MySql資料庫
- 資料訪問層基礎結構設計
- 解決hive資料庫 插入資料很慢的問題Hive資料庫
- Spring資料訪問Spring
- JDBC資料庫訪問JDBC資料庫
- 資料訪問 - EntityFramework整合Framework
- 控制資料訪問(一)
- 好程式設計師大資料學習路線分享SparkSQl程式設計師大資料SparkSQL