hive資料倉儲匯入資料的方法
[hadoop@tong1 ~]$ cat 1.txt --所用資料都用tab鍵分隔
2 3
1 1
[hadoop@tong1 ~]$ cat 2.txt
1 2
3 4
5 6
7 8
1 2
3 4
0 0
[hadoop@tong1 ~]$ cat 3.txt
5 6
7 8
[hadoop@tong1 ~]$hive> create table q(a int,b int) row format delimited fields terminated by '\t' stored as textfile;
OK
Time taken: 0.093 seconds
hive> desc q;
OK
a int
b int
Time taken: 0.117 seconds, Fetched: 2 row(s)
hive> load data local inpath '/home/hadoop/1.txt' into table q; --into是追加資料,overwrite是覆蓋表中的資料
Loading data to table tong.q
Table tong.q stats: [numFiles=1, totalSize=8]
OK
Time taken: 0.307 seconds
hive> select * from q;
OK
2 3
1 1
Time taken: 0.121 seconds, Fetched: 2 row(s)
hive> load data local inpath '/home/hadoop/2.txt' overwrite into table q; --overwrite覆蓋表中的資料
Loading data to table tong.q
Table tong.q stats: [numFiles=1, numRows=0, totalSize=28, rawDataSize=0]
OK
Time taken: 0.315 seconds
hive> select * from q;
OK
1 2
3 4
5 6
7 8
1 2
3 4
0 0
Time taken: 0.051 seconds, Fetched: 7 row(s)
hive>
2.從HDFS檔案系統中匯入資料
[hadoop@tong1 ~]$ hadoop fs -put /home/hadoop/3.txt /user/hive/warehouse/ --將3.txt檔案匯入到HDFS檔案系統中
[hadoop@tong1 ~]$ hadoop fs -ls /user/hive/warehouse/
Found 6 items
-rw-r--r-- 2 hadoop supergroup 8 2015-01-26 14:41 /user/hive/warehouse/3.txt
[hadoop@tong1 ~]$ hive
Logging initialized using configuration in jar:file:/usr/local/hive-0.14.0/lib/hive-common-0.14.0.jar!/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hive-0.14.0/lib/hive-jdbc-0.14.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Loading data to table tong.q
Table tong.q stats: [numFiles=2, numRows=0, totalSize=36, rawDataSize=0]
OK
Time taken: 0.295 seconds
hive> select * from q;
OK
1 2
3 4
5 6
7 8
1 2
3 4
0 0
5 6
7 8
Time taken: 0.063 seconds, Fetched: 9 row(s)
hive>
[hadoop@tong1 ~]$ hadoop fs -ls /user/hive/warehouse/ --檔案資料載入到表中後檔案就刪除了
Found 5 items
drwxr-xr-x - hadoop supergroup 0 2015-01-12 13:31 /user/hive/warehouse/hwz
drwxr-xr-x - hadoop supergroup 0 2015-01-13 15:21 /user/hive/warehouse/hwz1
drwxr-xr-x - hadoop supergroup 0 2015-01-12 15:11 /user/hive/warehouse/t
drwxr-xr-x - hadoop supergroup 0 2015-01-12 17:42 /user/hive/warehouse/t1
drwxr-xr-x - hadoop supergroup 0 2015-01-26 14:32 /user/hive/warehouse/tong.db
[hadoop@tong1 ~]$
3.用建立表的方法匯入資料
hive> create table q1 as select * from q;
Query ID = hadoop_20150126144747_dbc07ce3-40b0-441c-8ec3-08a48092593d
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1422249676009_0007, Tracking URL =
Kill Command = /usr/local/hadoop-2.6.0/bin/hadoop job -kill job_1422249676009_0007
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-01-26 14:47:41,941 Stage-1 map = 0%, reduce = 0%
2015-01-26 14:47:49,247 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.63 sec
MapReduce Total cumulative CPU time: 1 seconds 630 msec
Ended Job = job_1422249676009_0007
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://tong1:9000/tmp/hive/hadoop/c8ed6f95-d55d-4d1d-ba74-10170523f138/hive_2015-01-26_14-47-32_330_8683539045786824558-1/-ext-10001
Moving data to: hdfs://tong1:9000/user/hive/warehouse/tong.db/q1
Table tong.q1 stats: [numFiles=1, numRows=9, totalSize=36, rawDataSize=27]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 1.63 sec HDFS Read: 313 HDFS Write: 99 SUCCESS
Total MapReduce CPU Time Spent: 1 seconds 630 msec
OK
Time taken: 18.243 seconds
hive> select * from q1;
OK
1 2
3 4
5 6
7 8
1 2
3 4
0 0
5 6
7 8
Time taken: 0.046 seconds, Fetched: 9 row(s)
hive>
4.用插入語句(insert)匯入資料
hive> select * from q1; --插入表之前的資料
OK
1 2
3 4
5 6
7 8
1 2
3 4
0 0
5 6
7 8
Time taken: 0.07 seconds, Fetched: 9 row(s)
hive> insert into table q1 select * from q where a=1; --在表中插入資料
Query ID = hadoop_20150126144949_2f36c732-219d-463c-847a-fe03534892d2
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1422249676009_0008, Tracking URL =
Kill Command = /usr/local/hadoop-2.6.0/bin/hadoop job -kill job_1422249676009_0008
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-01-26 14:49:49,122 Stage-1 map = 0%, reduce = 0%
2015-01-26 14:49:57,461 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.19 sec
MapReduce Total cumulative CPU time: 3 seconds 190 msec
Ended Job = job_1422249676009_0008
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://tong1:9000/tmp/hive/hadoop/c8ed6f95-d55d-4d1d-ba74-10170523f138/hive_2015-01-26_14-49-39_626_4862096867748585368-1/-ext-10000
Loading data to table tong.q1
Table tong.q1 stats: [numFiles=2, numRows=11, totalSize=44, rawDataSize=33]
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 3.19 sec HDFS Read: 313 HDFS Write: 71 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 190 msec
OK
Time taken: 19.199 seconds
hive> select * from q1; --插入後的資料
OK
1 2
3 4
5 6
7 8
1 2
3 4
0 0
5 6
7 8
1 2
1 2
Time taken: 0.033 seconds, Fetched: 11 row(s)
hive>
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25854343/viewspace-1415605/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- HIVE資料匯入基礎Hive
- Hive:資料倉儲構建步驟Hive
- 黑猴子的家:Hive 資料倉儲位置配置Hive
- 一招教你資料倉儲如何高效批次匯入與更新資料
- 資料量不大的資料倉儲方案有必要用 hive 嗎?Hive
- 資料倉儲建模方法論
- Hive資料匯入HBase引起資料膨脹引發的思考Hive
- sqoop用法之mysql與hive資料匯入匯出OOPMySqlHive
- Sqoop將MySQL資料匯入到hive中OOPMySqlHive
- Nebula Exchange 工具 Hive 資料匯入的踩坑之旅Hive
- 大資料4.1 - Flume整合案例+Hive資料倉大資料Hive
- 將資料匯入kudu表(建立臨時hive表,從hive匯入kudu)步驟Hive
- 資料倉儲元件:Hive環境搭建和基礎用法元件Hive
- 資料庫倉庫系列:(一)什麼是資料倉儲,為什麼要資料倉儲資料庫
- 資料倉儲與大資料的區別大資料
- 關於資料湖、資料倉儲的想法
- 資料湖 vs 資料倉儲 vs 資料庫資料庫
- Mysql 大資料表 資料匯入到SqlServer 中的方法MySql大資料Server
- 掌握Hive資料儲存模型Hive模型
- sqoop1.4.7環境搭建及mysql資料匯入匯出到hiveOOPMySqlHive
- 資料庫 MySQL 資料匯入匯出資料庫MySql
- 淺談資料倉儲和大資料大資料
- 資料湖會取代資料倉儲嗎?
- 談談資料湖和資料倉儲
- 資料倉儲 - ER模型模型
- 資料湖和中央資料倉儲的設計
- 資料湖+資料倉儲 = 資料湖庫架構架構
- pandas的外部資料匯入與常用方法
- 透過 ETL 匯出 Hive 中的資料Hive
- 資料匯入終章:如何將HBase的資料匯入HDFS?
- 資料倉儲建模工具之一——Hive學習第四天Hive
- spark寫入hive資料SparkHive
- ETL資料倉儲的使用方式
- 大文字資料,匯入匯出到資料庫資料庫
- MySQL資料的匯入MySql
- 雲資料建模:為資料倉儲設計資料庫資料庫
- [數倉]資料倉儲設計方案
- 大資料和資料倉儲解決方案大資料
- 資料倉儲被淘汰了?都怪資料湖