Sqoop1.99.6 Install And From MySQL To Hdfs
環境: Hadoop2.7.2(NameNode HA,ResourceManage HA)
1.下載
[root@sht-sgmhadoopnn-01 hadoop]# wget
[root@sht-sgmhadoopnn-01 hadoop]# tar -zxvf sqoop-1.99.6-bin-hadoop200.tar.gz
[root@sht-sgmhadoopnn-01 hadoop]# mv sqoop-1.99.6-bin-hadoop200 sqoop2
2.配置環境變數
[root@sht-sgmhadoopnn-01 ~]$ vim .bash_profile
新增如下內容:
export SQOOP_HOME=/hadoop/sqoop2
export CATALINA_HOME=$SQOOP_HOME/server
export LOGDIR=$SQOOP_HOME/logs
export PATH=$SQOOP_HOME/bin:$PATH
儲存退出即時生效:
[root@sht-sgmhadoopnn-01 ~]source .bash_profile
3.配置服務端
[root@sht-sgmhadoopnn-01 hadoop]# cd server/conf
[root@sht-sgmhadoopnn-01 conf]# pwd
/hadoop/sqoop2/server/conf
[root@sht-sgmhadoopnn-01 conf]#
[root@sht-sgmhadoopnn-01 conf]$ vi catalina.properties
common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/hadoop/hadoop/share/hadoop/common/*.jar,/hadoop/hadoop/share/hadoop/common/lib/*.jar,/hadoop/hadoop/share/hadoop/hdfs/*.jar,/hadoop/hadoop/share/hadoop/hdfs/lib/*.jar,/hadoop/hadoop/share/hadoop/mapreduce/*.jar,/hadoop/hadoop/share/hadoop/mapreduce/lib/*.jar,/hadoop/hadoop/share/hadoop/tools/lib/*.jar,/hadoop/hadoop/share/hadoop/tools/lib/*.jar,/hadoop/hadoop/share/hadoop/yarn/*.jar,/hadoop/hadoop/share/hadoop/yarn/lib/*.jar,/hadoop/hadoop/share/hadoop/httpfs/tomcat/lib/*.jar
###新增叢集環境的本機的jar的目錄
[root@sht-sgmhadoopnn-01 conf]$ vi sqoop.properties
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/hadoop/hadoop/etc/hadoop
4.新增mysql driver
[root@sht-sgmhadoopnn-01 conf]$ cd /hadoop/sqoop2/server/lib
[root@sht-sgmhadoopnn-01 lib]# rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring mysql-connector-java-5.1.36-bin.jar...
100% 952 KB 952 KB/sec 00:00:01 0 Errors
[root@sht-sgmhadoopnn-01 lib]# ll
total 10224
-rw-r--r-- 1 root root 15241 Apr 29 2013 annotations-api.jar
-rw-r--r-- 1 root root 54567 Apr 29 2013 catalina-ant.jar
-rw-r--r-- 1 root root 132134 Apr 29 2013 catalina-ha.jar
-rw-r--r-- 1 root root 1244212 Apr 29 2013 catalina.jar
-rw-r--r-- 1 root root 237615 Apr 29 2013 catalina-tribes.jar
-rw-r--r-- 1 root root 1801636 Apr 29 2013 ecj-4.2.2.jar
-rw-r--r-- 1 root root 33315 Apr 29 2013 el-api.jar
-rw-r--r-- 1 root root 112556 Apr 29 2013 jasper-el.jar
-rw-r--r-- 1 root root 527739 Apr 29 2013 jasper.jar
-rw-r--r-- 1 root root 76691 Apr 29 2013 jsp-api.jar
-rw-r--r-- 1 root root 989497 May 4 19:11 mysql-connector-java-5.1.36-bin.jar
-rw-r--r-- 1 root root 88575 Apr 29 2013 servlet-api.jar
-rw-r--r-- 1 root root 8821 Apr 29 2015 sqoop-tomcat-1.99.6.jar
-rw-r--r-- 1 root root 776946 Apr 29 2013 tomcat-coyote.jar
-rw-r--r-- 1 root root 253635 Apr 29 2013 tomcat-dbcp.jar
-rw-r--r-- 1 root root 70014 Apr 29 2013 tomcat-i18n-es.jar
-rw-r--r-- 1 root root 51903 Apr 29 2013 tomcat-i18n-fr.jar
-rw-r--r-- 1 root root 54511 Apr 29 2013 tomcat-i18n-ja.jar
5.啟動服務端
[root@sht-sgmhadoopnn-01 bin]# ./sqoop2-server start
Sqoop home directory: /hadoop/sqoop2
Setting SQOOP_HTTP_PORT: 12000
Setting SQOOP_ADMIN_PORT: 12001
Using CATALINA_OPTS:
Adding to CATALINA_OPTS: -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE: /hadoop/sqoop2/server
Using CATALINA_HOME: /hadoop/sqoop2/server
Using CATALINA_TMPDIR: /hadoop/sqoop2/server/temp
Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera
Using CLASSPATH: /hadoop/sqoop2/server/bin/bootstrap.jar
6.啟動客戶端
[root@sht-sgmhadoopnn-01 bin]# sqoop2-shell
Sqoop home directory: /hadoop/sqoop2
Sqoop Shell: Type 'help' or '\h' for help.
#設定 連線服務端
sqoop:000> set server --host sht-sgmhadoopnn-01 --port 12000 --webapp sqoop
Server is set successfully
sqoop:000>
#驗證是否已經連上
sqoop:000> show version --all
client version:
Sqoop 1.99.6 source revision 07244c3915975f26f03d9e1edf09ab7d06619bb8
Compiled by root on Wed Apr 29 10:40:43 CST 2015
server version:
Sqoop 1.99.6 source revision 07244c3915975f26f03d9e1edf09ab7d06619bb8
Compiled by root on Wed Apr 29 10:40:43 CST 2015
API versions:
[v1]
sqoop:000>
#檢查Sqoop服務(server)已經註冊的 connectors
sqoop:000> show connector --all
+----+------------------------+---------+------------------------------------------------------+----------------------+
| Id | Name | Version | Class | Supported Directions |
+----+------------------------+---------+------------------------------------------------------+----------------------+
| 1 | kite-connector | 1.99.6 | org.apache.sqoop.connector.kite.KiteConnector | FROM/TO |
| 2 | kafka-connector | 1.99.6 | org.apache.sqoop.connector.kafka.KafkaConnector | TO |
| 3 | hdfs-connector | 1.99.6 | org.apache.sqoop.connector.hdfs.HdfsConnector | FROM/TO |
| 4 | generic-jdbc-connector | 1.99.6 | org.apache.sqoop.connector.jdbc.GenericJdbcConnector | FROM/TO |
+----+------------------------+---------+------------------------------------------------------+----------------------+
###建立mysql connector的link
sqoop:000> create link -c 4 #注意:這邊的4是connector的id,表明建立的是一個generic jdbc connector
Creating link for connector with id 4
Please fill following values to create new link object
Name: mysql-link #注意:Name是唯一的
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://sht-sgmhadoopnn-01:3306/test1 #注意:jdbc:mysql://主機名(ip):埠/資料庫名
Username: root
Password: **********
JDBC Connection Properties:
There are currently 0 values in the map:
entry# #按Enter鍵
There were warnings while create or update, but saved successfully.
Warning message: Can't connect to the database with given credentials: No suitable driver found for jdbc:jdbc://sht-sgmhadoopnn-01:/test1
New link was successfully created with validation status WARNING and persistent id 1
sqoop:000>
###建立hdfs connector的link
sqoop:000> create link -c 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: hdfs-link
Link configuration
HDFS URI: hdfs://mycluster/ #注意,為namenode HA的名稱空間mycluster(當然也可以指定active namenode節點上,HDFS URI改為 hdfs://sht-sgmhadoopnn-01:8020/)
Hadoop conf directory: /hadoop/hadoop/etc/hadoop
New link was successfully created with validation status OK and persistent id 2
sqoop:000>
###檢視 建立的link
sqoop:000> show link
+----+------------+--------------+------------------------+---------+
| Id | Name | Connector Id | Connector Name | Enabled |
+----+------------+--------------+------------------------+---------+
| 1 | mysql-link | 4 | generic-jdbc-connector | true |
| 2 | hdfs-link | 3 | hdfs-connector | true |
+----+------------+--------------+------------------------+---------+
sqoop:000>
sqoop:000> show link -all
2 link(s) to show:
link with id 1 and name mysql-link (Enabled: true, Created by root at 7/29/16 10:56 PM, Updated by root at 7/29/16 10:56 PM)
Using Connector generic-jdbc-connector with id 4
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://sht-sgmhadoopnn-01:3306/test1
Username: root
Password:
JDBC Connection Properties:
link with id 2 and name hdfs-link (Enabled: true, Created by root at 7/29/16 11:00 PM, Updated by root at 7/29/16 11:00 PM)
Using Connector hdfs-connector with id 3
Link configuration
HDFS URI: hdfs://mycluster/
Hadoop conf directory: /hadoop/hadoop/etc/hadoop
sqoop:000>
#####建立job物件,這個別寫錯了,使用LINK ID,而不是 Connector Id
#####使用這兩個link Id來關聯job的From和To部分。說的通俗一點,就是我們需要從哪裡(From)讀取資料,把這些資料匯入(To)到哪裡
sqoop:000> create job -f 1 -t 2
Creating job for links with from id 1 and to id 2
Please fill following values to create new job object
Name: testsqoop #Name必須唯一
From database configuration
Schema name: test1 ##資料庫的database name
Table name: person ##該庫的表
Table SQL statement: #可選
Table column names: #可選
Partition column name: #可選
Null value allowed for the partition column: #可選
Boundary query: #可選
Incremental read
Check column: #可選
Last value: #可選
To HDFS configuration
Override null value: #可選
Null value: #可選
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0 #必選,為文字格式
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0 #必選 ,不壓縮
Custom compression format:
Output directory: /sqoop/test ###必選, hdfs的輸出路徑
Append mode:
Throttling resources
Extractors: 1 #可選,對應mapreduce的job中的map的數量
Loaders: 1 #可選,對應mapreduce的job中的reduce的數量
New job was successfully created with validation status OK and persistent id 1
sqoop:000>
###檢視job
qoop:000> show job
+----+-----------+----------------+--------------+---------+
| Id | Name | From Connector | To Connector | Enabled |
+----+-----------+----------------+--------------+---------+
| 1 | testsqoop | 4 | 3 | true |
+----+-----------+----------------+--------------+---------+
sqoop:000>
###開啟錯誤資訊顯示
sqoop:000> set option --name verbose --value true
Verbose option was changed to true
sqoop:000>
###開啟job 1
sqoop:000> start job -j 1 -s
Submission details
Job ID: 1
Server URL:
Created by: root
Creation date: 2016-07-29 23:40:16 CST
Lastly updated by: root
External ID: job_1469795998430_0001
Source Connector schema: Schema{name=test1.person,columns=[
FixedPoint{name=ID,nullable=true,type=FIXED_POINT,byteSize=4,signed=true},
Text{name=NAME,nullable=true,type=TEXT,charSize=null},
Text{name=TITLE,nullable=true,type=TEXT,charSize=null}]}
2016-07-29 23:40:16 CST: BOOTING - Progress is not available
2016-07-29 23:40:39 CST: RUNNING - 0.00 %
2016-07-29 23:40:50 CST: RUNNING - 0.00 %
2016-07-29 23:41:00 CST: RUNNING - 0.00 %
2016-07-29 23:41:10 CST: RUNNING - 0.00 %
2016-07-29 23:41:20 CST: RUNNING - 100.00 %
2016-07-29 23:41:37 CST: FAILED
Exception: Job Failed with status:3
Stack trace:
sqoop:000>
#####錯誤 進入 檢視job的failed的原因
Log Length: 3070
2016-07-29 23:40:51,075 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Starting progress service
2016-07-29 23:40:51,077 [main] INFO org.apache.sqoop.job.mr.SqoopMapper - Running extractor class org.apache.sqoop.connector.jdbc.GenericJdbcExtractor
2016-07-29 23:40:51,613 [main] ERROR org.apache.sqoop.connector.jdbc.GenericJdbcExecutor - Caught SQLException:
java.sql.SQLException: Access denied for user (using password: YES)
###解決: mysql-link的配置的host連線不上,需要將其中root許可權記錄選取一行改為%
sht-sgmhadoopnn-01:mysqladmin:/usr/local/mysql:>mysql $c
mysql> select host,user,password from mysql.user;
+-------------------------------+-----------+-------------------------------------------+
| host | user | password |
+-------------------------------+-----------+-------------------------------------------+
| localhost | root | *6340BE3C15D246B0D74BAF3F135915ED19E0063F |
| sht-sgmhadoopnn-01.telenav.cn | root | *6340BE3C15D246B0D74BAF3F135915ED19E0063F |
| 127.0.0.1 | root | *6340BE3C15D246B0D74BAF3F135915ED19E0063F |
| ::1 | root | *6340BE3C15D246B0D74BAF3F135915ED19E0063F |
| % | repl_user | *A8E5E2B374324130F4B4CC73C63F5032D1E4195E |
| % | repl | *A424E797037BF97C19A2E88CF7891C5C2038C039 |
| % | hive | *4DF1D66463C18D44E3B001A8FB1BBFBEA13E27FC |
+-------------------------------+-----------+-------------------------------------------+
7 rows in set (0.67 sec)
mysql> update mysql.user set host='%' where host='127.0.0.1';
Query OK, 1 row affected (0.61 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> flush privileges;
Query OK, 0 rows affected (1.70 sec)
mysql>
###再次執行job 1
sqoop:000> start job -j 1
Submission details
Job ID: 1
Server URL:
Created by: root
Creation date: 2016-07-30 00:00:41 CST
Lastly updated by: root
External ID: job_1469795998430_0002
Source Connector schema: Schema{name=test1.person,columns=[
FixedPoint{name=ID,nullable=true,type=FIXED_POINT,byteSize=4,signed=true},
Text{name=NAME,nullable=true,type=TEXT,charSize=null},
Text{name=TITLE,nullable=true,type=TEXT,charSize=null}]}
2016-07-30 00:00:41 CST: BOOTING - Progress is not available
###hdfs 檢視
[root@sht-sgmhadoopnn-01 lib]# hadoop fs -text /sqoop/test/1b89afc4-c882-4fd2-abcb-9758a5d709ee.txt
1,'rc ','dba'
[root@sht-sgmhadoopnn-01 lib]#
Example:
start job --jid 1
start job --jid 1 -s
Argument Description
-j, --jid <x> Start job with id <x>
-s, --synchronous Synchoronous job execution
Sqoop2命令列:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30089851/viewspace-2122748/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Mysql 5.7.21 install for LinuxMySqlLinux
- Mysql 5.7.22 install for windowsMySqlWindows
- Install MySQL 5.7 in the DockerMySqlDocker
- yum install mysql5.7MySql
- ClickHouse-整合引擎(MySQL、HDFS)MySql
- [kubernetes]Error from server (BadRequest):container install-cni is not valid for pod kube-dnsErrorServerAIDNS
- Mysql增量寫入Hdfs(二) --Storm+hdfs的流式處理MySqlORM
- 報錯install_driver(mysql) failed: Attempt to reload DBD/mysql.pmMySqlAI
- MySQL主從複製報錯:Got fatal error 1236 from master when reading data fromMySqlGoErrorAST
- Mysql增量寫入Hdfs(一) --將Mysql資料寫入Kafka TopicMySqlKafka
- 安裝MySQL5.7報錯:The action ‘Install’ for product ‘MySQL Server 5.7.19’ failed.MySqlServerAI
- MySQL案例09:Last_IO_Error: Got fatal error 1236 from master when reading data from binary logMySqlASTErrorGo
- mysql中You can’t specify target table for update in FROM clMySql
- HDFS
- MySQL報錯Slave: received end packet from server, apparent master shutdownMySqlServerAPPAST
- go install: no install location for directory outside GOPATHGoIDE
- semantic-ui@2.4.2 install: `gulp install`UI
- HDFS 05 - HDFS 常用的 Java API 操作JavaAPI
- pip install 提示:Could not install packages due to an EnvironmentErrorPackageError
- mysql 的delete from where 子查詢的一些限制MySqldelete
- Hadoop–HDFSHadoop
- hdfs命令
- install qdrant
- Install clickhouse
- habitat install
- HDFS 09 - HDFS NameNode 的高可用機制
- Python:conda install 和pip install的區別Python
- From now on
- before install octave package you must be install gcc-fortranPackageGC
- Hadoop HDFS(一)Hadoop
- Hadoop HDFS(二)Hadoop
- hdfs dfsadmin -fetchImage
- HDFS shell命令
- HDFS常用操作
- mvn install 命令
- KubeSphere 3.3.2 install
- kubernetes 1.25.9 install
- openwrt install lsblk
- pip install METIS