資料整合工具—FlinkX

真好吃啊發表於2021-12-07

@

FlinkX的安裝與簡單使用

@

FlinkX的安裝

安裝unzip:yum install unzip

1、上傳並解壓

unzip flinkx-1.10.zip -d /usr/local/soft/

2、配置環境變數

3、給bin/flinkx這個檔案加上執行許可權

chmod a+x flinkx

4、修改配置檔案,設定執行埠

vim flinkconf/flink-conf.yaml
## web服務埠,不指定的話會隨機生成一個
rest.bind-port: 8888

配置環境變數、

vim /etc/profile

FLINKX_HOME=

flinkX開源網址:https://github.com/DTStack/flinkx

FlinkX的簡單使用

讀取mysql中student表中資料

{
  "job": {
    "content": [
      {
        "reader": {
          "parameter": {
            "username": "root",
            "password": "123456",
            "connection": [{
              "jdbcUrl": ["jdbc:mysql://master:3306/student?userSSL=false&useUnicode=true&characterEncoding=utf8"],
              "table": ["student"]
            }],
            "column": ["*"],
            "customSql": "",
            "where": "id > 1500100900",
            "splitPk": "id",
            "queryTimeOut": 1000
          },
          "name": "mysqlreader"
        },
        "writer": {
          "name": "streamwriter",
          "parameter": {
            "print": true
          }
        }
      }
    ],
    "setting": {
      "speed": {
        "channel": 3,
        "bytes": 0
      },
      "errorLimit": {
        "record": 100
      },
      "restore": {
        "maxRowNumForCheckpoint": 0,
        "isRestore": false,
        "restoreColumnName": "",
        "restoreColumnIndex": 0
      },
      "log" : {
        "isLogger": false,
        "level" : "debug",
        "path" : "",
        "pattern":""
      }
    }
  }
}

FlinkX本地執行

flinkx -mode local -job flinkx3.json -pluginRoot ../syncplugins -flinkconf ../flinkconf
執行時檔案所處路徑為:/usr/local/soft/flinkx-1.10/package

MySQLToHDFS

  • 配置檔案
{
    "job": {
        "content": [
            {
                "reader": {
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "connection": [
                            {
                                "jdbcUrl": [
                                    "jdbc:mysql://master:3306/student?characterEncoding=utf8"
                                ],
                                "table": [
                                    "student"
                                ]
                            }
                        ],
                        "column": [
                            "*"
                        ],
                        "customSql": "",
                        "where": "clazz = '理科二班'",
                        "splitPk": "",
                        "queryTimeOut": 1000,
                        "requestAccumulatorInterval": 2
                    },
                    "name": "mysqlreader"
                },
                "writer": {
                    "name": "hdfswriter",
                    "parameter": {
                        "path": "hdfs://master:9000/data/flinkx/student",
                        "defaultFS": "hdfs://master:9000",
                        "column": [
                            {
                                "name": "col1",
                                "index": 0,
                                "type": "string"
                            },
                            {
                                "name": "col2",
                                "index": 1,
                                "type": "string"
                            },
                            {
                                "name": "col3",
                                "index": 2,
                                "type": "string"
                            },
                            {
                                "name": "col4",
                                "index": 3,
                                "type": "string"
                            },
                            {
                                "name": "col5",
                                "index": 4,
                                "type": "string"
                            },
                            {
                                "name": "col6",
                                "index": 5,
                                "type": "string"
                            }
                        ],
                        "fieldDelimiter": ",",
                        "fileType": "text",
                        "writeMode": "overwrite"
                    }
                }
            }
        ],
        "setting": {
            "restore": {
                "isRestore": false,
                "isStream": false
            },
            "errorLimit": {},
            "speed": {
                "channel": 1
            }
        }
    }
}
  • 啟動任務
flinkx -mode local -job /usr/local/soft/flinkx-1.10/jsonConf/mysqlToHDFS.json -pluginRoot /usr/local/soft/flinkx-1.10/syncplugins/ -flinkconf /usr/local/soft/flinkx-1.10/flinkconf/
  • 監聽日誌

flinkx 任務啟動後,會在執行命令的目錄下生成一個nohup.out檔案

tail -f nohup.out
  • 通過web介面檢視任務執行情況
http://master:8888

MySQLToHive

  • 配置檔案
{
    "job": {
        "content": [
            {
                "reader": {
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "connection": [
                            {
                                "jdbcUrl": [
                                    "jdbc:mysql://master:3306/student?characterEncoding=utf8"
                                ],
                                "table": [
                                    "student"
                                ]
                            }
                        ],
                        "column": [
                            "*"
                        ],
                        "customSql": "",
                        "where": "clazz = '文科二班'",
                        "splitPk": "id",
                        "queryTimeOut": 1000,
                        "requestAccumulatorInterval": 2
                    },
                    "name": "mysqlreader"
                },
                "writer": {
                    "name": "hivewriter",
                    "parameter": {
                        "jdbcUrl": "jdbc:hive2://master:10000/testflinkx",
                        "username": "",
                        "password": "",
                        "fileType": "text",
                        "fieldDelimiter": ",",
                        "writeMode": "overwrite",
                        "compress": "",
                        "charsetName": "UTF-8",
                        "maxFileSize": 1073741824,
                        "tablesColumn": "{\"student\":[{\"key\":\"id\",\"type\":\"string\"},{\"key\":\"name\",\"type\":\"string\"},{\"key\":\"age\",\"type\":\"string\"}]}",
                        "defaultFS": "hdfs://master:9000"
                    }
                }
            }
        ],
        "setting": {
            "restore": {
                "isRestore": false,
                "isStream": false
            },
            "errorLimit": {},
            "speed": {
                "channel": 3
            }
        }
    }
}
  • 在hive中建立testflinkx資料庫,並建立student分割槽表
create database testflinkx;
use testflinkx;
CREATE TABLE `student`(
  `id` string, 
  `name` string, 
  `age` string)
PARTITIONED BY ( 
  `pt` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ','
  • 啟動hiveserver2
# 第一種方式:
hiveserver2
# 第二種方式:
hive --service hiveserver2
  • 啟動任務
flinkx -mode local -job /usr/local/soft/flinkx-1.10/jsonConf/mysqlToHive.json -pluginRoot /usr/local/soft/flinkx-1.10/syncplugins/ -flinkconf /usr/local/soft/flinkx-1.10/flinkconf/
  • 檢視日誌及執行情況同上

MySQLToHBase

  • 配置檔案
{
    "job": {
        "content": [
            {
                "reader": {
                    "parameter": {
                        "username": "root",
                        "password": "123456",
                        "connection": [
                            {
                                "jdbcUrl": [
                                    "jdbc:mysql://master:3306/student?characterEncoding=utf8"
                                ],
                                "table": [
                                    "score"
                                ]
                            }
                        ],
                        "column": [
                            "*"
                        ],
                        "customSql": "",
                        "splitPk": "student_id",
                        "queryTimeOut": 1000,
                        "requestAccumulatorInterval": 2
                    },
                    "name": "mysqlreader"
                },
                "writer": {
                    "name": "hbasewriter",
                    "parameter": {
                        "hbaseConfig": {
                            "hbase.zookeeper.property.clientPort": "2181",
                            "hbase.rootdir": "hdfs://master:9000/hbase",
                            "hbase.cluster.distributed": "true",
                            "hbase.zookeeper.quorum": "master,node1,node2",
                            "zookeeper.znode.parent": "/hbase"
                        },
                        "table": "testFlinkx",
                        "rowkeyColumn": "$(cf1:student_id)_$(cf1:course_id)",
                        "column": [
                            {
                                "name": "cf1:student_id",
                                "type": "string"
                            },
                            {
                                "name": "cf1:course_id",
                                "type": "string"
                            },
                            {
                                "name": "cf1:score",
                                "type": "string"
                            }
                        ]
                    }
                }
            }
        ],
        "setting": {
            "restore": {
                "isRestore": false,
                "isStream": false
            },
            "errorLimit": {},
            "speed": {
                "channel": 3
            }
        }
    }
}
  • 啟動hbase 並建立testflinkx表
create 'testFlinkx','cf1'
  • 啟動任務
flinkx -mode local -job /usr/local/soft/flinkx-1.10/jsonConf/mysqlToHBase.json -pluginRoot /usr/local/soft/flinkx-1.10/syncplugins/ -flinkconf /usr/local/soft/flinkx-1.10/flinkconf/
  • 檢視日誌及執行情況同上

MySQLToMySQL

  • 配置檔案
{
    "job": {
      "content": [
        {
          "reader": {
            "name": "mysqlreader",
            "parameter": {
              "column": [
                {
                  "name": "id",
                  "type": "int"
                },
                {
                  "name": "name",
                  "type": "string"
                },
                {
                  "name": "age",
                  "type": "int"
                },
                {
                  "name": "gender",
                  "type": "string"
                },
                {
                  "name": "clazz",
                  "type": "string"
                }
              ],
              "username": "root",
              "password": "123456",
              "connection": [
                {
                  "jdbcUrl": [
                    "jdbc:mysql://master:3306/student?useSSL=false"
                  ],
                  "table": [
                    "student"
                  ]
                }
              ]
            }
          },
          "writer": {
            "name": "mysqlwriter",
            "parameter": {
              "username": "root",
              "password": "123456",
              "connection": [
                {
                  "jdbcUrl": "jdbc:mysql://master:3306/student?useSSL=false",
                  "table": [
                    "student2"
                  ]
                }
              ],
              "writeMode": "insert",
              "column": [
                {
                    "name": "id",
                    "type": "int"
                  },
                  {
                    "name": "name",
                    "type": "string"
                  },
                  {
                    "name": "age",
                    "type": "int"
                  },
                  {
                    "name": "gender",
                    "type": "string"
                  },
                  {
                    "name": "clazz",
                    "type": "string"
                  }
              ]
            }
          }
        }
      ],
      "setting": {
        "speed": {
          "channel": 1,
          "bytes": 0
        }
      }
    }
  }

相關文章