一. QuickStart
1. 前言
Quick start | Elasticsearch Guide 8.12
跟著官方文件去學習瞭解Elasticsearch...
2. 註冊Elastic Cloud以獲取免費試用
不自己下載Elastic而是選擇Elastic Cloud的14天[免費試用](Sign up for the Elasticsearch Service with a free 14-day trial | Elastic)
- 註冊,獲取免費試用的賬戶.
- 登入到 Elastic Cloud.
- 點選 Create deployment.
- 為你的例項起名
點選開啟Kibana的主選單(Elastic徽標附近的"☰"),然後進入DEV Tools
>Console
。
3. 快速入門
上傳單個文件
/books
是索引名,沒有則自動新建
/_doc
表明上傳一個文件
?pretty
表明返回易閱讀的json格式資料
POST /books/_doc?pretty
{"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470}
批次上傳
/_bulk
表明批次上傳文件
POST /_bulk
{ "index" : { "_index" : "books" } }
{"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585}
{ "index" : { "_index" : "books" } }
{"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328}
{ "index" : { "_index" : "books" } }
{"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227}
{ "index" : { "_index" : "books" } }
{"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268}
{ "index" : { "_index" : "books" } }
{"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311}
查詢文件
/books
表明查詢文件的索引
/_search
表明操作為查詢
GET /books/_search?pretty
match查詢
match
表明查詢模式
name
:brave
表明查詢name中含有brave的文件這是一個執行全文搜尋的標準查詢
GET /books/_search?pretty
{
"query": {
"match": {
"name": "brave"
}
}
}
- Search your data. Jump here to learn about exact value search, full-text search, vector search, and more, using the search API.
- 光看文件太枯燥,還是需要用到的時候再去查這些api吧
二. 使用Canal
同步mysql資料到Elasticsearch
1. 安裝canal-deployer
修改/etc/mysql/mysql.conf.d/mysqld.cnf
檔案(配置檔案有好幾個,但是裡面會有介紹,可能會有不同)
新增上面三行
[mysqld]
log-bin=mysql-bin # 開啟 binlog
binlog-format=ROW # 選擇 ROW 模式
server_id=1 # 配置 MySQL replaction 需要定義,不要和 canal 的 slaveId 重複
重啟
service mysql restart
授權 canal 連結 MySQL 賬號具有作為 MySQL slave 的許可權, 如果已有賬戶可直接 grant
CREATE USER canal IDENTIFIED BY 'canal';
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
-- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ;
FLUSH PRIVILEGES;
下載canal
,訪問 release 頁面 , 選擇需要的包下載, 如以 1.1.7 版本為例
wget https://github.com/alibaba/canal/releases/download/canal-1.1.7/canal.deployer-1.1.7.tar.gz
解壓縮
mkdir /opt/canal
tar zxvf canal.deployer-1.1.7.tar.gz -C /opt/canal
進入,可以看到如下結構
drwxr-xr-x 7 root root 4096 Mar 4 20:13 ./
drwxr-xr-x 4 root root 4096 Mar 4 20:12 ../
drwxr-xr-x 2 root root 4096 Mar 4 20:21 bin/
drwxr-xr-x 5 root root 4096 Mar 4 20:13 conf/
drwxr-xr-x 2 root root 4096 Mar 4 20:13 lib/
drwxrwxrwx 4 root root 4096 Mar 4 20:21 logs/
drwxrwxrwx 2 root root 4096 Oct 13 14:09 plugin/
配置修改
vim conf/example/instance.properties
## mysql serverId
canal.instance.mysql.slaveId = 1234
#position info,需要改成自己的資料庫資訊
canal.instance.master.address = 127.0.0.1:3306
canal.instance.master.journal.name =
canal.instance.master.position =
canal.instance.master.timestamp =
#canal.instance.standby.address =
#canal.instance.standby.journal.name =
#canal.instance.standby.position =
#canal.instance.standby.timestamp =
#username/password,需要改成自己的資料庫資訊
canal.instance.dbUsername = canal
canal.instance.dbPassword = canal
canal.instance.defaultDatabaseName =
canal.instance.connectionCharset = UTF-8
#table regex
canal.instance.filter.regex = .\*\\\\..\*
我只改了slaveId和賬戶密碼
- canal.instance.connectionCharset 代表資料庫的編碼方式對應到 java 中的編碼型別,比如 UTF-8,GBK , ISO-8859-1
- 如果系統是1個 cpu,需要將 canal.instance.parser.parallel 設定為 false
啟動
sh bin/startup.sh
檢視server日誌
cat logs/canal/canal.log
2024-03-04 20:21:04.233 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## set default uncaught exception handler
2024-03-04 20:21:04.238 [main] INFO com.alibaba.otter.canal.deployer.CanalLauncher - ## load canal configurations
2024-03-04 20:21:04.245 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## start the canal server.
2024-03-04 20:21:04.267 [main] INFO com.alibaba.otter.canal.deployer.CanalController - ## start the canal server[172.18.0.1(172.18.0.1):11111]
2024-03-04 20:21:05.025 [main] INFO com.alibaba.otter.canal.deployer.CanalStarter - ## the canal server is running now ......
檢視 instance的日誌
cat logs/example/example.log
2024-03-04 20:21:04.588 [main] INFO c.a.otter.canal.instance.spring.CanalInstanceWithSpring - start CannalInstance for 1-example
2024-03-04 20:21:05.002 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^.*\..*$
2024-03-04 20:21:05.003 [main] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter : ^mysql\.slave_.*$
2024-03-04 20:21:05.005 [main] INFO c.a.otter.canal.instance.core.AbstractCanalInstance - start successful....
2024-03-04 20:21:05.060 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2024-03-04 20:21:05.060 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position just show master status
2024-03-04 20:21:05.641 [destination = example , address = /127.0.0.1:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=mysql-bin.000002,position=4,serverId=1,gtid=<null>,timestamp=1709554248000] cost : 577ms , the next step is binlog dump
關閉
sh bin/stop.sh
2. 安裝配置ClientAdapter(阿里雲ES)
參考文章
感悟:很多中介軟體升級了版本之後很多舊的配置教程都不管用了,所以最重要的是:
- 參考官方文件
- 很多配置見名知意,註釋詳細,注意觀察
- 參考舊的文件
- 是參考!!!不是照搬!
前提條件
MySQL 8.0: WSL2 Ubuntu中
ES 8.5: 阿里雲ES
Canal 1.1.7: WSL2 Ubuntu中
操作環境: WSL2 Ubuntu中
已建立MySQL例項、阿里雲ES例項。
- 已建立MYSQL例項。本文以MySQL 8.0版本為例。
- 已建立阿里雲ES例項。具體操作請參見建立阿里雲Elasticsearch例項。本文以阿里雲ES 8.5通用版為例。
說明:
透過canal將資料寫入到ES例項中,需將阿里雲ECS例項的IP地址加入ES例項中。具體操作,請參見配置ES例項公網或私網訪問白名單。
使用限制
-
本方案僅支援將MySQL增量資料同步至阿里雲ES。
-
安裝的JDK版本必須大於等於1.8.0(本文為1.8)。
-
Canal 1.1.4版本不支援ES 7.x版本。
ES 7.x版本的資料寫入需使用Canal 1.1.5版本,ES 8.x版本請選擇1.1.7版本。您也可以透過其他方式(例如Logstash、DTS)實現MySQL資料同步。
-
在進行資料同步時支援自定義索引Mapping,但需保證Mapping中定義的欄位(名稱+型別)與MySQL中一致。
-
本方案需要您自行保證Canal的可用性,避免出現業務不可用或故障。例如:當出現ECS重啟,Canal異常退出等場景時如何繼續同步資料等。
-
Canal Adapter不支援使用HTTPS協議連線阿里雲ES例項。
操作步驟
步驟一:準備MySQL資料來源
本文使用的建表語句如下。
-- create table
create database canal;<br>USE canal;
CREATE TABLE product (
id bigint(20) NOT NULL AUTO_INCREMENT,
title varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
sub_title varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
price decimal(10, 2) NULL DEFAULT NULL,
pic varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
PRIMARY KEY (id) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
步驟二:建立索引
Dev-tool位置,見下圖1-4
-
在這裡將公網白名單加上你的ip地址(ip查詢器)
-
點選此處也可找到入口
-
請不要忘記在安全配置頁面將你的ip地址加入公網白名單
在控制檯輸入以下DSL語句並執行
PUT canal_product
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"sub_title": {
"type": "text"
},
"pic": {
"type": "text"
},
"price": {
"type": "double"
}
}
}
}
建立成功後,返回如下結果。
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "es_test"
}
步驟三:安裝JDK
-
檢視可用的JDK軟體包列表。
sudo yum search java | grep -i --color JDK
-
選擇合適的版本,安裝JDK。
本文選擇java-1.8.0-openjdk-devel.x86_64。
sudo yum install java-1.8.0-openjdk-devel.x86_64
-
配置環境變數。
-
開啟etc資料夾下的profile檔案。
vim ~/.bash_profile
-
在檔案內新增如下的環境變數。
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.362.b08-1.el7_9.x86_64 export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin
重要
JAVA_HOME需要替換為您JDK的安裝路徑,可透過
find / -name 'java'
命令檢視。 -
按下Esc鍵,然後使用
:wq
儲存檔案並退出vi模式,隨後執行以下命令使配置生效。source ~/.bash_profile
-
-
執行以下命令,驗證JDK是否安裝成功。
java -version
顯示如下結果說明JDK安裝成功。
openjdk version "1.8.0_362" OpenJDK Runtime Environment (build 1.8.0_362-b08) OpenJDK 64-Bit Server VM (build 25.362-b08, mixed mode)
步驟四:安裝並啟動Canal-adapter
先啟動上面安裝的Server
-
下載Canal-adapter。
本文使用1.1.4版本。
wget https://github.com/alibaba/canal/releases/download/canal-1.1.7/canal.adapter-1.1.7.tar.gz
說明
-
如果你的是ES 8.0,Canal版本至少在1.1.7及以上
-
目前Canal 1.1.5版本已支援ES 7.0版本,如果您使用的是ES 7.0,需要下載Canal 1.1.5版本。詳細資訊請參見Canal release note。
-
-
解壓。
mkdir canal_adapter tar -zxvf canal.adapter-1.1.7.tar.gz -C canal_adapter/
-
修改
conf/application.yml
檔案。vim conf/application.yml
server: port: 8081 spring: jackson: date-format: yyyy-MM-dd HH:mm:ss time-zone: GMT+8 default-property-inclusion: non_null canal.conf: mode: tcp #tcp kafka rocketMQ rabbitMQ flatMessage: true zookeeperHosts: syncBatchSize: 1000 retries: 0 timeout: accessKey: secretKey: consumerProperties: # canal tcp consumer canal.tcp.server.host: 127.0.0.1:11111 canal.tcp.zookeeper.hosts: canal.tcp.batch.size: 500 canal.tcp.username: canal.tcp.password: # kafka consumer kafka.bootstrap.servers: 127.0.0.1:9092 kafka.enable.auto.commit: false kafka.auto.commit.interval.ms: 1000 kafka.auto.offset.reset: latest kafka.request.timeout.ms: 40000 kafka.session.timeout.ms: 30000 kafka.isolation.level: read_committed kafka.max.poll.records: 1000 # rocketMQ consumer rocketmq.namespace: rocketmq.namesrv.addr: 127.0.0.1:9876 rocketmq.batch.size: 1000 rocketmq.enable.message.trace: false rocketmq.customized.trace.topic: rocketmq.access.channel: rocketmq.subscribe.filter: # rabbitMQ consumer rabbitmq.host: rabbitmq.virtual.host: rabbitmq.username: rabbitmq.password: rabbitmq.resource.ownerId: # 1. 這裡需要將註釋去掉,並且注意格式對齊,我這裡被坑慘了 srcDataSources: defaultDS: url: jdbc:mysql://127.0.0.1:3306/elastic-search-lab?useUnicode=true username: root password: 123456 canalAdapters: - instance: example # canal instance Name or mq topic name groups: - groupId: g1 outerAdapters: - name: logger # 2. 首先注意對齊,把名稱改成es6/es7/es8 - name: es8 hosts: http://es-cn-xxxxxxxxxxxxx.public.elasticsearch.aliyuncs.com:9200 # 127.0.0.1:9200 for rest mode properties: mode: rest # or rest security.auth: elastic:xxxxxxxxxx # only used for rest mode cluster.name: es-cn-xxxxxxxxxxxxx # - name: rdb # key: mysql1 # properties: # jdbc.driverClassName: com.mysql.jdbc.Driver # jdbc.url: jdbc:mysql://127.0.0.1:3306/mytest2?useUnicode=true # jdbc.username: root # jdbc.password: 121212 # druid.stat.enable: false # druid.stat.slowSqlMillis: 1000 # - name: rdb # key: oracle1 # properties: # jdbc.driverClassName: oracle.jdbc.OracleDriver # jdbc.url: jdbc:oracle:thin:@localhost:49161:XE # jdbc.username: mytest # jdbc.password: m121212 # - name: rdb # key: postgres1 # properties: # jdbc.driverClassName: org.postgresql.Driver # jdbc.url: jdbc:postgresql://localhost:5432/postgres # jdbc.username: postgres # jdbc.password: 121212 # threads: 1 # commitSize: 3000 # - name: hbase # properties: # hbase.zookeeper.quorum: 127.0.0.1 # hbase.zookeeper.property.clientPort: 2181 # zookeeper.znode.parent: /hbase # - name: kudu # key: kudu # properties: # kudu.master.address: 127.0.0.1 # ',' split multi address # - name: phoenix # key: phoenix # properties: # jdbc.driverClassName: org.apache.phoenix.jdbc.PhoenixDriver # jdbc.url: jdbc:phoenix:127.0.0.1:2181:/hbase/db # jdbc.username: # jdbc.password:
配置項 說明 canal.conf.canalServerHost canalDeployer訪問地址。保持預設(127.0.0.1:11111)即可。 canal.conf.srcDataSources.defaultDS.url mysql所在 ip:port canal.conf.srcDataSources.defaultDS.username RDS MySQL資料庫的賬號名稱,可在RDS MySQL例項的賬號管理頁面獲取。 canal.conf.srcDataSources.defaultDS.password RDS MySQL資料庫的密碼。 canal.conf.canalAdapters.groups.outerAdapters.hosts 定位到name:es的位置,將hosts替換為<ES例項的公網地址>:<內網埠>,相關資訊可在ES例項的基本資訊頁面獲取。例如,es-cn-v64xxxxxxxxx3medp.elasticsearch.aliyuncs.com:9200。 canal.conf.canalAdapters.groups.outerAdapters.mode 必須設定為rest。 canal.conf.canalAdapters.groups.outerAdapters.properties.security.auth 需要設定為<ES例項的賬號>:<密碼>。例如elastic:es_password。 canal.conf.canalAdapters.groups.outerAdapters.properties.cluster.name ES例項的ID,可在ES例項的基本資訊頁面獲取。例如es-cn-v64xxxxxxxxx3medp。 -
按下Esc鍵,然後使用
:wq
命令儲存檔案並退出vi模式。 -
修改conf/es8/mytest_user.yml (對應自己的版本)
dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
# 1. 改索引
_index: canal_product
_id: _id
# upsert: true
# pk: id
# 2. 該sql
sql: "SELECT
p.id AS _id,
p.title,
p.sub_title,
p.price,
p.pic
FROM
product p"
# objFields:
# _labels: array:;
etlCondition: "where a.c_time>={}"
commitBatch: 3000
啟動Canal-adapter服務,並檢視日誌。
./bin/startup.sh
cat logs/adapter/adapter.log
# 這是啟動成功的日誌資訊
2024-03-06 01:16:01.939 [main] INFO c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Starting CanalAdapterApplication using Java 1.8.0_392 on JiuYou. with PID 15483 (/opt/canal.adapter-1.1.7/lib/client-adapter.launcher-1.1.7.jar started by root in /opt/canal.adapter-1.1.7/bin)
2024-03-06 01:16:01.949 [main] INFO c.a.otter.canal.adapter.launcher.CanalAdapterApplication - No active profile set, falling back to default profiles: default
2024-03-06 01:16:02.431 [main] INFO org.springframework.cloud.context.scope.GenericScope - BeanFactory id=a6cee8d1-48d1-3e64-a5f9-a1d1e90caee9
2024-03-06 01:16:02.631 [main] INFO o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat initialized with port(s): 8081 (http)
2024-03-06 01:16:02.639 [main] INFO org.apache.coyote.http11.Http11NioProtocol - Initializing ProtocolHandler ["http-nio-8081"]
2024-03-06 01:16:02.639 [main] INFO org.apache.catalina.core.StandardService - Starting service [Tomcat]
2024-03-06 01:16:02.639 [main] INFO org.apache.catalina.core.StandardEngine - Starting Servlet engine: [Apache Tomcat/9.0.52]
2024-03-06 01:16:02.681 [main] INFO o.a.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring embedded WebApplicationContext
2024-03-06 01:16:02.681 [main] INFO o.s.b.w.s.context.ServletWebServerApplicationContext - Root WebApplicationContext: initialization completed in 640 ms
2024-03-06 01:16:03.132 [main] INFO com.alibaba.druid.pool.DruidDataSource - {dataSource-1} inited
2024-03-06 01:16:03.453 [main] INFO org.apache.coyote.http11.Http11NioProtocol - Starting ProtocolHandler ["http-nio-8081"]
2024-03-06 01:16:03.461 [main] INFO o.s.boot.web.embedded.tomcat.TomcatWebServer - Tomcat started on port(s): 8081 (http) with context path ''
2024-03-06 01:16:03.464 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## syncSwitch refreshed.
2024-03-06 01:16:03.464 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## start the canal client adapters.
2024-03-06 01:16:03.467 [main] INFO c.a.otter.canal.client.adapter.support.ExtensionLoader - extension classpath dir: /opt/canal.adapter-1.1.7/plugin
2024-03-06 01:16:03.487 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: logger succeed
2024-03-06 01:16:03.633 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ...
2024-03-06 01:16:03.655 [main] INFO c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
2024-03-06 01:16:03.864 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es8 succeed
2024-03-06 01:16:03.869 [main] INFO c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /opt/canal.adapter-1.1.7/plugin
2024-03-06 01:16:03.880 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Start adapter for canal-client mq topic: example-g1 succeed
2024-03-06 01:16:03.880 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <=============
2024-03-06 01:16:03.880 [main] INFO c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## the canal client adapters are running now ......
2024-03-06 01:16:03.886 [main] INFO c.a.otter.canal.adapter.launcher.CanalAdapterApplication - Started CanalAdapterApplication in 2.246 seconds (JVM running for 2.59)
2024-03-06 01:16:03.959 [Thread-4] INFO c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Subscribe destination: example succeed <=============
步驟五:驗證增量資料同步
-
在RDS MySQL資料庫中,新增、修改或刪除資料庫中es_test表的資料。
INSERT INTO product ( id, title, sub_title, price, pic ) VALUES ( 15, '小米8', ' 全面屏遊戲智慧手機 6GB+64GB', 1999.00, NULL );
-
檢視日誌
2024-03-06 01:17:28.284 [pool-3-thread-1] INFO c.a.o.canal.client.adapter.logger.LoggerAdapterExample - DML: {"data":[{"id":15,"title":"小米8","sub_title":" 全面屏遊戲智慧手機 6GB+64GB","price":1999.0,"pic":null}],"database":"elastic-search-lab","destination":"example","es":1709659047000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"product","ts":1709659048171,"type":"INSERT"} 2024-03-06 01:17:28.790 [pool-3-thread-1] DEBUG c.a.o.canal.client.adapter.es.core.service.ESSyncService - DML: {"data":[{"id":15,"title":"小米8","sub_title":" 全面屏遊戲智慧手機 6GB+64GB","price":1999.0,"pic":null}],"database":"elastic-search-lab","destination":"example","es":1709659047000,"groupId":"g1","isDdl":false,"old":null,"pkNames":["id"],"sql":"","table":"product","ts":1709659048171,"type":"INSERT"} Affected indexes: canal_product
-
登入目標阿里雲ES例項的Kibana控制檯,具體操作請參見登入Kibana控制檯。
-
在左側導航欄,單擊Dev Tools。
-
在Console中,執行以下命令查詢同步成功的資料。
GET /product/_search
預期結果如下。
Canal同步的是增量資料,不會同步之前的存量資料。
3. 安裝配置ClientAdapter(ElasticCloud)
待更新...