邊緣叢集場景下的映象快取
1、問題背景
數量龐大的邊緣節點拉取中心雲私有倉庫裡的同一映象時會給雲端伺服器造成很大的壓力,因此如果同一份映象邊緣側只拉取一次,然後把映象快取到邊緣側,之後邊緣節點再去拉取映象時直接從快取中獲取,會大大減少雲端伺服器的壓力,有點類似於映象CDN的功能。
本次實驗需要至少三個節點,一個雲中心節點用來部署我們的私有映象倉庫,兩個邊緣節點,其中一個用來快取映象,一個用來做拉取實驗。
2、環境搭建
2.1、雲端私有倉庫Harbor搭建
2.1.1、安裝docker環境
#官方倉庫安裝
https://docs.docker.com/install/linux/docker-ce/centos
#二進位制安裝
https://docs.docker.com/install/linux/docker-ce/binaries
https://download.docker.com/linux/static/stable #下載二進位制檔案
2.1.2、安裝docker-compose
curl -L https://github.com/docker/compose/releases/download/1.27.4/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
docker-compose -version
2.1.3、安裝Harbor
wget https://github.com/goharbor/harbor/releases/download/v2.1.1/harbor-offline-installer-v2.1.1.tgz
#解壓到home目錄下
tar xvf harbor-offline-installer-v2.1.1.tgz -C /home/xing && cd /home/xing/harbor/
修改harbor.yml檔案,配置檔案中對應目錄要建立好
# Configuration file of Harbor
# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: harbor.xing.com
# http related config
#http:
# port for http, default is 80. If https enabled, this port will redirect to https port
# port: 80
# https related config
https:
# https port for harbor, default is 443
port: 443
# The path of cert and key files for nginx
certificate: /home/xing/harbor/certs/harbor.crt
private_key: /home/xing/harbor/certs/harbor.key
# # Uncomment following will enable tls communication between all harbor components
# internal_tls:
# # set enabled to true means internal tls is enabled
# enabled: true
# # put your cert and key files on dir
# dir: /etc/harbor/tls/internal
# Uncomment external_url if you want to enable external proxy
# And when it enabled the hostname will no longer used
# external_url: https://reg.mydomain.com:8433
# The initial password of Harbor admin
# It only works in first time to install harbor
# Remember Change the admin password from UI after launching Harbor.
harbor_admin_password: Harbor12345
# Harbor DB configuration
database:
# The password for the root user of Harbor DB. Change this before any production use.
password: root123
# The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.
max_idle_conns: 50
# The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.
# Note: the default number of connections is 1024 for postgres of harbor.
max_open_conns: 1000
# The default data volume
data_volume: /home/xing/harbor/data
# Harbor Storage settings by default is using /data dir on local filesystem
# Uncomment storage_service setting If you want to using external storage
# storage_service:
# # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore
# # of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate.
# ca_bundle:
# # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss
# # for more info about this configuration please refer https://docs.docker.com/registry/configuration/
# filesystem:
# maxthreads: 100
# # set disable to true when you want to disable registry redirect
# redirect:
# disabled: false
# Trivy configuration
#
# Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.
# It is downloaded by Trivy from the GitHub release page https://github.com/aquasecurity/trivy-db/releases and cached
# in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it
# should download a newer version from the Internet or use the cached one. Currently, the database is updated every
# 12 hours and published as a new release to GitHub.
trivy:
# ignoreUnfixed The flag to display only fixed vulnerabilities
ignore_unfixed: false
# skipUpdate The flag to enable or disable Trivy DB downloads from GitHub
#
# You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.
# If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and
# `metadatta.json` files and mount them in the `/home/scanner/.cache/trivy/db` path.
skip_update: false
#
# insecure The flag to skip verifying registry certificate
insecure: false
# github_token The GitHub access token to download Trivy DB
#
# Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough
# for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000
# requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult
# https://developer.github.com/v3/#rate-limiting
#
# You can create a GitHub token by following the instructions in
# https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
#
# github_token: xxx
jobservice:
# Maximum number of job workers in job service
max_job_workers: 10
notification:
# Maximum retry count for webhook job
webhook_job_max_retry: 10
chart:
# Change the value of absolute_url to enabled can enable absolute url in chart
absolute_url: disabled
# Log configurations
log:
# options are debug, info, warning, error, fatal
level: info
# configs for logs in local storage
local:
# Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.
rotate_count: 50
# Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.
# If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G
# are all valid.
rotate_size: 200M
# The directory on your host that store log
location: /var/log/harbor
# Uncomment following lines to enable external syslog endpoint.
# external_endpoint:
# # protocol used to transmit log to external endpoint, options is tcp or udp
# protocol: tcp
# # The host of external endpoint
# host: localhost
# # Port of external endpoint
# port: 5140
#This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!
_version: 2.2.0
# Uncomment external_database if using external database.
# external_database:
# harbor:
# host: harbor_db_host
# port: harbor_db_port
# db_name: harbor_db_name
# username: harbor_db_username
# password: harbor_db_password
# ssl_mode: disable
# max_idle_conns: 2
# max_open_conns: 0
# notary_signer:
# host: notary_signer_db_host
# port: notary_signer_db_port
# db_name: notary_signer_db_name
# username: notary_signer_db_username
# password: notary_signer_db_password
# ssl_mode: disable
# notary_server:
# host: notary_server_db_host
# port: notary_server_db_port
# db_name: notary_server_db_name
# username: notary_server_db_username
# password: notary_server_db_password
# ssl_mode: disable
# Uncomment external_redis if using external Redis server
# external_redis:
# # support redis, redis+sentinel
# # host for redis: <host_redis>:<port_redis>
# # host for redis+sentinel:
# # <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3>
# host: redis:6379
# password:
# # sentinel_master_set must be set to support redis+sentinel
# #sentinel_master_set:
# # db_index 0 is for core, it's unchangeable
# registry_db_index: 1
# jobservice_db_index: 2
# chartmuseum_db_index: 3
# trivy_db_index: 5
# idle_timeout_seconds: 30
# Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.
# uaa:
# ca_file: /path/to/ca
# Global proxy
# Config http proxy for components, e.g. http://my.proxy.com:3128
# Components doesn't need to connect to each others via http proxy.
# Remove component from `components` array if want disable proxy
# for it. If you want use proxy for replication, MUST enable proxy
# for core and jobservice, and set `http_proxy` and `https_proxy`.
# Add domain to the `no_proxy` field, when you want disable proxy
# for some special registry.
proxy:
http_proxy:
https_proxy:
no_proxy:
components:
- core
- jobservice
- trivy
# metric:
# enabled: false
# port: 9090
# path: /metrics
2.1.4、openssl生成自簽證照
# 1、生成證照,並儲存到 /home/xing/harbor/certs 目錄下
openssl req -newkey rsa:4096 -nodes -sha256 -keyout /home/xing/harbor/certs/harbor.key -x509 -out /home/xing/harbor/certs/harbor.crt -subj /C=CN/ST=BJ/L=BJ/O=DEVOPS/CN=harbor.xing.com -days 3650
req 產生證照籤發申請命令
-newkey 生成新私鑰
rsa:4096 生成祕鑰位數
-nodes 表示私鑰不加密
-sha256 使用SHA-2雜湊演算法
-keyout 將新建立的私鑰寫入的檔名
-x509 簽發X.509格式證照命令。X.509是最通用的一種簽名證照格式。
-out 指定要寫入的輸出檔名
-subj 指定使用者資訊
-days 有效期(3650表示十年)
2.1.5、啟動harbor服務
./install.sh
修改hosts檔案,新增域名。也可以直接登入https://hostip:80,使用者密碼預設admin Harbor12345(配置檔案中)
2.2、上傳映象到私有倉庫
2.2.1、新增私有映象信任倉庫
# 1、新增倉庫地址
vi /etc/docker/daemon.json
{
"registry-mirrors": ["https://k1ktap5m.mirror.aliyuncs.com"],
"insecure-registries": ["172.16.9.3","harbor.xing.com"]
}
# 2、重啟 docker 服務
systemctl daemon-reload
systemctl restart docker
#新增證照
2.2.2、登入私有倉庫
2.2.3、從Harbor倉庫上傳/下載映象
# 1、將本地映象打上私有倉庫
# 格式:docker tag 本地映象名:版本 Harbor伺服器訪問ip+埠/專案名/倉庫映象名:版本
docker tag nginx:latest harbor.xing.com/xing/mynginx:v1
# 2、上傳映象
docker push harbor.xing.com/xing/mynginx:v1
# 3、下載映象
docker pull harbor.xing.com/xing/mynginx:v1
2.3、在k8s叢集中使用Harbor
k8s 預設https訪問harbor,訪問http,需要修改整個叢集節點
/etc/docker/daemon.json
檔案
由於harbor採用了使用者名稱密碼認證,所以在映象下載時需要配置sercet
#建立一個給Docker registry使用的secret
kubectl create secret docker-registry registry-secret --namespace=default \
--docker-server=172.16.9.3 \
--docker-username=admin \
--docker-password=Harbor12345
#檢視secret
[root@master demo]# kubectl get secret
NAME TYPE DATA AGE
default-token-gdwgn kubernetes.io/service-account-token 3 2d18h
registry-secret kubernetes.io/dockerconfigjson 1 116s
#刪除
kubectl delete secret registry-secret
至此只需要把containers中的images映象指定為harbor倉庫映象地址即可。
3、邊緣部署registry快取
3.1、執行服務
可以通過官方映象啟動docker registry快取服務,也可以通過二進位制啟動。
-
啟動服務
二進位制方式
git clone https://github.com/distribution/distribution.git cd distribution make binaries #然後進入/bin目錄啟動 ./registry serve /etc/docker/registry/config.yml #前提是配置檔案寫好
docker啟動,把相應儲存卷掛載到容器裡。
docker run -itd -p 5000:5000 -v /var/lib/registry:/var/lib/registry -v /etc/docker/registry/config.yml:/etc/docker/registry/config.yml --name registry registry:v2 #埠要對外暴漏出來,埠的定義在配置檔案裡
-
編寫distribution配置檔案,預設放在/etc/docker/registry/config.yml裡
version: 0.1 log: fields: service: registry storage: cache: blobdescriptor: inmemory filesystem: rootdirectory: /var/lib/registry http: addr: :5000 #服務埠 headers: X-Content-Type-Options: [nosniff] proxy: remoteurl: https://harbor.xing.com #私有映象地址 username: *** #使用者名稱 password: *** #密碼 health: storagedriver: enabled: true interval: 10s threshold: 3
-
更改hosts檔案
vi /etc/hosts #新增域名 10.10.102.190 harbor.xing.com #ip為harbor映象倉庫的地址
如果對外暴露的是80埠,docker pull的時候可以不加埠,如果使用預設暴露的5000埠,拉取映象時候要加埠
docker pull localhost:5000/xing/imagecache:v2
如果想要上傳的話,先更改映象名稱
但是這樣只是push到了快取中,遠端倉庫並沒有更改。
映象快取在/var/lib/registry/docker/registry/v2/repositories/專案名/倉庫名下面。
其他節點使用快取的映象只需要把域名改為快取的地址就可以了
vi /etc/hosts
192.168.123.160 harbor.xing.com #此IP為快取節點的IP而非harbor倉庫的IP!!!
#當我們在其他節點上執行這個命令的時候(如果是80埠可以省略),映象拉取的請求會被打入到快取倉庫裡去,快取倉庫裡如果有,直接返回映象,如果沒有,快取倉庫會到config.yml裡配置的remoteurl的地址裡也就是我們的私有harbor倉庫裡去拉取映象,把映象快取並返回給請求節點(cache-aside策略)
docker pull harbor.xing.com:5000/xing/imagecahe:v2
3.2、問題總結
-
當我用docker啟動的時候,拉取映象時會報錯誤
我們的harbor.xing.com域名解析不了,本以為是docker內部沒有解析到這個域名,我嘗試進入容器內部修改hosts檔案,沒能解決,索性把容器的啟動網路設定為host模式,依舊不行。因此我把原始碼重新編譯成二進位制形式執行,沒有錯誤,正常快取。
-
跨平臺執行埠問題:我把服務部署在x86_64平臺下,服務正常執行,但由於我們有arm64的邊緣節點,我嘗試編譯成arm架構的指令去執行在邊緣節點下時,docker pull必須帶埠,即使是80埠,也不能省略。否則會報
invalid character '<' looking for beginning of value
錯誤,原因未知。
3.3、注意事項
-
docker pull 預設使用的是https協議,因為我們拉取私有地址時候可能用到http協議,因此需要修改damon.json檔案。只要我們請求的域名報https錯誤,就把這個域名新增到
insecure-registries
欄位裡就行了。 -
證照問題
我們需要把前面生成的harbor.crt證照匯入到我們的快取節點上,去更新證照,更新方法自行搜尋。如果更新後出現上面的
certificate relies on legacy Conmmon Name field
,要麼降低Golang的版本,要麼執行時候注入GODEBUG=x509ignoreCN=0
環境變數。
4、快取策略(補充知識)
4.1、Cache-Aside策略
這種策略下,應用程式會與cache和data source進行通訊,應用程式會在命中data source之前先檢查cache。
這種策略下,應用程式在首先讀cache裡面的資料,如果未命中,則去data source裡面獲取資料,然後在存到cache裡。
優點
- 適合
讀多
應用場景 - 在一定程度上可以抵抗快取故障,如果快取服務故障,系統可以直接訪問data source獲取資料
缺點
- 不能保證資料儲存和快取之間的一致性
- 首次請求資料時,總是快取未命中(可通過手動觸發查詢操作來對資料進行
預熱
)
4.2、Read-Through策略
這種策略,應用程式無需管理資料來源和快取,只需要將資料來源的同步委託給快取提供程式Cache Provider即可。所有資料互動都是通過抽象快取層完成的。
Read-Through適用於多次請求相同資料的場景
優點
- 進行大量讀取時,可以減少資料來源的負載
- 也對快取服務的故障具備一定的彈性
缺點
- 首次請求資料,會導致快取未命中,還是可以通過快取預熱來解決
與Cache-Aside相比,實際對快取和資料來源的操作通過Cache Provider支援
4.3、Write-Through策略
這種策略下,當資料發生更新時,Cache Provider負責更新資料來源和快取。快取與資料來源保持一致,並且寫入時始終通過抽象快取層到達資料來源。
由於需要將資料同步寫入快取和資料來源,因此資料寫入速度較慢。但是,與Read-Through配合使用時,我們將獲得Read-Through的所有好處,並且還可以獲得資料一致性保證。
4.4、Write-Behind策略
如果沒有強一致性要求,可以簡單地使快取的更新請求入隊,並且定期將其flush到資料儲存中
Write-Behind在資料更新時,只寫入快取。
優點
- 資料寫入速度快,適用於頻繁的寫工作負載
- 與
Read-Through
配合使用,可以很好地用於混合工作負載,最近更新和訪問的資料總是在快取中可用 - 可以抵抗資料來源故障,並可以容忍某些資料來源停機時間
缺點
- 一旦更新後的快取資料還未被寫入資料來源時(斷電),資料將無法找回