SOFAJRaft原始碼閱讀(伍)-初識RheaKV

yuan發表於2023-02-03

SOFAJRaft的SOFAJRaft-RheaKV 是基於 SOFAJRaft 和 RocksDB 實現的嵌入式、分散式、高可用、強一致的 KV 儲存類庫。SOFAJRaft-RheaKV 叢集主要包括三個核心元件:PD,Store 和 Region。
@Author:Akai-yuan
@更新時間:2023/2/3

1.架構設計

SOFAJRaft-RheaKV 儲存類庫主要包括PD,Store 和 Region 三個核心元件,支援輕量級的狀態/元資訊儲存以及叢集同步,分散式鎖服務使用場景

(1)排程

PD 是全域性的中心總控節點,負責整個叢集的排程管理、Region ID 生成、維護 RegionRouteTable 路由表。一個 PDServer 管理多個叢集,叢集之間基於 clusterId 隔離;PD Server 需要單獨部署,很多場景其實並不需要自管理,RheaKV 也支援不啟用 PD,不需要自管理的叢集可不啟用 PD,設定 PlacementDriverOptions 的 fake選項為 true 即可。PD 一般透過 Region 的心跳返回資訊進行對 Region 排程,Region 處理完後,PD 則會在下一個心跳返回中收到 Region 的變更資訊來更新路由及狀態表。

(2)儲存

Store 是叢集中的一個物理儲存節點,一個 Store 包含一個或多個 Region。通常一個 Node 負責一個 Store,Store 可以被看作是 Region 的容器,裡面儲存著多個分片資料。Store 會向 PD 主動上報 StoreHeartbeatRequest 心跳,心跳交由 PD 的 handleStoreHeartbeat 處理,裡面包含該 Store 的基本資訊,比如,包含多少 Region,有哪些 Region 的 Leader 在該 Store 等。

(3)資料

Region 是最小的 KV 資料單元,可理解為一個資料分割槽或者分片,每個 Region 都有一個左閉右開的區間 [startKey, endKey),能夠根據請求流量/負載/資料量大小等指標自動分裂以及自動副本搬遷。Region 有多個副本 Replication 構建 Raft Groups 儲存在不同的 Store 節點,透過 Raft 協議日誌複製功能資料同步到同 Group 的全部節點。Region對應的是 Store 裡某個實際的資料區間。每個 Region 會有多個副本,每個副本儲存在不同的 Store,一起組成一個Raft Group。Region 中的 Leader 會向 PD 主動上報 RegionHeartbeatRequest 心跳,交由 PD 的 handleRegionHeartbeat 處理,而 PD 是透過 Region的Epoch 感知 Region 是否有變化。
為了讓大家更清楚PD,Store 和 Region 三個核心元件的功能,這裡放一張官方圖片以便於理解:

2.初始化

我們從JRaft-Example模組的RheaKV部分開始,首先看到com.alipay.sofa.jraft.example.rheakv.Server1的main方法。

  1. 宣告瞭PlacementDriverOptions、StoreEngineOptions兩個配置選項實體
  2. 定義了RheaKVStoreOptions,並將PDOptions和SEOptions裝配到屬性中,並初始化宣告叢集名、是否開啟並行壓縮、服務的IP:埠列表("127.0.0.1:8181,127.0.0.1:8182,127.0.0.1:8183")。
  3. 宣告一個Node節點
  4. 新增一個鉤子函式,實現優雅停機。(作者曾經分析過鉤子函式的作用,具體參照:SOFAJRaft原始碼閱讀(叄)-ShutdownHook如何優雅的停機
public static void main(final String[] args) {
    final PlacementDriverOptions pdOpts = PlacementDriverOptionsConfigured.newConfigured()
        .withFake(true) // use a fake pd
        .config();
    final StoreEngineOptions storeOpts = StoreEngineOptionsConfigured.newConfigured() //
        //StoreEngine 儲存引擎支援 MemoryDB 和 RocksDB 兩種實現
        .withStorageType(StorageType.RocksDB)
        .withRocksDBOptions(RocksDBOptionsConfigured.newConfigured().withDbPath(Configs.DB_PATH).config())
        .withRaftDataPath(Configs.RAFT_DATA_PATH)
        .withServerAddress(new Endpoint("127.0.0.1", 8181))
        .config();
    final RheaKVStoreOptions opts = RheaKVStoreOptionsConfigured.newConfigured() //
        .withClusterName(Configs.CLUSTER_NAME) //
        .withUseParallelCompress(true) //
        .withInitialServerList(Configs.ALL_NODE_ADDRESSES)
        .withStoreEngineOptions(storeOpts) //
        .withPlacementDriverOptions(pdOpts) //
        .config();
    System.out.println(opts);
    final Node node = new Node(opts);
    node.start();
    Runtime.getRuntime().addShutdownHook(new Thread(node::stop));
    System.out.println("server1 start OK");
}

關於Node節點的實現:
裡面維護了一個RheaKVStoreOptions、RheaKVStore。

public class Node {
    private final RheaKVStoreOptions options;
    private RheaKVStore              rheaKVStore;
    public Node(RheaKVStoreOptions options) {
        this.options = options;
    }
    public void start() {
        this.rheaKVStore = new DefaultRheaKVStore();
        this.rheaKVStore.init(this.options);
    }
    public void stop() {
        this.rheaKVStore.shutdown();
    }
    public RheaKVStore getRheaKVStore() {
        return rheaKVStore;
    }
}

(1)DefaultRheaKVStore的初始化

可以看到呼叫了DefaultRheaKVStore,還呼叫了他的init方法進行初始化。

public synchronized boolean init(final RheaKVStoreOptions opts) {
    	//判斷是否已經啟動   
    	if (this.started) {
            LOG.info("[DefaultRheaKVStore] already started.");
            return true;
        }
        DescriberManager.getInstance().addDescriber(RouteTable.getInstance());
        this.opts = opts;
        //根據PlacementDriverOptions初始化PD
        final PlacementDriverOptions pdOpts = opts.getPlacementDriverOptions();
        final String clusterName = opts.getClusterName();
        Requires.requireNonNull(pdOpts, "opts.placementDriverOptions");
        Requires.requireNonNull(clusterName, "opts.clusterName");
        if (Strings.isBlank(pdOpts.getInitialServerList())) {
            // 如果為空,則繼承父級的值
            pdOpts.setInitialServerList(opts.getInitialServerList());
        }
        //這裡不啟用 PD,就例項化一個FakePlacementDriverClient
        if (pdOpts.isFake()) {
            this.pdClient = new FakePlacementDriverClient(opts.getClusterId(), clusterName);
        //啟用 PD,就例項化一個RemotePlacementDriverClient
        } else {
            this.pdClient = new RemotePlacementDriverClient(opts.getClusterId(), clusterName);
        }
    	//初始化FakePlacementDriverClient/RemotePlacementDriverClient
        if (!this.pdClient.init(pdOpts)) {
            LOG.error("Fail to init [PlacementDriverClient].");
            return false;
        }
        // 初始化壓縮策略
        ZipStrategyManager.init(opts);
        // 初始化儲存引擎
        final StoreEngineOptions stOpts = opts.getStoreEngineOptions();
        if (stOpts != null) {
            stOpts.setInitialServerList(opts.getInitialServerList());
            this.storeEngine = new StoreEngine(this.pdClient, this.stateListenerContainer);
            if (!this.storeEngine.init(stOpts)) {
                LOG.error("Fail to init [StoreEngine].");
                return false;
            }
        }
   		//獲取當前節點的ip和埠號
        final Endpoint selfEndpoint = this.storeEngine == null ? null : this.storeEngine.getSelfEndpoint();
        final RpcOptions rpcOpts = opts.getRpcOptions();
        Requires.requireNonNull(rpcOpts, "opts.rpcOptions");
        //初始化一個RpcService,並重寫getLeader方法
    	this.rheaKVRpcService = new DefaultRheaKVRpcService(this.pdClient, selfEndpoint) {
            @Override
            public Endpoint getLeader(final long regionId, final boolean forceRefresh, final long timeoutMillis) {
                final Endpoint leader = getLeaderByRegionEngine(regionId);
                if (leader != null) {
                    return leader;
                }
                return super.getLeader(regionId, forceRefresh, timeoutMillis);
            }
        };
        if (!this.rheaKVRpcService.init(rpcOpts)) {
            LOG.error("Fail to init [RheaKVRpcService].");
            return false;
        }
        //獲取重試次數,預設重試兩次
        this.failoverRetries = opts.getFailoverRetries();
        //預設5000
        this.futureTimeoutMillis = opts.getFutureTimeoutMillis();
        //是否只從leader讀取資料,預設為true
        this.onlyLeaderRead = opts.isOnlyLeaderRead();
        //初始化kvDispatcher, 這裡預設為true
        if (opts.isUseParallelKVExecutor()) {
            final int numWorkers = Utils.cpus();
            //乘以16
            final int bufSize = numWorkers << 4;
            final String name = "parallel-kv-executor";
            final ThreadFactory threadFactory = Constants.THREAD_AFFINITY_ENABLED
                    ? new AffinityNamedThreadFactory(name, true) : new NamedThreadFactory(name, true);
            //初始化Dispatcher
            this.kvDispatcher = new TaskDispatcher(bufSize, numWorkers, WaitStrategyType.LITE_BLOCKING_WAIT, threadFactory);
        }
        this.batchingOpts = opts.getBatchingOptions();
        //預設是true
        if (this.batchingOpts.isAllowBatching()) {
            this.getBatching = new GetBatching(KeyEvent::new, "get_batching",
                    new GetBatchingHandler("get", false));
            this.getBatchingOnlySafe = new GetBatching(KeyEvent::new, "get_batching_only_safe",
                    new GetBatchingHandler("get_only_safe", true));
            this.putBatching = new PutBatching(KVEvent::new, "put_batching",
                    new PutBatchingHandler("put"));
        }
        LOG.info("[DefaultRheaKVStore] start successfully, options: {}.", opts);
        return this.started = true;
    }

(2)StoreEngine初始化

其中有些程式碼操作與DefaultRheaKVStore的init方法中的某些程式碼一致,就不再重複贅述,其餘的用註釋的方式標註在以下程式碼塊中。

public synchronized boolean init(final StoreEngineOptions opts) {
        if (this.started) {
            LOG.info("[StoreEngine] already started.");
            return true;
        }
        DescriberManager.getInstance().addDescriber(this);
        this.storeOpts = Requires.requireNonNull(opts, "opts");
        Endpoint serverAddress = Requires.requireNonNull(opts.getServerAddress(), "opts.serverAddress");
        //獲取ip和埠
    	final int port = serverAddress.getPort();
        final String ip = serverAddress.getIp();
        //如果傳入的IP為空,那麼就設定啟動機器ip作為serverAddress的ip
        if (ip == null || Utils.IP_ANY.equals(ip)) {
            serverAddress = new Endpoint(NetUtil.getLocalCanonicalHostName(), port);
            opts.setServerAddress(serverAddress);
        }
        //獲取度量上報時間
        final long metricsReportPeriod = opts.getMetricsReportPeriod();
        // 初始化RegionEngineOptions
        List<RegionEngineOptions> rOptsList = opts.getRegionEngineOptionsList();
        //如果RegionEngineOptions為空,則初始化一個
        if (rOptsList == null || rOptsList.isEmpty()) {
            // -1 region
            final RegionEngineOptions rOpts = new RegionEngineOptions();
            rOpts.setRegionId(Constants.DEFAULT_REGION_ID);
            rOptsList = Lists.newArrayList();
            rOptsList.add(rOpts);
            opts.setRegionEngineOptionsList(rOptsList);
        }
        //獲取叢集名
        final String clusterName = this.pdClient.getClusterName();
        //遍歷rOptsList集合,為其中的RegionEngineOptions物件設定引數
        for (final RegionEngineOptions rOpts : rOptsList) {
            //用叢集名+“-”+RegionId 拼接設定為RaftGroupId
            rOpts.setRaftGroupId(JRaftHelper.getJRaftGroupId(clusterName, rOpts.getRegionId()));
            rOpts.setServerAddress(serverAddress);
            if (Strings.isBlank(rOpts.getInitialServerList())) {
                // if blank, extends parent's value
                rOpts.setInitialServerList(opts.getInitialServerList());
            }
            if (rOpts.getNodeOptions() == null) {
                // copy common node options
                rOpts.setNodeOptions(opts.getCommonNodeOptions() == null ? new NodeOptions() : opts
                    .getCommonNodeOptions().copy());
            }
            //如果原本沒有設定度量上報時間,那麼就重置一下
            if (rOpts.getMetricsReportPeriod() <= 0 && metricsReportPeriod > 0) {
                // extends store opts
                rOpts.setMetricsReportPeriod(metricsReportPeriod);
            }
        }
        // 初始化Store和Store裡面的region
        final Store store = this.pdClient.getStoreMetadata(opts);
        if (store == null || store.getRegions() == null || store.getRegions().isEmpty()) {
            LOG.error("Empty store metadata: {}.", store);
            return false;
        }
        this.storeId = store.getId();
        this.partRocksDBOptions = SystemPropertyUtil.getBoolean(PART_ROCKSDB_OPTIONS_KEY, false);
        // 初始化執行器
        if (this.readIndexExecutor == null) {
            this.readIndexExecutor = StoreEngineHelper.createReadIndexExecutor(opts.getReadIndexCoreThreads());
        }
        if (this.raftStateTrigger == null) {
            this.raftStateTrigger = StoreEngineHelper.createRaftStateTrigger(opts.getLeaderStateTriggerCoreThreads());
        }
        if (this.snapshotExecutor == null) {
            this.snapshotExecutor = StoreEngineHelper.createSnapshotExecutor(opts.getSnapshotCoreThreads(),
                opts.getSnapshotMaxThreads());
        }
        // init rpc executors
        final boolean useSharedRpcExecutor = opts.isUseSharedRpcExecutor();
        // 初始化rpc遠端執行器,用來執行RPCServer的Processors
        if (!useSharedRpcExecutor) {
            if (this.cliRpcExecutor == null) {
                this.cliRpcExecutor = StoreEngineHelper.createCliRpcExecutor(opts.getCliRpcCoreThreads());
            }
            if (this.raftRpcExecutor == null) {
                this.raftRpcExecutor = StoreEngineHelper.createRaftRpcExecutor(opts.getRaftRpcCoreThreads());
            }
            if (this.kvRpcExecutor == null) {
                this.kvRpcExecutor = StoreEngineHelper.createKvRpcExecutor(opts.getKvRpcCoreThreads());
            }
        }
        // 初始化指標度量
        startMetricReporters(metricsReportPeriod);
        // 初始化rpcServer,供其他服務呼叫
        this.rpcServer = RaftRpcServerFactory.createRaftRpcServer(serverAddress, this.raftRpcExecutor,
            this.cliRpcExecutor);
    	//為server加入各種processor
        StoreEngineHelper.addKvStoreRequestProcessor(this.rpcServer, this);
        if (!this.rpcServer.init(null)) {
            LOG.error("Fail to init [RpcServer].");
            return false;
        }
        // init db store
    	// 根據不同的型別選擇db
        if (!initRawKVStore(opts)) {
            return false;
        }
        if (this.rawKVStore instanceof Describer) {
            DescriberManager.getInstance().addDescriber((Describer) this.rawKVStore);
        }
        // init all region engine
    	// 為每個region初始化RegionEngine
        if (!initAllRegionEngine(opts, store)) {
            LOG.error("Fail to init all [RegionEngine].");
            return false;
        }
        // heartbeat sender
        // 如果開啟了自管理的叢集,那麼需要初始化心跳傳送器
        if (this.pdClient instanceof RemotePlacementDriverClient) {
            HeartbeatOptions heartbeatOpts = opts.getHeartbeatOptions();
            if (heartbeatOpts == null) {
                heartbeatOpts = new HeartbeatOptions();
            }
            this.heartbeatSender = new HeartbeatSender(this);
            if (!this.heartbeatSender.init(heartbeatOpts)) {
                LOG.error("Fail to init [HeartbeatSender].");
                return false;
            }
        }
        this.startTime = System.currentTimeMillis();
        LOG.info("[StoreEngine] start successfully: {}.", this);
        return this.started = true;
    }

(3)RegionEngineOptions初始化

在上述StoreEngine的初始化中,可以看到有對regionEngine進行初始化,接下來我們單獨再拿出該部分程式碼進行分析。

  1. 首先對opts.getRegionEngineOptionsList()判空,若是空則初始化一個RegionEngineOptions()
  2. 遍歷新生成的每一個RegionEngineOptions,並設定引數(RaftGroupId、ServerAddress、InitialServerList、NodeOptions、MetricsReportPeriod)
        // init region options
        List<RegionEngineOptions> rOptsList = opts.getRegionEngineOptionsList();
        if (rOptsList == null || rOptsList.isEmpty()) {
            // -1 region
            final RegionEngineOptions rOpts = new RegionEngineOptions();
            rOpts.setRegionId(Constants.DEFAULT_REGION_ID);
            rOptsList = Lists.newArrayList();
            rOptsList.add(rOpts);
            opts.setRegionEngineOptionsList(rOptsList);
        }
        final String clusterName = this.pdClient.getClusterName();
        for (final RegionEngineOptions rOpts : rOptsList) {
            rOpts.setRaftGroupId(JRaftHelper.getJRaftGroupId(clusterName, rOpts.getRegionId()));
            rOpts.setServerAddress(serverAddress);
            if (Strings.isBlank(rOpts.getInitialServerList())) {
                // if blank, extends parent's value
                rOpts.setInitialServerList(opts.getInitialServerList());
            }
            if (rOpts.getNodeOptions() == null) {
                // copy common node options
                rOpts.setNodeOptions(opts.getCommonNodeOptions() == null ? new NodeOptions() : opts
                    .getCommonNodeOptions().copy());
            }
            if (rOpts.getMetricsReportPeriod() <= 0 && metricsReportPeriod > 0) {
                // extends store opts
                rOpts.setMetricsReportPeriod(metricsReportPeriod);
            }
        }

(4)Store初始化

呼叫pdClient的getStoreMetadata方法進行初始化:

final Store store = this.pdClient.getStoreMetadata(opts);

當呼叫FakePlacementDriverClient#getStoreMetadata時:

  1. 獲取之前初始化得到的RegionEngineOptions連結串列
  2. 構造一個與RegionEngineOptions連結串列相同大小的Region連結串列,因為一個Store裡面會有多個region,之前在那張圖中能直觀看到。
  3. RegionEngineOptions連結串列中的每個元素執行getLocalRegionMetadata方法,並將結果新增到region連結串列中。
    public Store getStoreMetadata(final StoreEngineOptions opts) {
        final Store store = new Store();
        final List<RegionEngineOptions> rOptsList = opts.getRegionEngineOptionsList();
        final List<Region> regionList = Lists.newArrayListWithCapacity(rOptsList.size());
        store.setId(-1);
        store.setEndpoint(opts.getServerAddress());
        for (final RegionEngineOptions rOpts : rOptsList) {
            regionList.add(getLocalRegionMetadata(rOpts));
        }
        store.setRegions(regionList);
        return store;
    }

我們來看AbstractPlacementDriverClient#getLocalRegionMetadata方法:

  1. 保證regionId在合理的範圍內
  2. 設定key的範圍(左閉右開
  3. 根據initialServerList轉換成peer物件
  4. 將Region新增到regionRouteTable路由表
    protected Region getLocalRegionMetadata(final RegionEngineOptions opts) {
        final long regionId = Requires.requireNonNull(opts.getRegionId(), "opts.regionId");
        Requires.requireTrue(regionId >= Region.MIN_ID_WITH_MANUAL_CONF, "opts.regionId must >= "
                                                                         + Region.MIN_ID_WITH_MANUAL_CONF);
        Requires.requireTrue(regionId < Region.MAX_ID_WITH_MANUAL_CONF, "opts.regionId must < "
                                                                        + Region.MAX_ID_WITH_MANUAL_CONF);
        final byte[] startKey = opts.getStartKeyBytes();
        final byte[] endKey = opts.getEndKeyBytes();
        final String initialServerList = opts.getInitialServerList();
        final Region region = new Region();
        final Configuration conf = new Configuration();
        // region
        region.setId(regionId);
        region.setStartKey(startKey);
        region.setEndKey(endKey);
        region.setRegionEpoch(new RegionEpoch(-1, -1));
        // peers
        Requires.requireTrue(Strings.isNotBlank(initialServerList), "opts.initialServerList is blank");
        conf.parse(initialServerList);
        region.setPeers(JRaftHelper.toPeerList(conf.listPeers()));
        this.regionRouteTable.addOrUpdateRegion(region);
        return region;
    }

RegionEpoch涉及到兩個版本號:
(1)confVer:Conf 變化的版本號, 當增加或者移除一個peer時,版本號自增
(2)version:Region 版本號, 分裂或合併時,版本號自增

    public RegionEpoch(long confVer, long version) {
        this.confVer = confVer
        this.version = version;
    }

關於RegionRouteTable#addOrUpdateRegion:

    public void addOrUpdateRegion(final Region region) {
        Requires.requireNonNull(region, "region");
        Requires.requireNonNull(region.getRegionEpoch(), "regionEpoch");
        final long regionId = region.getId();
        final byte[] startKey = BytesUtil.nullToEmpty(region.getStartKey());
        final StampedLock stampedLock = this.stampedLock;
        final long stamp = stampedLock.writeLock();
        try {
            this.regionTable.put(regionId, region.copy());
            this.rangeTable.put(startKey, regionId);
        } finally {
            stampedLock.unlockWrite(stamp);
        }
    }

我們看看RegionRouteTable的幾個欄位:

  • keyBytesComparator:是一個LexicographicByteArrayComparator字典序比較器
  • stampedLock:比讀寫鎖效能更高的鎖
  • rangeTable:是一個TreeMap,它實現了NavigableMap,並按照指定的keyBytesComparator排序,鍵值對為<regionId, region>
  • regionTable:是一個hashMap,鍵值對為<startKey, regionId>
    private static final Comparator<byte[]>  keyBytesComparator = BytesUtil.getDefaultByteArrayComparator();
    private final StampedLock                stampedLock        = new StampedLock();
    private final NavigableMap<byte[], Long> rangeTable         = new TreeMap<>(keyBytesComparator);
    private final Map<Long, Region>          regionTable        = Maps.newHashMap();

(5)RegionEngine初始化

StoreEngine#initAllRegionEngine

    private boolean initAllRegionEngine(final StoreEngineOptions opts, final Store store) {
        Requires.requireNonNull(opts, "opts");
        Requires.requireNonNull(store, "store");
        //獲取主目錄
        String baseRaftDataPath = opts.getRaftDataPath();
        if (Strings.isNotBlank(baseRaftDataPath)) {
            try {
                FileUtils.forceMkdir(new File(baseRaftDataPath));
            } catch (final Throwable t) {
                LOG.error("Fail to make dir for raftDataPath: {}.", baseRaftDataPath);
                return false;
            }
        } else {
            baseRaftDataPath = "";
        }
        final Endpoint serverAddress = opts.getServerAddress();
        //獲取RegionEngineOptions和region
        final List<RegionEngineOptions> rOptsList = opts.getRegionEngineOptionsList();
        final List<Region> regionList = store.getRegions();
        Requires.requireTrue(rOptsList.size() == regionList.size());
        for (int i = 0; i < rOptsList.size(); i++) {
            final RegionEngineOptions rOpts = rOptsList.get(i);
            if (!inConfiguration(rOpts.getServerAddress().toString(), rOpts.getInitialServerList())) {
                continue;
            }
            final Region region = regionList.get(i);
            //檢驗region路徑是否為空,為空則重新設值
            if (Strings.isBlank(rOpts.getRaftDataPath())) {
                final String childPath = "raft_data_region_" + region.getId() + "_" + serverAddress.getPort();
                rOpts.setRaftDataPath(Paths.get(baseRaftDataPath, childPath).toString());
            }
            Requires.requireNonNull(region.getRegionEpoch(), "regionEpoch");
            //根據Region初始化RegionEngine
            final RegionEngine engine = new RegionEngine(region, this);
            if (engine.init(rOpts)) {
                // 每個 RegionKVService 對應一個 Region,只處理本身 Region 範疇內的請求
                final RegionKVService regionKVService = new DefaultRegionKVService(engine);
                registerRegionKVService(regionKVService);
                //放入到ConcurrentMap<Long, RegionEngine> regionKVServiceTable 中
                this.regionEngineTable.put(region.getId(), engine);
            } else {
                LOG.error("Fail to init [RegionEngine: {}].", region);
                return false;
            }
        }
        return true;
    }

**

    public synchronized boolean init(final RegionEngineOptions opts) {
        if (this.started) {
            LOG.info("[RegionEngine: {}] already started.", this.region);
            return true;
        }
        this.regionOpts = Requires.requireNonNull(opts, "opts");
        //例項化狀態機
        this.fsm = new KVStoreStateMachine(this.region, this.storeEngine);

        // node options
        NodeOptions nodeOpts = opts.getNodeOptions();
        if (nodeOpts == null) {
            nodeOpts = new NodeOptions();
        }
        //如果度量間隔時間大於零,那麼開啟度量
        final long metricsReportPeriod = opts.getMetricsReportPeriod();
        if (metricsReportPeriod > 0) {
            // metricsReportPeriod > 0 means enable metrics
            nodeOpts.setEnableMetrics(true);
        }
        final Configuration initialConf = new Configuration();
        if (!initialConf.parse(opts.getInitialServerList())) {
            LOG.error("Fail to parse initial configuration {}.", opts.getInitialServerList());
            return false;
        }
        //初始化叢集配置
        nodeOpts.setInitialConf(initialConf);
        nodeOpts.setFsm(this.fsm);
        //初始化各種日誌的路徑
        final String raftDataPath = opts.getRaftDataPath();
        try {
            FileUtils.forceMkdir(new File(raftDataPath));
        } catch (final Throwable t) {
            LOG.error("Fail to make dir for raftDataPath {}.", raftDataPath);
            return false;
        }
        if (Strings.isBlank(nodeOpts.getLogUri())) {
            final Path logUri = Paths.get(raftDataPath, "log");
            nodeOpts.setLogUri(logUri.toString());
        }
        if (Strings.isBlank(nodeOpts.getRaftMetaUri())) {
            final Path meteUri = Paths.get(raftDataPath, "meta");
            nodeOpts.setRaftMetaUri(meteUri.toString());
        }
        if (Strings.isBlank(nodeOpts.getSnapshotUri())) {
            final Path snapshotUri = Paths.get(raftDataPath, "snapshot");
            nodeOpts.setSnapshotUri(snapshotUri.toString());
        }
        LOG.info("[RegionEngine: {}], log uri: {}, raft meta uri: {}, snapshot uri: {}.", this.region,
            nodeOpts.getLogUri(), nodeOpts.getRaftMetaUri(), nodeOpts.getSnapshotUri());
        final Endpoint serverAddress = opts.getServerAddress();
        final PeerId serverId = new PeerId(serverAddress, 0);
        final RpcServer rpcServer = this.storeEngine.getRpcServer();
        this.raftGroupService = new RaftGroupService(opts.getRaftGroupId(), serverId, nodeOpts, rpcServer, true);
        this.node = this.raftGroupService.start(false);
        //初始化node節點
        RouteTable.getInstance().updateConfiguration(this.raftGroupService.getGroupId(), nodeOpts.getInitialConf());
        if (this.node != null) {
            final RawKVStore rawKVStore = this.storeEngine.getRawKVStore();
            final Executor readIndexExecutor = this.storeEngine.getReadIndexExecutor();
            //RaftRawKVStore 是 RheaKV 基於 Raft 複製狀態機 KVStoreStateMachine 的 RawKVStore 介面 KV 儲存實現
        	//RheaKV 的 Raft 入口,從這裡開始 Raft 流程
            this.raftRawKVStore = new RaftRawKVStore(this.node, rawKVStore, readIndexExecutor);
            //攔截請求做指標度量
            this.metricsRawKVStore = new MetricsRawKVStore(this.region.getId(), this.raftRawKVStore);
            // metrics config
            if (this.regionMetricsReporter == null && metricsReportPeriod > 0) {
                final MetricRegistry metricRegistry = this.node.getNodeMetrics().getMetricRegistry();
                if (metricRegistry != null) {
                    final ScheduledExecutorService scheduler = this.storeEngine.getMetricsScheduler();
                    // start raft node metrics reporter
                    this.regionMetricsReporter = Slf4jReporter.forRegistry(metricRegistry) //
                        .prefixedWith("region_" + this.region.getId()) //
                        .withLoggingLevel(Slf4jReporter.LoggingLevel.INFO) //
                        .outputTo(LOG) //
                        .scheduleOn(scheduler) //
                        .shutdownExecutorOnStop(scheduler != null) //
                        .build();
                    this.regionMetricsReporter.start(metricsReportPeriod, TimeUnit.SECONDS);
                }
            }
            this.started = true;
            LOG.info("[RegionEngine] start successfully: {}.", this);
        }
        return this.started;
    }

相關文章