圖解Janusgraph系列-分散式id生成策略分析

洋仔聊程式設計發表於2020-09-01

原文網址 : https://www.cnblogs.com/lyycoder/p/13594867.html

圖解分散式

JanusGraph - 分散式id的生成策略

大家好，我是洋仔，JanusGraph圖解系列文章，實時更新~

本次更新時間：2020-9-1
文章為作者跟蹤原始碼和檢視官方文件整理，如有任何問題，請聯絡我或在評論區指出，感激不盡！

圖資料庫網上資源太少，評論區評論 or 私信我，邀你加入“相簿交流微信群”，一起交流學習！

圖解相簿JanusGraph系列-一文知曉匯入資料流程（待發布）

圖解相簿JanusGraph系列-簡要分析查詢讀資料流程（待發布）

圖解相簿JanusGraph系列-一文知曉鎖機制（本地鎖+分散式鎖）（待發布）

圖解相簿JanusGraph系列-一文知曉分散式id生成策略

圖解相簿JanusGraph系列-一文知曉相簿儲存分割槽策略（待發布）

儲存結構相關：

圖解相簿JanusGraph系列-一文知曉圖資料底層儲存結構

其他：

解惑圖資料庫！你知道什麼是圖資料庫嗎？

圖解相簿JanusGraph系列-官方測試圖:諸神之圖分析（待發布）

原始碼分析相關可檢視github（求star~~）： https://github.com/YYDreamer/janusgraph

下述流程高清大圖地址：https://www.processon.com/view/link/5f471b2e7d9c086b9903b629

版本：JanusGraph-0.5.2

轉載文章請保留以下宣告：

作者：洋仔聊程式設計
微信公眾號：匠心Java

正文

在介紹JanusGraph的分散式ID生成策略之前，我們來簡單分析一下分散式ID應該滿足哪些特徵？

全域性唯一：必須保證ID是分散式環境中全域性性唯一的，這是基本要求
高效能：高可用低延時，ID生成響應快；否則可能會成為業務瓶頸
高可用：提供分散式id的生成的服務要保證高可用，不能隨隨便便就掛掉了，會對業務產生影響
趨勢遞增：主要看業務場景，類似於圖儲存中節點的唯一id就儘量保持趨勢遞增；但是如果類似於電商訂單就儘量不要趨勢遞增，因為趨勢遞增會被惡意估算出當天的訂單量和成交量，洩漏公司資訊
接入方便：如果是中介軟體，要秉著拿來即用的設計原則，在系統設計和實現上要儘可能的簡單

一：常用分散式id生成策略

當前常用的分散式id的生成策略主要分為以下四種：

UUID
資料庫+號段模式（優化：資料庫+號段+雙buffer）
基於Redis實現
雪花演算法（SnowFlake）

還有一些其他的比如：基於資料庫自增id、資料庫多主模式等，這些在小併發的情況下可以使用，大併發的情況下就不太ok了

市面上有一些生成分散式id的開源元件，包括滴滴基於資料庫+號段實現的TinyID 、百度基於SnowFlake的Uidgenerator、美團支援號段和SnowFlake的Leaf等

那麼，在JanusGraph中分散式id的生成是採用的什麼方式呢？

二：JanusGraph的分散式id策略

在JanusGraph中，分散式id的生成採用的是資料庫+號段+雙buffer優化的模式；下面我們來具體分析一下：

分散式id生成使用的資料庫就是JanusGraph當前使用的第三方儲存後端，這裡我們以使用的儲存後端Hbase為例；

JanusGraph分散式id生成所需後設資料儲存位置：

在Hbase中有column family 列族的概念； JanusGraph在初始化Hbase表時預設建立了9大列族，用於儲存不同的資料，具體看《圖解相簿JanusGraph系列-一文知曉圖資料底層儲存結構》；

其中有一個列族janusgraph_ids簡寫為i這個列族，主要儲存的就是JanusGraph分散式id生成所需要的後設資料！

JanusGraph的分散式id的組成結構：

	 // 原始碼中有一句話體現
     /*		--- JanusGraphElement id bit format ---
      *  [ 0 | count | partition | ID padding (if any) ]
     */

主要分為4部分：0、count、partition、ID padding（每個型別是固定值）；

其實這4部分的順序在序列化為二進位制資料時，順序會有所改變；這裡只是標明瞭id的組成部分！

上述部分的partition + count來保證分散式節點的唯一性；

partition id：分割槽id值，JanusGraph預設分了32個邏輯分割槽；節點分到哪個分割槽採用的是隨機分配;
count：每個partition都有對應的一個count範圍：0-2的55次冪；JanusGraph每次拉取一部分的範圍作為節點的count取值；JanusGraph保證了針對相同的partition，不會重複獲取同一個count值！

保證count在partition維度保持全域性唯一性，就保證了生成的最終id的全域性唯一性！！

則分散式id的唯一性保證，就在於count基於partition維度的唯一性！下面我們的分析也是著重在count的獲取！

JanusGraph分散式id生成的主要邏輯流程如下圖所示：（推薦結合原始碼分析觀看！）

分析過程中有一個概念為id block：指當前獲取的號段範圍

在這裡插入圖片描述

JanusGraph主要使用``PartitionIDPool 類來儲存不同型別的StandardIDPool；在StandardIDPool`中主要包含兩個id Block：

current block：當前生成id使用的block
next block：double buffer中的另一個已經準備好的block

為什麼要有兩個block呢？

主要是如果只有一個block的話，當我們在使用完當前的block時，需要阻塞等待區獲取下一個block，這樣便會導致分散式id生成較長時間的阻塞等待block的獲取；

怎麼優化上述問題呢？ double buffer；

除了當前使用的block，我們再儲存一個next block；當正在使用的block假設已經使用了50%，觸發next block的非同步獲取，如上圖的藍色部分所示；

這樣當current block使用完成後可以直接無延遲的切換到next block如上圖中綠色部分所示；

在執行過程中可能會因為一些異常導致節點id獲取失敗，則會進行重試；重試次數預設為1000次；

private static final int MAX_PARTITION_RENEW_ATTEMPTS = 1000;
for (int attempt = 0; attempt < MAX_PARTITION_RENEW_ATTEMPTS; attempt++) {
   // 獲取id的過程
}

ps：上述所說的IDPool和block是基於當前圖例項維度共用的！

三：原始碼分析

在JanusGraph的原始碼中，主要包含兩大部分和其他的一些元件：

Graph相關類：用於對節點、屬性、邊的操作
Transaction相關類：用於在對資料或者Schema進行CURD時，進行事務處理
其他一些：分散式節點id生成類；序列化類；第三方索引操作類等等

Graph和Transaction相關類的類圖如下所示：

在這裡插入圖片描述

分散式id涉及到id生成的類圖如下所示：

在這裡插入圖片描述

初始資料：

    @Test
    public void addVertexTest(){
        List<Object> godProperties = new ArrayList<>();
        godProperties.add(T.label);
        godProperties.add("god");

        godProperties.add("name");
        godProperties.add("lyy");

        godProperties.add("age");
        godProperties.add(18);

        JanusGraphVertex godVertex = graph.addVertex(godProperties.toArray());

        assertNotNull(godVertex);
    }

在諸神之圖中新增一個name為lyy節點；看下執行流程，注意，此處主要分析的節點的分散式id生成程式碼！

1、呼叫JanusGraphBlueprintsGraph類的AddVertex方法

    @Override
    public JanusGraphVertex addVertex(Object... keyValues) {
        // 新增節點
        return getAutoStartTx().addVertex(keyValues);
    }

2、呼叫JanusGraphBlueprintsTransaction的addVertex方法

   public JanusGraphVertex addVertex(Object... keyValues) {
        // 。。。省略了其他的處理
        // 該處生成節點物件，包含節點的唯一id生成邏輯
        final JanusGraphVertex vertex = addVertex(id, label); 
        // 。。。省略了其他的處理
        return vertex;
    }

3、呼叫StandardJanusGraphTx的addVertex方法

    @Override
    public JanusGraphVertex addVertex(Long vertexId, VertexLabel label) {
        // 。。。省略了其他的處理
        if (vertexId != null) {
            vertex.setId(vertexId);
        } else if (config.hasAssignIDsImmediately() || label.isPartitioned()) {
            graph.assignID(vertex,label);  // 為節點分配正式的節點id！
        }
         // 。。。省略了其他的處理
        return vertex;
    }

4、呼叫VertexIDAssigner的assignID(InternalElement element, IDManager.VertexIDType vertexIDType)方法

    private void assignID(InternalElement element, IDManager.VertexIDType vertexIDType) {
        // 開始獲取節點分散式唯一id
        // 因為一些異常導致獲取節點id失敗，進行重試，重試此為預設為1000次
        for (int attempt = 0; attempt < MAX_PARTITION_RENEW_ATTEMPTS; attempt++) {
            // 初始化一個partiiton id
            long partitionID = -1;
            // 獲取一個partition id
            // 不同型別的資料，partition id的獲取方式也有所不同
            if (element instanceof JanusGraphSchemaVertex) {
                // 為partition id賦值
            }
            try {
                // 正式分配節點id， 依據partition id 和 節點型別
                assignID(element, partitionID, vertexIDType);
            } catch (IDPoolExhaustedException e) {
                continue; //try again on a different partition
            }
            assert element.hasId();
            // 。。。省略了其他程式碼
        }
    }

5、呼叫了VertexIDAssigner的assignID(final InternalElement element, final long partitionIDl, final IDManager.VertexIDType userVertexIDType)方法

    private void assignID(final InternalElement element, final long partitionIDl, final IDManager.VertexIDType userVertexIDType) {
      
        final int partitionID = (int) partitionIDl;

        // count為分散式id組成中的一部分，佔55個位元組
        // 分散式id的唯一性保證，就在於`count`基於`partition`維度的唯一性
        long count;
        if (element instanceof JanusGraphSchemaVertex) { // schema節點處理
            Preconditions.checkArgument(partitionID==IDManager.SCHEMA_PARTITION);
            count = schemaIdPool.nextID();
        } else if (userVertexIDType==IDManager.VertexIDType.PartitionedVertex) { // 配置的熱點節點，類似於`makeVertexLabel('product').partition()`的處理
            count = partitionVertexIdPool.nextID();
        } else { // 普通節點和邊型別的處理
            // 首先獲取當前partition敵營的idPool
            PartitionIDPool partitionPool = idPools.get(partitionID);
            // 如果當前分割槽對應的IDPool為空，則建立一個預設的IDPool，預設size = 0
            if (partitionPool == null) {
                // 在PartitionIDPool中包含多種型別對應的StandardIDPool型別
                // StandardIDPool中包含對應的block資訊和count資訊
                partitionPool = new PartitionIDPool(partitionID, idAuthority, idManager, renewTimeoutMS, renewBufferPercentage);
                // 快取下來
                idPools.putIfAbsent(partitionID,partitionPool);
                // 從快取中再重新拿出
                partitionPool = idPools.get(partitionID);
            }
            // 確保partitionPool不為空
            Preconditions.checkNotNull(partitionPool);
            // 判斷當前分割槽的IDPool是否枯竭；已經被用完
            if (partitionPool.isExhausted()) {
                // 如果被用完，則將該分割槽id放到對應的快取中，避免之後獲取分割槽id再獲取到該分割槽id
                placementStrategy.exhaustedPartition(partitionID);
                // 丟擲IDPool異常， 最外層捕獲，然後進行重試獲取節點id
                throw new IDPoolExhaustedException("Exhausted id pool for partition: " + partitionID);
            }
            // 儲存當前型別對應的IDPool，因為partitionPool中儲存好幾個型別的IDPool
            IDPool idPool;
            if (element instanceof JanusGraphRelation) {
                idPool = partitionPool.getPool(PoolType.RELATION);
            } else {
                Preconditions.checkArgument(userVertexIDType!=null);
                idPool = partitionPool.getPool(PoolType.getPoolTypeFor(userVertexIDType));
            }
            try {
                // 重要！！！！ 依據給定的IDPool獲取count值！！！！
                // 在此語句中設計 block的初始化 和 double buffer block的處理！
                count = idPool.nextID();
                partitionPool.accessed();
            } catch (IDPoolExhaustedException e) { // 如果該IDPool被用完，丟擲IDPool異常， 最外層捕獲，然後進行重試獲取節點id
                log.debug("Pool exhausted for partition id {}", partitionID);
                placementStrategy.exhaustedPartition(partitionID);
                partitionPool.exhaustedIdPool();
                throw e;
            }
        }

        // 組裝最終的分散式id：[count + partition id + ID padding]
        long elementId;
        if (element instanceof InternalRelation) {
            elementId = idManager.getRelationID(count, partitionID);
        } else if (element instanceof PropertyKey) {
            elementId = IDManager.getSchemaId(IDManager.VertexIDType.UserPropertyKey,count);
        } else if (element instanceof EdgeLabel) {
            elementId = IDManager.getSchemaId(IDManager.VertexIDType.UserEdgeLabel, count);
        } else if (element instanceof VertexLabel) {
            elementId = IDManager.getSchemaId(IDManager.VertexIDType.VertexLabel, count);
        } else if (element instanceof JanusGraphSchemaVertex) {
            elementId = IDManager.getSchemaId(IDManager.VertexIDType.GenericSchemaType,count);
        } else {
            elementId = idManager.getVertexID(count, partitionID, userVertexIDType);
        }

        Preconditions.checkArgument(elementId >= 0);
        // 對節點物件賦值其分散式唯一id
        element.setId(elementId);
    }

上述程式碼，我們拿到了對應的IdPool，有兩種情況：

第一次獲取分散式id時，分割槽對應的IDPool初始化為預設的size = 0的IDPool
分割槽對應的IDPool不是初次獲取

這兩種情況的處理，都在程式碼count = idPool.nextID()的StandardIDPool類中的nextID()方法中被處理！

在分析該程式碼之前，我們需要知道 PartitionIDPool 和StandardIDPool的關係：

每個partition都有一個對應的PartitionIDPool extends EnumMap<PoolType,IDPool> 是一個列舉map型別；

每一個PartitionIDPool 都有對應的不同型別的StandardIDPool：

NORMAL_VERTEX：用於vertex id的分配
UNMODIFIABLE_VERTEX：用於schema label id的分配
RELATION：用於edge id的分配

在StandardIDPool中包含多個欄位，分別代表不同的含義，抽取幾個重要的欄位進行介紹：

    private static final int RENEW_ID_COUNT = 100; 
    private final long idUpperBound; // Block的最大值，預設為2的55次冪
    private final int partition; // 當前pool對應的分割槽
    private final int idNamespace; // 標識pool為那種型別的pool，上述的三種型別NORMAL_VERTEX、UNMODIFIABLE_VERTEX、RELATION；值為當前列舉值在列舉中的位置

    private final Duration renewTimeout;// 重新獲取block的超時時間
    private final double renewBufferPercentage;// 雙buffer中，當第一個buffer block使用的百分比，到達配置的百分比則觸發other buffer block的獲取

    private IDBlock currentBlock; // 當前的block
    private long currentIndex; // 標識當前block使用到那一個位置
    private long renewBlockIndex; // 依據currentBlock.numIds()*renewBufferPercentage來獲取這個值，主要用於在當前的block在消費到某個index的時候觸發獲取下一個buffer block

    private volatile IDBlock nextBlock;// 雙buffer中的另外一個block

    private final ThreadPoolExecutor exec;// 非同步獲取雙buffer的執行緒池

6、呼叫了StandardIDPool類中的nextID方法

經過上述分析，我們知道，分散式唯一id的唯一性是由在partition維度下的count的值的唯一性來保證的；

上述程式碼通過呼叫IDPool的nextId來獲取count值；

下述程式碼就是獲取count的邏輯；

    @Override
    public synchronized long nextID() {
        // currentIndex標識當前的index小於current block的最大值
        assert currentIndex <= currentBlock.numIds();

        // 此處涉及兩種情況：
        // 1、分割槽對應的IDPool是第一次被初始化；則currentIndex = 0； currentBlock.numIds() = 0；
        // 2、分割槽對應的該IDPool不是第一次，但是此次的index正好使用到了current block的最後一個count
        if (currentIndex == currentBlock.numIds()) {
            try {
                // 將current block賦值為next block
                // next block置空 並計算renewBlockIndex
                nextBlock();
            } catch (InterruptedException e) {
                throw new JanusGraphException("Could not renew id block due to interruption", e);
            }
        }
        
        // 在使用current block的過程中，當current index  ==  renewBlockIndex時，觸發double buffer next block的非同步獲取！！！！
        if (currentIndex == renewBlockIndex) {
            // 非同步獲取next block
            startIDBlockGetter();
        }
        
        // 生成最終的count
        long returnId = currentBlock.getId(currentIndex);
        // current index + 1
        currentIndex++;
        if (returnId >= idUpperBound) throw new IDPoolExhaustedException("Reached id upper bound of " + idUpperBound);
        log.trace("partition({})-namespace({}) Returned id: {}", partition, idNamespace, returnId);
        // 返回最終獲取的分割槽維度的全域性唯一count
        return returnId;
    }

上述程式碼中進行了兩次判斷：

currentIndex == currentBlock.numIds()：
- 第一次生成分散式id：此處判斷即為 0==0；然後生成新的block
- 非第一次生成分散式id：等於情況下標識當前的block已經使用完了，需要切換為next block
currentIndex == renewBlockIndex
- renew index：標識index使用多少後開始獲取下一個double buffer 的next block；有一個預設值100，主要為了相容第一次分散式id的生成；相等則會觸發非同步獲取下一個next block

下面我們分別對nextBlock();邏輯和startIDBlockGetter();進行分析；

7、呼叫了StandardIDPool類中的nextBlock方法

    private synchronized void nextBlock() throws InterruptedException {
        // 在分割槽對應的IDPool第一次使用時，double buffer的nextBlock為空
        if (null == nextBlock && null == idBlockFuture) {
            // 非同步啟動 獲取id block
            startIDBlockGetter();
        }

        // 也是在分割槽對應的IDPool第一次使用時，因為上述為非同步獲取，所以在執行到這一步時nextBlock可能還沒拿到
        // 所以需要阻塞等待block的獲取
        if (null == nextBlock) {
            waitForIDBlockGetter();
        }

        // 將當前使用block指向next block
        currentBlock = nextBlock;
        // index清零
        currentIndex = 0;
        // nextBlock置空
        nextBlock = null;

        // renewBlockIndex用於雙buffer中，當第一個buffer block使用的百分比，到達配置的百分比則觸發other buffer block的獲取
        // 值current block 對應的count數量 - （值current block 對應的count數量 * 為renewBufferPercentage配置的剩餘空間百分比）
        // 在使用current block的時候，當current index  ==  renewBlockIndex時，觸發double buffer next block的非同步獲取！！！！
        renewBlockIndex = Math.max(0,currentBlock.numIds()-Math.max(RENEW_ID_COUNT, Math.round(currentBlock.numIds()*renewBufferPercentage)));
    }

主要是做了三件事：

1、block是否為空，為空的話則非同步獲取一個block
2、nextBlock不為空的情況下：next賦值到current、next置空、index置零
3、計算獲取下一個nextBlock的觸發index renewBlockIndex值

8、呼叫了StandardIDPool類中的startIDBlockGetter方法

    private synchronized void startIDBlockGetter() {
        Preconditions.checkArgument(idBlockFuture == null, idBlockFuture);
        if (closed) return; //Don't renew anymore if closed
        //Renew buffer
        log.debug("Starting id block renewal thread upon {}", currentIndex);
        // 建立一個執行緒物件，包含給定的許可權控制類、分割槽、名稱空間、超時時間
        idBlockGetter = new IDBlockGetter(idAuthority, partition, idNamespace, renewTimeout);
        // 提交獲取double buffer的執行緒任務，非同步執行
        idBlockFuture = exec.submit(idBlockGetter);
    }

其中建立一個執行緒任務，提交到執行緒池exec進行非同步執行；

下面看下，執行緒類的call方法主要是呼叫了 idAuthority.getIDBlock方法，這個方法主要是基於Hbase來獲取還未使用的block；

    /**
     * 獲取double buffer block的執行緒類
     */
    private static class IDBlockGetter implements Callable<IDBlock> {

        // 省略部分程式碼
        @Override
        public IDBlock call() {
            Stopwatch running = Stopwatch.createStarted();
            try {
                // 此處呼叫idAuthority 呼叫HBase進行佔用獲取Block
                IDBlock idBlock = idAuthority.getIDBlock(partition, idNamespace, renewTimeout);
                return idBlock;
            } catch (BackendException e) {}
        }
    }

9、呼叫ConsistentKeyIDAuthority類的getIDBlock方法

    @Override
    public synchronized IDBlock getIDBlock(final int partition, final int idNamespace, Duration timeout) throws BackendException {
      
        // 開始時間
        final Timer methodTime = times.getTimer().start();

        // 獲取當前名稱空間配置的blockSize，預設值10000；可自定義配置
        final long blockSize = getBlockSize(idNamespace);
        // 獲取當前名稱空間配置的最大id值idUpperBound；值為：2的55次冪大小
        final long idUpperBound = getIdUpperBound(idNamespace);
        // uniqueIdBitWidth標識uniqueId佔用的位數；uniqueId為了相容“關閉分散式id唯一性保障”的開關情況，uniqueIdBitWidth預設值=4
        // 值：64-1(預設0)-5（分割槽佔用位數）-3（ID Padding佔用位數）-4（uniqueIdBitWidth） = 51；標識block中的上限為2的51次冪大小
        final int maxAvailableBits = (VariableLong.unsignedBitLength(idUpperBound)-1)-uniqueIdBitWidth;

        // 標識block中的上限為2的51次冪大小
        final long idBlockUpperBound = (1L <<maxAvailableBits);

        // UniquePID用盡的UniquePID集合，預設情況下，randomUniqueIDLimit = 0；
        final List<Integer> exhaustedUniquePIDs = new ArrayList<>(randomUniqueIDLimit);

        // 預設0.3秒  用於處理TemporaryBackendException異常情況（後端儲存出現問題）下：阻塞一斷時間，然後進行重試
        Duration backoffMS = idApplicationWaitMS;

        // 從開始獲取IDBlock開始，持續超時時間（預設2分鐘）內重試獲取IDBlock
        while (methodTime.elapsed().compareTo(timeout) < 0) {
            final int uniquePID = getUniquePartitionID(); // 獲取uniquePID，預設情況下“開啟分散式id唯一性控制”，值 = 0； 當“關閉分散式id唯一性控制”時為一個隨機值
            final StaticBuffer partitionKey = getPartitionKey(partition,idNamespace,uniquePID); // 依據partition + idNamespace + uniquePID組裝一個RowKey
            try {
                long nextStart = getCurrentID(partitionKey); // 從Hbase中獲取當前partition對應的IDPool中被分配的最大值，用來作為當前申請新的block的開始值
                if (idBlockUpperBound - blockSize <= nextStart) { // 確保還未被分配的id池中的id個數，大於等於blockSize
                    // 相應處理
                }

                long nextEnd = nextStart + blockSize; // 獲取當前想要獲取block的最大值
                StaticBuffer target = null;

                // attempt to write our claim on the next id block
                boolean success = false;
                try {
                    Timer writeTimer = times.getTimer().start(); // ===開始：開始進行插入自身的block需求到Hbase
                    target = getBlockApplication(nextEnd, writeTimer.getStartTime()); // 組裝對應的Column: -nextEnd +  當前時間戳 + uid（唯一標識當前圖例項）
                    final StaticBuffer finalTarget = target; // copy for the inner class
                    BackendOperation.execute(txh -> { // 非同步插入當前生成的RowKey 和 Column
                        idStore.mutate(partitionKey, Collections.singletonList(StaticArrayEntry.of(finalTarget)), KeyColumnValueStore.NO_DELETIONS, txh);
                        return true;
                    },this,times);
                    writeTimer.stop(); // ===結束：插入完成

                    final boolean distributed = manager.getFeatures().isDistributed();
                    Duration writeElapsed = writeTimer.elapsed(); // ===獲取方才插入的時間耗時
                    if (idApplicationWaitMS.compareTo(writeElapsed) < 0 && distributed) { // 判斷是否超過配置的超時時間，超過則報錯TemporaryBackendException，然後等待一斷時間進行重試
                        throw new TemporaryBackendException("Wrote claim for id block [" + nextStart + ", " + nextEnd + ") in " + (writeElapsed) + " => too slow, threshold is: " + idApplicationWaitMS);
                    } else {

                        assert 0 != target.length();
                        final StaticBuffer[] slice = getBlockSlice(nextEnd); // 組裝下述基於上述Rowkey的Column的查詢範圍：(-nextEnd + 0 : 0nextEnd + 最大值)      

                        final List<Entry> blocks = BackendOperation.execute( // 非同步獲取指定Rowkey和指定Column區間的值
                            (BackendOperation.Transactional<List<Entry>>) txh -> idStore.getSlice(new KeySliceQuery(partitionKey, slice[0], slice[1]), txh),this,times);
                        if (blocks == null) throw new TemporaryBackendException("Could not read from storage");
                        if (blocks.isEmpty())
                            throw new PermanentBackendException("It seems there is a race-condition in the block application. " +
                                    "If you have multiple JanusGraph instances running on one physical machine, ensure that they have unique machine idAuthorities");

                        if (target.equals(blocks.get(0).getColumnAs(StaticBuffer.STATIC_FACTORY))) { // 如果獲取的集合中，當前的圖例項插入的資料是第一條，則表示獲取block; 如果不是第一條，則獲取Block失敗
                            // 組裝IDBlock物件
                            ConsistentKeyIDBlock idBlock = new ConsistentKeyIDBlock(nextStart,blockSize,uniqueIdBitWidth,uniquePID);

                            if (log.isDebugEnabled()) {
                                    idBlock, partition, idNamespace, uid);
                            }

                            success = true;
                            return idBlock; // 返回
                        } else { }
                    }
                } finally {
                    if (!success && null != target) { // 在獲取Block失敗後，刪除當前的插入； 如果沒有失敗，則保留當前的插入，在hbase中標識該Block已經被佔用
                        //Delete claim to not pollute id space
                        for (int attempt = 0; attempt < ROLLBACK_ATTEMPTS; attempt++) { // 回滾：刪除當前插入，嘗試次數5次
                        }
                    }
                }
            } catch (UniqueIDExhaustedException e) {
                // No need to increment the backoff wait time or to sleep
                log.warn(e.getMessage());
            } catch (TemporaryBackendException e) {
                backoffMS = Durations.min(backoffMS.multipliedBy(2), idApplicationWaitMS.multipliedBy(32));
                sleepAndConvertInterrupts(backoffMS); \
            }
        }

        throw new TemporaryLockingException();
    }

主要的邏輯就是：

組裝Rowkey：partition + idNameSpace+unquePId
組裝Column：-nextEnd+now time+uid
將RowKey+Column插入Hbase
獲取的上述組裝的RowKey 基於(-nextEnd + 0 : -nextEnd + max)範圍的所有Column集合
判斷集合的第一個Column是不是當前插入的Column，是的話則佔用block成功，不是的話則佔用失敗，刪除剛才佔用並進行重試

最終：非同步獲取到了唯一佔用的Block，然後生成對應的唯一count，組裝最後的唯一id

整體的呼叫流程如下：

在這裡插入圖片描述

四：其他型別的id生成

上述我們主要依據生成節點id（vertex id）的過程來進行分析

在JanusGraph中還包含edge id、property id、schema label id等幾種的分散式id生成

所有型別的分散式id的生成主要思想和邏輯都幾乎相同，只是一些具體的邏輯可能有所不同，我們理解了vertex id的分散式id生成流程，其他的也可以理解了。

1、property id的生成

在JanusGraph中的property的分散式唯一id的生成，整體邏輯和vertex id的生成邏輯大體相同；

property id的生成和 vertex id有兩點不同：

ID的組成部分：在vertex id中組成部分包含count+partition+ID Padding；而在property id中沒有ID Padding部分，其組成為count + partition

        long id = (count<<partitionBits)+partition;
        if (type!=null) id = type.addPadding(id); // 此時，type = null
        return id;

partition id的獲取方式：在生成vertex id時，partition id是隨機獲取的；而在生成property id時，partition id是獲取的當前節點對應的partition id，如果節點獲取不到分割槽id，則隨機生成一個；

            if (element instanceof InternalRelation) { // 屬性 + 邊
                InternalRelation relation = (InternalRelation)element;
                if (attempt < relation.getLen()) { 
                    InternalVertex incident = relation.getVertex(attempt);
                    Preconditions.checkArgument(incident.hasId());
                    if (!IDManager.VertexIDType.PartitionedVertex.is(incident.longId()) || relation.isProperty()) { // 獲取對應節點已有的partition id
                        partitionID = getPartitionID(incident);
                    } else {
                        continue;
                    }
                } else { // 如果對應的節點都沒有，則隨機獲取一個partition id
                    partitionID = placementStrategy.getPartition(element);
                }

2、Edge id的生成

在JanusGraph中的edge的分散式唯一id的生成，整體邏輯和vertex id的生成邏輯大體相同；

edge id的生成和 vertex id有兩點不同：

ID的組成部分：在vertex id中組成部分包含count+partition+ID Padding；而在edge id中沒有ID Padding部分，其組成為count + partition，程式碼同property id的生成程式碼
partition id的獲取方式：在生成vertex id時，partition id是隨機獲取的；而在生成edge id時，partition id是獲取的當前source vertex 或者 target vertex對應的partition id，如果節點獲取不到分割槽id，則隨機生成一個，程式碼同property id的生成程式碼；

3、Schema相關id的生成

在JanusGraph中的schema相關id的分散式唯一id的生成，整體邏輯和vertex id的生成邏輯大體相同；

schema相關id的生成分為四種：PropertyKey、EdgeLabel、VertexLabel、JanusGraphSchemaVertex

ID的組成部分：在vertex id中組成部分包含count+partition+ID Padding；在schema對應的id生成，這四種產生的id對應的結構都是一樣的：count + 對應型別的固定字尾

return (count << offset()) | suffix();

partition id的獲取方式：在生成vertex id時，partition id是隨機獲取的；而在生成schema id時，partition id是預設的partition id = 0；

public static final int SCHEMA_PARTITION = 0;
if (element instanceof JanusGraphSchemaVertex) {
                partitionID = IDManager.SCHEMA_PARTITION; // 預設分割槽
}

總結

本文總結了JanusGraph的分散式唯一id的生成邏輯，也進行的原始碼分析；

下一篇，JanusGraph的鎖機制分析，包含本地鎖和分散式鎖相關的分析，我是“洋仔”，我們下期見~

分散式唯一id生成策略
2018-11-26
分散式
圖解Janusgraph系列-併發安全：鎖機制（本地鎖+分散式鎖）分析
2020-12-17
圖解分散式
圖解JanusGraph系列 - JanusGraph指標監控報警（Monitoring JanusGraph）
2020-12-23
圖解指標
分散式ID系列（2）——UUID適合做分散式ID嗎
2019-08-06
分散式UI
搞懂分散式技術12：分散式ID生成方案
2019-11-19
分散式
探討分散式ID生成系統
2019-01-19
分散式
Leaf-分散式ID生成系統
2019-08-09
分散式
分散式唯一 ID 生成器
2020-06-17
分散式
分散式全域性ID生成方案
2019-07-08
分散式
分散式 ID 生成演算法 — SnowFlake
2021-01-21
分散式演算法
分散式ID生成器的解決方案總結
2021-09-09
分散式
ID生成策略——SnowFlake
2019-02-15
分散式ID系列（3）——資料庫自增ID機制適合做分散式ID嗎
2019-08-07
分散式資料庫
圖解Janusgraph系列-圖資料底層序列化原始碼分析（Data Serialize）
2020-12-17
圖解原始碼
生成分散式唯一ID的幾種解決方案
2018-07-26
分散式
分散式唯一 ID 生成器 - IDGen
2024-05-16
分散式
怎樣生成分散式的流水ID
2022-06-02
分散式
分散式ID系列（5）——Twitter的雪法演算法Snowflake適合做分散式ID嗎
2019-08-14
分散式演算法
分散式id
2024-03-17
分散式
圖解JanusGraph系列 - 關於JanusGraph圖資料批量快速匯入的方案和想法（bulk load data）
2020-12-22
圖解
PHP 實現 Snowflake 生成分散式唯一 ID
2018-11-26
PHP分散式
Leaf：美團分散式ID生成服務開源
2019-03-08
分散式
分散式唯一ID的幾種生成方案
2023-02-19
分散式
一文詳解分散式 ID
2024-06-19
分散式
研究分散式唯一ID生成，看完這篇就夠
2019-07-27
分散式
分散式ID生成服務，真的有必要搞一個
2020-07-22
分散式
ShardingSphere-proxy-5.0.0分散式雪花ID生成(三)
2022-06-20
分散式
Golang 分散式 ID 生成系統，高效能、高可用、易擴充套件的 id 生成服務
2020-06-10
Golang分散式套件
微服務之唯一ID生成策略
2019-08-01
微服務
資料庫主鍵 ID 生成策略
2019-07-30
資料庫
基於雪花演算法生成分散式ID(Java版)
2021-06-07
演算法分散式Java
分散式全域性ID生成方案彙總和對比
2021-01-03
分散式
分散式 ID 解決方案之美團 Leaf
2020-07-22
分散式
分散式ID設計方案
2024-11-27
分散式
5 大分散式 ID 生成器優缺點簡單對比
2019-03-12
分散式
分散式唯一ID解決方案-雪花演算法
2021-01-12
分散式演算法
redis實現分散式id方案
2024-10-18
Redis分散式
分散式全域性唯一ID
2021-06-21
分散式

圖解Janusgraph系列-分散式id生成策略分析

JanusGraph - 分散式id的生成策略

正文

一：常用分散式id生成策略

二：JanusGraph的分散式id策略

三：原始碼分析

四：其他型別的id生成

1、property id的生成

2、Edge id的生成

3、Schema相關id的生成

總結

相關文章