【Zookeeper】原始碼分析之持久化（二）之FileSnap

leesf發表於2017-01-14

一、前言

　　前篇博文已經分析了FileTxnLog的原始碼，現在接著分析持久化中的FileSnap，其主要提供了快照相應的介面。

二、SnapShot原始碼分析

　　SnapShot是FileTxnLog的父類，介面型別，其方法如下　　

public interface SnapShot {
    
    /**
     * deserialize a data tree from the last valid snapshot and 
     * return the last zxid that was deserialized
     * @param dt the datatree to be deserialized into
     * @param sessions the sessions to be deserialized into
     * @return the last zxid that was deserialized from the snapshot
     * @throws IOException
     */
    // 反序列化
    long deserialize(DataTree dt, Map<Long, Integer> sessions) 
        throws IOException;
    
    /**
     * persist the datatree and the sessions into a persistence storage
     * @param dt the datatree to be serialized
     * @param sessions 
     * @throws IOException
     */
    // 序列化
    void serialize(DataTree dt, Map<Long, Integer> sessions, 
            File name) 
        throws IOException;
    
    /**
     * find the most recent snapshot file
     * @return the most recent snapshot file
     * @throws IOException
     */
    // 查詢最新的snapshot檔案
    File findMostRecentSnapshot() throws IOException;
    
    /**
     * free resources from this snapshot immediately
     * @throws IOException
     */
    // 釋放資源
    void close() throws IOException;
}

　　說明：可以看到SnapShot只定義了四個方法，反序列化、序列化、查詢最新的snapshot檔案、釋放資源。

三、FileSnap原始碼分析

　　FileSnap實現了SnapShot介面，主要用作儲存、序列化、反序列化、訪問相應snapshot檔案。

　　3.1 類的屬性　

public class FileSnap implements SnapShot {
    // snapshot目錄檔案
    File snapDir;
    // 是否已經關閉標識
    private volatile boolean close = false;
    // 版本號
    private static final int VERSION=2;
    // database id
    private static final long dbId=-1;
    // Logger
    private static final Logger LOG = LoggerFactory.getLogger(FileSnap.class);
    // snapshot檔案的魔數(類似class檔案的魔數)
    public final static int SNAP_MAGIC
        = ByteBuffer.wrap("ZKSN".getBytes()).getInt();
}

　　說明：FileSnap主要的屬性包含了是否已經關閉標識。

　　3.2 類的核心函式

　　1. deserialize函式

　　函式簽名如下：

　　public long deserialize(DataTree dt, Map<Long, Integer> sessions)，是對SnapShot的deserialize函式的實現。其原始碼如下　　

    public long deserialize(DataTree dt, Map<Long, Integer> sessions)
            throws IOException {
        // we run through 100 snapshots (not all of them)
        // if we cannot get it running within 100 snapshots
        // we should  give up
        // 查詢100個合法的snapshot檔案
        List<File> snapList = findNValidSnapshots(100);
        if (snapList.size() == 0) { // 無snapshot檔案，直接返回
            return -1L;
        }
        // 
        File snap = null;
        // 預設為不合法
        boolean foundValid = false;
        for (int i = 0; i < snapList.size(); i++) { // 遍歷snapList
            snap = snapList.get(i);
            // 輸入流
            InputStream snapIS = null;
            CheckedInputStream crcIn = null;
            try {
                LOG.info("Reading snapshot " + snap);
                // 讀取指定的snapshot檔案
                snapIS = new BufferedInputStream(new FileInputStream(snap));
                // 驗證
                crcIn = new CheckedInputStream(snapIS, new Adler32());
                InputArchive ia = BinaryInputArchive.getArchive(crcIn);
                // 反序列化
                deserialize(dt,sessions, ia);
                // 獲取驗證的值Checksum
                long checkSum = crcIn.getChecksum().getValue();
                // 從檔案中讀取val值
                long val = ia.readLong("val");
                if (val != checkSum) { // 比較驗證，不相等，丟擲異常
                    throw new IOException("CRC corruption in snapshot :  " + snap);
                }
                // 合法
                foundValid = true;
                // 跳出迴圈
                break;
            } catch(IOException e) {
                LOG.warn("problem reading snap file " + snap, e);
            } finally { // 關閉流
                if (snapIS != null) 
                    snapIS.close();
                if (crcIn != null) 
                    crcIn.close();
            } 
        }
        if (!foundValid) { // 遍歷所有檔案都未驗證成功
            throw new IOException("Not able to find valid snapshots in " + snapDir);
        }
        // 從檔名中解析出zxid
        dt.lastProcessedZxid = Util.getZxidFromName(snap.getName(), "snapshot");
        return dt.lastProcessedZxid;
    }

　　說明：deserialize主要用作反序列化，並將反序列化結果儲存至dt和sessions中。其大致步驟如下

　　① 獲取100個合法的snapshot檔案，並且snapshot檔案已經通過zxid進行降序排序，進入②

　　② 遍歷100個snapshot檔案，從zxid最大的開始，讀取該檔案，並建立相應的InputArchive，進入③

　　③ 呼叫deserialize(dt,sessions, ia)函式完成反序列化操作，進入④

　　④ 驗證從檔案中讀取的Checksum是否與新生的Checksum相等，若不等，則丟擲異常，否則，進入⑤

　　⑤ 跳出迴圈並關閉相應的輸入流，並從檔名中解析出相應的zxid返回。

　　⑥ 在遍歷100個snapshot檔案後仍然無法找到通過驗證的檔案，則丟擲異常。

　　在deserialize函式中，會呼叫findNValidSnapshots以及同名的deserialize(dt,sessions, ia)函式，findNValidSnapshots函式原始碼如下　　

    private List<File> findNValidSnapshots(int n) throws IOException {
        // 按照zxid對snapshot檔案進行降序排序
        List<File> files = Util.sortDataDir(snapDir.listFiles(),"snapshot", false);
        int count = 0;
        List<File> list = new ArrayList<File>();
        for (File f : files) { // 遍歷snapshot檔案
            // we should catch the exceptions
            // from the valid snapshot and continue
            // until we find a valid one
            try {
                // 驗證檔案是否合法，在寫snapshot檔案時伺服器當機
                // 此時的snapshot檔案非法;非snapshot檔案也非法
                if (Util.isValidSnapshot(f)) {
                    // 合法則新增
                    list.add(f);
                    // 計數器加一
                    count++;
                    if (count == n) { // 等於n則跳出迴圈
                        break;
                    }
                }
            } catch (IOException e) {
                LOG.info("invalid snapshot " + f, e);
            }
        }
        return list;
    }

　　說明：該函式主要是查詢N個合法的snapshot檔案並進行降序排序後返回，Util的isValidSnapshot函式主要是從檔名和檔案的結尾符號是否是"/"來判斷snapshot檔案是否合法。其原始碼如下　

    public static boolean isValidSnapshot(File f) throws IOException {
        // 檔案為空或者非snapshot檔案，則返回false
        if (f==null || Util.getZxidFromName(f.getName(), "snapshot") == -1)
            return false;

        // Check for a valid snapshot
        // 隨機訪問檔案
        RandomAccessFile raf = new RandomAccessFile(f, "r");
        try {
            // including the header and the last / bytes
            // the snapshot should be atleast 10 bytes
            if (raf.length() < 10) { // 檔案大小小於10個位元組，返回false
                return false;
            }
            // 移動至倒數第五個位元組
            raf.seek(raf.length() - 5);
            byte bytes[] = new byte[5];
            int readlen = 0;
            int l;
            while(readlen < 5 &&
                  (l = raf.read(bytes, readlen, bytes.length - readlen)) >= 0) { // 將最後五個位元組存入bytes中
                readlen += l;
            }
            if (readlen != bytes.length) {
                LOG.info("Invalid snapshot " + f
                        + " too short, len = " + readlen);
                return false;
            }
            ByteBuffer bb = ByteBuffer.wrap(bytes);
            int len = bb.getInt();
            byte b = bb.get();
            if (len != 1 || b != '/') { // 最後字元不為"/",不合法
                LOG.info("Invalid snapshot " + f + " len = " + len
                        + " byte = " + (b & 0xff));
                return false;
            }
        } finally {
            raf.close();
        }

        return true;
    }

　　deserialize(dt,sessions, ia)函式的原始碼如下　　

    public void deserialize(DataTree dt, Map<Long, Integer> sessions,
            InputArchive ia) throws IOException {
        FileHeader header = new FileHeader();
        // 反序列化至header
        header.deserialize(ia, "fileheader");
        if (header.getMagic() != SNAP_MAGIC) { // 驗證魔數是否相等
            throw new IOException("mismatching magic headers "
                    + header.getMagic() + 
                    " !=  " + FileSnap.SNAP_MAGIC);
        }
        // 反序列化至dt、sessions
        SerializeUtils.deserializeSnapshot(dt,ia,sessions);
    }

　　說明：該函式主要作用反序列化，並將反序列化結果儲存至header和sessions中。其中會驗證header的魔數是否相等。

　　2. serialize函式　

　　函式簽名如下：protected void serialize(DataTree dt,Map<Long, Integer> sessions, OutputArchive oa, FileHeader header) throws IOException

    protected void serialize(DataTree dt,Map<Long, Integer> sessions,
            OutputArchive oa, FileHeader header) throws IOException {
        // this is really a programmatic error and not something that can
        // happen at runtime
        if(header==null) // 檔案頭為null
            throw new IllegalStateException(
                    "Snapshot's not open for writing: uninitialized header");
        // 將header序列化
        header.serialize(oa, "fileheader");
        // 將dt、sessions序列化
        SerializeUtils.serializeSnapshot(dt,oa,sessions);
    }

　　說明：該函式主要用於序列化dt、sessions和header，其中，首先會檢查header是否為空，然後依次序列化header，sessions和dt。

　　3. serialize函式

　　函式簽名如下：public synchronized void serialize(DataTree dt, Map<Long, Integer> sessions, File snapShot) throws IOException　　

    public synchronized void serialize(DataTree dt, Map<Long, Integer> sessions, File snapShot)
            throws IOException {
        if (!close) { // 未關閉
            // 輸出流
            OutputStream sessOS = new BufferedOutputStream(new FileOutputStream(snapShot));
            CheckedOutputStream crcOut = new CheckedOutputStream(sessOS, new Adler32());
            //CheckedOutputStream cout = new CheckedOutputStream()
            OutputArchive oa = BinaryOutputArchive.getArchive(crcOut);
            // 新生檔案頭
            FileHeader header = new FileHeader(SNAP_MAGIC, VERSION, dbId);
            // 序列化dt、sessions、header
            serialize(dt,sessions,oa, header);
            // 獲取驗證的值
            long val = crcOut.getChecksum().getValue();
            // 寫入值
            oa.writeLong(val, "val");
            // 寫入"/"
            oa.writeString("/", "path");
            // 強制重新整理
            sessOS.flush();
            crcOut.close();
            sessOS.close();
        }
    }

　　說明：該函式用於將header、sessions、dt序列化至本地snapshot檔案中，並且在最後會寫入"/"字元。該方法是同步的，即是執行緒安全的。

四、總結

　　FileSnap原始碼相對較簡單，其主要是用於操作snapshot檔案，也謝謝各位園友的觀看~　　

【Zookeeper】原始碼分析之持久化（三）之FileTxnSnapLog
2017-01-14
原始碼持久化
【Zookeeper】原始碼分析之持久化（一）之FileTxnLog
2017-01-13
原始碼持久化
【Zookeeper】原始碼分析之序列化
2017-01-12
原始碼
【Zookeeper】原始碼分析之伺服器（二）之ZooKeeperServer
2017-03-07
原始碼伺服器Server
【Zookeeper】原始碼分析之Leader選舉（二）之FastLeaderElection
2017-03-07
原始碼AST
【Zookeeper】原始碼分析之網路通訊（二）之NIOServerCnxn
2017-03-01
原始碼iOSServer
【Zookeeper】原始碼分析之Watcher機制（二）之WatchManager
2017-01-16
原始碼
【Zookeeper】原始碼分析之請求處理鏈（二）之PrepRequestProcessor
2017-02-20
原始碼
【Zookeeper】原始碼分析之Watcher機制（三）之ZooKeeper
2017-01-18
原始碼
【Zookeeper】原始碼分析之伺服器（四）之FollowerZooKeeperServer
2017-03-08
原始碼伺服器Server
【Zookeeper】原始碼分析之伺服器（五）之ObserverZooKeeperServer
2017-03-08
原始碼伺服器Server
Zookeeper原始碼分析（二） —– zookeeper日誌
2019-03-02
原始碼
Zookeeper原始碼分析（二） ----- zookeeper日誌
2018-04-30
原始碼
【Zookeeper】原始碼分析之伺服器（一）
2017-03-07
原始碼伺服器
【Zookeeper】原始碼分析之伺服器（三）之LeaderZooKeeperServer
2017-03-07
原始碼伺服器Server
【Zookeeper】原始碼分析之Leader選舉（一）
2017-03-03
原始碼
【Zookeeper】原始碼分析之網路通訊（一）
2017-02-28
原始碼
【Zookeeper】原始碼分析之Watcher機制（一）
2017-01-15
原始碼
【Zookeeper】原始碼分析之網路通訊（三）之NettyServerCnxn
2017-03-02
原始碼NettyServer
【Zookeeper】原始碼分析之請求處理鏈（三）之SyncRequestProcessor
2017-02-24
原始碼
原始碼|jdk原始碼之HashMap分析(二)
2019-01-19
原始碼JDKHashMap
【Zookeeper】原始碼分析之請求處理鏈（四）之FinalRequestProcessor
2017-02-27
原始碼
【Zookeeper】原始碼分析之請求處理鏈（一）
2017-02-17
原始碼
redis原始碼分析（二）、redis原始碼分析之sds字串
2017-11-12
Redis原始碼字串
Spring原始碼分析之IoC（二）
2018-09-30
Spring原始碼
Zookeeper原始碼分析
2020-12-18
原始碼
搞懂ZooKeeper的Watcher之原始碼分析及特性總結
2019-05-27
原始碼
Java併發之AQS原始碼分析（二）
2019-05-08
JavaAQS原始碼
redis原始碼分析（五）：資料持久化
2021-09-09
Redis原始碼持久化
Zookeeper之Zookeeper的Client的分析
2017-07-14
client
redis 之持久化
2020-07-27
Redis持久化
Redis之持久化
2017-08-11
Redis持久化
Guava 原始碼分析之 EventBus 原始碼分析
2018-08-01
Guava原始碼
Abp原始碼分析之Abp本地化
2024-11-12
原始碼
netty原始碼分析之pipeline(二)
2019-03-03
Netty原始碼
物聯網協議之MQTT原始碼分析(二)
2019-05-12
協議MQQT原始碼
【JUC】JDK1.8原始碼分析之AbstractQueuedSynchronizer（二）
2016-04-08
JDK原始碼
原始碼分析之 HashMap
2019-03-04
原始碼HashMap

【Zookeeper】原始碼分析之持久化（二）之FileSnap

相關文章