git-0.1版本原始碼分析

azurelaker發表於2018-08-14

git是一個開源的分散式程式碼控制系統(SCM),由Linus在2005年開發.當時由於linux核心工程所使用的SCM工具BitKeeper的提供商不再提供免費使用,Linus沒有尋找到能替代BitKeeper,滿足需求的SCM工具,因此自己設計開發了git.為什麼叫這個名字,在初始版本的README中,Linus是這樣解釋的, 也就是說,其實也沒有特殊的含義-:)

    GIT - the stupid content tracker

"git" can mean anything, depending on your mood.

 - random three-letter combination that is pronounceable, and not
   actually used by any common UNIX command.  The fact that it is a
   mispronounciation of "get" may or may not be relevant.
 - stupid. contemptible and despicable. simple. Take your pick from the
   dictionary of slang.
 - "global information tracker": you're in a good mood, and it actually
   works for you. Angels sing, and a light suddenly fills the room. 
 - "goddamn idiotic truckload of sh*t": when it breaks

相比較而言,git有如下幾個顯著特點和優勢:
1) 從倉庫克隆以後,包含所有的歷史修改記錄.
2) 分支管理的高效性和高效能.
3) 分散式開發的高效率.
當然,git還有許多強大和便利的功能.git的原始碼工程請參考:https://github.com/git/git

git的初始版本為git-0.1,完成了物件資料庫和cache的框架設計,並且只是實現了底層命令的操作,我們現在使用git時,比如git add 或者git rm等,都是以特定的引數呼叫某個底層命令.git的每個底層命令都被編譯成為可執行檔案. git初始版本的目錄結構及檔案如下圖所示:
這裡寫圖片描述

cache.h標頭檔案中,包括cache處理流程中涉及到的幾個型別和介面,在流程分析中會說明這些型別,該檔案的內容如下:

#ifndef CACHE_H
#define CACHE_H

#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stddef.h>
#include <stdlib.h>
#include <stdarg.h>
#include <errno.h>
#include <sys/mman.h>

#include <openssl/sha.h>
#include <zlib.h>

/*
 * Basic data structures for the directory cache
 *
 * NOTE NOTE NOTE! This is all in the native CPU byte format. It's
 * not even trying to be portable. It's trying to be efficient. It's
 * just a cache, after all.
 */

#define CACHE_SIGNATURE 0x44495243  /* "DIRC" */
struct cache_header {
    unsigned int signature;
    unsigned int version;
    unsigned int entries;
    unsigned char sha1[20];
};

/*
 * The "cache_time" is just the low 32 bits of the
 * time. It doesn't matter if it overflows - we only
 * check it for equality in the 32 bits we save.
 */
struct cache_time {
    unsigned int sec;
    unsigned int nsec;
};

/*
 * dev/ino/uid/gid/size are also just tracked to the low 32 bits
 * Again - this is just a (very strong in practice) heuristic that
 * the inode hasn't changed.
 */
struct cache_entry {
    struct cache_time ctime;
    struct cache_time mtime;
    unsigned int st_dev;
    unsigned int st_ino;
    unsigned int st_mode;
    unsigned int st_uid;
    unsigned int st_gid;
    unsigned int st_size;
    unsigned char sha1[20];
    unsigned short namelen;
    unsigned char name[0];
};

const char *sha1_file_directory;
struct cache_entry **active_cache;
unsigned int active_nr, active_alloc;

#define DB_ENVIRONMENT "SHA1_FILE_DIRECTORY"
#define DEFAULT_DB_ENVIRONMENT ".dircache/objects"

#define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7)
#define ce_size(ce) cache_entry_size((ce)->namelen)

#define alloc_nr(x) (((x)+16)*3/2)

/* Initialize the cache information */
extern int read_cache(void);

/* Return a statically allocated filename matching the sha1 signature */
extern char *sha1_file_name(unsigned char *sha1);

/* Write a memory buffer out to the sha file */
extern int write_sha1_buffer(unsigned char *sha1, void *buf, unsigned int size);

/* Read and unpack a sha1 file into memory, write memory to a sha1 file */
extern void * read_sha1_file(unsigned char *sha1, char *type, unsigned long *size);
extern int write_sha1_file(char *buf, unsigned len);

/* Convert to/from hex/sha1 representation */
extern int get_sha1_hex(char *hex, unsigned char *sha1);
extern char *sha1_to_hex(unsigned char *sha1);  /* static buffer! */

/* General helper functions */
extern void usage(const char *err);

#endif /* CACHE_H */

原始碼流程分析主要包括如下幾部分:
init-db命令處理流程
update-cache命令處理流程
write-tree命令處理流程
show-diff命令處理流程

相關文章