Golang 原始碼學習(一) —— os/file 檔案操作

Hanson發表於2021-12-29

專案地址 github.com/Hanson/golang-learning

本篇幅我們通過檔案的建立、開啟、寫入、讀取來講講檔案模組。

file 結構

我們先來看看 file 的結構

package os

// file_unix.go
type file struct {
    pfd         poll.FD
    name        string
    dirinfo     *dirInfo // nil unless directory being read
    nonblock    bool     // whether we set nonblocking mode
    stdoutOrErr bool     // whether this is stdout or stderr
    appendMode  bool     // whether file is opened for appending
}

// file_windows.go
type file struct {
    pfd        poll.FD
    name       string
    dirinfo    *dirInfo // nil unless directory being read
    appendMode bool     // whether file is opened for appending
}

因為 Windowsunix 兩個作業系統的差異,結構上也會有些稍微的差別

可以看出 unix 下多了 nonblockstdoutOrErr 兩個引數

我們先來看看其他公共引數代表了什麼。

pfd 檔案描述符
name 檔名
dirinfo 當檔案為資料夾時的資訊
appendMode 檔案是開啟還是 append 模式
nonblock 是否非阻塞模式

無論建立還是開啟檔案,都需要我們制定開啟模式以及開啟方式

開啟方式

//開啟方式
const (
    //只讀模式
    O_RDONLY int = syscall.O_RDONLY // open the file read-only.
    //只寫模式
    O_WRONLY int = syscall.O_WRONLY // open the file write-only.
    //可讀可寫
    O_RDWR int = syscall.O_RDWR // open the file read-write.
    //追加內容
    O_APPEND int = syscall.O_APPEND // append data to the file when writing.
    //建立檔案,如果檔案不存在
    O_CREATE int = syscall.O_CREAT // create a new file if none exists.
    //與建立檔案一同使用,檔案必須存在
    O_EXCL int = syscall.O_EXCL // used with O_CREATE, file must not exist
    //開啟一個同步的檔案流
    O_SYNC int = syscall.O_SYNC // open for synchronous I/O.
    //如果可能,開啟時縮短檔案
    O_TRUNC int = syscall.O_TRUNC // if possible, truncate file when opened.
)

開啟模式

//開啟模式
const (
    ModeDir FileMode = 1 << (32 - 1 - iota) // d: is a directory 資料夾模式
    ModeAppend // a: append-only 追加模式
    ModeExclusive // l: exclusive use 單獨使用
    ModeTemporary // T: temporary file (not backed up) 臨時檔案
    ModeSymlink // L: symbolic link 象徵性的關聯
    ModeDevice // D: device file 裝置檔案
    ModeNamedPipe // p: named pipe (FIFO) 命名管道
    ModeSocket // S: Unix domain socket Unix 主機 socket
    ModeSetuid // u: setuid 設定uid
    ModeSetgid // g: setgid 設定gid
    ModeCharDevice // c: Unix character device, when ModeDevice is set Unix 字元裝置,當裝置模式是設定Unix
    ModeSticky // t: sticky 黏滯位
    // Mask for the type bits. For regular files, none will be set. bit位遮蓋.不變的檔案設定為none
    ModeType = ModeDir | ModeSymlink | ModeNamedPipe | ModeSocket | ModeDevice
    ModePerm FileMode = 0777 // Unix permission bits 許可權位.
)

os.Create 建立檔案

f, err := os.Create(fileName)
defer f.Close()
// file.go
func Create(name string) (*File, error) {
return OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666)
}

可以看到建立檔案,golang 呼叫了 OpenFile 方法,並傳入了 3 個 flag 以及指定了 0666 許可權的開啟方式。

// OpenFile is the generalized open call; most users will use Open
// or Create instead. It opens the named file with specified flag
// (O_RDONLY etc.). If the file does not exist, and the O_CREATE flag
// is passed, it is created with mode perm (before umask). If successful,
// methods on the returned File can be used for I/O.
// If there is an error, it will be of type *PathError.
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
    testlog.Open(name)
    f, err := openFileNolog(name, flag, perm)
    if err != nil {
        return nil, err
    }
    f.appendMode = flag&O_APPEND != 0

    return f, nil
}

註釋翻譯來說,就是 OpenFile 為最常規的開啟呼叫方式。大部分開發者會用 Open 或者 Create 代替。如果檔案不存在並且有 O_Create 標記,將會建立檔案。

可以看到 OpenFile 核心就是呼叫了 openFileNolog 去開啟一個檔案,引數都是透傳過去。

windowsunix 的程式碼並不一致,因為其背後作業系統以及檔案系統都是不一樣的,我們先來看看 Unix

// file_unix.go
// openFileNolog is the Unix implementation of OpenFile.
// Changes here should be reflected in openFdAt, if relevant.
func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
    setSticky := false
    if !supportsCreateWithStickyBit && flag&O_CREATE != 0 && perm&ModeSticky != 0 {
        if _, err := Stat(name); IsNotExist(err) {
            setSticky = true
        }
    }

    var r int
    for {
        var e error
        r, e = syscall.Open(name, flag|syscall.O_CLOEXEC, syscallMode(perm))
        if e == nil {
            break
        }

        // We have to check EINTR here, per issues 11180 and 39237.
        if e == syscall.EINTR {
            continue
        }

        return nil, &PathError{Op: "open", Path: name, Err: e}
    }

    // open(2) itself won't handle the sticky bit on *BSD and Solaris
    if setSticky {
        setStickyBit(name)
    }

    // There's a race here with fork/exec, which we are
    // content to live with. See ../syscall/exec_unix.go.
    if !supportsCloseOnExec {
        syscall.CloseOnExec(r)
    }

    return newFile(uintptr(r), name, kindOpenFile), nil
}

先是通過運算子 & 去判斷識別符號是否存在,如果沒有建立的識別符號以及黏滯位,則在通過 stat 檔案不存在時報錯。

可以看到 unixfor 裡面迴圈去系統呼叫 Open 函式,如沒有錯誤則退出迴圈。

呼叫 open 函式 O_CLOEXEC 模式開啟的檔案描述符在執行 exec 呼叫新程式中關閉,且為原子操作

當系統呼叫返回 syscall.EINTR 則繼續執行,否則則返回對應錯誤。

為什麼要用 for ?
可以從歷史中看到,新增判斷 syscall.EINTR 時用的還是 goto 語法,之前則沒有任何迴圈。for 是為了防止系統呼叫返回 syscall.EINTR 時退出。

後面邏輯為

  • setSticky 為 true 則 Chmod 檔案新增 ModeSticky
  • !supportsCloseOnExec 時系統呼叫 CloseOnExec

最終返回結果為 newFile 函式的結果,。

// newFile is like NewFile, but if called from OpenFile or Pipe
// (as passed in the kind parameter) it tries to add the file to
// the runtime poller.
func newFile(fd uintptr, name string, kind newFileKind) *File {
    fdi := int(fd)
    if fdi < 0 {
        return nil
    }
    f := &File{&file{
        pfd: poll.FD{
            Sysfd:         fdi,
            IsStream:      true,
            ZeroReadIsEOF: true,
        },
        name:        name,
        stdoutOrErr: fdi == 1 || fdi == 2,
    }}

    pollable := kind == kindOpenFile || kind == kindPipe || kind == kindNonBlock

    // If the caller passed a non-blocking filedes (kindNonBlock),
    // we assume they know what they are doing so we allow it to be
    // used with kqueue.
    if kind == kindOpenFile {
        switch runtime.GOOS {
        case "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd":
            var st syscall.Stat_t
            err := ignoringEINTR(func() error {
                return syscall.Fstat(fdi, &st)
            })
            typ := st.Mode & syscall.S_IFMT
            // Don't try to use kqueue with regular files on *BSDs.
            // On FreeBSD a regular file is always
            // reported as ready for writing.
            // On Dragonfly, NetBSD and OpenBSD the fd is signaled
            // only once as ready (both read and write).
            // Issue 19093.
            // Also don't add directories to the netpoller.
            if err == nil && (typ == syscall.S_IFREG || typ == syscall.S_IFDIR) {
                pollable = false
            }

            // In addition to the behavior described above for regular files,
            // on Darwin, kqueue does not work properly with fifos:
            // closing the last writer does not cause a kqueue event
            // for any readers. See issue #24164.
            if (runtime.GOOS == "darwin" || runtime.GOOS == "ios") && typ == syscall.S_IFIFO {
                pollable = false
            }
        }
    }

    if err := f.pfd.Init("file", pollable); err != nil {
        // An error here indicates a failure to register
        // with the netpoll system. That can happen for
        // a file descriptor that is not supported by
        // epoll/kqueue; for example, disk files on
        // Linux systems. We assume that any real error
        // will show up in later I/O.
    } else if pollable {
        // We successfully registered with netpoll, so put
        // the file into nonblocking mode.
        if err := syscall.SetNonblock(fdi, true); err == nil {
            f.nonblock = true
        }
    }

    runtime.SetFinalizer(f.file, (*file).close)
    return f
}

newFile 先是判斷系統呼叫 Open 返回的檔案描述符的值是否小於0,後面則是構造 File 結構。

kind 傳參是 kindOpenFilepollabletrue,用於後續系統呼叫 SetNonblocktrue

接下來我們來看看 windows 的程式碼。

// file_windows.go
// openFileNolog is the Windows implementation of OpenFile.
func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
    if name == "" {
        return nil, &PathError{Op: "open", Path: name, Err: syscall.ENOENT}
    }
    r, errf := openFile(name, flag, perm)
    if errf == nil {
        return r, nil
    }
    r, errd := openDir(name)
    if errd == nil {
        if flag&O_WRONLY != 0 || flag&O_RDWR != 0 {
            r.Close()
            return nil, &PathError{Op: "open", Path: name, Err: syscall.EISDIR}
        }
        return r, nil
    }
    return nil, &PathError{Op: "open", Path: name, Err: errf}
}

這裡的程式碼也很簡單,判斷檔名,呼叫 openFile,沒有錯誤則返回,否則呼叫 openDir。當沒有錯誤以及有識別符號 O_WRONLYO_RDWR 時,關閉資料夾並且返回錯誤。

// file_windows.go
func openFile(name string, flag int, perm FileMode) (file *File, err error) {
    r, e := syscall.Open(fixLongPath(name), flag|syscall.O_CLOEXEC, syscallMode(perm))
    if e != nil {
        return nil, e
    }
    return newFile(r, name, "file"), nil
}

windows 下的 openFile 顯得簡單多了,系統呼叫沒有錯誤,則返回函式 newFile

// file_windows.go
// newFile returns a new File with the given file handle and name.
// Unlike NewFile, it does not check that h is syscall.InvalidHandle.
func newFile(h syscall.Handle, name string, kind string) *File {
    if kind == "file" {
        var m uint32
        if syscall.GetConsoleMode(h, &m) == nil {
            kind = "console"
        }
        if t, err := syscall.GetFileType(h); err == nil && t == syscall.FILE_TYPE_PIPE {
            kind = "pipe"
        }
    }

    f := &File{&file{
        pfd: poll.FD{
            Sysfd:         h,
            IsStream:      true,
            ZeroReadIsEOF: true,
        },
        name: name,
    }}
    runtime.SetFinalizer(f.file, (*file).close)

    // Ignore initialization errors.
    // Assume any problems will show up in later I/O.
    f.pfd.Init(kind, false)

    return f
}

可以看到除了構造 file 結構外,還呼叫了檔案描述符的 Init 方法,把上面的 kind 傳了進去。

檔案描述符不在本節內容,以後會新開篇章詳細講講

接下來我們看看報錯後執行的 openDir 又做了什麼。

// file_windows.go
func openDir(name string) (file *File, err error) {
    var mask string

    path := fixLongPath(name)

    if len(path) == 2 && path[1] == ':' { // it is a drive letter, like C:
        mask = path + `*`
    } else if len(path) > 0 {
        lc := path[len(path)-1]
        if lc == '/' || lc == '\\' {
            mask = path + `*`
        } else {
            mask = path + `\*`
        }
    } else {
        mask = `\*`
    }
    maskp, e := syscall.UTF16PtrFromString(mask)
    if e != nil {
        return nil, e
    }
    d := new(dirInfo)
    r, e := syscall.FindFirstFile(maskp, &d.data)
    if e != nil {
        // FindFirstFile returns ERROR_FILE_NOT_FOUND when
        // no matching files can be found. Then, if directory
        // exists, we should proceed.
        if e != syscall.ERROR_FILE_NOT_FOUND {
            return nil, e
        }
        var fa syscall.Win32FileAttributeData
        pathp, e := syscall.UTF16PtrFromString(path)
        if e != nil {
            return nil, e
        }
        e = syscall.GetFileAttributesEx(pathp, syscall.GetFileExInfoStandard, (*byte)(unsafe.Pointer(&fa)))
        if e != nil {
            return nil, e
        }
        if fa.FileAttributes&syscall.FILE_ATTRIBUTE_DIRECTORY == 0 {
            return nil, e
        }
        d.isempty = true
    }
    d.path = path
    if !isAbs(d.path) {
        d.path, e = syscall.FullPath(d.path)
        if e != nil {
            return nil, e
        }
    }
    f := newFile(r, name, "dir")
    f.dirinfo = d
    return f, nil
}

先是判斷引數 name 的格式,是否磁碟,例如C:,是否某些特定符號結尾等,生成引數 mask

因為 windows 系統是使用 UTF-16 編碼,所以需要把檔案路徑的字串轉成 UTF-16。

呼叫系統函式 FindFirstFile 並寫入剛 newdirInfo

後面就是系統呼叫返回了詳細路徑,中間部分不作詳細講解。

到此為止,os.Create 終於講完了。

os.OpenFile 寫入檔案

f, err := os.OpenFile(fileName, os.O_WRONLY|os.O_TRUNC, 0600)
defer f.Close()
if err == nil {
    f.Write([]byte("text"))
}
// file.go
// Write writes len(b) bytes from b to the File.
// It returns the number of bytes written and an error, if any.
// Write returns a non-nil error when n != len(b).
func (f *File) Write(b []byte) (n int, err error) {
    if err := f.checkValid("write"); err != nil {
        return 0, err
    }
    n, e := f.write(b)
    if n < 0 {
        n = 0
    }
    if n != len(b) {
        err = io.ErrShortWrite
    }

    epipecheck(f, e)

    if e != nil {
        err = f.wrapErr("write", e)
    }

    return n, err
}

// checkValid checks whether f is valid for use.
// If not, it returns an appropriate error, perhaps incorporating the operation name op.
func (f *File) checkValid(op string) error {
    if f == nil {
        return ErrInvalid
    }
    return nil
}

Write 函式首先會檢查 f 是否為空,否則將呼叫 write 函式。

// file_posix.go
// write writes len(b) bytes to the File.
// It returns the number of bytes written and an error, if any.
func (f *File) write(b []byte) (n int, err error) {
    n, err = f.pfd.Write(b)
    runtime.KeepAlive(f)
    return n, err
}

write 函式會呼叫作業系統對應的檔案描述符的 Write 函式,關於 fd 的內容我之後將另開篇章講解。

當寫入成功後,會判斷寫入檔案長度與引數長度是否一致,不一致則設定 errio.ErrShortWrite

小結

本節內容講解的檔案的建立、開啟以及寫入,但這只是比較淺層的 golang 包,實際與作業系統接觸的內容並不多,如需要更加深入瞭解,可以看 internal/poll/fd 的內容。

本作品採用《CC 協議》,轉載必須註明作者和本文連結
❤️ 新增本人微信:hansonskr ❤️ 備註:go 可立即通過好友並拉入 golang原始碼學習 群 ❤️ 備註:vbot 可立即通過好友並拉入 vbot交流 群 ❤️ 已經是好友的朋友可以傳送 go拉群 vbot拉群 進對應的群

相關文章