結合 Go 讀 APUE－基本檔案I/O

zhaohu發表於2017-11-03

原文網址 : https://gocn.vip/topics/1271?locale=zh-CN

原文連結： https://github.com/zhaohuXing/books-learning/blob/master/APUE/chapter-3/basic_file_io.md

基本的檔案 I/O

我想 open, read, write, lseek, close 這個幾個操作就滿足了對檔案操作的基本需求。當然了，我也是看書是這麼寫的。

每個語言基本都有對應的函式或方法，我們呼叫就行，在這種情況下，我們可以理解成 -> 語言就是個工具。我比較偏向 Go 的風格，所以這裡我以 Go 的函式庫為例，但在介紹其之前，要明白一個概念：檔案描述符。

畫中重點了：

對於核心而言, 所有開啟的檔案都通過檔案描述符引用。檔案描述符是一個非負整數。

對上面的描述還是有點模糊呢？

當開啟一個現有檔案或建立一個新的檔案時，核心向程式返回一個 檔案描述符。

當讀、寫一個檔案時，使用 open 或 create 返回的 檔案描述符 標識該檔案，將 檔案描述符 作為引數傳遞給 read 或 write。

通常用變數 fd 來表示檔案描述符 (file descripter)

函式 open 和 openat & 函式 create

呼叫 open 或 openat 函式就可以開啟或建立一個檔案。

#include &lt;fcntl.h&gt;

int open(const char *path, int oflag, ... /* mode_t mode */);

int openat(int fd, const char *path, int oflag, ... /* mode_t mode */);

呼叫 create 函式建立一個新檔案。

#include &lt;fcntl.h&gt;

int create(const char *path, mode_t mode);

上面函式中的引數：

path 是要開啟或建立檔案的名字
oflag 是對檔案進行哪些操作的 flag, 例如：O_RDWR|O_CREATE|O_TRUNC
mode 指定該檔案的訪問許可權位
fd 表示檔案描述符

在這裡羅列了 Go 中對檔案進行哪些操作的 flags:

// Flags to OpenFile wrapping those of the underlying system. Not all
// flags may be implemented on a given system.
const (
    O_RDONLY int = syscall.O_RDONLY // open the file read-only.
    O_WRONLY int = syscall.O_WRONLY // open the file write-only.
    O_RDWR   int = syscall.O_RDWR   // open the file read-write.
    O_APPEND int = syscall.O_APPEND // append data to the file when writing.
    O_CREATE int = syscall.O_CREAT  // create a new file if none exists.
    O_EXCL   int = syscall.O_EXCL   // used with O_CREATE, file must not exist
    O_SYNC   int = syscall.O_SYNC   // open for synchronous I/O.
    O_TRUNC  int = syscall.O_TRUNC  // if possible, truncate file when opened.
)

如何用 Go 開啟或建立一個檔案：

// Open file 
func Open(name string) (*File, error) {
    return OpenFile(name, O_RDONLY, 0)
}

// Create file 
func Create(name string) (*File, error) {
    return OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666)
}

通過觀察原始碼，得知二者都是呼叫 OpenFile 函式，只是 flag, mode 不同。

// OpenFile is the generalized open call; most users will use Open
// or Create instead. It opens the named file with specified flag
// (O_RDONLY etc.) and perm, (0666 etc.) if applicable. If successful,
// methods on the returned File can be used for I/O.
// If there is an error, it will be of type *PathError.
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
    chmod := false
    if !supportsCreateWithStickyBit &amp;&amp; flag&amp;O_CREATE != 0 &amp;&amp; perm&amp;ModeSticky != 0 {
        if _, err := Stat(name); IsNotExist(err) {
            chmod = true
        }
    }

    var r int
    for {
        var e error
        r, e = syscall.Open(name, flag|syscall.O_CLOEXEC, syscallMode(perm))
        if e == nil {
            break
        }

        // On OS X, sigaction(2) doesn't guarantee that SA_RESTART will cause
        // open(2) to be restarted for regular files. This is easy to reproduce on
        // fuse file systems (see http://golang.org/issue/11180).
        if runtime.GOOS == &quot;darwin&quot; &amp;&amp; e == syscall.EINTR {
            continue
        }

        return nil, &amp;PathError{&quot;open&quot;, name, e}
    }

    // open(2) itself won't handle the sticky bit on *BSD and Solaris
    if chmod {
        Chmod(name, perm)
    }

    // There's a race here with fork/exec, which we are
    // content to live with. See ../syscall/exec_unix.go.
    if !supportsCloseOnExec {
        syscall.CloseOnExec(r)
    }

    return newFile(uintptr(r), name), nil
}

當讀上面這段程式碼時，supportsCreatedWithStickyBit 這就卡住啦，知識點就是 StickyBit （粘著位）

瞭解下 StickyBit （粘著位）：

在 UNIX 還沒有使用請求分頁式技術的早期版本中，如果 可執行檔案 設定了 StickyBit，在執行該檔案結束時，程式的正文部分的一個副本仍被儲存在交換區，以便下次執行時，可以迅速裝入記憶體。然而現今的 UNIX 中大多數配置了虛擬儲存系統以及快速檔案系統，所以不再需要使用該技術啦。

在 OpenFile 函式原始碼中, 常量supportsCreatedWithStickyBit 在 Ubuntu 16.04 環境下的值是 true, 故那部分程式碼不會被執行。所以在 Ubuntu 16.04 環境下的開發者可以不用去了解 if !supportsCreatedWithStickyBit ... 程式碼塊。由於使用 Ubuntu 16.04 的緣故，所以 OpenFile 函式可以簡化如下：

// OpenFile is the generalized open call; most users will use Open
// or Create instead. It opens the named file with specified flag
// (O_RDONLY etc.) and perm, (0666 etc.) if applicable. If successful,
// methods on the returned File can be used for I/O.
// If there is an error, it will be of type *PathError.
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
    var r int
    for {
        var e error
        r, e = syscall.Open(name, flag|syscall.O_CLOEXEC, syscallMode(perm))
        if e == nil {
            break
        }
        return nil, &amp;PathError{&quot;open&quot;, name, e}
    }
    return newFile(uintptr(r), name), nil
}

簡化後的程式碼，發現核心程式碼就是：syscall.Open(name, flag|syscall.O_CLOEXEC, syscallMode(perm)), 觸發系統呼叫。在深入瞭解之前，我們先把 syscallMode(prem) 解決掉，掃除障礙。

// syscallMode returns the syscall-specific mode bits from Go's portable mode bits.
func syscallMode(i FileMode) (o uint32) {
    o |= uint32(i.Perm())
    if i&amp;ModeSetuid != 0 {
        o |= syscall.S_ISUID
    }
    if i&amp;ModeSetgid != 0 {
        o |= syscall.S_ISGID
    }
    if i&amp;ModeSticky != 0 {
        o |= syscall.S_ISVTX
    }
    // No mapping for Go's ModeTemporary (plan9 only).
    return
}

讓我們瞭解下 FileMode,原始碼是這樣定義的 type FileMode uint32, 並通過檢視原始碼得值 i.Perm() 等價於 i & 0777, 並通過了解 Open 的 mode 為 0 ，syscallMode(0) == 0 ;Create 中 mode 為 0666, syscallMode(0666) == 438

> Tips: 一開始因為 posix 結尾的檔案是 “posix 系統” (不存在的) 下呼叫的，查了之後，才知道是 unix 系統下呼叫的。

那讓我們關注點切換到 syscall.Open(name, mode, prem) 上, 類似 c 中的方法吧！深度的話先挖到這個地方。

讓我們回到簡化後的 OpenFile 剩餘的知識點: PathError, NewFile(uintptr(r), name)。

PathError 的原始碼如下：

// PathError records an error and the operation and file path that caused it.
type PathError struct {
    Op   string
    Path string
    Err  error
}

func (e *PathError) Error() string { return e.Op + &quot; &quot; + e.Path + &quot;: &quot; + e.Err.Error() }

error 是個介面, 只要實現了 Error 方法就 OK.

uintptr(r) 中 uintptr 定義如下：

// uintptr is an integer type that is large enough to hold the bit pattern of
// any pointer.
type uintptr uintptr

uintptr(r) 中 r 是個 int 型別。

看下 NewFile 這個函式是怎麼定義的，原始碼如下：

// NewFile returns a new File with the given file descriptor and name.
func NewFile(fd uintptr, name string) *File {
    fdi := int(fd)
    if fdi &lt; 0 {
        return nil
    }
    f := &amp;File{&amp;file{fd: fdi, name: name}}
    runtime.SetFinalizer(f.file, (*file).close)
    return f
}

上面函式中 fd 經過一輪迴又回到了 int 型別。File 是 file 型別的封裝，原始碼如下：

// File represents an open file descriptor.
type File struct {
    *file // os specific
}

// file is the real representation of *File.
// The extra level of indirection ensures that no clients of os
// can overwrite this data, which could cause the finalizer
// to close the wrong file descriptor.
type file struct {
    fd      int
    name    string
    dirinfo *dirInfo // nil unless directory being read
}

上面函式中 runtime.SetFinalizer(f.file, (*file).close), 型別 c/c++ 中的 解構函式 吧！(挖, 先這吧)

函式 close

呼叫 close 函式關閉一個開啟檔案。

#include &lt;unistd.h&gt;

int close(int fd);

如何用 Go 來關閉一個檔案呢？

// Close closes the File, rendering it unusable for I/O.
// It returns an error, if any.
func (f *File) Close() error {
    if f == nil {
        return ErrInvalid
    }
    return f.file.close()
}

func (file *file) close() error {
    if file == nil || file.fd == badFd {
        return syscall.EINVAL
    }
    var err error
    if e := syscall.Close(file.fd); e != nil {
        err = &amp;PathError{&quot;close&quot;, file.name, e}
    }
    file.fd = -1 // so it can't be closed again

    // no need for a finalizer anymore
    runtime.SetFinalizer(file, nil)
    return err
}

從上面的程式碼中可見，syscall.Close(file.fd) 類似 c 中的 close，起著關鍵性的作用。其原始碼如下：

// THIS FILE IS GENERATED BY THE COMMAND AT THE TOP; DO NOT EDIT

func Close(fd int) (err error) {
    _, _, e1 := Syscall(SYS_CLOSE, uintptr(fd), 0, 0)
    if e1 != 0 {
        err = errnoErr(e1)
    }
    return
}

Syscall(SYS_CLOSE, uintptr(fd), 0, 0) 估計是更底層的呼叫了，就不再挖啦。

函式 lseek

呼叫 lseek 顯式地為一個開啟檔案設定偏移量。

#include &lt;unistd.h&gt;

off_t lseek(int fd, off_t offset, int whence);

上面函式中的引數：

fd 表示檔案描述符
若 whence 是 SEEK_SET, 則將該檔案的偏移量設定為距檔案開始處 offset 個位元組
若 whence 是 SEEK_CUR, 則將該檔案的偏移量設定為其當前值加 offset, offset可正可負
若 whence 是 SEEK_END, 則將該檔案的偏移量設定為檔案長度加 offset, offset可正可負

這些引數是在 Go 也適用的, 但是這種方式，已經在 Go 中棄用啦，詳情如下：

// Seek whence values.
//
// Deprecated: Use io.SeekStart, io.SeekCurrent, and io.SeekEnd.
const (
    SEEK_SET int = 0 // seek relative to the origin of the file
    SEEK_CUR int = 1 // seek relative to the current offset
    SEEK_END int = 2 // seek relative to the end
)

如何用 Go 來設定檔案的偏移量呢？

// Seek sets the offset for the next Read or Write on file to offset, interpreted
// according to whence: 0 means relative to the origin of the file, 1 means
// relative to the current offset, and 2 means relative to the end.
// It returns the new offset and an error, if any.
// The behavior of Seek on a file opened with O_APPEND is not specified.
func (f *File) Seek(offset int64, whence int) (ret int64, err error) {
    if err := f.checkValid(&quot;seek&quot;); err != nil {
        return 0, err
    }
    r, e := f.seek(offset, whence)
    if e == nil &amp;&amp; f.dirinfo != nil &amp;&amp; r != 0 {
        e = syscall.EISDIR
    }
    if e != nil {
        return 0, f.wrapErr(&quot;seek&quot;, e)
    }
    return r, nil
}

可見 f.seek(offset, whence) 起著關鍵性的作用。

// seek sets the offset for the next Read or Write on file to offset, interpreted
// according to whence: 0 means relative to the origin of the file, 1 means
// relative to the current offset, and 2 means relative to the end.
// It returns the new offset and an error, if any.
func (f *File) seek(offset int64, whence int) (ret int64, err error) {
    return syscall.Seek(f.fd, offset, whence)
}

syscall.Seek(f.fd, offset, whence) 發起了一個系統呼叫，再挖就到了再底層和彙編啦。

函式 read

呼叫 read 函式從開啟檔案中讀取資料。

#include &lt;unistd.h&gt;

ssize_t read(int fd, void *buf, size_t nbytes);

上面函式中的引數：

fd 表示檔案描述符
buf 要讀的檔案，型別是通用的指標
nbytes 表示讀取的位元組數

如果 read 成功, 則返回讀到的位元組數，如已到達檔案的尾端，則返回 0。

Tips: 有多種情況可能使實際讀到的位元組數少於要求讀的位元組數。

如何用 Go 從開啟檔案中讀取資料呢？

// Read reads up to len(b) bytes from the File.
// It returns the number of bytes read and any error encountered.
// At end of file, Read returns 0, io.EOF.
func (f *File) Read(b []byte) (n int, err error) {
    if err := f.checkValid(&quot;read&quot;); err != nil {
        return 0, err
    }
    n, e := f.read(b)
    return n, f.wrapErr(&quot;read&quot;, e)
}

其底層程式碼如上，遞迴檢視 go package。

函式 write

呼叫 write 函式向開啟檔案寫資料。

#include &lt;unistd.h&gt;

ssize_t write(int fd, const void *buf, size_t nbytes);

上面函式的引數:

fd 表示檔案描述符
buf 要寫的檔案，型別是通用的指標
nbytes 表示讀取的位元組數

如果 write 成功, 則返回讀到的位元組數，如已到達檔案的尾端，則返回 0。

如何用 Go 向開啟檔案中寫入資料？

func (f *File) Write(b []byte) (n int, err error)

結束

如果光看 APUE, 前幾頁還可以，慢慢就看不下去了，Go 的 lib 基本跟 unix 的介面相似，就結合著 Go 的原始碼一起看了，只要有個大概的框架就 OK, 隨著往後慢慢深入，會有更深的理解。

更多原創文章乾貨分享，請關注公眾號

加微信實戰群請加微信(註明:實戰群)：gocnio

I/O流以及檔案的基本操作
2021-05-20
Python：讀寫檔案(I/O) | 組織檔案
2021-04-22
Python
一起學Scala 檔案 I/O
2020-10-01
二、javase基礎知識總結（從檔案 I/O開始)
2021-09-09
Java
Java I/O流複製檔案速度對比
2020-09-14
Java
網路I/O模型解讀
2022-12-01
模型
go配置檔案讀取
2018-11-02
Go
SpringBoot--SpringBoot 讀取Properties檔案(結合JDBC)
2020-05-15
Spring BootJDBC
Go檔案操作綜合指南
2024-04-12
Go
go–讀取檔案的方式
2019-02-16
Go
go 讀取.ini配置檔案
2024-11-15
Go
第二十章：非同步和檔案I/O.（一）
2018-11-12
非同步
第二十章：非同步和檔案I/O.（九）
2018-11-23
非同步
第二十章：非同步和檔案I/O.（八）
2018-11-21
非同步
第二十章：非同步和檔案I/O.（十四）
2018-12-22
非同步
第二十章：非同步和檔案I/O.（二）
2018-11-13
非同步
Go基礎知識-01 關鍵字,go檔案的基本結構(持續更新)
2018-08-14
Go
計算機I/O與I/O模型
2019-05-10
計算機模型
第二十章：非同步和檔案I/O.（二十三）
2019-01-03
非同步
第二十章：非同步和檔案I/O.（二十一）
2019-01-01
非同步
I/O程式設計技術(檔案IO)筆記綱要梳理
2024-05-28
程式設計筆記
Go讀取yaml檔案到struct類
2023-01-16
GoYAMLStruct
I/O流
2018-07-26
Java I/O
2024-07-07
Java
詳解Go語言I/O多路複用netpoller模型
2021-02-08
Go模型
Go 專案配置檔案的定義和讀取
2022-05-09
Go
go micro 原始碼閱讀-Options [Functional O
2021-09-09
Go原始碼Function
利用標準I/O函式，實現兩個檔案的複製功能
2024-05-08
函式
Python教程：精簡概述I/O模型與I/O操作
2020-05-25
Python模型
C#讀寫檔案總結
2020-04-05
C#
Solidity語言學習筆記————31、智慧合約的基本檔案結構概述
2018-06-19
Solid筆記
分析Mach-O檔案
2018-07-21
Mac
服務端 I/O 效能：Node、PHP、Java、Go 的對比
2021-09-09
服務端PHPJavaGo
關於I/O
2018-05-09
c++ I/O
2024-09-01
C++
【java】I/O流
2018-04-11
Java
Java（8）I/O
2020-11-26
Java
Pandas之EXCEL資料讀取/儲存/檔案分割/檔案合併
2019-01-30
Excel
EXE檔案結構及讀取方法
2018-04-11

結合 Go 讀 APUE－基本檔案I/O

基本的檔案 I/O

函式 open 和 openat & 函式 create

函式 close

函式 lseek

函式 read

函式 write

結束

相關文章