Linux系統程式設計-檔案IO

elmluo發表於2021-05-12

原文網址 : https://www.cnblogs.com/elmluo/p/14761271.html

Linux程式設計

1. 無處不在的系統呼叫

但凡涉及與資源有關的操作、會影響其他程式的操作，都需要作業系統的介入支援，都需要通過系統呼叫來實現，其實系統呼叫從概念上來講也不難理解。
由作業系統實現並提供給外部應用程式的程式設計介面(Application Programming Interface，API)，是應用程式同系統之間資料互動的橋樑。

1.1 系統呼叫和庫函式的區別？

系統呼叫是作業系統向上層提供的介面。
庫函式是對系統呼叫的進一步封裝。
應用程式大多是通過高階語言提供的庫函式，間接的進行系統呼叫。

1.2 呼叫的簡單過程

標庫函式和系統函式呼叫過程。

2. C標準庫的檔案IO函式

fopen、fclose、fseek、fgets、fputs、fread、fwrite......
在命令列，通過 man fopen...... 等可以檢視系統定義的對應的標庫函式。

2.1 fopen 開啟檔案

r 只讀、r+讀寫、w只寫並截斷為0、w+讀寫並截斷為0、a追加只寫、a+追加讀寫。
這些字串引數 mode 值後面也可以新增b，可以通過 man-pages 看到。

函式 fopen 開啟檔名為 path 指向的字串的檔案，將一個流與它關聯。

       引數 mode 指向一個字串，以下列序列之一開始 (序列之後可以有附加的字元):

       r      開啟文字檔案，用於讀。流被定位於檔案的開始。

       r+     開啟文字檔案，用於讀寫。流被定位於檔案的開始。

       w      將檔案長度截斷為零，或者建立文字檔案，用於寫。流被定位於檔案的開始。

       w+     開啟檔案，用於讀寫。如果檔案不存在就建立它，否則將截斷它。流被定位於檔案的開始。

       a      開啟檔案，用於追加 (在檔案尾寫)。如果檔案不存在就建立它。流被定位於檔案的末尾。

       a+     開啟檔案，用於追加
              (在檔案尾寫)。如果檔案不存在就建立它。讀檔案的初始位置是檔案的開始，但是輸出總是被追加到檔案
的末尾。

       字串                       mode                      也可以包含字母                      ``b''
       作為最後一個字元，或者插入到上面提到的任何雙字元的字串的兩個字元中間。這樣只是為了和      ANSI
       X3.159-1989  (``ANSI  C'')  標準嚴格保持相容，沒有實際的效果；在所有的遵循 POSIX 的系統中，``b''
       都被忽略，包括        Linux。(其他系統可能將文字檔案和二進位制檔案區別對待，如果在進行二進位制檔案的
       I/O，那麼新增 ``b'' 是個好主意，因為你的程式可能會被移植到非 Unix 環境中。)

2.2 按字元讀寫 fgetc、fputc

編譯執行看對應的輸出檔案和控制檯列印內容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// 按照字元方式 fgetc(), fputc();
void test01()
{
  // 寫檔案
  // 可讀可寫的方式開啟檔案，沒有就建立
  FILE *f_write = fopen("./test01.txt", "w+");
  if (f_write == NULL)
  {
    return;
  }
  char buf[] = "Read and write as characters";
  for (int i = 0; i < strlen(buf); i++)
  {
    fputc(buf[i], f_write);
  }

  // 關閉，會重新整理緩衝區
  fclose(f_write);

  // 讀檔案
  FILE *f_read = fopen("./test01.txt", "r");
  if (f_read == NULL)
  {
    return;
  }
  char ch;
  while ((ch = fgetc(f_read)) != EOF)
  {
    printf("%c", ch);
  }
  fclose(f_read);
}

int main(int argc, char *argv[])
{
  test01();
}

2.3 按行讀寫 fgets、fputs

編譯執行看對應的輸出檔案和控制檯列印內容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void test02()
{
  // 寫入檔案
  // 可寫的方式開啟檔案
  FILE *f_write = fopen("./test02.txt", "w");
  if (f_write == NULL)
  {
    return;
  }
  char *buf[] = {
      "hellow world\n",
      "hellow world1\n",
      "hellow world2\n"};
  int len = sizeof(buf) / sizeof(char *);
  for (int i = 0; i < len; i++)
  {
    fputs(buf[i], f_write);
  }
  fclose(f_write);

  // 讀取檔案
  FILE *f_read = fopen("./test02.txt", "r");
  char *s = NULL;
  while (!feof(f_read))
  {
    char buf[1024] = {0};
    fgets(buf, 1024, f_read);
    printf("%s", buf);
  }
  fclose(f_read);
}

int main(int argc, char *argv[])
{
  test02();
}

2.4 按塊讀寫檔案 fread、fwrite

主要針對於自定義的資料型別，可以通過二進位制的方式讀寫。
編譯執行看對應的輸出檔案和控制檯列印內容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// 按照塊讀寫檔案（自定義的資料型別，二進位制）：fread() fwrite();
void test03()
{
  // 寫檔案
  FILE *f_write = fopen("./test03.txt", "wb");
  if (f_write == NULL)
  {
    return;
  }
  
  // 自定義結構體型別
  struct Person
  {
    char name[16];
    int age;
  };

  struct Person persons[5] =
      {
          {"zhangsan", 25},
          {"lisi", 25},
          {"wangwu", 25},
          {"zhuliu", 25},
          {"zhuoqi", 25},
      };
  int len = sizeof(persons) / sizeof(struct Person);
  for (int i = 0; i < 5; i++)
  {
    // 引數：資料地址、塊的大小、塊的個數、檔案流
    fwrite(&persons, sizeof(struct Person), 5, f_write);
  }
  fclose(f_write);

  // 讀檔案
  FILE *f_read = fopen("./test03.txt", "rb");
  if (f_read == NULL)
  {
    return;
  }
  struct Person temp[5];
  fread(&temp, sizeof(struct Person), len, f_read);
  for (int i = 0; i < len; i++)
  {
    printf("name: %s, age: %d \n", temp[i].name, temp[i].age);
  }
}

int main(int argc, char *argv[])
{
  test03();
}

2.5 按格式化讀寫檔案 fprintf、fscanf

編譯執行看對應的輸出檔案和控制檯列印內容。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void test04()
{
  // 寫檔案
  FILE *f_write = fopen("./test04.txt", "w");
  if (f_write == NULL)
  {
    return;
  }
  fprintf(f_write, "hello world %d year - %d - month %d - day", 2008, 8, 8);
  fclose(f_write);

  // 讀檔案
  FILE *f_read = fopen("./test04.txt", "r");
  if (f_read == NULL)
  {
    return;
  }
  char buf[1024] = {0};

  while (!feof(f_read)) // 直到檔案結束識別符號，迴圈結束
  {
    fscanf(f_read, "%s", buf);
    printf("%s ", buf);
  }
  fclose(f_read);
}


int main(int argc, char *argv[])
{
  test04();
}

3. 系統open、close函式

3.1 通過man-pages檢視函式

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
int close(int fd);
引數：檔案路徑、讀寫方式、許可權設定（一般O_CREAT,許可權用8進位制，如0664）

3.2 open 中 flags 引數說明

標頭檔案：fcntl.h 中定義
O_RDONLY ：只讀
O_WRONLY：只寫
O_RDWR：讀寫
O_APPEND: 追加
O_CREAT: 檔案存在就使用，不存在就建立
O_EXCL：檔案不存就建立，存在則返回錯誤資訊
O_TRUNC：檔案截斷為0
O_NONBLOCK：非阻塞的方式操作

3.3 open 中 mode 引數並不是檔案真正許可權

通過八進位制建立檔案的許可權，在系統當中還要考慮umask。可以命令列執行 umask 進行檢視。

標庫函式fopen的man-pages中也有關與這個 umask 的提及。

計算公式：新建真實檔案許可權 = mode & ~umask

如設定mode = 777，此時系統umask = 002，~umask取反得775，那麼真實建立出來的檔案許可權 777 & 775 = 775；

  // 理解過程如下
  檔案真實許可權
  ⬇
  mode & ~umask
  ⬇
  777 & ~(002)
  ⬇
  777 & 775
  ⬇
  775

3.4 open常見錯誤

開啟檔案不存在
以寫方式開啟只讀檔案(開啟檔案沒有對應許可權)
以只寫方式開啟目錄

3.5 系統open函式開啟檔案

編譯執行輸出

#include <unistd.h> // open close 引入的標頭檔案
#include <fcntl.h>
#include <stdio.h>
#include <errno.h> // errno 需要的標頭檔案

int main(int argc, char *argv[])
{

    // int fd = open("./dict.back", O_RDONLY | O_CREAT | O_TRUNC, 0777);

    int fd = open("./demo.txt", O_RDWR | O_TRUNC);

    printf("fd=%d \n", fd);
    
    // 這裡關閉，下面程式碼中會產errno = 9;
    // close(fd);  
    
    if (fd != -1)
    {
        printf("open success");
    }
    else
    {
        // 出錯的時候會產生一個errno, 對應不同的錯誤細節。
        printf("errno=%d \n", errno);
        printf("open failure");
    }
    
    // close(fd);
    
    return 0;
}

4. PCB、檔案描述符表、檔案結構體

4.1 檔案描述符表、檔案結構體、PCB結構體之間的關係圖如下

4.2 task_struct 結構體

控制檯中可使用命令 locate /include/linux/sched.h，如果沒有locate 外掛，可以根據系統提示命令列安裝。
如定位檔案目錄為：/usr/src/kernels/3.10.0-1160.11.1.el7.x86_64/include/linux/sched.h。
開啟檔案可以看到，task_struct 中儲存了指向檔案描述符表files指標。

4.3 檔案描述符表

sched.h 標頭檔案中，PCB 結構體的成員變數 files_struct *file 指向檔案描述符表。
從應用程式使用角度，該指標可理解記憶成一個字元指標陣列，通過下標 [0/1/2/3/4...] 找到對應的file結構體。
本質是鍵值對， [0/1/2/3/4...] 分別對應具體file結構體地址。
鍵值對使用的特性是自動對映的，系統會將自動找到使用下標的檔案結構體。
新開啟檔案，返回檔案描述符表中未使用的最小檔案描述符，這個系統自動進行管理。
三個檔案鍵是系統是預設開啟，如果要用，使用系統定義的巨集。
- 0->巨集STDIN_FILENO 指向標準輸入檔案。
- 1->巨集STDOUT_FILENO 指向標準輸出檔案。
- 2->巨集STDERR_FILENO 指向標準錯誤檔案。
files_struct 結構體中成員變數，fd_array 為 file描述符陣列。

struct files_struct
{
    // 引用累加計數
　　atomic_t count; 
    ...
    // 檔案描述符陣列
　　struct file * fd_array[NR_OPEN_DEFAULT]; 
｝

4.4 FILE結構體

file結構體主要包含檔案描述符、檔案讀寫位置、IO緩衝區三部分內容。
open一個檔案，核心就維護一個結構體，用來操作檔案。
結構體檔案可以命令列定位 locate /include/linux/fs.h。
vim /usr/src/kernels/3.10.0-1160.11.1.el7.x86_64/include/linux/fs.h。

舉例說明常用的成員變數

// 檔案屬性操作函式指標
struct inode            *f_inode;       /* cached value */

// 檔案內容操作函式指標
const struct file_operations    *f_op;

// 開啟的檔案數量
atomic_long_t           f_count;

// O_RDONLY、O_NONBLOCK、O_SYNC（檔案的開啟標誌）
unsigned int            f_flags;

// 檔案的訪問許可權
fmode_t                 f_mode;

// 檔案的偏移量
loff_t                  f_pos;

// 檔案所有者
struct fown_struct      f_owner;

...

4.5 最大開啟檔案數

單個程式預設開啟檔案的個數1024。命令檢視unlimit -a 可看到open files 預設為1024。

可以改通過提示的 (-n) 修改當前 shell 程式開啟最大檔案數量，命令列 ulimit -n 4096。
但是隻對當前執行程式生效，如果退出shell程式，再進入檢視最大檔案數變成原來的值1024

通過修改系統配置檔案永久修改該值(不建議)。
vim /etc/security/limits.conf，按照格式要求修改。

cat /proc/sys/fs/file-max 可以檢視該電腦最大可以開啟的檔案個數，受記憶體大小影響。

5. 系統read、write函式

5.1 通過man-pages檢視函式

ssize_t read(int fd, void *buf, size_t count);
ssize_t write(int fd, const void *buf, size_t count);
read與write函式類似，但注意 read、write 函式的第三個引數有所區別。

int main(int argc, char *argv[]) {
    char buf[1024];
    int ret = 0;
    int fd = open("./dict.txt", O_RDWR);
    
    while(( ret = read(fd, buf, sizeof(buf)) ) != 0) {
        wirte(STDOUT_FILENO, buf, ret);
    }
    
    close(fd);
}

5.2 緩衝區的作用

假設我們一次只讀一個位元組實現檔案拷貝功能，使用read、write效率高，還是使用對應的標庫函式fgetc、fputc效率高？
根據呼叫的順序，標庫函式-系統呼叫-底層裝置。呼叫一次系統函式，有個權級切換，比較耗時。
所以標庫函式理論上比系統呼叫的要快，通過下面兩個小節來說明一下。

5.2.1 標庫函式fgetc、fputc使用的標庫(使用者)緩衝區

過程：fgetc --> 庫函式緩衝區 --> 系統呼叫write --> 磁碟
標庫函式有自己的緩衝區4096位元組。
write（有使用者區切換到 kernel 區這樣的權級切換，一次刷4096位元組）。
示例程式碼如下通過fget、fputc 實現檔案copy功能。

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    FILE *fp, *fp_out;
    int n;
    
    // 使用標庫函式開啟
    // r：只讀，r+讀寫
    fp = fopen("./from.txt", "r");
    if (fp == NULL) {
        perror("fopen error");
        exit(1);
    }
    
    // w 只寫，並且截斷為0，w+ 讀寫，並且截斷為0
    fp_out = fopen("./to.txt", "w");
    if (fp == NULL) {
        perror("fopen error");
    }

    // 先存到庫函式去的快取，4096位元組，滿了在呼叫系統函式寫入磁碟。
    while ((n = fgetc(fp)) != EOF) {
        fputc(n, fp_out);
    }

    fclose(fp);
    fclose(fp_out);

    return 0;
}

5.2.2 系統呼叫read、write使用系統緩衝區

過程：系統呼叫write --> 磁碟
核心也有一個緩衝區，預設大小4096位元組。
檔案輸入，先到緩衝區，充滿再重新整理到磁碟。
write（user區到kernel區權級切換，每次切換比較耗時。如果一次刷一個位元組，切換的次數會特別的多，比較慢）。
read、write函式也可以稱為 Unbuffered I/O，指的是無使用者級緩衝區。但不保證不使用核心緩衝區。
示例程式碼如下通過read、write 實現檔案copy功能。

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>

// buf 快取的大小。
// #define N 1024
#define N 1

int main(int argc, char *argv[]) {
    int fd, fd_out;
    int n;
    char buf[N];

    fd = open("from.txt", O_RDONLY);
    if (fd < 0) {
        perror("open from.txt error");
        exit(1);
    }

    fd_out = open("to.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd < 0) {
        perror("open to.txt error");
        exit(1);
    }

    while ((n = read(fd, buf, N))) {
        if (n < 0) {
            perror("read error");
            exit(1);
        }
        write(fd_out, buf, n);
    }

    close(fd);
    close(fd_out);

    return 0;
}

5.3 系統呼叫是否能被標庫函式完全替代？

既然標庫函式減少了權級切換的次數，比系統呼叫快，但庫函式也不能完全可以替代系統呼叫。
比如需要保持實時性的場景，即時通訊的QQ、微信等軟體。

5.4 預輸入緩輸出

使用者區到核心區，權級切換比較耗時。所以通過快取來提高讀寫效率。
預輸入： 檔案Input，如果客戶端需要100個位元組，系統核心先從磁碟讀滿緩衝區4096位元組(4KB)，下一次讀取的時候，從緩衝區裡面讀取。
緩輸出： 檔案Output, 如果客戶端需要輸出100M位元組內容到磁碟，先存滿核心緩衝區4096位元組(4KB)，再由系統核心一次次的重新整理到磁碟中。

6. 系統錯誤處理函式

6.1 exit 函式

標頭檔案：stdlib.h
函式引數可以由開發人員約定，比如0表示正常退出，1表示異常退出。但是系統方法沒有強制要求。

...
if (fd < 0) {
  perror("open to.txt error");
  exit(1);  // 1表示異常，有開發人員相互協定
}

while ((n = read(fd, buf, N))) {
    if (n < 0) {
        perror("read error");
        exit(1);  // 1表示異常
    }
    write(fd_out, buf, n);
}

...

6.2 錯誤編號 errno

對應不同型別錯誤編號和編號對應的描述。
標頭檔案：errno.h
標頭檔案位置： /usr/include/asm-generic/errno-base.h、/usr/include/asm-generic/errno.h

...
// 如開啟檔案不存在， 檢視errno對應的編號，程式碼如下
fd = open("test", O_RDONLY);
if (fd < 0)
{
    printf("errno = %d\n", errno);
    exit(1);
}
...

6.3 perror 函式

會把上面errno對應的字串描述一起拼接上，進行控制檯列印。
void perror(const char *s)

...
// 以寫方式開啟一個目錄
// fd = open("testdir", O_RDWR);
fd = open("testdir", O_WRONLY);
if (fd < 0)
{
    perror("open testdir error");
    exit(1);
}
...

6.4 strerror 函式

返回錯誤編號對應的描述
標頭檔案：string.h
char *strerror(int errnum);

printf ("open testdir error", strerror(errno));

6.5 錯誤處理的程式碼示例

#include <unistd.h> //read write
#include <fcntl.h>  //open close O_WRONLY O_RDONLY O_CREAT O_RDWR
#include <stdlib.h> //exit
#include <errno.h>
#include <stdio.h> //perror
#include <string.h>

int main(void)
{
    int fd;
#if 0
    //開啟檔案不存在
    // fd = open("test", O_RDONLY | O_CREAT);
    fd = open("test", O_RDONLY);
    if (fd < 0)
    {
        printf("errno = %d\n", errno);
        printf("open test error: %s\n", strerror(errno));
        exit(1);
    }
    printf("open success");
#elif 0
    // 開啟的檔案沒有對應許可權(以只寫方式開啟一個只有讀許可權的檔案)
    if (fd < 0)
    {
        fd = open("test", O_WRONLY);
        // fd = open("test", O_RDWR);
        printf("errno = %d\n", errno);
        perror("open test error");
        exit(1);
    }
    printf("open success");

#endif
#if 1
    // 以寫方式開啟一個目錄
    // fd = open("testdir", O_RDWR);
    fd = open("testdir", O_WRONLY);
    if (fd < 0)
    {
        perror("open testdir error");
        exit(1);
    }
#endif

    return 0;
}

7. 阻塞、非阻塞

7.1 阻塞和非阻塞概念

讀常規檔案是不會阻塞的，不管讀多少位元組，read一定會在有限的時間內返回。從終端裝置或網路讀則不一定，如果從終端輸入的資料沒有換行符，呼叫read讀終端裝置就會阻塞，如果網路上沒有接收到資料包，呼叫read從網路讀就會阻塞，至於會阻塞多長時間也是不確定的，如果一直沒有資料到達就一直阻塞在那裡。同樣，寫常規檔案是不會阻塞的，而向終端裝置或網路寫則不一定。

現在明確一下阻塞（Block）這個概念。當程式呼叫一個阻塞的系統函式時，該程式被置於睡眠（Sleep）狀態，這時核心排程其它程式執行，直到該程式等待的事件發生了（比如網路上接收到資料包，或者呼叫sleep指定的睡眠時間到了）它才有可能繼續執行。與睡眠狀態相對的是執行（Running）狀態，在Linux核心中，處於執行狀態的程式分為兩種情況：

正在被排程執行： CPU處於該程式的上下文環境中，程式計數器（eip）裡儲存著該程式的指令地址，通用暫存器裡儲存著該程式運算過程的中間結果，正在執行該程式的指令，正在讀寫該程式的地址空間。

就緒狀態： 該程式不需要等待什麼事件發生，隨時都可以執行，但CPU暫時還在執行另一個程式，所以該程式在一個就緒佇列中等待被核心排程。系統中可能同時有多個就緒的程式，那麼該排程誰執行呢？核心的排程演算法是基於優先順序和時間片的，而且會根據每個程式的執行情況動態調整它的優先順序和時間片，讓每個程式都能比較公平地得到機會執行，同時要兼顧使用者體驗，不能讓和使用者互動的程式響應太慢。

7.2 終端裝置

檔案描述符：STDIN_FILENO、STDOUT_FILE、STDERR_FILENO;
上面三個檔案描述符對應都是一個裝置檔案，/dev/tty。
從控制檯輸入內容到裝置檔案，這個過程就是阻塞的，對應STDIN_FILENO，會等待使用者輸入。
阻塞與非阻塞是對於裝置檔案而言。

7.3 阻塞讀終端

...
   // 預設是用阻塞的方式
   fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }
    else
    {
        printf("fd: %d", fd);
    }
...

7.4 非阻塞讀終端(O_NONBLOCK）

#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MSG_TRY "try again\n"

int main(void)
{
    char buf[10];
    int fd, n;

    // 預設是阻塞的方式
    // fd = open("/dev/tty", O_RDONLY);

    // 使用 O_NONBLOCK 標誌，設定非阻塞讀終端
    fd = open("/dev/tty", O_RDONLY | O_NONBLOCK);

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }
    else
    {
        printf("fd: %d", fd);
    }
tryagain:

    //-1 出錯  errno==EAGAIN 或者 EWOULDBLOCK
    n = read(fd, buf, 10);

    if (n < 0)
    {
        // 由於 open 時指定了 O_NONBLOCK 標誌，
        // 通過 read 讀裝置，沒有資料到達返回-1，同時將 errno 設定為 EAGAIN 或 EWOULDBLOCK
        
        if (errno != EAGAIN)
        {
            perror("read /dev/tty");
            exit(1);
        }
        sleep(3);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
        goto tryagain;
    }
    write(STDOUT_FILENO, buf, n);
    close(fd);

    return 0;
}

7.5 非阻塞讀終端和等待超時

#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

#define MSG_TRY "try again\n"
#define MSG_TIMEOUT "time out\n"

int main(int argc, char *argv[])
{
    char buf[10];
    int i;
    int fd;
    int n;
    
    //  使用 NONBLOCK 非阻塞
    fd = open("/dev/tty", O_RDONLY | O_NONBLOCK); 

    if (fd < 0)
    {
        perror("open /dev/tty");
        exit(1);
    }

    printf("open /dev/tty success ... %d \n", fd);

    // timeout
    for (i = 0; i < 5; ++i)
    {
        n = read(fd, buf, 10);
        if (n > 0)
        {
            // 讀到了東西，直接跳出迴圈
            break;
        }
        if (n != EAGAIN)
        {
            // EWOULDBLK
            perror("read /dev/tty");
            exit(1);
        }
        sleep(1);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
    }
    if (i == 5)
    {
        write(STDOUT_FILENO, MSG_TIMEOUT, strlen(MSG_TIMEOUT));
    }
    else
    {
        write(STDOUT_FILENO, buf, n);
    }
    close(fd);
    return 0;
}

7.6 read 函式返回值

7.6.1 返回 >0

實際讀取到的位元組數

7.6.2 返回 0

讀到檔案末尾

7.6.3 返回 -1

errno != EAGAIN(或!= EWOULDBLOCK) read出錯
- EAGAIN: enable again，Resource temporarily unavailable 表示資源短暫不可用，這個操作可能等下重試後可用。
- EWOULDBLOCK：用於非阻塞模式，不需要重新讀或者寫
errno == EAGAIN (或== EWOULDBLOCK) read 正常，只不過沒有資料到達而已
- 讀取了裝置檔案，設定了非阻塞讀，並且沒有資料到達。

8. lseek 函式

8.1 檔案偏移

Linux中可使用系統函式lseek來修改檔案偏移量(讀寫位置)。
每個開啟的檔案都記錄著當前讀寫位置，開啟檔案時讀寫位置是0，表示檔案開頭，通常讀寫多少個位元組就會將讀寫位置往後移多少個位元組。
但是有一個例外，如果以O_APPEND方式開啟，每次寫操作都會在檔案末尾追加資料，然後將讀寫位置移到新的檔案末尾。
lseek和標準I/O庫的fseek函式類似，可以移動當前讀寫位置（或者叫偏移量）。

8.2 標庫 fseek 函式

int fseek(FILE *stream, long offset, int whence)
fseek常用引數。 SEEK_SET、SEEK_CUR、SEEK_END
成功返回0；失敗返回-1
PS：超出檔案末尾位置返回0；往回超出檔案頭位置，返回-1

8.3 系統 lseek 函式

lseek (int fd, off_t offset, int whence)
lseek常用引數。 SEEK_SET、SEEK_CUR、SEEK_END
失敗返回 -1；成功返回較檔案起始位置向後的偏移量。
PS：lseek允許超過檔案結尾設定偏移量，檔案會因此被擴容。並且檔案“讀”和“寫”使用同一偏移位置。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
    int fd;
    int n;
    int ret;

    char msg[] = "It's a test for lseek \n";
    char ch;

    fd = open("lseek.txt", O_RDWR | O_CREAT, 0644);

    if (fd < 0)
    {
        perror("open lseek.txt error");
        exit(1);
    }

    // 使用fd對開啟的檔案進行寫操作，寫完游標指標位於檔案內容結尾處。
    write(fd, msg, strlen(msg));

    // 將檔案內容指標，重置，設定從0開始，偏移12個位置。返回偏移量。
    ret = lseek(fd, 12, SEEK_SET);

    printf("offset len: %d \n", ret);

    while (n = read(fd, &ch, 1))
    {
        if (n < 0)
        {
            perror("read error");
            exit(1);
        }

        // 將文字內容按照位元組讀出，寫到螢幕
        write(STDOUT_FILENO, &ch, n);
    }

    close(fd);
    
    return 0;
}

8.4 lseek 常用操作

檔案的讀寫，使用一個游標指標，寫完檔案，再去讀的話，需要重新設定指標目標。
PS: lseek函式返回的偏移量總是相對於檔案頭而言。

8.4.1 使用lseek擴充檔案

write操作才能實質性的擴充檔案。
單單lseek是不能進行擴充的，需要加一次實質性的IO操作。
一般如write(fd, "c", 1); 加一次實質性的IO操作。
檢視檔案的16進製表示形式 od -tcx 檔名。
檢視檔案的10進製表示形式 od -tcd 檔名。

8.4.2 標庫 truncate 函式

截斷檔案到具體specific長度，傳入通過檔案路徑。
int truncate(const char *path, off_t length)。
使用這個方法，檔案必須可寫。
成功返回0；失敗返回-1和設定errno。

8.4.3 系統 ftruncate 函式

截斷檔案到具體specific長度，傳入檔案描述符。
使用這個方法，檔案必須open，且擁有可寫許可權。
int ftruncate(int fd, off_t length)。
成功返回0；失敗返回-1和設定errno。

8.4.4 通過lseek獲取檔案的大小

int ret = lseek(fd, 0, SEEK_END);

8.4.5 綜合示例程式碼如下

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

int main(int argc, char *argv[])
{
    int fd;
    int ret_len;
    int ret_truncate;

    fd = open("lseek.txt", O_RDWR | O_TRUNC | O_CREAT, 0664);
    if (fd < 0)
    {
        perror("open lseek.txt error");
        exit(1);
    }

    // 可以用來檔案長度, 從末尾開始，偏移到頭。返回偏移量
    ret_len = lseek(fd, 0, SEEK_END);

    if (ret_len == -1)
    {
        perror("lseek error");
        exit(1);
    }

    printf("len of msg = %d\n", ret_len);

    // truncate(const char *path, off_t length) 截斷檔案到具體長度，檔案必須可寫, 成功返回0，失敗返回-1

    // ftruncate(int fd, off_t length) 截斷檔案到具體長度，檔案必須開啟，成功返回0，失敗返回-1

    ret_truncate = ftruncate(fd, 1800);

    if (ret_truncate == -1)
    {
        perror("ftruncate error");
        exit(1);
    }

    printf("ftruncate file success, and ret_truncate is %d \n", ret_truncate);
#if 1

    ret_len = lseek(fd, 999, SEEK_SET);
    if (ret_len == -1)
    {
        perror("lseek seek_set error");
        exit(1);
    }

    int ret = write(fd, "a", 1);
    if (ret == -1)
    {
        perror("write error");
        exit(1);
    }

#endif

#if 0
    off_t cur = lseek(fd, -10, SEEK_SET);
    printf(" ****** %ld \n", cur);
    if (cur == -1) {
        perror("lseek error");
        exit(1);
    }
#endif
    close(fd);
    return 0;
}

9. fcntl 函式

標頭檔案 fcntl.h
檔案控制 file control，改變一個已經開啟的檔案的訪問控制屬性。不需要重新open設定。
int fcntl(int fd, int cmd, ... /* arg */ )
兩個引數，F_GETFL 和 F_SETFL 重點需要掌握

9.1 F_GETFL（get file flags）

獲取檔案描述符，對應檔案的屬性資訊

9.2 F_SETFL（set file flags）

設定檔案描述符，對應檔案的屬性資訊

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>

#define MSG_TRY "try again \n"

int main(int argc, char *argv[])
{
    char buf[10];
    int flags;
    int n;

    // 獲取stdin屬性資訊
    flags = fcntl(STDIN_FILENO, F_GETFL);
    if (flags == -1)
    {
        perror("fcntl error");
        exit(1);
    }

    // 位或操作，加入非阻塞操作許可權(這樣檔案不用重新通過設定許可權的方式開啟)
    flags |= O_NONBLOCK;
    
    int ret = fcntl(STDIN_FILENO, F_SETFL, flags);
    if (ret == -1)
    {
        perror("fcntl error");
        exit(1);
    }

tryagain:
    n = read(STDIN_FILENO, buf, 10);
    if (n < 0)
    {
        if (errno != EAGAIN)
        {
            perror("read /dev/tty");
            exit(1);
        }
        sleep(3);
        write(STDOUT_FILENO, MSG_TRY, strlen(MSG_TRY));
        goto tryagain;
    }
    write(STDOUT_FILENO, buf, n);
    return 0;
}

10. ioctl函式

標頭檔案：sys/ioctl.h，檔案位置 locate sys/ioctl.h。
主要應用於裝置驅動程式中，對裝置的I/O通道進行管理，控制裝置特性。
通常用來獲取檔案的物理特性，不同檔案型別所含有的特性值各不相同。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ioctl.h>

int main(void) {
    // 定義一個包含視窗大小的結構體。
    struct winsize size;
    
    // isatty 如果是不是終端，返回0
    if (isatty(STDOUT_FILENO) == 0) {
        exit(1);
    }
    
    if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &size) < 0) {
        perror("ioctl TIOCGWINSZ error");
        exit(1);
    }
    
    // 輸出控制檯行和列
    printf("%d rows, %d colums \n", size.ws_row, size.ws_col);
    
    return 0;
}

Linux系統程式設計之檔案IO
2021-11-08
Linux程式設計
Linux系統程式設計【4】——檔案系統
2021-02-27
Linux程式設計
Linux系統程式設計（七）檔案許可權系統呼叫
2020-12-22
Linux程式設計
Linux C 檔案IO
2021-05-31
Linux
Linux檔案IO操作
2021-10-30
Linux
【Linux】Linux系統程式設計入門
2019-03-04
Linux程式設計
Linux檔案系統
2020-10-03
Linux
[Linux]檔案系統
2024-11-28
Linux
Linux系統篇-檔案系統&虛擬檔案系統
2018-08-25
Linux
Linux系統程式設計—有名管道
2019-02-27
Linux程式設計
Linux系統程式設計基礎
2024-10-30
Linux程式設計
Linux系統程式設計入門
2020-12-24
Linux程式設計
【Linux系統程式設計】Linux訊號列表
2020-01-28
Linux程式設計
Linux網路程式設計之IO模型
2018-11-21
Linux程式設計模型
Linux系統檔案系統及檔案基礎篇
2019-07-11
Linux
論Linux檔案系統
2023-11-28
Linux
Linux AUFS 檔案系統
2019-07-25
Linux
linux的檔案系統
2018-12-26
Linux
Linux系統程式設計之程式介紹
2019-08-28
Linux程式設計
Linux程式設計學習筆記 | Linux IO學習[2] – 標準IO
2019-05-12
Linux程式設計筆記
Linux系統程式設計：mmap使用技巧
2018-07-28
Linux程式設計
Linux系統程式設計：訊號捕捉
2020-10-21
Linux程式設計
Linux系統程式設計之匿名管道
2021-12-01
Linux程式設計
Linux系統程式設計—訊號捕捉
2021-09-09
Linux程式設計
I/O程式設計技術(檔案IO)筆記綱要梳理
2024-05-28
程式設計筆記
分散式檔案系統(HDFS）與 linux系統檔案系統對比
2018-09-14
分散式Linux
『學了就忘』Linux檔案系統管理 — 57、Linux檔案系統介紹
2021-12-01
Linux
Linux系統程式設計——特殊程式之孤兒程式
2019-08-29
Linux程式設計
檔案系統(十一)：Linux Squashfs只讀檔案系統介紹
2024-07-30
Linux
linux之路（五）檔案系統
2018-11-20
Linux
Linux檔案系統、目錄
2018-08-27
Linux
Linux 檔案系統詳解
2018-07-02
Linux
linux 檔案系統擴容
2024-05-14
Linux
Linux檔案系統詳解
2019-05-29
Linux
檔案程式設計、檔案下載
2018-07-26
程式設計
【Linux】關於Linux的系統程式設計總結
2021-04-18
Linux程式設計
Linux系統——程式設計師跳槽必備
2018-11-29
Linux程式設計師
Linux作業系統之Shell程式設計
2019-06-20
Linux作業系統程式設計

Linux系統程式設計-檔案IO

1. 無處不在的系統呼叫

1.1 系統呼叫和庫函式的區別？

1.2 呼叫的簡單過程

2. C標準庫的檔案IO函式

2.1 fopen 開啟檔案

2.2 按字元讀寫 fgetc、fputc

2.3 按行讀寫 fgets、fputs

2.4 按塊讀寫檔案 fread、fwrite

2.5 按格式化讀寫檔案 fprintf、fscanf

3. 系統open、close函式

3.1 通過man-pages檢視函式

3.2 open 中 flags 引數說明

3.3 open 中 mode 引數並不是檔案真正許可權

3.4 open常見錯誤

3.5 系統open函式開啟檔案

4. PCB、檔案描述符表、檔案結構體

4.1 檔案描述符表、檔案結構體、PCB結構體之間的關係圖如下

4.2 task_struct 結構體

4.3 檔案描述符表

4.4 FILE結構體

4.5 最大開啟檔案數

5. 系統read、write函式

5.1 通過man-pages檢視函式

5.2 緩衝區的作用

5.2.1 標庫函式fgetc、fputc使用的標庫(使用者)緩衝區

5.2.2 系統呼叫read、write使用系統緩衝區

5.3 系統呼叫是否能被標庫函式完全替代？

5.4 預輸入緩輸出

6. 系統錯誤處理函式

6.1 exit 函式

6.2 錯誤編號 errno

6.3 perror 函式

6.4 strerror 函式

6.5 錯誤處理的程式碼示例

7. 阻塞、非阻塞

7.1 阻塞和非阻塞概念

7.2 終端裝置

7.3 阻塞讀終端

7.4 非阻塞讀終端(O_NONBLOCK）

7.5 非阻塞讀終端和等待超時

7.6 read 函式返回值

7.6.1 返回 >0

7.6.2 返回 0

7.6.3 返回 -1

8. lseek 函式

8.1 檔案偏移

8.2 標庫 fseek 函式

8.3 系統 lseek 函式

8.4 lseek 常用操作

8.4.1 使用lseek擴充檔案

8.4.2 標庫 truncate 函式

8.4.3 系統 ftruncate 函式

8.4.4 通過lseek獲取檔案的大小

8.4.5 綜合示例程式碼如下

9. fcntl 函式

9.1 F_GETFL（get file flags）

9.2 F_SETFL（set file flags）

10. ioctl函式

相關文章