譯| 關於 Unix 命令 `yes` 的小故事

想念小魚乾的清蒸發表於2018-07-17

原文網址 : https://flycode.co/archives/286795

原文閱讀：A Little Story About the `yes` Unix Command

寫在前面：瑟瑟發抖的首次翻譯

這是第一次動手翻譯一篇外文，看懂和翻懂是不一樣的，你所見到的是 v3.0 版本…

感謝依雲 信雅達的科普和滿滿的批註，還有依雲和傳奇老師的最後的校正，以及，H 老師的文章分享~

如果你發現本文有任何一處翻譯不當的，歡迎指教，感謝感謝(///▽///)

譯文開始

你所知的最簡單的 Unix 命令是什麼呢？

有echo命令，用於將字串列印到標準輸出流，並以 o 為結束的命令。

在成堆的簡單 Unix 命令中，也有 yes 命令。如果你不帶引數地執行yes命令，你會得到一串無盡的被換行符分隔開的 y 字元流：

y
y
y
y
(...你明白了吧)
複製程式碼

一開始看似無意義的東西原來它是非常的有用：

yes | sh 糟心的安裝.sh
複製程式碼

你曾經有安裝一個程式，需要你輸入“y”並按下回車繼續安裝的經歷嗎？yes命令就是你的救星。它會很好地履行安裝程式繼續執行的義務，而你可以繼續觀看 Pootie Tang.（一部歌舞喜劇）。

編寫 yes

emmm，這是 BASIC 編寫 ‘yes’的一個基礎版本：

10 PRINT "y"
20 GOTO 10
複製程式碼

下面這個是用 Python 實現的編寫 ‘yes’：

while True:
    print("y")
複製程式碼

看似很簡單？不，執行速度沒那麼快！
事實證明，這個程式執行的速度非常慢。

python yes.py | pv -r > /dev/null
[4.17MiB/s]
複製程式碼

和我 Mac 自帶的版本執行速度相比：

yes | pv -r > /dev/null
[34.2MiB/s]
複製程式碼

所以我重新寫了一個執行速度更快的的 Rust 版本，這是我的第一次嘗試：

use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}
複製程式碼

解釋一下：

迴圈裡想列印的那個被叫做expletive字串是第一個命令列的引數。expletive這個詞是我在yes書冊裡學會的；
用 unwrap_or給expletive傳參，為了防止引數沒有初始化，我們將yes作為預設值
用into()方法將預設引數將從單個字串轉換為堆上的字串

來，我們測試下效果：

cargo run --release | pv -r > /dev/null
   Compiling yes v0.1.0
    Finished release [optimized] target(s) in 1.0 secs
     Running `target/release/yes`
[2.35MiB/s] 
複製程式碼

emmm，速度上看上去並沒有多大提升，它甚至比 Python 版本的執行速度更慢。這結果讓我意外，於是我決定分析下用 C 實現的寫入‘yes’程式的原始碼。

這是 C 語言的第一個版本，這是 Ken Thompson 在 1979 年 1 月 10 日 Unix 第七版裡的 C 實現的編寫‘yes’程式：

main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s
", argc>1? argv[1]: "y");
}
複製程式碼

這裡沒有魔法。

將它同 GitHub 上映象的 GNU coreutils 的 128 行程式碼版相比較，即使 25 年過去了，它依舊在發展更新。上一次的程式碼變動是在一年前，現在它執行速度快多啦：

# brew install coreutils
gyes | pv -r > /dev/null 
[854MiB/s]
複製程式碼

最後，重頭戲來了：

/* Repeatedly output the buffer until there is a write error; then fail.  */
while (full_write (STDOUT_FILENO, buf, bufused) == bufused)
  continue;
複製程式碼

wow，讓寫入速度更快他們只是用了一個緩衝區。常量BUFSIZ用來表明這個緩衝區的大小，根據不同的作業系統會選擇不同的緩衝區大小【寫入/讀取】操作高效（延伸閱讀傳送門。我的系統的緩衝區大小是 1024 個位元組，事實上，我用 8192 個位元組能更高效。

好，來看看我改進的 Rust 新版本：

use std::io::{self, Write};

const BUFSIZE: usize = 8192;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  let mut writer = BufWriter::with_capacity(BUFSIZE, io::stdout());
  loop {
    writeln!(writer, "{}", expletive).unwrap();
  }
}
複製程式碼

最關鍵的一點是，緩衝區的大小要是 4 的倍數以確保記憶體對齊。

現在執行速度是 51.3MiB/s ，比我係統預設的版本執行速度快多了，但仍然比 Ken Thompson 在 [高效的輸入輸出] (https://www.gnu.org/software/libc/manual/html_node/Controlling-Buffering.html) 文中說的 10.2GiB/s 慢。

更新

再一次，Rust 社群沒讓我失望。

這篇文章剛釋出到 Reddit 的 Rust 板塊， Reddit 的使用者 nwydo 就提到了之前關於速率問題的討論。這個是先前討論人員的優化程式碼，它打破了我機子的 3GB/s 的速度：

use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<`a>(buffer: &`a mut [u8], output: &`a [u8]) -> &`a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y
"[..],
    ),
    |mut arg| {
      arg.push(b`
`);
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}
複製程式碼

一個新的實現方式！

我們預先準備了一個填充好的字串緩衝區，在每次迴圈中重用。
標準輸出流被鎖保護著，所以，我們不採用不斷地獲取、釋放的形式，相反的，我們用 lock 進行資料寫入同步。
我們用平臺原生的 std::ffi::OsString 和 std::borrow::Cow 去避免不必要的空間分配

我唯一能做的事情就是刪除一個不必要的 mut 。

這是我這次經歷的一個總結：

看似簡單的 yes 程式其實沒那麼簡單，它用了一個輸出緩衝和記憶體對齊形式去提高效能。重新實現 Unix 工具很有意思，我很欣賞那些讓電腦執行飛速的有趣的小技巧。

附上原文

A Little Story About the `yes` Unix Command

What`s the simplest Unix command you know?
There`s echo, which prints a string to stdout andtrue, which always terminates with an exit code of 0.

Among the rows of simple Unix commands, there`s alsoyes. If you run it without arguments, you get an infinite stream of y`s, separated by a newline:

y
y
y
y
(...you get the idea)
複製程式碼

What seems to be pointless in the beginning turns out to be pretty helpful :

yes | sh boring_installation.sh
複製程式碼

Ever installed a program, which required you to type “y” and hit enter to keep going?yesto the rescue! It will carefully fulfill this duty, so you can keep watchingPootie Tang.

Writing yes

Here`s a basic version in… uhm… BASIC.

10 PRINT "y"
20 GOTO 10
複製程式碼

And here`s the same thing in Python:

while True:
    print("y")
複製程式碼

Simple, eh? Not so quick!
Turns out, that program is quite slow.

python yes.py | pv -r > /dev/null
[4.17MiB/s]
複製程式碼

Compare that with the built-in version on my Mac:

yes | pv -r > /dev/null
[34.2MiB/s]
So I tried to write a quicker version in Rust. Here`s my first attempt:

use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}
複製程式碼

Some explanations:

The string we want to print in a loop is the first command line parameter and is named expletive. I learned this word from the yes manpage.
I use unwrap_or to get the expletive from the parameters. In case the parameter is not set, we use “y” as a default.
The default parameter gets converted from a string slice (&str) into an owned string on the heap (String) using into().

Let`s test it.

cargo run --release | pv -r > /dev/null
   Compiling yes v0.1.0
    Finished release [optimized] target(s) in 1.0 secs
     Running `target/release/yes`
[2.35MiB/s] 
複製程式碼

Whoops, that doesn`t look any better. It`s even slower than the Python version! That caught my attention, so I looked around for the source code of a C implementation.

Here`s the very first version of the program, released with Version 7 Unix and famously authored by Ken Thompson on Jan 10, 1979:

main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s
", argc>1? argv[1]: "y");
}
複製程式碼

No magic here.

Compare that to the 128-line-version from the GNU coreutils, which is mirrored on Github. After 25 years, it is still under active development! The last code change happened around a year ago. That`s quite fast:

# brew install coreutils
gyes | pv -r > /dev/null 
[854MiB/s]
複製程式碼

The important part is at the end:

/* Repeatedly output the buffer until there is a write error; then fail.  */
while (full_write (STDOUT_FILENO, buf, bufused) == bufused)
  continue;
複製程式碼

Aha! So they simply use a buffer to make write operations faster. The buffer size is defined by a constant namedBUFSIZ, which gets chosen on each system so as to make I/O efficient (see here). On my system, that was defined as 1024 bytes. I actually had better performance with 8192 bytes.

I`ve extended my Rust program:

use std::env;
use std::io::{self, BufWriter, Write};

const BUFSIZE: usize = 8192;

fn main() {
    let expletive = env::args().nth(1).unwrap_or("y".into());
    let mut writer = BufWriter::with_capacity(BUFSIZE, io::stdout());
    loop {
        writeln!(writer, "{}", expletive).unwrap();
    }
}
複製程式碼

The important part is, that the buffer size is a multiple of four, to ensure memory alignment.

Running that gave me 51.3MiB/s. Faster than the version, which comes with my system, but still way slower than the results from this Reddit post that I found, where the author talks about 10.2GiB/s.

####Update

Once again, the Rust community did not disappoint.
As soon as this post hit the Rust subreddit, user nwydo pointed out a previous discussion on the same topic. Here`s their optimized code, that breaks the 3GB/s mark on my machine:

use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<`a>(buffer: &`a mut [u8], output: &`a [u8]) -> &`a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y
"[..],
    ),
    |mut arg| {
      arg.push(b`
`);
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}
複製程式碼

Now that`s a whole different ballgame!

We prepare a filled string buffer, which will be reused for each loop.
Stdout is protected by a lock. So, instead of constantly acquiring and releasing it, we keep it all the time.
We use a the platform-native std::ffi::OsString and std::borrow::Cow to avoid unnecessary allocations.

The only thing, that I could contribute was removing an unnecessary mut. ?

Lessons learned

The trivial programyesturns out not to be so trivial after all. It uses output buffering and memory alignment to improve performance. Re-implementing Unix tools is fun and makes me appreciate the nifty tricks, which make our computers fast.

【譯】關於四種快取的故事
2018-03-12
快取
[譯] 格子拼貼 — 關於模組化的故事
2019-02-26
關於SAP的故事（轉）
2019-05-25
關於詩歌的故事
2024-09-06
Linux基礎命令---yes
2018-10-24
Linux
[譯] Express.js 與 AWS Lambda — 一場關於 serverless 的愛情故事
2018-03-08
ExpressJSServer
【UNIX】DBA常用的linux命令
2021-09-17
Linux
Unix、Windows、Mac OS、Linux系統故事
2018-11-06
WindowsMacLinux
關於使用者故事
2018-10-11
UNIX 常用命令
2020-08-09
關於oracle的Spool命令
2019-06-21
Oracle
關於資料庫壓力測試的故事
2019-09-25
資料庫
Linux/Unix重要find命令詳解
2020-01-16
Linux
關於敘事與故事：我眼中的《畫中世界》
2020-08-25
[20210410]關於time命令的解析.txt
2021-04-12
[譯] Slidable：一個 Flutter 的故事
2019-03-01
Flutter
胎教小故事
2024-07-04
關於DrawerLayout的小問題
2018-08-16
九、Linux/UNIX操作命令積累【rpm】
2018-04-23
Linux
4 種繞過 Linux/Unix 命令別名的方法
2018-03-02
Linux
Unix、Linux、GNU 關係梳理
2024-07-29
Linux
[譯] 關於 PHP 7.4 的最新訊息
2019-03-01
PHP
關於Select Model的兩篇譯文
2024-04-29
關於Gdb工具的交叉編譯、移植
2020-11-14
編譯
CentOS8關於防火牆的命令
2021-11-22
CentOS防火牆
GCC編譯器背後的故事
2020-10-17
GC編譯
關於轉儲Oracle索引資訊的相關命令
2018-04-12
Oracle索引
android反編譯相關命令總結
2019-01-22
Android編譯
8個關於Python的小技巧
2019-05-14
Python
Go - 關於 protoc 工具的小疑惑
2022-01-09
Go
[譯]從LinkedIn，Apache Kafka到Unix哲學
2019-05-11
ApacheKafka
Unix系統中的dot命令的使用方法詳解
2020-07-12
關於cpu體系架構的一些有趣的故事分享
2022-07-08
架構
我是如何用2個Unix命令給SQL提速的
2018-08-10
SQL
（小說）我們的故事1
2024-06-14
[開發故事]關於測試人員的職業發展
2019-05-11
[譯] 關於 Room 的 7 點專業提示
2019-03-31
OOM
[譯] Kotlin中關於Companion Object的那些事
2019-04-10
KotlinObject