rust實戰系列-base64編碼

ZachLim發表於2022-07-12

原文網址 : https://www.cnblogs.com/linzhehuang/p/16472298.html

Rust

前言

某些只能使用ASCII字元的場景，往往需要傳輸非ASCII字元的資料，這時就需要一種編碼可以將資料轉換成ASCII字元，而base64編碼就是其中一種。

編碼原理很簡單，將原始資料以3位元組（24位元）為一組均分成4份，每部分6位元共64種組合，每種組合轉換成對應字元，最後拼接起來即可。若最後一組不夠3位元組則後面用0補齊，轉換後補齊多少位元組就用幾個“=”字元表示。

上面大致描述了base64編碼的場景及原理，具體細節不做探討，本文主要描述用rust實現時涉及的rust知識點。

標準輸出讀取

程式的資料是從標準輸入（stdin）中讀取的，使用std::io::stdin()返回實現Read特性（trait）的Stdin結構體，呼叫Read特性read函式即可從標準輸出讀取資料，例子如下。

let buf: [u8; 300] = [0; 300];
let size = stdin().read(&buf).unwrap();

read使用一個u8型別陣列用作從標準輸入接收資料的快取，接收到的位元組數以包裹在Result中的usize型別返回，這裡簡單地使用unwrap()解包獲取位元組數。

快取的大小是固定的但是輸入資料執行時確定的，因此使用迴圈不斷從標準輸入中讀取資料，直到讀取資料位元組數為0。

let mut buf: [u8; 300] = [0; 300];

loop {
    let size = stdin().read(&mut buf).unwrap();
    if size == 0 {
        break;
    }
    // Output the buffer, and assume that buffer is utf-8 string.
    print!("{}", String::from_utf8(buf.to_vec()).unwrap());
}

IO抽象模型

與Java的InputStream和OutputStream一樣，rust也有IO抽象模型，那就是Read和Write特性。

Read和Write特性將輸入輸出抽象為read、write等一系列函式，具體細節尤其實現決定。

使用時無需知道其實現是標準輸入輸出、檔案還是網路，例如可以實現一個輸入源自動匹配函式，當指定路徑的檔案不存在就讀取標準輸入，反之就從檔案中讀取內容。

fn main() {
    let mut buf: [u8; 300] = [0; 300];

    loop {
        let size = input("./input").read(&mut buf).unwrap();
        if size == 0 {
            break;
        }
        print!("{}", String::from_utf8(buf.to_vec()).unwrap())
    }
}

fn input(path: &'static str) -> Box<dyn Read> {
    if !Path::new(path).exists() {
        return Box::new(stdin());
    }
    Box::new(File::open(path).unwrap())
}

陣列

rust陣列是定長的，因此宣告時必須明確長度及型別以便分配記憶體，長度和型別可以自動推斷也可指定。

let arr: [i32; 4]; // 1. Specify type and length, the format is [Type; length].
let arr = [0, 4]; // 2. Infer type automatically.
let arr = [0, 0, 0, 0]; // 3. Infer type and length.

與其他多數語言一樣也是使用下標訪問元素，超出範圍會直接panic。

let mut arr = [0, 0, 0, 0];
print!("{}", arr[0]); // Output is 0.
arr[0] = 1;
print!("{}", arr[0]); // Output is 1.
arr[4] = 4; // Panic here.

字串

rust的字串有str和String兩種：

str是原始型別，其實現是一種切片（Slice）型別且不可變，由於切片型別沒有所有權，因此只能是以引用方式&str出現；
String有所有權且可變，其使用的是堆記憶體，因此開銷會比str大。

字串相加是常見場景，一種方式是直接用+運算子，注意其左值必須是String型別，因為String實現了運算子過載的Add特性且由於其是可變的。

let a = String.from("a");
let b = "b";
let _ = a + b;
// Can not use variable "a" here, its ownership has been moved

注意這裡作為左值的a變數在運算後不能在被使用，因為其所有權已經被移動。

另一種相加方式是String的push_str方法，其實+實現也是呼叫了此方法。

附錄

base64編碼實現完整程式碼如下：

use std::io::{stdin, Read};

fn main() {
    let mut buf: [u8; 300] = [0; 300];
    loop {
        let size = stdin().read(&mut buf).unwrap();
        if size == 0 {
            break;
        }
        print!("{}", String::from_utf8(buf.to_vec()).unwrap());
        print!("{}", encode(&buf, size));
    }
}

fn encode(bytes: &[u8], size: usize) -> String {
    let mut buf = String::new();
    let i = 0;
    for mut i in 0..(size / 3) {
        i = i * 3;
        let f = bytes[i];
        let s = bytes[i + 1];
        let t = bytes[i + 2];

        buf.push_str(&cvt((f & 0xfc) >> 2));
        buf.push_str(&cvt((f & 0x03) << 4 | ((s & 0xf0) >> 4)));
        buf.push_str(&cvt((s & 0x0f) << 2 | ((t & 0xc0) >> 6)));
        buf.push_str(&cvt(t & 0x3f));
    }

    let mut i = (i + 1) * 3;
    i = if size < i { 0 } else { i };
    let remain = size - i;
    if remain == 1 {
        let f = bytes[i];
        buf.push_str(&cvt((f & 0xfc) >> 2));
        buf.push_str(&cvt((f & 0x03) << 4 | 0));
        buf.push_str("==");
    } else if remain == 2 {
        let f = bytes[i];
        let s = bytes[i + 1];
        buf.push_str(&cvt((f & 0xfc) >> 2));
        buf.push_str(&cvt((f & 0x03) << 4 | ((s & 0xf0) >> 4)));
        buf.push_str(&cvt((s & 0x0f) << 2 | 0));
        buf.push_str("=");
    }
    buf
}

const BASE64_TABLE: [char; 64] = [
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',
    'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l',
    'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4',
    '5', '6', '7', '8', '9', '+', '/',
];

fn cvt(i: u8) -> String {
    BASE64_TABLE.get(i as usize).unwrap().to_string()
}

Rust中字串的base64編碼與解碼
2022-09-21
Rust字串
base64 編碼
2019-01-03
Base64編碼
2024-06-13
netty系列之:java中的base64編碼器
2022-04-08
NettyJava
netty系列之:netty中的核心編碼器base64
2022-04-22
Netty
JS 簡單實現UTF-8編碼,Base64編碼
2020-01-30
JS
清華尹成帶你實戰GO案例（6）Go Base64編碼
2018-05-21
Go
base64 編碼原理
2019-09-25
關於base64編碼的原理及實現
2018-03-10
Java之Base64編碼解析
2018-12-01
Java
計算機編碼規則之:Base64編碼
2022-04-11
計算機
Base64編碼的全面介紹
2024-03-31
base64編碼原理和函式
2024-06-12
函式
Base64編碼知識詳解
2022-06-14
Java 8中的Base64編碼和解碼
2019-03-26
Java
【Java小工匠】密碼學--base64編碼
2018-06-08
Java密碼學
用JS進行Base64編碼、解碼
2020-12-01
JS
Notepad++外掛Base64編解碼
2019-01-09
Base64自定義編碼表及破解
2018-12-13
深入瞭解圖片Base64編碼
2024-04-08
kubebuilder實戰之五：operator編碼
2021-08-30
UI
檔案編碼為Base64字串
2024-08-13
字串
rust實戰系列 - 使用Iterator 迭代器實現斐波那契數列(Fibonacci )
2022-02-05
Rust
用於將位元組進行base64編碼或解碼(C語言實現)
2024-06-13
C語言
圖片 base64 編碼還原成圖片
2020-05-11
Base64 編碼知識，一文打盡！
2022-05-26
WebAssembly體驗之編碼base64（AssemblyScript使用教程）
2020-11-29
Web
將 Rust 程式碼編譯為 WASM
2024-08-14
Rust編譯ASM
[Docker 系列]docker 學習七，DockerFile 編寫和實戰
2021-11-28
Docker
【Docker 系列】docker 學習七，DockerFile 編寫和實戰
2021-11-28
Docker
實戰逆向RUST語言程式
2024-10-09
Rust
控制檯編碼系列
2020-10-01
Go ARM64 Base64編碼優化小記
2018-11-29
Go優化
Go語言基礎-常見編碼(Json、Base64)
2024-09-22
GoJSON
ekzhang/rustpad：使用Rust編寫的高效程式碼編輯器
2021-12-16
Rust
webpack 學習筆記：實戰之 babel 編碼
2020-09-27
Web筆記Babel
Oracle blob型別資料轉換成 base64編碼
2019-06-15
Oracle型別
shell指令碼企業實戰系列-nginx原始碼包安裝
2020-12-10
指令碼Nginx原始碼

rust實戰系列-base64編碼

前言

標準輸出讀取

IO抽象模型

陣列

字串

附錄

相關文章