資訊的表示和處理及 CS:APP 15213 datalab

小胖西瓜發表於2019-06-04

原文網址 : https://www.cnblogs.com/shuqin/p/10975730.html

APP

資訊的表示和處理

在通用計算機中中，位元組作為最為最小的可定址的記憶體單元，而不是訪問記憶體中單獨的位。

定址和位元組順序

big endian (大端法)，資料最高位元組部分地址在地址處，和人的感覺邏輯相似
little endian (小端法)，低位元組部分在低地址處

布林代數

1 TRUE
2 FALSE
~ NOT
& AND
| OR
^ EXCLUSIVE-OR（異或）
- 1 ^ 0 = 1
- 1 ^ 1 = 0
- 0 ^ 0 = 0
- 0 ^ 1 = 1

IEEE 754 浮點數

$ V = (-1)^s \times M \times 2^E$

符號(sign) s(1)為負數, s(0)為非負數
尾數(significand) M 是一個二進位制小數, 範圍為 $1 \sim 2 - \varepsilon $ 或者 $0 \sim 1 - \varepsilon$
階碼(exponent) E的作用是對浮點數加權, 權重的範圍為2的 E 次方冪

將浮點數的位劃分位三個欄位，分別對這些值賦值：

一個單獨的符號位 s 直接編碼符號位 s, 1-bit
k 位的階碼欄位 $exp = e_{k-1} \cdots e_1 e_0$ 編碼階碼 E, k=7(單精度), k=11(雙精度)
n 位小數字段 $frac = f_{n-1} \cdots f_1 f_0$ 編碼尾數 M, 且編碼的值依賴階碼欄位的值是否等於 0, n=23(單精度), n=52(雙精度)

浮點數的值：

e 為無符號整數，其位表示 $e_{k-1} \cdots e_1 e_0$
小數字段 frac 被解釋為描述小數值 $f$, 其中 $0 \le f \le 1$, 其二進位制表示$0.f_{n-1} \cdots f_1 f_0$
Bias 是一個等於 $2^{k-1} -1 $ 的偏置值
規格化$(exp !=0, exp != 2^{k}-1)$, 最常遇到的值
- 階碼的值 $E = exp - Bias$
- 尾數定義 $M = 1 + f$
非規格化$(exp == 0)$, 提供表示數值 0 及逐漸接近 0 的方法
- 階碼的值 $E = 1 - Bias $
- 尾數定義 $M = f$
非規格化$(exp == 2^{k}-1)$, 特殊值 NaN

舍入
表示方法限制了浮點數的範圍和精度
偶數舍入(round-to-even) 為預設的舍入方式, 其將數字向上或向下舍入，使得結果的最低有效數字(保留位)是偶數(0)
只有是在兩個可能的結果的中間值才考慮向偶數舍入, 大於 0.5 是直接進位的
向上舍入的情況，向下舍入可以不管（反正要丟棄了，不影響結果）
尾數 $1.BBGRXXX$, 保留位(Guard bit)、近似位(Round bit) 和粘滯位(Sticky bit)

Round = 1, Sticky = 1 > 0.5 進位
Guard = 1, Round = 1, Sticky = 0 -> 偶數向上舍入

實驗部分

1. 只用 ~ 和 & 操作符求兩個數的或
摩根定律： $ \neg(p \lor q) = \neg p \land \neg q $
異或：$ p \oplus q = (\neg p \land q) \lor (p \land \neg q)$
所以展開即可

/*
 * bitXor - x^y using only ~ and &
 *   Example: bitXor(4, 5) = 1
 *   Legal ops: ~ &
 *   Max ops: 14
 *   Rating: 1
 */
int bitXor(int x, int y) {
    return ~(~(~x & y) & ~(x & ~y));
}

2. 最小的整形補碼, 可用符號 ! ~ & ^ | + << >>
$ -2^{31} $ (0xF0000000)

/*
 * tmin - return minimum two's complement integer
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 4
 *   Rating: 1
 */
int tmin(void) {
    return 1 << 31;
}

3. 判斷是否是最大的整形數，可用符號 ! ~ & ^ | +*
直接利用 INT_MAX + INT_MAX + 2 = 0 的結果並且排除0xFFFFFFFF，還要注意一個不能直接相加，只能 x+1+x+1

/*
 * isTmax - returns 1 if x is the maximum, two's complement number,
 *     and 0 otherwise
 *   Legal ops: ! ~ & ^ | +
 *   Max ops: 10
 *   Rating: 2
 */
int isTmax(int x) {
    return !(x + 1 + x + 1) & !!(x + 1);
}

4. 判斷所有的奇數位為1，可用符號 ! ~ & ^ | + << >>
排除偶數位的干擾得到奇數位的值，再與奇數位的 0xaaaaaaaa 做亦或運算，如果正確結果必為 0，這時做非運算就可以了
所有先得到 0xaaaaaaa

/*
 * allOddBits - return 1 if all odd-numbered bits in word set to 1
 *   Examples allOddBits(0xFFFFFFFD) = 0, allOddBits(0xAAAAAAAA) = 1
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 12
 *   Rating: 2
 */
int allOddBits(int x) {
    int bits0_15 = (0xAA << 8) + 0xAA;
    int bits0_23 = (bits0_15 << 8) + 0xAA;
    int bits0_31 = (bits0_23 << 8) + 0xAA;

    return !((bits0_31 & x) ^ bits0_31);
}

5.取負，可用符號 ! ~ & ^ | + << >>

/*
 * negate - return -x
 *   Example: negate(1) = -1.
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 5
 *   Rating: 2
 */
int negate(int x) { return ~x + 1; }

6. 判斷是否是 ASCII 數字，可用符號 ! ~ & ^ | + << >>

判斷高 6_31 位，必須是 0
判斷 4 5 位，必須為 1
判斷第四位，通過相加6判斷是否有進位

/*
 * isAsciiDigit - return 1 if 0x30 <= x <= 0x39 (ASCII codes for characters '0'
 * to '9') Example: isAsciiDigit(0x35) = 1. isAsciiDigit(0x3a) = 0.
 *            isAsciiDigit(0x05) = 0.
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 15
 *   Rating: 3
 */
int isAsciiDigit(int x) {
    int bit6_31 = !((x >> 6) & (~0));
    int bit_5 = (x & 0x20) >> 5;
    int bit_4 = (x & 0x10) >> 4;
    int bits0_3 = !(((x & 0xF) + 6) & 0x10);
    return bits0_3 & bit_4 & bit_5 & bit6_31;
}

7. 條件判斷，三目運算子，可用字元 ! ~ & ^ | + << >>
思路：由於 X & 0xFFFFFFFF = X, X & 0x0 = 0, 將兩個數和 0xFFFFFFFF, 0x0 做與操作，再相加

只需要找到什麼時候為 0xFFFFFFFF 和 0x0, 注意這兩者可通過 ~ 得到

/*
 * conditional - same as x ? y : z
 *   Example: conditional(2,4,5) = 4
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 16
 *   Rating: 3
 */
int conditional(int x, int y, int z) {
  int flag = (!!x + ~0);
  return (z & flag) + (y & ~flag);
}

8. 小於等於可用字元 ! ~ & ^ | + << >>
思路：判斷相等，同符號相減判斷是否有進位，不同符號直接判斷第一個數的符號是否為正

/*
 * isLessOrEqual - if x <= y  then return 1, else return 0
 *   Example: isLessOrEqual(4,5) = 1.
 *   Legal ops: ! ~ & ^ | + << >>
 *   Max ops: 24
 *   Rating: 3
 */
int isLessOrEqual(int x, int y) { 
    int equal = !(x ^ y);             // x == y
    int same_sign = !((x ^ y) >> 31); // sign
    int x_reduce_y = ~y + 1 + x;
    return equal | (same_sign & (x_reduce_y >> 31)) | ((!same_sign) & (x >> 31) & 1);
}

9. 非運算子可用字元 ! ~ & ^ | + << >>
思路：核心就是抓住符號位判斷

考慮取反還是取負，取反所有數字的符號位都改變，取負只有 0 和 0x80000000 符號位不變，且這兩個符號位相反，所以用取負的方式
將數與其負數直接做與操作，只有 0x80000000，符號為 1 不變，不能篩選出 0 的情況
考慮別的情況，將數取反做與操作，符號相反的數與操作後仍然為 0， 0x800000000 取反(符號為0)與其負數(符號為0)相與也還為 0，只有0取反後符號為1，與操作後仍為1

/*
 * logicalNeg - implement the ! operator, using all of
 *              the legal operators except !
 *   Examples: logicalNeg(3) = 0, logicalNeg(0) = 1
 *   Legal ops: ~ & ^ | + << >>
 *   Max ops: 12
 *   Rating: 4
 */
int logicalNeg(int x) {
    return ((~x & ~(~x + 1)) >> 31) & 0x1;
}

10. 計算一個最少的補碼位可以表達的位數可用字元 ! ~ & ^ | + << >>
思路：將相鄰位做亦或操作，找到最高的位為 1 所在的位
~(bits16 << 3) + 1) + (((bits16 ^ 1) & 0x1) << 3, bits* 為上一個移位的結果，利用這個結果判斷是增加位移的大小

/* howManyBits - return the minimum number of bits required to represent x in
 *             two's complement
 *  Examples: howManyBits(12) = 5
 *            howManyBits(298) = 10
 *            howManyBits(-5) = 4
 *            howManyBits(0)  = 1
 *            howManyBits(-1) = 1
 *            howManyBits(0x80000000) = 32
 *  Legal ops: ! ~ & ^ | + << >>
 *  Max ops: 90
 *  Rating: 4
 */
int howManyBits(int x) {
   // all assignment must be followed by a declaration.
    int bits16, bits8, bits4, bits2, bits1;
    int shift16, shift8, shift4, shift2, shift1;

    int shift_off = 16; // first shift offset
    x ^= x << 1;        // find the highest 1-bit after XOR adjacent bits

    bits16 = !(x >> shift_off);
    shift16 = bits16 << 4;

    // binary search.
    // if result of prev offset != 0, shift_off should be increasing half prev
    // offset , else should be decreasing half.
    shift_off = shift_off + (~(bits16 << 3) + 1) + (((bits16 ^ 1) & 0x1) << 3);
    bits8 = (!(x >> shift_off));
    shift8 = bits8 << 3;

    shift_off = shift_off + (~(bits8 << 2) + 1) + (((bits8 ^ 1) & 0x1) << 2);
    bits4 = (!(x >> shift_off));
    shift4 = bits4 << 2;

    shift_off = shift_off + (~(bits4 << 1) + 1) + (((bits4 ^ 1) & 0x1) << 1);
    bits2 = (!(x >> shift_off));
    shift2 = bits2 << 1;

    shift_off = shift_off + (~(bits2) + 1) + ((bits2 ^ 1) & 0x1);
    bits1 = (!(x >> shift_off));
    shift1 = bits1;
}

11. 計算浮點數 f 2, 返回浮點數的二進位制位表示, 可用符號不受限制*
思路：由於尾數的值取決於 frac 和 exp，所以要對其分開處理

對於規格數，exp + 1, 但要考慮 +1 後不能為 255
對於非規格數
- exp = 255, 直接返回引數
- exp = 0, frac = 0 返回 0，因為這就是個 0
- exp = 0， frac != 0, frac 左移一位（尾數取值的問題），又要判斷左移後是否溢位（0-22bit）

/*
 * float_twice - Return bit-level equivalent of expression 2*f for
 *   floating point argument f.
 *   Both the argument and result are passed as unsigned int's, but
 *   they are to be interpreted as the bit-level representation of
 *   single-precision floating point values.
 *   When argument is NaN, return argument
 *   Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
 *   Max ops: 30
 *   Rating: 4
 */
unsigned float_twice(unsigned uf) {
  int exp = 0x7f800000 & uf;
  int frac = 0x007FFFFF & uf;
  int sign = 0x80000000 & uf;
  int bias = (exp >> 23) - 127;

  if (uf == 0x0)
    return 0;
  if (bias == 128) // NaN return NaN, inf can't *2
    return uf;

  // frac depends on exp, so exp could not add 1 alone.
  if (exp == 0) { // (exp + frac) << 1
    frac = (frac << 1) & 0x007FFFFF;
    if (uf & 0x00400000)
      exp = 0x00800000;
  } else {
    exp = (exp + 0x00800000) & 0x7F800000;
    if (exp == 0x7F800000)
      frac = 0;
  }
  uf = sign | exp | frac;
  return sign | exp | frac;
}

12. 整數轉浮點數，返回浮點數的二進位制位表示, 可用符號不受限制
思路：核心在於發現該數的絕對值的最高位 1 對應浮點數隱式精度的 1, 然後最高位1後的23位排列在 frac 位置

取數的絕對值，後面對非負數數進行操作
取最少可以表達整數(最高位 1)的 k 位 inum,
所在的位數 n 整數i轉浮點數f 在位模式上為將 k-1 .. k-2 .. 0 放置在浮點數的 frac 部分，非規格數有一個隱式 1, 代替數字有效最高位 1
由上精度有限制，有效位的前23位充當尾數部分，要對後9位進行判斷是否需要舍入
將 exp = 127 + n
符號位不變
其他 0 等情況考慮

/*
 * float_i2f - Return bit-level equivalent of expression (float) x
 *   Result is returned as unsigned int, but
 *   it is to be interpreted as the bit-level representation of a
 *   single-precision floating point values.
 *   Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
 *   Max ops: 30
 *   Rating: 4
 */
unsigned float_i2f(int x) {
  unsigned abs_x = x;
  unsigned sign = x & 0x80000000;
  int flag = 0;
  int n = 30;

  if (x == 0)
    return x;
  else if (x == 0x80000000)
    return 0xcf000000;

  if (sign)
    abs_x = -x;

  while (!(abs_x & (1 << n)))
    n--;
  abs_x <<= 32 - n;

  if ((abs_x & 0x01ff) > 0x0100)
    flag = 1;
  else if ((abs_x & 0x03ff) == 0x0300)
    flag = 1;
  else
    flag = 0;

  return sign + ((n << 23) + 0x3F800000) + (abs_x >> 9) + flag;
}

13. 浮點數轉整數，返回整數的二進位制位表示, 可用符號不受限制
思路：有上面的 float_i2f() 做鋪墊，

集中在對精度的處理, 對於 exp
- 大於 31，超過整形的表達範圍
- 小於 23，值不發生改變，右移 23 - exp
- 大於 23 小於等於 31，值發生改變左移 exp -23
由於浮點數的正負只由符號位影響，所以可以最後做取負操作。

/*
 * float_f2i - Return bit-level equivalent of expression (int) f
 *   for floating point argument f.
 *   Argument is passed as unsigned int, but
 *   it is to be interpreted as the bit-level representation of a
 *   single-precision floating point value.
 *   Anything out of range (including NaN and infinity) should return
 *   0x80000000u.
 *   Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
 *   Max ops: 30
 *   Rating: 4
 */
int float_f2i(unsigned uf) {
    unsigned sign = uf & 0x80000000;
    unsigned exp = uf & 0x7F800000;
    unsigned frac = uf & 0x007FFFFF;

    if (uf == 0x7F800000)
        return 0x80000000;
    else if (uf == 0)
        return 0;
    
    if (exp == 0)
        return 0;

    int m = 0x00800000 + frac;
    int e = (exp >> 23) - 127;

    if (e < 0)
        return 0;
    else if (e > 31)
        return 0x80000000;
    else if (e < 23)
        m >>= (23 - e);
    else
        m <<= (e - 23);

    if (sign)
        m = -m;
    return sign | m;
}

CS APP第二章資料的表示和處理
2019-11-12
APP
CSAPP =2= 資訊的表示和處理
2020-09-14
APP
第二章：資訊的表示和處理
2018-07-28
Uber如何處理和使用乘客資料改善App的體驗？
2021-10-12
APP
資訊的儲存及整數表示
2018-10-31
springboot統一異常處理及返回資料的處理
2020-10-15
Spring Boot
《深入理解計算機系統》讀書筆記 —— 第二章資訊的表示和處理
2020-12-08
計算機筆記
Dart函式、類和運算子-處理資訊
2019-09-23
Dart函式
Datalab
2024-05-12
資料處理及跳轉
2020-07-04
Handler處理器和 Opener 及Cookie
2020-09-24
Cookie
深入理解計算機系統系列（第二章--資訊的表示與處理）
2019-01-24
計算機
音訊錄製及視覺化處理
2024-07-11
音訊視覺化
sklearn基礎及資料處理
2019-09-03
SpringMVC：資料處理及跳轉
2020-12-18
SpringMVC
資料清洗和資料處理
2020-03-03
多對一處理和一對多處理的處理
2020-06-20
零點起飛學Photoshop CS6影像處理
2020-04-07
異常錯誤資訊處理
2019-12-16
正規表示式處理批量插入
2018-06-07
音訊質量評估及音訊處理常用功能
2020-08-07
音訊
優雅的處理Spring Boot異常資訊
2019-04-10
Spring Boot
程式中的敏感資訊如何優雅的處理？
2022-01-14
從資料提取到管理：合合資訊的智慧文件處理全方位解析【合合資訊智慧文件處理百寶箱】
2024-10-31
使用Lambda表示式處理簡單的業務
2020-10-20
Apache Beam，批處理和流式處理的融合！
2020-12-02
Apache
音訊處理
2024-07-05
音訊
支付類系統資料處理和資料中臺的資料處理方式有什麼不同？
2019-06-17
使用正規表示式處理金額
2020-09-21
資料預處理和特徵工程
2020-07-24
特徵工程
處理python中的訊號
2021-07-04
Python
資訊抽取(UIE)技術：讓保險理賠資訊處理流程便捷高效
2024-08-14
UI
Go 錯誤處理新思路？用左側函式和表示式
2022-05-31
Go函式
python GIL的使用及弊端處理
2021-09-11
Python
UCI資料集詳解及其資料處理（附148個資料集及處理程式碼）
2022-04-19
大資料處理的關鍵技術及應用
2022-05-19
大資料
VS2017 處理 Rdlc , microsoft report viewer 輕量級報表處理（WPF CS客戶端版本）
2019-01-15
ROSView客戶端
Sql Server資料庫類似正規表示式的字元處理問題
2019-02-10
SQLServer資料庫字元

資訊的表示和處理 及 CS:APP 15213 datalab