Python 位元組串轉Hex字串(一個久遠的問題點總結)

菊次郎的秋天發表於2024-07-30
  • 時間: 2024.07.30 作者: Yuan

  •  問題簡述 

  這是一個發生在很久之前的問題,但是值得被記錄和澄清一下!

  在2022年1月份參與的專案中,其中有一個需求是利用 SHT40-AD1B-R2 晶片,讀取環境溫度。其實就是透過 i2c 與這個溫度感測器建立通訊,然後讀取溫溼度資訊。對於上位機的我而言,需要解決的一點就是將從感測器中讀回的資料換算為以攝氏度為單位的十進位制數值。當時費了一番腦筋,最後找到了一個"解決"的方法。

  在2022年6月份,梳理 Python 知識點的過程中,我猛然間發現似乎不用那麼複雜,使用Python內建的函式就可以直接轉換…… 頓時感覺自己“愚蠢且努力”!

  現在是2024年7月份,事情過去兩年了,忽然想起需要整理下這個內容。由於當時對“位元組字串”的認識不足,以及對python基礎不夠清楚,所以走了彎路。現在再次確認,發現兩年前的解決辦法實際上是有bug的,現在就來解一解這個兩年前的bug!

  •  byte 字串初見 
"""
當時在看到讀回的資料為類似 `b'fX'`、`b'f4'`、`b'd\xac\xd6\xa5\x9f\x1e'` 這樣的結果時,一時間有點摸不著頭腦,
於是請教了客戶那邊的軟體工程師,客戶給我做了一個示例。
有了客戶的示例和指導後,我就奔著目標開幹了:
將遇到的\x開頭的2位16進位制字元記為16進位制,非\x開頭的16進位制的字元轉為 ascii 碼!
"""

# Yuan: I2C 讀取溫度返回的資訊, 有點奇怪: b'f4' 、b'fX' 、b'f`5\x95\xA6\xE1' 
#  Kim: 什麼型號的溫度感測元件

# Yuan: SHT40-AD1B-R2
#  Kim: b'f4' 算出來是24.866483558 °C
#       我覺得應該是正常的
#       b'f4' = 0x66, 0x34
#       0x6634 = 26164
#       ((26164/65535)*175) - 45 = 24.866483558

#       b'f'' = 0x66, 0x27
#       ' 這個的ascii code應該是0x27

# Yuan: 是用 ascii code 轉換成16進位制的對嗎?
#  Kim: Yes,因為那是byte array
#       除非看到\x
#       不然就都要轉ASCII

# Yuan: 好的,\x開頭的都是2位16進位制, 對嗎
#  Kim: Yes
  •  “笨拙的”解決方案 
#!/usr/bin/env python3

"""
思路:
將 “\x後跟2位字元” 部分直接記錄為 Hex 值,將非“\x後跟2位字元”的單個字元轉換為 ASCII 碼,
於是,我突發奇想從utf-8中找了幾個希臘字母(因為他們不在ascii表中,避免了重複),先將 “\x後跟2位字元” 這類已知的Hex子字串替換成希臘字母,
然後,再遍歷替換後的字串,遇到的非希臘字母,就將其從 ASCII 碼,轉換為對應的Hex形式。
最終,將兩類Hex字元彙總成純Hex字元的字串,再統一轉換為十進位制數值,再進行下一步溫度的計算。
"""

import re

temp_list = [
        b'dW',
        b'd^',
        b'dB',
        b'd>',
        b'dP',
        b'dD',
        b'dS',
        b'dQ',
        b'dP',
        b'dO',
        b'dK',
        b'',
        b'dU',
        b'df',
        b'd_',
        b'dd(\xa6\x99\x95',
        b'd\xa4o\xa5\xc2,',
        b'd\xb0\xe8\xa5\xddA',
        b'd\xa3\xf8\xa5\xaa.',
        b'd\xb0\xe8\xa5\xac\x88',
        b'd\xb8Q\xa5\xb3\xe5',
        b'd\xac\xd6\xa5\x8b\x99',
        b'd\xaap\xa5\xa41',
        b'd\xae\xb4\xa5\xa7b',
        b'd\xa7<\xa5\x93c',
        b'd\xa5^\xa5\x98\x89',
        b'd\xac\xd6\xa5\x9f\x1e',
        b'd\xb9`\xa5\x9d|',
        b'd\xa7<\xa5\xa3\xa6',
        b'd\xb1\xd9\xa5\x9a\xeb',
        b'd\xb8Q\xa5\x94\xf4',
        b'd\xb9`\xa5\x89\xfb',
        b'd\xb8Q\xa5xH',
        b'',
        b'd\xb3\xbb\xa5\x91\x01',
        b'd\xb0\xe8\xa5\x96\x96',
        b'd\xba3\xa5\x7f\xdf',
        b'd\xb2\x8a\xa5xH',
        b'd\xaap\xa5s\xa2',
        b'd\xabA\xa5s\xa2',
        b'd\xba3\xa5\x81B',
        b'd\xb8Q\xa5n\xad',
        b'd\xb2\x8a\xa5yy',
        b'd\xac\xd6\xa5h\x0b',
        b'd\xb6N\xa5n\xad',
        b'd\xabA\xa5wf',
        b'd\xad\xe7\xa5l\xcf',
        b'd\xbc\x95\xa5i:',
        b'd\xb3\xbb\xa5];',
        b'd\xb1\xd9\xa5eG',
        b'd\xbb\x02\xa5s\xa2',
        b'd\xac\xd6\xa5p\xf1',
        b'',
        b'd\xad\xe7\xa5\\\n',
        b'\n',
        b'd\n',
        b'd\xbc\x95\xa5O\x1a',
        b'd\xb1\xd9\xa5LI']

def get_temperature(_data):
    if _data:
        data1 = str(_data).lstrip("b").strip("'")

        reg = r'\\x([0-9a-f]{2})'
        val_ret = re.findall(reg, data1)

        unichr = 'δλξπσω'
        unidict = {}
        data2 = str(data1)
        if val_ret:
            for i, ret in enumerate(val_ret):
                unidict[unichr[i]] = '0x{}'.format(ret)
                data2 = data2.replace('\\x{}'.format(ret), unichr[i])

        hex_str = []
        for da in data2:
            if da in unidict.keys():
                hex_str.append(unidict[da])
            else:
                hex_str.append(hex(ord(da)))

        data_pool = hex_str
        value1 = int(''.join(data_pool[:2]).replace('0x', ''), 16)
        degc = round(((value1 / 65535) * 175) - 45, 2)
        # prh = round(((value2 / 65535) * 125) - 6, 2)

        # if prh > 100:
        #     prh = 100
        # if prh < 0:
        #     prh = 0

        # unit = chr(8451)
        unit = "°C"

        if degc > 40 or degc < 10:
            result = 'degc: {} {} ({}, {})'.format(degc, unit, _data, val_ret)
        else:
            result = 'degc: {} {}'.format(degc, unit)
        # print('DEG: {} {}'.format(degc, unit), ', PRH: {} {}'.format(prh, '%'))

        # print("{} ==> {} ==> Hex: {} ==> DEG: {} {}".format(_data, val_ret, data_pool, degc, unit))
        print(f"{str(_data).ljust(25)} ==> hex: {repr(val_ret).ljust(30)} ==> full_hex: {repr(data_pool).ljust(64)} ==> DEG: {degc} {unit}")
    else:
        result = '-- no data --'

    return result

if __name__ == '__main__':

    for i, temp in enumerate(temp_list):
        # print('{}: {}    {}'.format(str(i).zfill(2), get_temperature(temp), temp))
        get_temperature(temp)

"""
輸出:
b'dW'                     ==> hex: []                             ==> full_hex: ['0x64', '0x57']                                                 ==> DEG: 23.59 °C
b'd^'                     ==> hex: []                             ==> full_hex: ['0x64', '0x5e']                                                 ==> DEG: 23.61 °C
b'dB'                     ==> hex: []                             ==> full_hex: ['0x64', '0x42']                                                 ==> DEG: 23.54 °C
b'd>'                     ==> hex: []                             ==> full_hex: ['0x64', '0x3e']                                                 ==> DEG: 23.53 °C
b'dP'                     ==> hex: []                             ==> full_hex: ['0x64', '0x50']                                                 ==> DEG: 23.57 °C
b'dD'                     ==> hex: []                             ==> full_hex: ['0x64', '0x44']                                                 ==> DEG: 23.54 °C
b'dS'                     ==> hex: []                             ==> full_hex: ['0x64', '0x53']                                                 ==> DEG: 23.58 °C
b'dQ'                     ==> hex: []                             ==> full_hex: ['0x64', '0x51']                                                 ==> DEG: 23.58 °C
b'dP'                     ==> hex: []                             ==> full_hex: ['0x64', '0x50']                                                 ==> DEG: 23.57 °C
b'dO'                     ==> hex: []                             ==> full_hex: ['0x64', '0x4f']                                                 ==> DEG: 23.57 °C
b'dK'                     ==> hex: []                             ==> full_hex: ['0x64', '0x4b']                                                 ==> DEG: 23.56 °C
b'dU'                     ==> hex: []                             ==> full_hex: ['0x64', '0x55']                                                 ==> DEG: 23.59 °C
b'df'                     ==> hex: []                             ==> full_hex: ['0x64', '0x66']                                                 ==> DEG: 23.63 °C
b'd_'                     ==> hex: []                             ==> full_hex: ['0x64', '0x5f']                                                 ==> DEG: 23.61 °C
b'dd(\xa6\x99\x95'        ==> hex: ['a6', '99', '95']             ==> full_hex: ['0x64', '0x64', '0x28', '0xa6', '0x99', '0x95']                 ==> DEG: 23.63 °C
b'd\xa4o\xa5\xc2,'        ==> hex: ['a4', 'a5', 'c2']             ==> full_hex: ['0x64', '0xa4', '0x6f', '0xa5', '0xc2', '0x2c']                 ==> DEG: 23.8 °C
b'd\xb0\xe8\xa5\xddA'     ==> hex: ['b0', 'e8', 'a5', 'dd']       ==> full_hex: ['0x64', '0xb0', '0xe8', '0xa5', '0xdd', '0x41']                 ==> DEG: 23.83 °C
b'd\xa3\xf8\xa5\xaa.'     ==> hex: ['a3', 'f8', 'a5', 'aa']       ==> full_hex: ['0x64', '0xa3', '0xf8', '0xa5', '0xaa', '0x2e']                 ==> DEG: 23.8 °C
b'd\xb0\xe8\xa5\xac\x88'  ==> hex: ['b0', 'e8', 'a5', 'ac', '88'] ==> full_hex: ['0x64', '0xb0', '0xe8', '0xa5', '0xac', '0x88']                 ==> DEG: 23.83 °C
b'd\xb8Q\xa5\xb3\xe5'     ==> hex: ['b8', 'a5', 'b3', 'e5']       ==> full_hex: ['0x64', '0xb8', '0x51', '0xa5', '0xb3', '0xe5']                 ==> DEG: 23.85 °C
b'd\xac\xd6\xa5\x8b\x99'  ==> hex: ['ac', 'd6', 'a5', '8b', '99'] ==> full_hex: ['0x64', '0xac', '0xd6', '0xa5', '0x8b', '0x99']                 ==> DEG: 23.82 °C
b'd\xaap\xa5\xa41'        ==> hex: ['aa', 'a5', 'a4']             ==> full_hex: ['0x64', '0xaa', '0x70', '0xa5', '0xa4', '0x31']                 ==> DEG: 23.81 °C
b'd\xae\xb4\xa5\xa7b'     ==> hex: ['ae', 'b4', 'a5', 'a7']       ==> full_hex: ['0x64', '0xae', '0xb4', '0xa5', '0xa7', '0x62']                 ==> DEG: 23.83 °C
b'd\xa7<\xa5\x93c'        ==> hex: ['a7', 'a5', '93']             ==> full_hex: ['0x64', '0xa7', '0x3c', '0xa5', '0x93', '0x63']                 ==> DEG: 23.81 °C
b'd\xa5^\xa5\x98\x89'     ==> hex: ['a5', 'a5', '98', '89']       ==> full_hex: ['0x64', '0xa5', '0x5e', '0xa5', '0x98', '0x89']                 ==> DEG: 23.8 °C
b'd\xac\xd6\xa5\x9f\x1e'  ==> hex: ['ac', 'd6', 'a5', '9f', '1e'] ==> full_hex: ['0x64', '0xac', '0xd6', '0xa5', '0x9f', '0x1e']                 ==> DEG: 23.82 °C
b'd\xb9`\xa5\x9d|'        ==> hex: ['b9', 'a5', '9d']             ==> full_hex: ['0x64', '0xb9', '0x60', '0xa5', '0x9d', '0x7c']                 ==> DEG: 23.85 °C
b'd\xa7<\xa5\xa3\xa6'     ==> hex: ['a7', 'a5', 'a3', 'a6']       ==> full_hex: ['0x64', '0xa7', '0x3c', '0xa5', '0xa3', '0xa6']                 ==> DEG: 23.81 °C
b'd\xb1\xd9\xa5\x9a\xeb'  ==> hex: ['b1', 'd9', 'a5', '9a', 'eb'] ==> full_hex: ['0x64', '0xb1', '0xd9', '0xa5', '0x9a', '0xeb']                 ==> DEG: 23.83 °C
b'd\xb8Q\xa5\x94\xf4'     ==> hex: ['b8', 'a5', '94', 'f4']       ==> full_hex: ['0x64', '0xb8', '0x51', '0xa5', '0x94', '0xf4']                 ==> DEG: 23.85 °C
b'd\xb9`\xa5\x89\xfb'     ==> hex: ['b9', 'a5', '89', 'fb']       ==> full_hex: ['0x64', '0xb9', '0x60', '0xa5', '0x89', '0xfb']                 ==> DEG: 23.85 °C
b'd\xb8Q\xa5xH'           ==> hex: ['b8', 'a5']                   ==> full_hex: ['0x64', '0xb8', '0x51', '0xa5', '0x78', '0x48']                 ==> DEG: 23.85 °C
b'd\xb3\xbb\xa5\x91\x01'  ==> hex: ['b3', 'bb', 'a5', '91', '01'] ==> full_hex: ['0x64', '0xb3', '0xbb', '0xa5', '0x91', '0x01']                 ==> DEG: 23.84 °C
b'd\xb0\xe8\xa5\x96\x96'  ==> hex: ['b0', 'e8', 'a5', '96', '96'] ==> full_hex: ['0x64', '0xb0', '0xe8', '0xa5', '0x96', '0x96']                 ==> DEG: 23.83 °C
b'd\xba3\xa5\x7f\xdf'     ==> hex: ['ba', 'a5', '7f', 'df']       ==> full_hex: ['0x64', '0xba', '0x33', '0xa5', '0x7f', '0xdf']                 ==> DEG: 23.86 °C
b'd\xb2\x8a\xa5xH'        ==> hex: ['b2', '8a', 'a5']             ==> full_hex: ['0x64', '0xb2', '0x8a', '0xa5', '0x78', '0x48']                 ==> DEG: 23.84 °C
b'd\xaap\xa5s\xa2'        ==> hex: ['aa', 'a5', 'a2']             ==> full_hex: ['0x64', '0xaa', '0x70', '0xa5', '0x73', '0xa2']                 ==> DEG: 23.81 °C
b'd\xabA\xa5s\xa2'        ==> hex: ['ab', 'a5', 'a2']             ==> full_hex: ['0x64', '0xab', '0x41', '0xa5', '0x73', '0xa2']                 ==> DEG: 23.82 °C
b'd\xba3\xa5\x81B'        ==> hex: ['ba', 'a5', '81']             ==> full_hex: ['0x64', '0xba', '0x33', '0xa5', '0x81', '0x42']                 ==> DEG: 23.86 °C
b'd\xb8Q\xa5n\xad'        ==> hex: ['b8', 'a5', 'ad']             ==> full_hex: ['0x64', '0xb8', '0x51', '0xa5', '0x6e', '0xad']                 ==> DEG: 23.85 °C
b'd\xb2\x8a\xa5yy'        ==> hex: ['b2', '8a', 'a5']             ==> full_hex: ['0x64', '0xb2', '0x8a', '0xa5', '0x79', '0x79']                 ==> DEG: 23.84 °C
b'd\xac\xd6\xa5h\x0b'     ==> hex: ['ac', 'd6', 'a5', '0b']       ==> full_hex: ['0x64', '0xac', '0xd6', '0xa5', '0x68', '0x0b']                 ==> DEG: 23.82 °C
b'd\xb6N\xa5n\xad'        ==> hex: ['b6', 'a5', 'ad']             ==> full_hex: ['0x64', '0xb6', '0x4e', '0xa5', '0x6e', '0xad']                 ==> DEG: 23.85 °C
b'd\xabA\xa5wf'           ==> hex: ['ab', 'a5']                   ==> full_hex: ['0x64', '0xab', '0x41', '0xa5', '0x77', '0x66']                 ==> DEG: 23.82 °C
b'd\xad\xe7\xa5l\xcf'     ==> hex: ['ad', 'e7', 'a5', 'cf']       ==> full_hex: ['0x64', '0xad', '0xe7', '0xa5', '0x6c', '0xcf']                 ==> DEG: 23.82 °C
b'd\xbc\x95\xa5i:'        ==> hex: ['bc', '95', 'a5']             ==> full_hex: ['0x64', '0xbc', '0x95', '0xa5', '0x69', '0x3a']                 ==> DEG: 23.86 °C
b'd\xb3\xbb\xa5];'        ==> hex: ['b3', 'bb', 'a5']             ==> full_hex: ['0x64', '0xb3', '0xbb', '0xa5', '0x5d', '0x3b']                 ==> DEG: 23.84 °C
b'd\xb1\xd9\xa5eG'        ==> hex: ['b1', 'd9', 'a5']             ==> full_hex: ['0x64', '0xb1', '0xd9', '0xa5', '0x65', '0x47']                 ==> DEG: 23.83 °C
b'd\xbb\x02\xa5s\xa2'     ==> hex: ['bb', '02', 'a5', 'a2']       ==> full_hex: ['0x64', '0xbb', '0x02', '0xa5', '0x73', '0xa2']                 ==> DEG: 23.86 °C
b'd\xac\xd6\xa5p\xf1'     ==> hex: ['ac', 'd6', 'a5', 'f1']       ==> full_hex: ['0x64', '0xac', '0xd6', '0xa5', '0x70', '0xf1']                 ==> DEG: 23.82 °C
b'd\xad\xe7\xa5\\\n'      ==> hex: ['ad', 'e7', 'a5']             ==> full_hex: ['0x64', '0xad', '0xe7', '0xa5', '0x5c', '0x5c', '0x5c', '0x6e'] ==> DEG: 23.82 °C
b'\n'                     ==> hex: []                             ==> full_hex: ['0x5c', '0x6e']                                                 ==> DEG: 18.19 °C
b'd\n'                    ==> hex: []                             ==> full_hex: ['0x64', '0x5c', '0x6e']                                         ==> DEG: 23.61 °C
b'd\xbc\x95\xa5O\x1a'     ==> hex: ['bc', '95', 'a5', '1a']       ==> full_hex: ['0x64', '0xbc', '0x95', '0xa5', '0x4f', '0x1a']                 ==> DEG: 23.86 °C
b'd\xb1\xd9\xa5LI'        ==> hex: ['b1', 'd9', 'a5']             ==> full_hex: ['0x64', '0xb1', '0xd9', '0xa5', '0x4c', '0x49']                 ==> DEG: 23.83 °C
"""
  •  後知後覺的內建函式 
# 原始位元組資料
byte_data = b'd\xad\xe7\xa5\\\n'

# 轉換為十六進位制字串
hex_data = byte_data.hex()

# 列印結果
print(hex_data)

"""
輸出:
'64ade7a55c0a'
"""

# 如果需要將十六進位制字串格式化成每個位元組兩位並用空格分隔的形式,可以使用以下程式碼

# 原始位元組資料
byte_data = b'd\xad\xe7\xa5\\\n'

# 轉換為十六進位制字串並分隔每個位元組
hex_data = byte_data.hex()
formatted_hex = ' '.join([hex_data[i:i+2] for i in range(0, len(hex_data), 2)])

# 列印結果
print(formatted_hex)

  •  兩種方案執行結果比對 
# 我的方案
def byte_to_hex(_byte_data):

    # 去頭去尾,轉為字串
    data1 = str(_byte_data).lstrip("b").strip("'")

    # 匹配\x開頭的2位字元
    reg = r'\\x([0-9a-f]{2})'
    val_ret = re.findall(reg, data1)

    # 用拉丁字母替代\x開頭的2位字元
    unichr = 'δλξπσω'
    unidict = {}
    data2 = str(data1)
    if val_ret:
        for i, ret in enumerate(val_ret):
            unidict[unichr[i]] = '0x{}'.format(ret)
            data2 = data2.replace('\\x{}'.format(ret), unichr[i])

    # 遍歷加入拉丁字母代替的新字串, 轉換為hex
    hex_chr_list = []
    for da in data2:
        if da in unidict.keys():
            hex_chr_list.append(unidict[da])
        else:
            hex_chr_list.append(hex(ord(da)))
    hex_str = ''.join(hex_chr_list).replace('0x', '')
    return hex_str

# 新的方案
_byte_data.hex()

  •  最終的結論 

舊的方法基本可以滿足需求; 使用 python 內建方法, 更簡單更準確,且有ascii碼錶作為依據!

# 新、舊方案轉換和計算的溫度結果,在大部分情況下基本是一致的;
# 在現有的資料中有3筆資料轉換存在差異。引起差異的地方是:對 \n 的轉換, 比如 \\\n 這樣的資料
# 經過查詢 Ascii 碼錶推測,科學的做法是:\n 需要被看成一個整體轉換為 ascii 碼的 hex 形式,而不是舊的方案中分別對 '\' 和 'n' 進行轉換


-- the end --

相關文章