pandas比較兩個文件的差異

w_xjlxm發表於2020-10-23

pandas讀取檔案再比較檔案的差異

直接上程式碼

本文使用到datacompy庫,安裝方式可以直接pip install datacompy

import pandas as pd
import os
import datacompy


if __name__ == "__main__":
    a = pd.read_csv('./sot2/ICX01.SOT2',skiprows=1,names=["X","Y","BIN"])
    # print(a)
    aa = a[a['BIN']!=0]
    del aa['BIN']
    # aa.to_csv('./aa.csv',index=False)
    b = pd.read_csv('./sot2user/ICX.SOT2',skiprows=1,names=["X","Y","BIN"])
    bb = b[b['BIN']!=0]
    del bb['BIN']
    # bb.to_csv('./bb.csv',index=False)
    # c=a[a!=b]
    # c = c.drop_duplicates(['X','Y',"BIN"])
    # c.to_csv('./sot.SOT2',index=False)
    # c.to_excel('./sot.xlsx',index=False)
    # print(a.equals(b))
    # print(b)
    # print(a.merge(b))
    compare = datacompy.Compare(bb, aa, join_columns=["X","Y"])
    print(compare.matches())
    print(compare.report())

需要安裝datacompy這個庫來進行比較,pandas自帶的比較輸出不簡潔

compare = datacompy.Compare(bb, aa, join_columns=["X","Y"])

本段程式碼為用datacompy庫進行比較,bb,aa為pandas讀取的dataframe,[“X”,“Y”]為要比較的列

print(compare.matches())

這裡列印出比較的布林結果

print(compare.report())

列印差異的具體資訊

相關文章