資料結構與演算法分析 (雜湊)

myxs發表於2017-04-23

原文網址 : http://www.ituring.com.cn/article/274866

資料結構演算法

標籤： Data-Structures

雜湊在選取了雜湊函式後，要解決的問題及時如何解決雜湊衝突。有2種簡單方式，分離連結法和開放定址法

1. 分離連結法

核心思想在於：將雜湊到同一個雜湊值的元素放在一個連結串列中

分離連結雜湊表：

hash

Find和Insert是雜湊表的關鍵操作

Find：通過雜湊函式計算出雜湊值，找到相應的連結串列，剩下的就是連結串列的遍歷操作

Insert：先使用Find例程，根據查詢結果，決定是否需要插入到連結串列中，將之插入到頭節點後的第一個位置

int hash(int key, int size){
    return key%size;
}


Hash Init(int Size){
    Hash H;
    H = malloc(sizeof(struct HashTable));
    H->Size = Size;
    //H->List = malloc(sizeof(Position)*H->Size);
    H->List = malloc(sizeof(struct ListNode)*H->Size);
    for (int i = 0; i < H->Size; ++i) {
        H->List[i] = malloc(sizeof(struct ListNode));
        H->List[i]->Next = NULL;
    }
    return H;
}


Position Find(int key, Hash H){
    Position P;
    Position L;
    L = H->List[hash(key, H->Size)];
    P = L->Next;
    while(P && P->Elem != key){
        P = P->Next;
    }
    return P;
}


void Insert(int key, Hash H){
    Position P,NewCell;
    Position L;
    L = H->List[hash(key,H->Size)];
    P = Find(key,H);
    if (P == NULL){
        NewCell = malloc(sizeof(struct ListNode));
        NewCell->Elem = key;
        NewCell->Next = L->Next;
        L->Next = NewCell;
    }
}

在編寫Init時，即申請指標陣列和給每個連結串列的頭節點申請空間時的寫法

H->List = malloc(sizeof(struct ListNode)*H->Size);
for (int i = 0; i < H->Size; ++i) {
    H->List[i] = malloc(sizeof(struct ListNode));
    H->List[i]->Next = NULL;
}

問題出現在第一行，我編寫了如下程式碼

H->List = malloc(sizeof(struct ListNode)*H->Size);
for (int i = 0; i < H->Size; ++i) {
    H->List[i]->Next = NULL;
}

單步除錯發現錯誤提示 SIGSEGV (Segmentation fault).

在StackOerFlow上找到如下回答：

It means your program has tried to access memory that doesn't belong to it. Basically, you have a pointer that contains an invalid value somewhere in your code - a common source of this error is dereferencing a NULL pointer.

作為一個自己踩的坑記錄一下，一直對指標陣列這種形式的表示不太理解，這一次稍微明白一點了總算。

2. 開放定址法

不同於分離連結法使用連結串列，開放定址法在衝突發生時，藉助函式 F(i)來選擇另外的空單元

線性探測法 F(i) = i
平方探測法 F(i) = i^2
雙雜湊 F(i) = i * hash2(X)(注意和再雜湊的區別:將雜湊表大小改變)

初始化

HashTable Init(int Size){
    HashTable H;
    H = malloc(sizeof(struct HashTbl));
    H->Size = Size;
    H->Array = malloc(sizeof(Cell) * H->Size);
    for (int i = 0; i < H->Size; ++i) {
        H->Array[i].Info = Empty;
    }
    return H;
}

查詢下一個空單元位置

int Find(int key, HashTable H){
    int CurrentPos;
    int CollisionNum = 0;
    CurrentPos = Hash(key, H->Size);
    while(H->Array[CurrentPos].Info != Empty && H->Array[CurrentPos].Elem != key){
        CollisionNum++;
//      CurrentPos += CollisionNum;              //針對線性探測
        CurrentPos += 2*CollisionNum - 1;        //針對平方探測
//      CurrentPos += CollisionNum * hash2(key,H->Size); //針對雙雜湊
        if (CurrentPos>=H->Size)
            CurrentPos -= H->Size;
    }
    return CurrentPos;
}

這裡有一點：平方探測的快速實現，F(i) = F(i-1) +2 * i - 1 , 下一個要探測的單元通過乘2減1確定在原書上有一句警告提示，切勿改變while(H->Array[CurrentPos].Info != Empty && H->Array[CurrentPos].Elem != key)的測試順序，個人猜測是從優化角度考慮的

插入

void Insert(int key, HashTable H){
    int Position;
    Position = Find(key,H);
    if (H->Array[Position].Info != Legitimate){
        H->Array[Position].Elem = key;
        H->Array[Position].Info = Legitimate;
    }
}

總結：線性探測易出現一次聚集，平方探測易出現二次聚集

3. 利用Hash表實現多項式乘法

2個連結串列每一項相乘時時間複雜度是 O(MN),消耗 O(1) 插入到Hash表中

HashTable insert(ElementType key, HashTable h){
    int Pos;
    Pos = find(key,h);
    if(h->Cells[Pos].info == Legitimate){
        h->Cells[Pos].element.Cof += key.Cof;
    }else if(h->Cells[Pos].info == Empty){
        h->Cells[Pos].element = key;
        h->Cells[Pos].info = Legitimate;
        h->currentNum ++;
    }
}

多項式相乘

for (Position_list P1 = Poly1->Next; P1 != NULL; P1 = P1->Next) {
    for (Position_list P2 = Poly2->Next; P2 != NULL; P2 = P2->Next) {
        T.Exp = P1->Element.Exp + P2->Element.Exp;
        T.Cof = P1->Element.Cof * P2->Element.Cof;
        insert(T, H);
    }
}

思路清楚，程式碼簡潔，效能比較好。翻閱了參考解釋，如下：

    Another method of implementing these operations is to use a search tree instead of a hash table; a balanced tree is required because elements are inserted in the tree with too much order. A splay tree might be particularly well suited for this type of a problem because it does well with sequential accesses. Comparing the different ways of solving the problem is a good programming assignment.

本人不理解怎麼做到的，歡迎有朋友能夠指點一下。

4. Hash表改進字謎程式

第一章第一頁的字謎問題，將W個單詞的單詞表讀入雜湊表中，O(W) 對每一個四元組（行，列，方向，字元數）測試是否有單詞出現。雜湊表的查詢 O(1),而方向只有8個，單詞的字元數最大數也是個常數，O(R.C)

//順時針從右旋轉
int x[8] = { 0,1,1,1,0,-1,-1,-1 };
int y[8] = { 1,1,0,-1,-1,-1,0,1 };

關鍵程式碼

 for (int row = 0; row < n; row++) {
    for (int col = 0; col < n; col++) {
        for (int d = 0; d < 8; d++) {
            string s;
            int rr = row;
            int cc = col;
            for (int count = 1; count <= n; count++) {
                s += table[rr][cc];
                rr += x[d];
                cc += y[d];
                Position pos = find(s.c_str(), h);
                if(isLegitimate(pos,h))
                    cout<<s.c_str()<<endl;
            }
        }
    }
}

改程式序：通過將每一個單詞的W以及W的所有字首放入Hash表中，可以在查詢時，如果確定字首不在Hash表中，可以提前結束查詢

    for (int count = 1; count <= n; count++) {//長度  
                s += table[rr][cc];  
                rr += x[d];  
                cc += y[d];  
                Position pos = find(s.c_str(), s.length(), h);  
                if (isLegitimate(pos, h) && retrive(pos, h).info == word)  
                    printf("%s\n", s.c_str());  
                else if (!isLegitimate(pos, h)) {//找不到字首  
                    break;//找不到，換方向  
                }

總結：雜湊表的原理簡單，但是用處很大，比如編譯器前端中符號表記錄變數等，拼寫檢查程式中的詞典雜湊

演算法與資料結構——雜湊表
2024-08-27
演算法資料結構
演算法與資料結構——雜湊衝突
2024-08-28
演算法資料結構
【資料結構與演算法學習】雜湊表（Hash Table，雜湊表）
2023-03-15
資料結構演算法
資料結構與演算法整理總結---雜湊表
2020-04-16
資料結構演算法
資料結構與演算法整理總結---雜湊演算法
2020-04-18
資料結構演算法
《資料結構與演算法分析》學習筆記-第五章-雜湊
2021-02-19
資料結構演算法筆記
演算法與資料結構基礎 - 雜湊表(Hash Table)
2019-08-05
演算法資料結構
資料結構——雜湊表
2019-03-04
資料結構
資料結構和演算法-雜湊表 (HashTable)
2020-06-13
資料結構演算法
資料結構與演算法——複雜度分析
2020-12-19
資料結構演算法複雜度
04 Javascript資料結構與演算法之字典和雜湊表
2018-08-27
JavaScript資料結構演算法
雜湊表 ADT 分離連結法【資料結構與演算法分析 c 語言描述】
2019-01-23
資料結構演算法
資料結構之「雜湊表」
2019-03-23
資料結構
資料結構 - 雜湊表，初探
2024-10-27
資料結構
資料結構基礎--雜湊表
2018-12-07
資料結構
JAVA資料結構之雜湊表
2018-08-15
Java資料結構
資料結構 - 雜湊表，再探
2024-10-29
資料結構
Day76.雜湊表、雜湊函式的構造 -資料結構
2020-10-19
函式資料結構
資料結構與演算法學習-複雜度分析
2019-03-03
資料結構演算法複雜度
資料結構與演算法分析——開篇以及複雜度分析
2019-11-29
資料結構演算法複雜度
資料結構（二十八）：雜湊表
2020-10-06
資料結構
【PHP資料結構】雜湊表查詢
2021-09-09
PHP資料結構
資料結構雜湊表（c語言）
2020-12-27
資料結構C語言
《JavaScript資料結構與演算法》筆記——第7章字典和雜湊表
2019-02-16
JavaScript資料結構演算法筆記
前端資料結構與演算法細緻分析—上（複雜度分析）
2019-11-17
前端資料結構演算法複雜度
js實現資料結構及演算法之雜湊表(Hashtable)
2018-08-31
JS資料結構演算法
【資料結構】查詢結構（二叉排序樹、ALV樹、雜湊技術雜湊表）
2018-06-08
資料結構排序
資料結構與演算法之美-02複雜度分析（下）
2020-12-28
資料結構演算法複雜度
資料結構第十一節(雜湊表)
2020-12-15
資料結構
雜湊表 ADT 開放地址法解決衝突【資料結構與演算法分析 c 語言描述】
2019-01-25
資料結構演算法
資料結構：初識（資料結構、演算法與演算法分析）
2020-07-21
資料結構演算法
資料結構與演算法分析——棧
2019-12-05
資料結構演算法
Java關於資料結構的實現：雜湊
2019-03-03
Java資料結構
資料結構，雜湊表hash設計實驗
2020-12-12
資料結構
資料結構與演算法整理總結---演算法複雜度
2020-01-22
資料結構演算法複雜度
iOS標準庫中常用資料結構和演算法之雜湊表
2019-04-22
iOS資料結構演算法
資料結構與演算法分析——連結串列
2019-12-03
資料結構演算法
資料結構與演算法Python版熟悉雜湊表，瞭解Python字典底層實現
2021-06-15
資料結構演算法Python
資料結構與演算法——時間複雜度
2020-09-30
資料結構演算法時間複雜度

資料結構與演算法分析 (雜湊)

1. 分離連結法

2. 開放定址法

3. 利用Hash表實現多項式乘法

4. Hash表改進字謎程式

相關文章