[已完結]CMU資料庫(15-445)實驗2-B+樹索引實現(下）

周小倫發表於2021-01-27

原文網址 : https://www.cnblogs.com/JayL-zxl/p/14333395.html

4. Index_Iterator實現

這裡就是需要實現迭代器的一些操作,比如begin、end、isend等等

下面是對於IndexIterator的建構函式

template <typename KeyType, typename ValueType, typename KeyComparator>
IndexIterator<KeyType, ValueType, KeyComparator>::
IndexIterator(BPlusTreeLeafPage<KeyType, ValueType, KeyComparator> *leaf,
              int index_, BufferPoolManager *buff_pool_manager):
    leaf_(leaf), index_(index_), buff_pool_manager_(buff_pool_manager) {}

1. 首先我們來看begin函式的實現

利用key值找到葉子結點
然後獲取當前key值的index就是begin的位置

INDEX_TEMPLATE_ARGUMENTS
INDEXITERATOR_TYPE BPLUSTREE_TYPE::Begin(const KeyType &key) {
  auto leaf = reinterpret_cast<BPlusTreeLeafPage<KeyType, ValueType,KeyComparator> *>(FindLeafPage(key, false));
  int index = 0;
  if (leaf != nullptr) {
    index = leaf->KeyIndex(key, comparator_);
  }
  return IndexIterator<KeyType, ValueType, KeyComparator>(leaf, index, buffer_pool_manager_);
}

2. end函式的實現

找到最開始的結點
然後一直向後遍歷直到nextPageId=-1結束
這裡注意需要過載!=和==

end函式

INDEX_TEMPLATE_ARGUMENTS
INDEXITERATOR_TYPE BPLUSTREE_TYPE::end() {
  KeyType key{};
  auto leaf= reinterpret_cast<BPlusTreeLeafPage<KeyType, ValueType,KeyComparator> *>( FindLeafPage(key, true));
  page_id_t new_page;
  while(leaf->GetNextPageId()!=INVALID_PAGE_ID){
    new_page=leaf->GetNextPageId();
    leaf=reinterpret_cast<BPlusTreeLeafPage<KeyType, ValueType,KeyComparator> *>(buffer_pool_manager_->FetchPage(new_page));
  }
  buffer_pool_manager_->UnpinPage(new_page,false);
  return IndexIterator<KeyType, ValueType, KeyComparator>(leaf, leaf->GetSize(), buffer_pool_manager_);
}

==和 !=函式

bool operator==(const IndexIterator &itr) const {
  return this->index_==itr.index_&&this->leaf_==itr.leaf_;
}

bool operator!=(const IndexIterator &itr) const {
  return !this->operator==(itr);
}

3. 過載++和*(解引用符號)

過載++

簡單的index++然後設定nextPageId即可

template <typename KeyType, typename ValueType, typename KeyComparator>
IndexIterator<KeyType, ValueType, KeyComparator> &IndexIterator<KeyType, ValueType, KeyComparator>::
operator++() {
//
 // std::cout<<"++"<<std::endl;
  ++index_;
  if (index_ == leaf_->GetSize() && leaf_->GetNextPageId() != INVALID_PAGE_ID) {
    // first unpin leaf_, then get the next leaf
    page_id_t next_page_id = leaf_->GetNextPageId();

    auto *page = buff_pool_manager_->FetchPage(next_page_id);
    if (page == nullptr) {
      throw Exception("all page are pinned while IndexIterator(operator++)");
    }
    // first acquire next page, then release previous page
    page->RLatch();

    buff_pool_manager_->FetchPage(leaf_->GetPageId())->RUnlatch();
    buff_pool_manager_->UnpinPage(leaf_->GetPageId(), false);
    buff_pool_manager_->UnpinPage(leaf_->GetPageId(), false);

    auto next_leaf =reinterpret_cast<BPlusTreeLeafPage<KeyType, ValueType,KeyComparator> *>(page->GetData());
    assert(next_leaf->IsLeafPage());
    index_ = 0;
    leaf_ = next_leaf;
  }
  return *this;
};

過載*

return array[index]即可

template <typename KeyType, typename ValueType, typename KeyComparator>
const MappingType &IndexIterator<KeyType, ValueType, KeyComparator>::
operator*() {
  if (isEnd()) {
    throw "IndexIterator: out of range";
  }
  return leaf_->GetItem(index_);
}

5. 併發機制的實現

0. 首先複習一下讀寫?機制

讀操作是可以多個程式之間共享latch的而寫操作則必須互斥
加入MaxReader數就是為了防止等待的⌛️寫程式飢餓

首先來看如果沒有?機制多執行緒會發生什麼問題

執行緒T1想要刪除44。
執行緒T2 想要查詢41

假設T2在執行到D位置的時候又切換到執行緒T1
這個時候T1進行重新分配，會把41借到I結點上
T1執行完成切換回T2這時候T2再去原來的執行尋找41就會找不到

就會出現下面的情況。❓

由此我們需要讀寫?的存在

對於find操作

由於我們是隻讀操作，所以我們到下一個結點的時候就可以釋放上一個結點的Latch

剩下的操作都是一樣的

對於delete則不一樣

因為我們需要寫操作

這裡我們不能釋放結點A的Latch。因為我們的刪除操作可能會合並根節點。

到D的時候。我們會發現D中的38刪除之後不需要進行合併，所以對於A和B的寫Write是可以安全釋放了

對於Insert操作

這裡我們就可以安全的釋放掉A的鎖。因為B中還有空位，我們插入是不會對A造成影響的

當我們執行到D這裡發現D中已經滿了。所以此時我們不會釋放B的鎖，因為我們會對B進行寫操作

上面的演算法雖然是正確的但是有瓶頸問題。由於只有一個執行緒可以獲得寫Latch。而插入和刪除的時候都需要對頭結點加寫Latch。所以多執行緒在有許多個插入或者刪除操作的時候，效能就會大打折扣

這裡要引入樂觀?

樂觀的假設大部分操作是不需要進行合併和分裂的。因此在我們向下的時候都是讀Latch而不是寫Latch。只有在葉子結點才是write Latch

從上到下都是讀Latch。而且逐步釋放
到葉子結點需要修改的時候才為寫Latch。這個刪除是安全的所以直接結束

當我們到最後一步發現不安全的時候。則需要像上面我們沒有引入樂觀?的時候一樣。重新執行一遍

B-Link Tree簡介

延遲更新父結點

這裡用一個?來標記這裡需要被更新但是還沒有執行

這個時候我們執行其他操作也是正確的比如查詢31

這裡我們執行insert 33

當執行到結點C的時候。因為這個時候有另一個執行緒持有了write Latch。所以這個時候?操作要執行。隨後在插入33

最後一點補充關於掃描操作的

執行緒1在C結點上持有write Latch
執行緒2已經掃描完了結點B想要獲得結點C的read Latch

這時候會發生問題，因為執行緒2無法拿到read Latch

這裡有幾種解決方法

可以等到T1的寫操作完成
可以重新執行T2
可以直接讓執行緒T2停止搶得這個Latch。

注意這裡的Latch和Lock並不一樣

1. 輔助函式`UnlockUnpinPages`的實現

如果是讀操作則釋放read鎖
否則釋放write鎖

INDEX_TEMPLATE_ARGUMENTS
void BPLUSTREE_TYPE::
UnlockUnpinPages(Operation op, Transaction *transaction) {
  if (transaction == nullptr) {
    return;
  }

  for (auto page:*transaction->GetPageSet()) {
    if (op == Operation::READ) {
      page->RUnlatch();
      buffer_pool_manager_->UnpinPage(page->GetPageId(), false);
    } else {
      page->WUnlatch();
      buffer_pool_manager_->UnpinPage(page->GetPageId(), true);
    }
  }
  transaction->GetPageSet()->clear();

  for (const auto &page_id: *transaction->GetDeletedPageSet()) {
    buffer_pool_manager_->DeletePage(page_id);
  }
  transaction->GetDeletedPageSet()->clear();

  // if root is locked, unlock it

  node_mutex_.unlock();
  }

四個自帶的解鎖和上鎖操作

/** Acquire the page write latch. */
inline void WLatch() { rwlatch_.WLock(); }

/** Release the page write latch. */
inline void WUnlatch() { rwlatch_.WUnlock(); }

/** Acquire the page read latch. */
inline void RLatch() { rwlatch_.RLock(); }

/** Release the page read latch. */
inline void RUnlatch() { rwlatch_.RUnlock(); }

這裡的rwlatch是自己實現的讀寫鎖類下面來探究一下這個類

由於c++ 併發程式設計我現在還不太會。。。所以就簡單看一下啦後面學完併發程式設計再補充

WLock函式
1. 首先獲取一個鎖
2. 用一個記號writer_entered表示是否有寫操作
3. 如果之前已經有了現在的操作就需要等(這個執行緒處於阻塞狀態)
4. 當前如果有其他執行緒執行讀操作。則仍需要阻塞(別人讀的時候你不能寫)
```
void WLock() {
  std::unique_lock<mutex_t> latch(mutex_);
  while (writer_entered_) {
    reader_.wait(latch);
  }
  writer_entered_ = true;
  while (reader_count_ > 0) {
    writer_.wait(latch);
  }
}
```

WunLock函式

寫標記置為false
然後通知所有的執行緒

void WUnlock() {
  std::lock_guard<mutex_t> guard(mutex_);
  writer_entered_ = false;
  reader_.notify_all();
}

RLock函式

如果當前有人在寫或者已經有最多的人讀了則阻塞
否則只需要讓讀的計數++

因為是允許多個執行緒一起讀這樣並不會出錯

void RLock() {
  std::unique_lock<mutex_t> latch(mutex_);
  while (writer_entered_ || reader_count_ == MAX_READERS) {
    reader_.wait(latch);
  }
  reader_count_++;
}

RUnLatch函式

計數--
如果當前有人在寫並且無人讀的話需要通知所有其他執行緒
如果在計數--之前達到了最大讀數，釋放這個鎖之後需要通知其他執行緒，現在又可以讀了。

void RUnlock() {
  std::lock_guard<mutex_t> guard(mutex_);
  reader_count_--;
  if (writer_entered_) {
    if (reader_count_ == 0) {
      writer_.notify_one();
    }
  } else {
    if (reader_count_ == MAX_READERS - 1) {
      reader_.notify_one();
    }
  }
}

6. Summary

好了終於磕磕絆絆的寫完了Lab2.關於資料庫的Lab2應該會停一段時間。這段時間要補一補深度學習（畢竟要畢業）然後趕工一下老師給的活。同時學一下c++併發程式設計和看一下侯捷老師的課程。

最後附上GitHub的?
https://github.com/JayL-zxl/CMU15-445Lab

CMU資料庫(15-445)實驗2-b+樹索引實現(上)
2021-01-25
資料庫索引
CMU資料庫(15-445)-實驗2-B+樹索引實現(中）刪除
2021-01-26
資料庫索引
CMU資料庫（15-445）Lab1-BufferPoolManager
2021-01-22
資料庫
CMU資料庫（15-445）Lab3- QUERY EXECUTION
2021-03-08
資料庫
資料庫索引為什麼用B+樹實現？
2019-02-16
資料庫索引
資料庫儲存與索引技術（三）LSM樹實現案例
2023-03-16
資料庫索引
資料結構實驗三 2024_樹與圖實驗
2024-11-18
資料結構
《資料結構》實驗08--樹及其應用
2020-11-10
資料結構
Oracle如何實現B樹索引
2020-11-06
Oracle索引
資料結構-哈夫曼樹（python實現）
2019-07-22
資料結構Python
Java關於資料結構的實現：樹
2019-02-06
Java資料結構
資料庫實驗二
2024-05-24
資料庫
資料結構實驗多維陣列的實現
2020-11-18
資料結構陣列
資料結構實驗二維矩陣的實現
2020-11-18
資料結構矩陣
資料結構 - 樹，三探之程式碼實現
2024-10-23
資料結構
資料結構實驗（4）
2020-10-05
資料結構
資料結構實驗1
2024-10-22
資料結構
資料結構-二叉搜尋樹的實現
2019-02-24
資料結構
資料庫實驗五：資料庫程式設計
2024-06-16
資料庫程式設計
資料庫實驗八資料庫程式設計
2020-12-24
資料庫程式設計
資料結構實驗六是否同一顆二叉樹
2020-05-21
資料結構二叉樹
資料結構實驗六（二叉排序樹字元統計）
2020-12-13
資料結構排序字元
vue 實現樹形資料 curd
2019-07-03
Vue
資料庫實現原理#5（FSM：最大堆二叉樹）
2020-04-16
資料庫二叉樹
資料庫8530_實驗(1)
2024-10-23
資料庫
寫資料庫實驗報告
2024-06-14
資料庫
資料庫實驗五資料庫的安全性
2020-12-16
資料庫
CMU15-455 Lab2 - task4 Concurrency Index -併發B+樹索引演算法的實現
2021-03-16
Index索引演算法
資料結構系列：Objective-C實現二叉樹
2018-11-21
資料結構Object二叉樹
資料結構之二叉搜尋樹—Java實現
2019-03-07
資料結構Java
20分鐘資料庫索引設計實戰
2019-02-16
資料庫索引
資料庫索引背後的資料結構
2019-02-26
資料庫索引資料結構
資料庫系列：InnoDB下實現高併發控制
2023-11-07
資料庫
MySQL資料庫索引選擇使用B+樹
2020-04-04
MySql資料庫索引
資料結構實驗三：線性表綜合實驗
2018-05-01
資料結構
樹結構與Java實現
2019-04-18
Java
dg_閃回資料庫實驗
2019-05-28
資料庫
資料結構高階--二叉搜尋樹（原理+實現）
2022-12-01
資料結構