OpenMP Sections Construct 實現原理以及原始碼分析

一無是處的研究僧發表於2023-02-16

OpenMP Sections Construct 實現原理以及原始碼分析

前言

在本篇文章當中主要給大家介紹 OpenMP 當中主要給大家介紹 OpenMP 當中 sections construct 的實現原理以及他呼叫的動態庫函式分析。如果已經瞭解過了前面的關於 for 的排程方式的分析，本篇文章就非常簡單了。

編譯器角度分析

在這一小節當中我們將從編譯器角度去分析編譯器會怎麼處理 sections construct ，我們以下面的 sections construct 為例子，看看編譯器是如何處理 sections construct 的。

#pragma omp sections
{
  #pragma omp section
  stmt1;
  #pragma omp section
  stmt2;
  #pragma omp section
  stmt3;
}

上面的程式碼會被編譯器轉換成下面的形式，其中 GOMP_sections_start 和 GOMP_sections_next 是併發安全的，他們都會返回一個資料表示第幾個 omp section 程式碼塊，其中 GOMP_sections_start 的引數是表示有幾個 omp section 程式碼塊，並且返回給執行緒一個整數表示執行緒需要執行第幾個 section 程式碼塊，這兩個函式的意義不同的是在 GOMP_sections_start 當中會進行一些資料的初始化操作。當兩個函式返回 0 的時候表示所有的 section 都被執行完了，從而退出 for 迴圈。

for (i = GOMP_sections_start (3); i != 0; i = GOMP_sections_next ())
  switch (i)
    {
    case 1:
      stmt1;
      break;
    case 2:
      stmt2;
      break;
    case 3:
      stmt3;
      break;
    }
GOMP_barrier ();

動態庫函式分析

事實上在函式 GOMP_sections_start 和函式 GOMP_sections_next 當中呼叫的都是我們之前分析過的函式 gomp_iter_dynamic_next ，這個函式實際上就是讓執行緒始終原子指令去競爭資料塊（chunk），這個特點和 sections 需要完成的語意是相同的，只不過 sections 的塊大小（chunk size）都是等於 1 的，因為一個執行緒一次只能夠執行一個 section 程式碼塊。

unsigned
GOMP_sections_start (unsigned count)
{
  // 引數 count 的含義就是表示一共有多少個 section 程式碼塊
  // 得到當執行緒的相關資料
  struct gomp_thread *thr = gomp_thread ();
  long s, e, ret;
  // 進行資料的初始化操作
  // 將資料的 chunk size 設定等於 1
  // 分割 chunk size 的起始位置設定成 1 因為根據上面的程式碼分析 0 表示退出迴圈 因此不能夠使用 0 作為分割的起始位置
  if (gomp_work_share_start (false))
    {
    // 這裡傳入 count 作為引數的原因是需要設定 chunk 分配的最終位置 具體的原始碼在下方
      gomp_sections_init (thr->ts.work_share, count);
      gomp_work_share_init_done ();
    }
  // 如果獲取到一個 section 的執行權 gomp_iter_dynamic_next 返回 true 否則返回 false 
  // s 和 e 分別表示 chunk 的起始位置和終止位置 但是在 sections 當中需要注意的是所有的 chunk size 都等於 1
  // 這也很容易理解一次執行一個 section 程式碼塊
  if (gomp_iter_dynamic_next (&s, &e))
    ret = s;
  else
    ret = 0;
  return ret;
}

// 下面是部分 gomp_sections_init 的程式碼
static inline void
gomp_sections_init (struct gomp_work_share *ws, unsigned count)
{
  ws->sched = GFS_DYNAMIC;
  ws->chunk_size = 1; // 設定 chunk size 等於 1
  ws->end = count + 1L; // 因為一共有 count 個 section 塊
  ws->incr = 1; // 每次增長一個
  ws->next = 1; // 從 1 開始進行 chunk size 的分配 因為 0 表示退出迴圈（編譯器角度分析）
}

unsigned
GOMP_sections_next (void)
{
  // 這個函式就比較容易理解了 就是獲取一個 chunk 拿到對應的 section 的執行權
  long s, e, ret;
  if (gomp_iter_dynamic_next (&s, &e))
    ret = s;
  else
    ret = 0;
  return ret;
}

// 下面的函式在之前的很多文章當中都分析過了 這裡不再進行分析
// 下面的函式的主要過程就是使用 CAS 指令不斷的進行嘗試，直到獲取成功或者全部獲取完成 沒有 chunk 需要分配
bool
gomp_iter_dynamic_next (long *pstart, long *pend)
{
  struct gomp_thread *thr = gomp_thread ();
  struct gomp_work_share *ws = thr->ts.work_share;
  long start, end, nend, chunk, incr;

  end = ws->end;
  incr = ws->incr;
  chunk = ws->chunk_size;

  if (__builtin_expect (ws->mode, 1))
    {
      long tmp = __sync_fetch_and_add (&ws->next, chunk);
      if (incr > 0)
	{
	  if (tmp >= end)
	    return false;
	  nend = tmp + chunk;
	  if (nend > end)
	    nend = end;
	  *pstart = tmp;
	  *pend = nend;
	  return true;
	}
      else
	{
	  if (tmp <= end)
	    return false;
	  nend = tmp + chunk;
	  if (nend < end)
	    nend = end;
	  *pstart = tmp;
	  *pend = nend;
	  return true;
	}
    }

  start = ws->next;
  while (1)
    {
      long left = end - start;
      long tmp;

      if (start == end)
	return false;

      if (incr < 0)
	{
	  if (chunk < left)
	    chunk = left;
	}
      else
	{
	  if (chunk > left)
	    chunk = left;
	}
      nend = start + chunk;

      tmp = __sync_val_compare_and_swap (&ws->next, start, nend);
      if (__builtin_expect (tmp == start, 1))
	break;

      start = tmp;
    }

  *pstart = start;
  *pend = nend;
  return true;
}

總結

在本篇文章當中主要介紹了 OpenMP 當中 sections 的實現原理和相關的動態庫函式分析，關於 sections 重點在編譯器會如何對 sections 的編譯指導語句進行處理的，動態庫函式和 for 迴圈的動態排程方式是一樣的，只不過 chunk size 設定成 1，分塊的起始位置等於 1，分塊的最終值是 section 程式碼塊的個數，最終在動態排程的方式使用 CAS 不斷獲取 section 的執行權，指導所有的 section 被執行完成。

更多精彩內容合集可訪問專案：https://github.com/Chang-LeHung/CSCore

關注公眾號：一無是處的研究僧，瞭解更多計算機（Java、Python、計算機系統基礎、演演算法與資料結構）知識。

OpenMP task construct 實現原理以及原始碼分析
2023-03-05
Struct原始碼
OpenMP Parallel Construct 實現原理與原始碼分析
2023-01-25
ParallelStruct原始碼
OpenMP 執行緒同步 Construct 實現原理以及原始碼分析（上）
2023-01-28
執行緒Struct原始碼
OpenMP 執行緒同步 Construct 實現原理以及原始碼分析（下）
2023-01-31
執行緒Struct原始碼
OpenMP For Construct dynamic 排程方式實現原理和原始碼分析
2023-02-03
Struct原始碼
OPENMP FOR CONSTRUCT GUIDED 排程方式實現原理和原始碼分析
2023-02-15
StructGUIIDE原始碼
ConcurrentHashMap 實現原理和原始碼分析
2018-04-09
HashMap原始碼
HashMap實現原理及原始碼分析
2018-07-30
HashMap原始碼
HashMap 實現原理與原始碼分析
2019-04-26
HashMap原始碼
【原始碼&庫】Vue3 的響應式核心 reactive 和 effect 實現原理以及原始碼分析
2023-03-07
原始碼VueReact
層次分析法模型原理以及程式碼實現
2024-06-27
模型
spring原始碼分析第二天------spring系統概述以及IOC實現原理
2020-12-12
Spring原始碼
Spring原始碼分析之 lazy-init 實現原理
2019-03-20
Spring原始碼
JDK動態代理實現原理詳解（原始碼分析）
2020-10-24
JDK原始碼
原始碼|ThreadLocal的實現原理
2019-03-04
原始碼thread
Promise實現原理（附原始碼）
2018-09-15
Promise原始碼
redis個人原始碼分析2---dict的實現原理
2018-11-19
Redis原始碼
Java面試題從原始碼角度分析HashSet實現原理？
2019-07-25
Java面試題原始碼
Spring Ioc原始碼分析系列--@Autowired註解的實現原理
2022-06-01
Spring原始碼
原始碼分析 Alibaba sentinel 滑動視窗實現原理(文末附原理圖)
2020-04-25
原始碼
HashMap原始碼實現分析
2020-07-22
HashMap原始碼
【原始碼分析】Lottie 實現炫酷動畫背後的原理
2019-04-23
原始碼動畫
深入原始碼解析 tapable 實現原理
2019-11-05
原始碼
Netty原始碼解析 -- PoolChunk實現原理
2020-12-06
Netty原始碼
synchronized實現原理及ReentrantLock原始碼
2020-12-17
synchronizedReentrantLock原始碼
Netty原始碼解析 -- PoolSubpage實現原理
2020-12-19
Netty原始碼
SpringMVC原始碼分析原理
2020-05-21
SpringMVC原始碼
GCD原始碼原理分析
2019-04-22
GC原始碼
從原始碼解讀Category實現原理
2018-03-06
原始碼Go
InnoDB MVCC實現原理及原始碼解析
2018-04-15
MVC原始碼
【整合學習】：Stacking原理以及Python程式碼實現
2021-10-17
Python
節流原理以及實現
2019-10-09
Guava 原始碼分析（Cache 原理）
2019-01-19
Guava原始碼
Spring原始碼分析：BeanPostProcessor原理
2019-01-19
Spring原始碼Bean
Volcano 原理、原始碼分析（一）
2023-12-29
原始碼
Composer 工作原理 [原始碼分析]
2020-04-17
原始碼
Openmp Runtime 庫函式彙總（下）——深入剖析鎖?原理與實現
2023-01-16
函式
搶紅包案例分析以及程式碼實現
2018-11-05

OpenMP Sections Construct 實現原理以及原始碼分析

OpenMP Sections Construct 實現原理以及原始碼分析

前言

編譯器角度分析

動態庫函式分析

總結

相關文章