PostgreSQL利用編譯器extension支援int128，提升聚合效能

德哥發表於2016-12-21

原文網址 : https://flycode.co/archives/112858

SQL編譯

背景

PostgreSQL 9.4以及以前的版本，在INT，INT2，INT8的聚合計算中，為了保證資料不會溢位，中間結果使用numeric來儲存。

numeric是PostgreSQL自己實現的一種數值型別，可以儲存非常大的數值(估計是做科學計算的需求)，但是犧牲了一定的效能。

為了提高聚合，特別是大資料量的聚合時的效能，社群借用了編譯器支援的int128型別，作為資料庫int, int2, int8的中間計算結果，從而提升計算效能。

編譯器相關的解釋

gcc,clang,icc都支援int128

1. gcc

6.8 128-bit Integers

As an extension the integer scalar type __int128 is supported for targets which have an integer mode wide enough to hold 128 bits. 

Simply write __int128 for a signed 128-bit integer, or unsigned __int128 for an unsigned 128-bit integer. 

There is no support in GCC for expressing an integer constant of type __int128 for targets with long long integer less than 128 bits wide.

2. icc

From what I can tell, at least icc 13.0.1+ support __int128_t and __uint128_t. Courtesy of Matt Godbolt`s Compiler Explorer:

__int128_t ai (__int128_t x, __int128_t y) {
  return x + y;
}

__int128_t mi (__int128_t x, __int128_t y) {
  return x * y;
}

__int128_t di (__int128_t x, __int128_t y) {
  return x / y;
}

__int128_t ri (__int128_t x, __int128_t y) {
  return x % y;
}
compiles to:

L__routine_start_ai_0:
ai:
        add       rdi, rdx                                      #2.14
        mov       rax, rdi                                      #2.14
        adc       rsi, rcx                                      #2.14
        mov       rdx, rsi                                      #2.14
        ret                                                     #2.14
L__routine_start_mi_1:
mi:
        mov       rax, rdi                                      #6.14
        imul      rsi, rdx                                      #6.14
        imul      rcx, rdi                                      #6.14
        mul       rdx                                           #6.14
        add       rsi, rcx                                      #6.14
        add       rdx, rsi                                      #6.14
        ret                                                     #6.14
L__routine_start_di_2:
di:
        push      rsi                                           #9.44
        call      __divti3                                      #10.14
        pop       rcx                                           #10.14
        ret                                                     #10.14
L__routine_start_ri_3:
ri:
        push      rsi                                           #13.44
        call      __modti3                                      #14.14
        pop       rcx                                           #14.14
        ret                                                     #14.14
with icc 13.0.1 (http://goo.gl/UnxEFt).

PostgreSQL int128支援

編譯時根據編譯器的特性自動判斷是否使用int128特性.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8122e1437e332e156d971a0274879b0ee76e488a

Add, optional, support for 128bit integers.

We will, for the foreseeable future, not expose 128 bit datatypes to
SQL. But being able to use 128bit math will allow us, in a later patch,
to use 128bit accumulators for some aggregates; leading to noticeable
speedups over using numeric.

So far we only detect a gcc/clang extension that supports 128bit math,
but no 128bit literals, and no *printf support. We might want to expand
this in the future to further compilers; if there are any that that
provide similar support.


config/c-compiler.m4        diff | blob | blame | history
configure       diff | blob | blame | history
configure.in        diff | blob | blame | history
src/include/c.h     diff | blob | blame | history
src/include/pg_config.h.in      diff | blob | blame | history
src/include/pg_config.h.win32       diff | blob | blame | history

效能提升測試

There was recently talk about if we should start using 128-bit integers
(where available) to speed up the aggregate functions over integers
which uses numeric for their internal state. So I hacked together a
patch for this to see what the performance gain would be.

Previous thread:
http://www.postgresql.org/message-id/20141017182500.GF2075@alap3.anarazel.de

What the patch does is switching from using numerics in the aggregate
state to int128 and then convert the type from the 128-bit integer in
the final function.

The functions where we can make use of int128 states are:

- sum(int8)
- avg(int8)
- var_*(int2)
- var_*(int4)
- stdev_*(int2)
- stdev_*(int4)

The initial benchmark results look very promising. When summing 10
million int8 I get a speedup of ~2.5x and similarly for var_samp() on 10
million int4 I see a speed up of ~3.7x. To me this indicates that it is
worth the extra code. What do you say? Is this worth implementing?

The current patch still requires work. I have not written the detection
of int128 support yet, and the patch needs code cleanup (for example: I
used an int16_ prefix on the added functions, suggestions for better
names are welcome). I also need to decide on what estimate to use for
the size of that state.

The patch should work and pass make check on platforms where __int128_t
is supported.

The simple benchmarks:

CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1, 
10000000) x;

Before:

# SELECT sum(x) FROM test_int8;
       sum
----------------
  50000005000000
(1 row)

Time: 2521.217 ms

After:

# SELECT sum(x) FROM test_int8;
       sum
----------------
  50000005000000
(1 row)

Time: 1022.811 ms

CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1, 
10000000) x;

Before:

# SELECT var_samp(x) FROM test_int4;
       var_samp
--------------------
  8333334166666.6667
(1 row)

Time: 3808.546 ms

After:

# SELECT var_samp(x) FROM test_int4;
       var_samp
--------------------
  8333334166666.6667
(1 row)

Time: 1033.243 ms

參考

1. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8122e1437e332e156d971a0274879b0ee76e488a

2. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=959277a4f579da5243968c750069570a58e92b38

3. https://www.postgresql.org/message-id/flat/544BB5F1.50709%40proxel.se#544BB5F1.50709@proxel.se

4. http://stackoverflow.com/questions/16365840/128-bit-integers-supporting-and-in-the-intel-c-compiler

5. https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html

PostgreSQL編寫自定義extension
2021-12-12
SQL
PHP 8.0重大版本更新正式釋出：支援JIT編譯器，效能提升高達3倍
2020-11-30
PHP編譯
PHP編譯器BPC 6.0釋出,支援namespace,支援closure,成功編譯 workerman
2022-11-24
PHP編譯namespace
PostgreSQL DBA(133) - Extension(postgresql_anonymizer)
2019-11-19
SQL
windows vs 編譯postgresql
2018-12-30
Windows編譯SQL
PostgreSQL DBA(150) - Extension(pgmetrics)
2019-12-19
SQL
PostgreSQL DBA(184) - Extension(hypoPG)
2021-06-17
SQL
6個提升PostgreSQL效能的小技巧
2023-10-11
SQL
關於支援OPenACC的編譯器說明
2018-03-06
編譯
藉助 Valve 的新編譯器，Linux 遊戲在 AMD GPU 中獲得了效能提升
2019-07-10
編譯Linux遊戲GPU
狂攬兩千星，速度百倍提升，高效能Python編譯器Codon開源
2022-12-12
Python編譯
PostgreSQL 12 正式釋出：全面的效能提升
2019-10-11
SQL
[譯] 使用 PhpFastCache 提升網站效能
2018-07-22
PHPAST網站
Java動態編譯優化——提升編譯速度（N倍）
2018-12-06
Java編譯優化
PostgreSQL DBA(83) - Extension(pg_buffercache)
2019-08-21
SQL
PostgreSQL DBA(84) - Extension(pg_prewarm)
2019-08-21
SQL
PostgreSQL DBA(72) - Extension(pgplsql_check)
2019-08-09
SQL
PostgreSQL DBA(162) - Extension(pg_catcheck)
2020-02-11
SQL
PostgreSQL DBA(134) - Extension(auto_explain)
2019-11-20
SQLAI
PostgreSQL DBA(130) - Extension(pgsql-gzip)
2019-11-18
SQL
PostgreSQL DBA(63) - Extension(pg_qualstats)
2019-07-26
SQL
PostgreSQL DBA(163) - Extension(pg_cron)
2020-02-11
SQL
webpack編譯速度提升之DllPlugin
2018-07-08
Web編譯Plugin
利用DDP技術提升Tungsten Fabric vRouter效能
2020-09-09
VR
淺談彙編器、編譯器和直譯器
2019-06-26
編譯
[譯]現代瀏覽器是如何提升效能的：網路層
2019-02-26
瀏覽器
QT支援https及編譯OpenSSL
2020-12-26
QTHTTP編譯
PostgreSQL DBA(129) - Extension(pg_variables).md
2019-11-18
SQL
PostgreSQL DBA(63) - Extension(pg_stat_statements)
2019-07-26
SQL
Ruby 將引入新 JIT 編譯器：YJIT，平均速度提升 23%
2021-11-02
編譯
vue編譯器
2024-04-11
Vue編譯
CUDAFORTRAN編譯器
2018-03-06
編譯
Micronaut使用提前編譯支援Spring Boot
2019-01-31
編譯Spring Boot
Ubuntu18.04編譯ZLMediakit支援webrtc
2023-02-18
Ubuntu編譯Web
Go編譯器簡介【譯】
2019-02-16
Go編譯
什麼是聚合路由器？聚合路由器支援的鏈路型別有哪些？
2022-01-04
路由器型別
Go 高效能系列教程之三：編譯器優化
2021-06-01
Go編譯優化
深入瞭解Java JIT編譯器：原理與效能最佳化
2023-03-16
Java編譯
程式碼線上編譯器（上）- 編輯及編譯
2018-10-30
編譯

PostgreSQL利用編譯器extension支援int128，提升聚合效能

標籤

背景

編譯器相關的解釋

PostgreSQL int128支援

效能提升測試

參考

相關文章