元旦技術大禮包-2017金秋將要釋出的PostgreSQL10.0已裝備了哪些核武器?
標籤
PostgreSQL , 10.0 , 金秋 , 元旦 , 大禮包 , commitfest
背景
早上送給大家的新年大禮包,一年一個大版本是PostgreSQL社群的傳統,雖然釋出時間通常為秋天,還有一段時間,但是已經迫不及待地想看看2017金秋將要釋出的10.0版本已經裝備了哪些核武器。
放心,還會有一波又一波的feature和增強搭上開往2017金秋的列車,本文提到的可能只是其中的某一節車廂沃,PGer是不是開始有一點振奮人心的感覺啦。
1. 平行計算專屬動態共享記憶體區,(加速索引掃描外部表並行的支援步伐)
Provide a DSA area for all parallel queries.
This will allow future parallel query code to dynamically allocate
storage shared by all participants.
Thomas Munro, with assorted changes by me.
Parallel tuplesort (for parallel B-Tree index creation)
https://commitfest.postgresql.org/12/690/
Parallel bitmap heap scan
https://commitfest.postgresql.org/12/812/
Parallel Index Scans
https://commitfest.postgresql.org/12/849/
Parallel Merge Join
https://commitfest.postgresql.org/12/918/
Parallel Append implementation
https://commitfest.postgresql.org/12/929/
parallelize queries containing subplans
https://commitfest.postgresql.org/12/941/
2. 多副本新增 “任意節點、順序” 兩種模式自由選擇
Support quorum-based synchronous replication.
This feature is also known as "quorum commit" especially in discussion
on pgsql-hackers.
This commit adds the following new syntaxes into synchronous_standby_names
GUC. By using FIRST and ANY keywords, users can specify the method to
choose synchronous standbys from the listed servers.
FIRST num_sync (standby_name [, ...])
ANY num_sync (standby_name [, ...])
The keyword FIRST specifies a priority-based synchronous replication
which was available also in 9.6 or before. This method makes transaction
commits wait until their WAL records are replicated to num_sync
synchronous standbys chosen based on their priorities.
The keyword ANY specifies a quorum-based synchronous replication
and makes transaction commits wait until their WAL records are
replicated to *at least* num_sync listed standbys. In this method,
the values of sync_state.pg_stat_replication for the listed standbys
are reported as "quorum". The priority is still assigned to each standby,
but not used in this method.
The existing syntaxes having neither FIRST nor ANY keyword are still
supported. They are the same as new syntax with FIRST keyword, i.e.,
a priorirty-based synchronous replication.
Author: Masahiko Sawada
Reviewed-By: Michael Paquier, Amit Kapila and me
Discussion: <CAD21AoAACi9NeC_ecm+Vahm+MMA6nYh=Kqs3KB3np+MBOS_gZg@mail.gmail.com>
Many thanks to the various individuals who were involved in
discussing and developing this feature.
3. 新增會話級臨時replication slots支援
(原來slot是為長時間使用設計,如STANDBY。 對於一些測試、針對性複製場景,不再需要擔心忘記刪除slot帶來的問題了)
Add support for temporary replication slots
This allows creating temporary replication slots that are removed
automatically at the end of the session or on error.
From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
4. 認證協議會更加安全(SCRAM)
Refactor the code for verifying user`s password.
Split md5_crypt_verify() into three functions:
* get_role_password() to fetch user`s password from pg_authid, and check
its expiration.
* md5_crypt_verify() to check an MD5 authentication challenge
* plain_crypt_verify() to check a plaintext password.
get_role_password() will be needed as a separate function by the upcoming
SCRAM authentication patch set. Most of the remaining functionality in
md5_crypt_verify() was different for MD5 and plaintext authentication, so
split that for readability.
While we`re at it, simplify the *_crypt_verify functions by using
stack-allocated buffers to hold the temporary MD5 hashes, instead of
pallocing.
Reviewed by Michael Paquier.
Turn password_encryption GUC into an enum.
This makes the parameter easier to extend, to support other password-based
authentication protocols than MD5. (SCRAM is being worked on.)
The GUC still accepts on/off as aliases for "md5" and "plain", although
we may want to remove those once we actually add support for another
password hash type.
Michael Paquier, reviewed by David Steele, with some further edits by me.
Discussion: <CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com>
https://postgrespro.com/docs/postgresproee/9.6/auth-methods.html#auth-password
21.3.2. Password Authentication
The password-based authentication methods are md5, scram, and password. These methods operate similarly except for the way that the password is sent across the connection, namely MD5-hashed, SCRAM-SHA-256 and clear-text respectively.
If you are at all concerned about password “sniffing” attacks then md5 is preferred. Plain password should always be avoided if possible. However, md5 cannot be used with the db_user_namespace feature. If the connection is protected by SSL encryption then password can be used safely (though SSL certificate authentication might be a better choice if one is depending on using SSL).
scram has more advantages than md5 as it protects from cases where the hashed password is taken directly from pg_authid in which case a connection using only the stolen hash is possible without knowing the password behind it. It protects as well from password interception and data sniffing where the password data could be directly obtained from the network as well as man-in-the-middle (MITM) attacks. So it is strongly encouraged to use it over md5 for password-based deployments.
Postgres Pro Enterprise database passwords are separate from operating system user passwords. The password for each database user is stored in the pg_authid system catalog. Passwords can be managed with the SQL commands CREATE USER and ALTER ROLE, e.g., CREATE USER foo WITH PASSWORD `secret`. If no password has been set up for a user, the stored password is null and password authentication will always fail for that user.
5. Support for SCRAM-SHA-256
https://commitfest.postgresql.org/12/829/
6. 分割槽、多級分割槽支援
此前,我們可以使用pg_pathman外掛來實現高效能多級分割槽
《PostgreSQL 9.5+ 高效分割槽表實現 – pg_pathman》
《PostgreSQL 9.6 sharding based on FDW & pg_pathman》
Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can`t be mixed. Partitions
can`t have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn`t yet supported for partitions which are foreign
tables, and it doesn`t handle updates that cross partition boundaries.
Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.
Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.
Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
Discussion: https://www.postgresql.org/message-id/3029e460-d47c-710e-507e-d8ba759d7cbb@iki.fi
7. 聚合運算減少context切換
Perform one only projection to compute agg arguments.
Previously we did a ExecProject() for each individual aggregate
argument. That turned out to be a performance bottleneck in queries with
multiple aggregates.
Doing all the argument computations in one ExecProject() is quite a bit
cheaper because ExecProject`s fastpath can do the work at once in a
relatively tight loop, and because it can get all the required columns
with a single slot_getsomeattr and save some other redundant setup
costs.
Author: Andres Freund
Reviewed-By: Heikki Linnakangas
Discussion: https://postgr.es/m/20161103110721.h5i5t5saxfk5eeik@alap3.anarazel.de
8. hash index增強
Improve hash index bucket split behavior.
Previously, the right to split a bucket was represented by a
heavyweight lock on the page number of the primary bucket page.
Unfortunately, this meant that every scan needed to take a heavyweight
lock on that bucket also, which was bad for concurrency. Instead, use
a cleanup lock on the primary bucket page to indicate the right to
begin a split, so that scans only need to retain a pin on that page,
which is they would have to acquire anyway, and which is also much
cheaper.
In addition to reducing the locking cost, this also avoids locking out
scans and inserts for the entire lifetime of the split: while the new
bucket is being populated with copies of the appropriate tuples from
the old bucket, scans and inserts can happen in parallel. There are
minor concurrency improvements for vacuum operations as well, though
the situation there is still far from ideal.
This patch also removes the unworldly assumption that a split will
never be interrupted. With the new code, a split is done in a series
of small steps and the system can pick up where it left off if it is
interrupted prior to completion. While this patch does not itself add
write-ahead logging for hash indexes, it is clearly a necessary first
step, since one of the things that could interrupt a split is the
removal of electrical power from the machine performing it.
Amit Kapila. I wrote the original design on which this patch is
based, and did a good bit of work on the comments and README through
multiple rounds of review, but all of the code is Amit`s. Also
reviewed by Jesper Pedersen, Jeff Janes, and others.
Discussion: http://postgr.es/m/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com
Improve handling of dead tuples in hash indexes.
When squeezing a bucket during vacuum, it`s not necessary to retain
any tuples already marked as dead, so ignore them when deciding which
tuples must be moved in order to empty a bucket page. Similarly, when
splitting a bucket, relocating dead tuples to the new bucket is a
waste of effort; instead, just ignore them.
Amit Kapila, reviewed by me. Testing help provided by Ashutosh
Sharma.
9. 支援程式級條件變數,簡化sleepwakeup設計
Support condition variables.
Condition variables provide a flexible way to sleep until a
cooperating process causes an arbitrary condition to become true. In
simple cases, this can be accomplished with a WaitLatch/ResetLatch
loop; the cooperating process can call SetLatch after performing work
that might cause the condition to be satisfied, and the waiting
process can recheck the condition each time. However, if the process
performing the work doesn`t have an easy way to identify which
processes might be waiting, this doesn`t work, because it can`t
identify which latches to set. Condition variables solve that problem
by internally maintaining a list of waiters; a process that may have
caused some waiter`s condition to be satisfied must "signal" or
"broadcast" on the condition variable.
Robert Haas and Thomas Munro
10. 支援聚合運算下推至sharding節點(postgres_fdw增強)
postgres_fdw: Push down aggregates to remote servers.
Now that the upper planner uses paths, and now that we have proper hooks
to inject paths into the upper planning process, it`s possible for
foreign data wrappers to arrange to push aggregates to the remote side
instead of fetching all of the rows and aggregating them locally. This
figures to be a massive win for performance, so teach postgres_fdw to
do it.
Jeevan Chalke and Ashutosh Bapat. Reviewed by Ashutosh Bapat with
additional testing by Prabhat Sahu. Various mostly cosmetic changes
by me.
Push down more full joins in postgres_fdw
https://commitfest.postgresql.org/12/727/
11. 支援流式備份時,同時備份資料檔案與REDO檔案
Allow pg_basebackup to stream transaction log in tar mode
This will write the received transaction log into a file called
pg_wal.tar(.gz) next to the other tarfiles instead of writing it to
base.tar. When using fetch mode, the transaction log is still written to
base.tar like before, and when used against a pre-10 server, the file
is named pg_xlog.tar.
To do this, implement a new concept of a "walmethod", which is
responsible for writing the WAL. Two implementations exist, one that
writes to a plain directory (which is also used by pg_receivexlog) and
one that writes to a tar file with optional compression.
Reviewed by Michael Paquier
12. 分散式事務
https://commitfest.postgresql.org/12/853/
13. Twophase transactions on slave, take 2
https://commitfest.postgresql.org/12/915/
14. Scan key push down to heap
https://commitfest.postgresql.org/12/850/
15. 間接索引
indirect indexes
https://commitfest.postgresql.org/12/874/
16. Logical Replication
https://commitfest.postgresql.org/12/836/
實際上9.4開始就已經支援了,通過外部外掛+邏輯流複製來實現,但是沒有整合到核心中
值得期待的10.0,一起等風來 – 2017金秋
參考
1. https://commitfest.postgresql.org/
3. https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary
相關文章
- LG將釋出AI人工智慧技術 CES2017亮相AI人工智慧
- 技術運維要從哪些方面出發?運維
- 8月即將釋出的手機有哪些 八月要釋出的新手機彙總
- 迎元旦,慶surging 1.0釋出
- AI 金秋將至AI
- Docker 1.12.0將要釋出的新功能Docker
- 2017程式設計趨勢預測:哪些技術大熱?哪些技術遇冷?程式設計
- 阿里攜手中科大,2017年雲棲大會將有“量子技術”重磅釋出!阿里
- 0基礎學Web前端要掌握哪些技術?Web前端
- 英特爾將釋出“超級晶片技術” 效能提高几倍晶片
- Warning!阿里的技術小姐姐要“發電”了阿里
- 2017年10月即將釋出的新手機盤點 十月新手機有哪些?
- Java要學哪些技術 鄭州Java培訓內容有哪些Java
- RHEL 7.0已釋出CentOS 7即將到來CentOS
- 尤雨溪的 Live 說了哪些技術名詞?
- 學習運維技術要掌握哪些知識點?運維
- 微軟麻將 AI 論文釋出,首次公開技術細節微軟AI
- vivo NEX將加入人工智慧技術 6月12日釋出人工智慧
- 新品釋出|可重構技術伺服器網路卡來了!伺服器
- 網路電話:新技術、新裝備 (轉)
- 灰度釋出的一種技術實踐
- Facebook工程釋出技術的幕後故事
- 1024特輯——來自美團技術大佬的成長大禮包
- OTN 釋出的技術文章的連結地址
- 國家資訊化戰略綱要釋出核心技術突圍成關鍵
- 開源介面管理平臺YApi v1.4.3 元旦釋出API
- 魅藍5C或將釋出 魅族千元入門機要來了
- 甲骨文要改變 Java 釋出頻率,將每半年釋出一個版本Java
- 對alpha釋出的總結技術隨筆
- 華米智慧手錶即將釋出! 都有哪些賣點?
- 火山引擎聯合IDC釋出雲原生白皮書:50%企業已將雲原生技術應用到生產環境
- 釋出會全程採用AR技術 金立M5 Plus將發
- 錘子春季新品釋出會將於5月9日舉行 堅果Pro要來了
- IFA要來了 三星新一代Gear S3智慧手錶將釋出S3
- 大資料要學習哪些技術呢?大資料技術的分類與選擇路線大資料
- 掘金技術徵文活動釋出一週,十三篇優秀的徵文出爐 | 掘金技術徵文
- 技術面試已死面試
- Motorola Solutions:零售商需要的店內技術裝備