騰訊雲TDSQL PostgreSQL版 -最佳實踐｜優化 SQL 語句

騰訊雲資料庫發表於2021-08-11

原文網址 : https://www.cnblogs.com/tencentdb/p/15130105.html

檢視是否為分佈鍵查詢
postgres=# explain select * from tbase_1 where f1=1;
QUERY PLAN
--------------------------------------------------------------------------------
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f1 = 1)
(6 rows)
postgres=# explain select * from tbase_1 where f2=1;
QUERY PLAN
--------------------------------------------------------------------------------
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_1 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f2 = 1)
(6 rows)
如上，第一個查詢為非分佈鍵查詢，需要發往所有節點，這樣最慢的節點決定了整個業務的速度，需要保持所有節點的響應效能一致，如第二個查詢所示，業務設計查詢時儘可能帶上分佈鍵。

檢視是否使用索引
postgres=# create index tbase_2_f2_idx on tbase_2(f2);
CREATE INDEX
postgres=# explain select * from tbase_2 where f2=1;
QUERY PLAN
-------------------------------------------------------------------------------------
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)
Index Cond: (f2 = 1)
(4 rows)
postgres=# explain select * from tbase_2 where f3='1';
QUERY PLAN
--------------------------------------------------------------------------------
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Gather (cost=1000.00..7827.20 rows=1 width=14)
Workers Planned: 2
-> Parallel Seq Scan on tbase_2 (cost=0.00..6827.10 rows=1 width=14)
Filter: (f3 = '1'::text)
(6 rows)
postgres=#
第一個查詢使用了索引，第二個沒有使用索引，通常情況下，使用索引可以加速查詢速度，但索引也會增加更新的開銷。

檢視是否為分佈 key join
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
------------------------------------------------------------------------------------------------
Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)
-> Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
Distribute results by S: f1
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=18.80..18.80 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(8 rows)
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f2=tbase_2.f1 ;
QUERY PLAN
---------------------------------------------------------------------------------
Remote Fast Query Execution (cost=0.00..0.00 rows=0 width=0)
Node/s: dn001, dn002
-> Hash Join (cost=18904.69..46257.08 rows=500564 width=14)
Hash Cond: (tbase_1.f2 = tbase_2.f1)
-> Seq Scan on tbase_1 (cost=0.00..9225.64 rows=500564 width=14)
-> Hash (cost=9225.64..9225.64 rows=500564 width=4)
-> Seq Scan on tbase_2 (cost=0.00..9225.64 rows=500564 width=4)
(7 rows)
第一個查詢需要資料重分佈，而第二個不需要，分佈鍵 join 查詢效能會更高。

檢視 join 發生的節點
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=126.72..126.72 rows=880 width=4)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..126.72 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(7 rows)
postgres=# set prefer_olap to on;
SET
postgres=# explain select tbase_1.* from tbase_1,tbase_2 where tbase_1.f1=tbase_2.f1 ;
QUERY PLAN
------------------------------------------------------------------------------------------------
Remote Subquery Scan on all (dn001,dn002) (cost=29.80..186.32 rows=3872 width=40)
-> Hash Join (cost=29.80..186.32 rows=3872 width=40)
Hash Cond: (tbase_1.f1 = tbase_2.f1)
-> Remote Subquery Scan on all (dn001,dn002) (cost=100.00..158.40 rows=880 width=40)
Distribute results by S: f1
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=40)
-> Hash (cost=18.80..18.80 rows=880 width=4)
-> Seq Scan on tbase_2 (cost=0.00..18.80 rows=880 width=4)
(8 rows)
第一個 join 在 cn 節點執行，第二個在 dn 上重分佈後再 join，業務設計上，一般 OLTP 類業務在 cn 上進行少資料量 join ，效能會更好。

檢視並行的 worker 數
postgres=# explain select count(1) from tbase_1;
QUERY PLAN
---------------------------------------------------------------------------------------
Finalize Aggregate (cost=118.81..118.83 rows=1 width=8)
-> Remote Subquery Scan on all (dn001,dn002) (cost=118.80..118.81 rows=1 width=0)
-> Partial Aggregate (cost=18.80..18.81 rows=1 width=8)
-> Seq Scan on tbase_1 (cost=0.00..18.80 rows=880 width=0)
(4 rows)
postgres=# analyze tbase_1;
ANALYZE
postgres=# explain select count(1) from tbase_1;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Parallel Finalize Aggregate (cost=14728.45..14728.46 rows=1 width=8)
-> Parallel Remote Subquery Scan on all (dn001,dn002) (cost=14728.33..14728.45 rows=1 width=0)
-> Gather (cost=14628.33..14628.44 rows=1 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=13628.33..13628.34 rows=1 width=8)
-> Parallel Seq Scan on tbase_1 (cost=0.00..12586.67 rows=416667 width=0)
(6 rows)
上面第一個查詢沒走並行，第二個查詢 analyze 後走並行才是正確的，建議大資料量更新再執行 analyze。

檢視各節點的執行計劃是否一致
./tbase_run_sql_dn_master.sh "explain select * from tbase_2 where f2=1"
dn006 --- psql -h 172.16.0.13 -p 11227 -d postgres -U tbase -c "explain select * from tbase_2 where f2=1"
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on tbase_2 (cost=2.18..7.70 rows=4 width=40)
Recheck Cond: (f2 = 1)
-> Bitmap Index Scan on tbase_2_f2_idx (cost=0.00..2.18 rows=4 width=0)
Index Cond: (f2 = 1)
(4 rows)
dn002 --- psql -h 172.16.0.42 -p 11012 -d postgres -U tbase -c "explain select * from tbase_2 where f2=1"
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using tbase_2_f2_idx on tbase_2 (cost=0.42..4.44 rows=1 width=14)
Index Cond: (f2 = 1)
(2 rows)
兩個 dn 的執行計劃不一致，最大可能是資料傾斜或者是執行計劃被禁用。
如有可能，DBA 可以配置在系統空閒時執行全庫 analyze 和 vacuum。

SQL語句優化
2019-05-08
SQL優化
SQL語句最佳化
2024-07-06
SQL
MYSQL SQL語句優化
2019-02-22
MySql優化
sql語句效能優化
2021-01-03
SQL優化
MySQL在大資料、高併發場景下的SQL語句優化和"最佳實踐"
2018-04-17
MySql大資料優化
MySQL之SQL語句優化
2022-05-25
MySql優化
[20201210]sql語句優化.txt
2020-12-10
SQL優化
京東雲TiDB SQL最佳化的最佳實踐
2022-10-18
TiDBSQL
[20221012]修改統計資訊最佳化sql語句.txt
2022-10-12
SQL
騰訊雲原生資料庫TDSQL-C架構探索和實踐
2022-07-08
資料庫SQL架構
優化 SQL 語句的步驟
2021-08-23
優化SQL
資料庫最佳化技巧 - SQL語句最佳化
2022-11-23
資料庫SQL
騰訊雲Elasticsearch叢集規劃及效能優化實踐
2020-09-30
Elasticsearch優化
騰訊雲TDSQL MySQL版 - 開發指南二級分割槽
2021-08-19
MySql
騰訊雲TDSQL MySQL版 - 開發指南分散式事務
2021-08-27
MySql分散式
[20200320]SQL語句優化的困惑.txt
2020-03-20
SQL優化
直播分享| 騰訊雲 MongoDB 智慧診斷及效能優化實踐
2022-06-23
MongoDB優化
騰訊雲H5語音通訊QoE優化
2018-04-26
H5優化
騰訊雲操作實踐
2021-11-19
postgresql dba常用sql查詢語句
2019-08-27
SQL
[20181114]一條sql語句的優化.txt
2018-11-14
SQL優化
騰訊雲容器服務日誌採集最佳實踐
2020-10-20
SQL優化案例-單表分頁語句的優化（八）
2018-11-28
SQL優化
《MySQL慢查詢優化》之SQL語句及索引優化
2020-12-06
MySql優化索引
Java中如何解析SQL語句、格式化SQL語句、生成SQL語句？
2023-03-07
JavaSQL
SQL最佳化案例-單表分頁語句的最佳化（八）
2018-11-21
SQL
騰訊雲TDSQL助力金融核心系統數字化轉型
2021-09-26
SQL
MySQL 52個SQL效能優化策略SQL語句彙總
2022-01-17
MySql優化
Sql語句本身的優化-定位慢查詢
2018-10-11
SQL優化
[20200324]SQL語句優化的困惑2.txt
2020-03-24
SQL優化
SQL語句優化的原則與方法QO
2022-03-21
SQL優化
你真的瞭解“SQL”嗎？《SQL優化最佳實踐》作者帶你重新瞭解SQL
2019-07-11
SQL優化
騰訊註冊中心演進及效能最佳化實踐
2022-11-24
墨天輪沙龍 | 騰訊雲陳昊：TDSQL-C Serverless應用與技術實踐
2022-12-26
SQLServer
2022 前端效能優化最佳實踐
2022-04-25
前端優化
spark sql語句效能最佳化及執行計劃
2024-10-19
SparkSQL
彈性配置為構建提速 - CODING & 騰訊雲 CVM 最佳實踐
2020-06-24
騰訊雲TDSQL-C雲原生資料庫技術
2021-09-26
SQL資料庫

騰訊雲TDSQL PostgreSQL版 -最佳實踐 ｜優化 SQL 語句

相關文章

騰訊雲TDSQL PostgreSQL版 -最佳實踐｜優化 SQL 語句