Greenplum型別一致性使用規範-索引條件、JOIN的型別一致性限制
標籤
PostgreSQL , Greenplum , 型別一致 , 索引 , inner轉換 , join
背景
在查詢時,有很多使用者會犯渾,發現建立了索引,但是查詢偏偏不走索引。
怎麼不走索引啊?
這裡做容易混淆的是型別的差異,例如欄位型別為字串,但是輸入的是INT型別,這樣就可能不走索引。(除非建立了自動的CAST,自動進行型別轉換)
查詢的輸入型別與索引的型別一致是非常有必要的。
例子
1、建表(使用變長、定長字串型別),寫入測試資料
create table tbl3(id int, c1 text, c2 varchar(64), c3 char(64));
postgres=# insert into tbl3 select id, md5, md5, md5 from (select id, md5(random()::text) as md5 from generate_series(1,1000000) t(id)) t;
INSERT 0 1000000
2、建立索引
postgres=# create index idx_tbl3_1 on tbl3 (c1);
CREATE INDEX
postgres=# create index idx_tbl3_2 on tbl3 (c2);
CREATE INDEX
postgres=# create index idx_tbl3_3 on tbl3 (c3);
CREATE INDEX
3、結構如下
postgres=# d+ tbl3
Table "public.tbl3"
Column | Type | Modifiers | Storage | Description
--------+-----------------------+-----------+----------+-------------
id | integer | | plain |
c1 | text | | extended |
c2 | character varying(64) | | extended |
c3 | character(64) | | extended |
Indexes:
"idx_tbl3_1" btree (c1)
"idx_tbl3_2" btree (c2)
"idx_tbl3_3" btree (c3)
Has OIDs: no
Distributed by: (id)
4、走索引,強制轉換為bpchar。
postgres=# explain analyze select * from tbl3 where c3=`abc`;
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..201.23 rows=1 width=135)
Rows out: 0 rows at destination with 3.081 ms to end, start offset by 0.294 ms.
-> Index Scan using idx_tbl3_3 on tbl3 (cost=0.00..201.23 rows=1 width=135)
Index Cond: c3 = `abc`::bpchar
Rows out: 0 rows (seg0) with 0.138 ms to end, start offset by 3.211 ms.
Slice statistics:
(slice0) Executor memory: 147K bytes.
(slice1) Executor memory: 145K bytes avg x 3 workers, 145K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Total runtime: 3.484 ms
(11 rows)
5、不走索引,因為輸入的為text變長,與char定長不一致。
postgres=# explain analyze select * from tbl3 where c3=`abc`::text;
QUERY PLAN
-------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..20032.60 rows=1001 width=135)
Rows out: 0 rows at destination with 305 ms to end, start offset by 0.335 ms.
-> Seq Scan on tbl3 (cost=0.00..20032.60 rows=334 width=135)
Filter: c3::text = `abc`::text
Rows out: 0 rows (seg0) with 302 ms to end, start offset by 4.023 ms.
Slice statistics:
(slice0) Executor memory: 147K bytes.
(slice1) Executor memory: 195K bytes avg x 3 workers, 195K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Total runtime: 305.985 ms
(11 rows)
6、走索引,輸入的型別為變長型別,欄位也是變長型別,與TEXT相容。
postgres=# explain analyze select * from tbl3 where c2=`abc`::text;
QUERY PLAN
------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..200.86 rows=1 width=135)
Rows out: 0 rows at destination with 2.744 ms to end, start offset by 0.275 ms.
-> Index Scan using idx_tbl3_2 on tbl3 (cost=0.00..200.86 rows=1 width=135)
Index Cond: c2::text = `abc`::text
Rows out: 0 rows (seg0) with 0.031 ms to end, start offset by 2.868 ms.
Slice statistics:
(slice0) Executor memory: 147K bytes.
(slice1) Executor memory: 145K bytes avg x 3 workers, 145K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Total runtime: 3.120 ms
(11 rows)
7、不走索引,輸入的型別為定長型別,但是欄位為變長型別。
postgres=# explain analyze select * from tbl3 where c2=`abc`::bpchar;
QUERY PLAN
-----------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..17531.66 rows=1 width=135)
Rows out: 0 rows at destination with 60 ms to end, start offset by 0.276 ms.
-> Seq Scan on tbl3 (cost=0.00..17531.66 rows=1 width=135)
Filter: c2::bpchar = `abc`::bpchar
Rows out: 0 rows (seg0) with 57 ms to end, start offset by 2.864 ms.
Slice statistics:
(slice0) Executor memory: 147K bytes.
(slice1) Executor memory: 131K bytes avg x 3 workers, 131K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Total runtime: 60.320 ms
(11 rows)
8、內表不走索引,因為JOIN欄位型別不匹配。
postgres=# explain select count(*) from tbl3 t1 join tbl3 t2 on (t1.c1=t2.c3) and t1.c1=`abc`;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Aggregate (cost=25235.81..25235.82 rows=1 width=8)
-> Gather Motion 3:1 (slice2; segments: 3) (cost=25235.74..25235.79 rows=1 width=8)
-> Aggregate (cost=25235.74..25235.75 rows=1 width=8)
-> Nested Loop (cost=0.00..25235.66 rows=11 width=0)
-> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..200.88 rows=1 width=33)
-> Index Scan using idx_tbl3_1 on tbl3 t1 (cost=0.00..200.86 rows=1 width=33)
Index Cond: c1 = `abc`::text AND `abc`::text = c1
-> Seq Scan on tbl3 t2 (cost=0.00..25034.46 rows=11 width=65)
Filter: `abc`::text = c3::text AND c3::text = `abc`::text
(9 rows)
9、內表走索引,因為JOIN欄位型別都是變長,匹配。
postgres=# explain select count(*) from tbl3 t1 join tbl3 t2 on (t1.c1=t2.c2) and t1.c1=`abc`;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=401.86..401.87 rows=1 width=8)
-> Gather Motion 3:1 (slice2; segments: 3) (cost=401.80..401.85 rows=1 width=8)
-> Aggregate (cost=401.80..401.81 rows=1 width=8)
-> Nested Loop (cost=200.88..401.79 rows=2 width=0)
-> Index Scan using idx_tbl3_1 on tbl3 t1 (cost=0.00..200.86 rows=1 width=33)
Index Cond: c1 = `abc`::text AND `abc`::text = c1
-> Materialize (cost=200.88..200.89 rows=1 width=33)
-> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..200.88 rows=1 width=33)
-> Index Scan using idx_tbl3_2 on tbl3 t2 (cost=0.00..200.86 rows=1 width=33)
Index Cond: `abc`::text = c2::text AND c2::text = `abc`::text
(10 rows)
分析
實際上就是要求索引的表示式與條件的表示式一致。
例子,使用表示式,強制轉換為變長型別。
postgres=# create index idx_tbl3_4 on tbl3 ((c3::text));
CREATE INDEX
查詢時,只要表示式的型別與條件型別匹配即可,走索引,同時會使用Implicit轉換。
postgres=# explain analyze select * from tbl3 where (c3::text)=`abc`;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=108.60..52793.25 rows=1001 width=135)
Rows out: 0 rows at destination with 3.796 ms to end, start offset by 0.309 ms.
-> Bitmap Heap Scan on tbl3 (cost=108.60..52793.25 rows=334 width=135)
Recheck Cond: c3::text = `abc`::text
Rows out: 0 rows (seg0) with 0.163 ms to end, start offset by 3.095 ms.
-> Bitmap Index Scan on idx_tbl3_4 (cost=0.00..108.35 rows=334 width=0)
Index Cond: c3::text = `abc`::text
Bitmaps out: Avg 1.0 x 3 workers. Max 1 (seg0) with 0.135 ms to end, start offset by 3.119 ms.
Slice statistics:
(slice0) Executor memory: 155K bytes.
(slice1) Executor memory: 321K bytes avg x 3 workers, 321K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Settings: enable_seqscan=off
Total runtime: 4.219 ms
(15 rows)
postgres=# explain analyze select * from tbl3 where (c3::text) = `123`::int;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice1; segments: 3) (cost=108.60..52793.25 rows=1001 width=135)
Rows out: 0 rows at destination with 2.926 ms to end, start offset by 0.298 ms.
-> Bitmap Heap Scan on tbl3 (cost=108.60..52793.25 rows=334 width=135)
Recheck Cond: c3::text = `123`::text
Rows out: 0 rows (seg0) with 0.110 ms to end, start offset by 3.057 ms.
-> Bitmap Index Scan on idx_tbl3_4 (cost=0.00..108.35 rows=334 width=0)
Index Cond: c3::text = `123`::text
Bitmaps out: Avg 1.0 x 3 workers. Max 1 (seg0) with 0.064 ms to end, start offset by 3.095 ms.
Slice statistics:
(slice0) Executor memory: 155K bytes.
(slice1) Executor memory: 289K bytes avg x 3 workers, 289K bytes max (seg0).
Statement statistics:
Memory used: 128000K bytes
Settings: enable_seqscan=off
Total runtime: 3.327 ms
(15 rows)
postgres=# dC
List of casts
Source type | Target type | Function | Implicit?
-----------------------------+-----------------------------+--------------------+---------------
abstime | date | date | in assignment
abstime | integer | (binary coercible) | no
abstime | timestamp without time zone | timestamp | yes
abstime | timestamp with time zone | timestamptz | yes
abstime | time without time zone | time | in assignment
bigint | bit | bit | no
bigint | character | text | in assignment
bigint | character varying | text | in assignment
bigint | double precision | float8 | yes
bigint | integer | int4 | in assignment
bigint | numeric | numeric | yes
bigint | oid | oid | yes
bigint | real | float4 | yes
...............
小結
要讓索引起作用,最起碼應該關注一下幾點。
1、注意變長字串和定長字串,屬於兩種資料型別,不要混淆。
2、JOIN時操作符兩邊的欄位型別一致。
3、查詢時,WHERE條件與索引表示式一致。
4、查詢時,WHERE條件與索引欄位的型別一致。
5、查詢時,WHERE條件的collate與索引的collate一致。(本地化相關)
6、操作符本身支援對應的索引訪問方法。例如大多數的=操作符,支援b-tree的索引訪問方法。
7、PG是CBO優化,因此索引訪問的成本更低時,才會選擇索引掃描。(設定了特定的優化器開關,或者使用了HINT除外)。
參考
相關文章
- Greenplum索引設計的規範索引
- typescript type 分配條件型別TypeScript型別
- SQL | JOIN 型別使用介紹SQL型別
- MySQL索引的型別MySql索引型別
- TypeScript 索引型別TypeScript索引型別
- TypeScript 官方手冊翻譯計劃【九】:型別操控-條件型別TypeScript型別
- 使用條件型別實現TypeScript中的函式過載型別TypeScript函式
- TypeScript 條件型別精讀與實踐TypeScript型別
- Git提交規範中,常見的commit型別GitMIT型別
- 從規範看ECMAScript(二):資料型別資料型別
- 實戰Elasticsearch6的join型別Elasticsearch型別
- 32. 基本資料型別、約束條件資料型別
- Golang的值型別和引用型別的範圍、儲存區域、區別Golang型別
- Oracle中left join中右表的限制條件Oracle
- react 報錯 元素隱式具有 "any" 型別,因為型別為 "string" 的表示式不能用於索引型別 "{}"。 在型別 "{}" 上找不到具有型別為 "string" 的引數的索引簽名。React型別索引
- 資料型別範圍資料型別
- 如何理解Linux系統中的資料一致性和IO型別?Linux型別
- Java基本資料型別總結、型別轉換、常量的宣告規範,final關鍵字的用法Java資料型別
- postgreSQL 索引(二)型別介紹SQL索引型別
- <input type="file"> 限制檔案型別型別
- mysql索引型別Normal,Unique,Full Text區別以及索引方法Btree,Hash的區別MySql索引型別ORM
- 基本資料型別與API引用型別的使用資料型別API
- 淺談程式語言型別的強型別,弱型別,動態型別,靜態型別型別
- MySQL的索引型別和實現原理MySql索引型別
- MySQL 索引的型別——《高效能MySQL》MySql索引型別
- 掌握4種SQL索引型別,剖析索引原理SQL索引型別
- 值型別與引用型別的區別型別
- TypeScript 基本型別和泛型的使用TypeScript型別泛型
- mysql索引型別:FULLTEXT、NORMAL、SPATIAL、UNIQUEMySql索引型別ORM
- 【Postgresql】索引型別(btree、hash、GIST、GIN)SQL索引型別
- 關於sqlserver字元型別查詢條件區分大小寫SQLServer字元型別
- 申請英國公司需要什麼條件?都有哪些型別?型別
- SAP MM 自定義條件型別出現在採購資訊記錄的'條件'介面裡 ?型別
- TS資料型別:型別別名/聯合型別/字面量型別/型別推論等綱要資料型別
- Redis Stream型別的使用Redis型別
- MySQL 的索引型別及如何建立維護MySql索引型別
- TypeScript 官方手冊翻譯計劃【八】:型別操控-按索引訪問的型別TypeScript型別索引
- Java的基本型別和引用型別Java型別
- c#中值型別和引用型別的區別C#型別