PostgreSQL10.0preview效能增強-分割槽表效能增強(plan階段加速)

德哥發表於2017-03-14

標籤

PostgreSQL , 10.0 , 分割槽表 , 子表 , 元資訊搜尋效能增強


背景

PostgreSQL 10.0 增強了分割槽表的子表搜尋效能,對於涉及分割槽表包含子表特別多的QUERY,可以提升效能。

效能分析

get_tabstat_entry, find_all_inheritors成為主要瓶頸。

Hello.  

I decided to figure out whether current implementation of declarative  
partitioning has any bottlenecks when there is a lot of partitions. Here  
is what I did [1].  


-- init schema  

	iming on  

CREATE TABLE part_test (pk int not null, k int, v varchar(128)) PARTITION BY RANGE(pk);  

do $$  
declare  
    i integer;  
begin  
    for i in 1 .. 10000  
    loop  
        raise notice `i = %`, i;  
        execute (`CREATE TABLE part_test_` || i ||  
                 ` PARTITION OF part_test FOR VALUES FROM (` ||  
                 (1 + (i-1)*1000) || `) to (` || ( (i * 1000) + 1) || `);`  
                );  
    end loop;  
end $$;  

-- fill tables with some data  

do $$  
declare  
    i integer;  
begin  
    for i in 1 .. 100*1000  
    loop  
        raise notice `i = %`, i;  
        execute (`insert into part_test values ( ceil(random()*(10000-1)*1000), ceil(random()*10000*1000), ```` || ceil(random()*10000*1000) );`);  
    end loop;  
end $$;  


Then:  


# 2580 is some pk that exists  
echo `select * from part_test where pk = 2580;` > t.sql  
pgbench -j 7 -c 7 -f t.sql -P 1 -T 300 eax  


`perf top` showed to bottlenecks [2]. A stacktrace for the first one  
looks like this [3]:  


0x00000000007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 ` 00`) at pgstat.c:1689  
1689                if (entry->t_id == rel_id)  
#0  0x00000000007a42e2 in get_tabstat_entry (rel_id=25696, isshared=0 ` 00`) at pgstat.c:1689  
#1  0x00000000007a4275 in pgstat_initstats (rel=0x7f4af3fd41f8) at pgstat.c:1666  
#2  0x00000000004c7090 in relation_open (relationId=25696, lockmode=0) at heapam.c:1137  
#3  0x00000000004c72c9 in heap_open (relationId=25696, lockmode=0) at heapam.c:1291  
(skipped)  


And here is a stacktrace for the second bottleneck [4]:  


0x0000000000584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, numparents=0x0) at pg_inherits.c:199  
199             forboth(lo, rels_list, li, rel_numparents)  
#0  0x0000000000584fb1 in find_all_inheritors (parentrelId=16393, lockmode=1, numparents=0x0) at pg_inherits.c:199  
#1  0x000000000077fc9f in expand_inherited_rtentry (root=0x1badcb8, rte=0x1b630b8, rti=1) at prepunion.c:1408  
#2  0x000000000077fb67 in expand_inherited_tables (root=0x1badcb8) at prepunion.c:1335  
#3  0x0000000000767526 in subquery_planner (glob=0x1b63cc0, parse=0x1b62fa0, parent_root=0x0, hasRecursion=0 ` 00`, tuple_fraction=0) at planner.c:568  
(skipped)  


The first one could be easily fixed by introducing a hash table  
(rel_id -> pgStatList entry). Perhaps hash table should be used only  
after some threshold. Unless there are any objections I will send a  
corresponding patch shortly.  

I didn`t explored the second bottleneck closely yet but at first glance  
it doesn`t look much more complicated.  

Please don`t hesitate to share your thoughts regarding this matter.  

[1] http://afiskon.ru/s/e3/5f47af9102_benchmark.txt  
[2] http://afiskon.ru/s/00/2008c4ae66_temp.png  
[3] http://afiskon.ru/s/23/650f0afc89_stack.txt  
[4] http://afiskon.ru/s/03/a7e685a4db_stack2.txt  

--   
Best regards,  
Aleksander Alekseev  

這個patch的討論,詳見郵件組,本文末尾URL。

PostgreSQL社群的作風非常嚴謹,一個patch可能在郵件組中討論幾個月甚至幾年,根據大家的意見反覆的修正,patch合併到master已經非常成熟,所以PostgreSQL的穩定性也是遠近聞名的。

參考

https://commitfest.postgresql.org/13/1058/

https://www.postgresql.org/message-id/flat/20170228142509.GA19777@e733.localdomain#20170228142509.GA19777@e733.localdomain


相關文章