PostgreSQL實時健康監控大屏-低頻指標
標籤
PostgreSQL , 大屏指標 , qps , long query , locks , active , idle in transaction , long idle in transaction , 2PC
背景
最關鍵的一些資料庫健康指標,趨勢監測。
1 資料年齡
年齡不能超過20億(大概值),建議當達到15億時,應儘快安排freeze。
postgres=# select datname,age(datfrozenxid) from pg_database order by 2 desc;
datname | age
-----------+-----------
postgres | 172127964
template1 | 172127964
template0 | 172127964
test | 172127964
(4 rows)
2 年齡大於N的物件佔用空間數
年齡大於12億的資料庫,佔用的空間數
postgres=# select pg_size_pretty(sum(pg_database_size(oid))) from pg_database where age(datfrozenxid)>1200000000;
pg_size_pretty
----------------
(1 row)
3 freeze 風暴預測
《PostgreSQL Freeze 風暴預測續 – 珍藏級SQL》
4 表膨脹
https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `SELECT
current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,
ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,
CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,
iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,
ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,
CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,
CASE WHEN relpages < otta THEN
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END
ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)
ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END
END AS totalwastedbytes
FROM (
SELECT
nn.nspname AS schemaname,
cc.relname AS tablename,
COALESCE(cc.reltuples,0) AS reltuples,
COALESCE(cc.relpages,0) AS relpages,
COALESCE(bs,0) AS bs,
COALESCE(CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,
COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
FROM
pg_class cc
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$
LEFT JOIN
(
SELECT
ma,bs,foo.nspname,foo.relname,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
ns.nspname, tbl.relname, hdr, ma, bs,
SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,
MAX(coalesce(null_frac,0)) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname
) AS nullhdr
FROM pg_attribute att
JOIN pg_class tbl ON att.attrelid = tbl.oid
JOIN pg_namespace ns ON ns.oid = tbl.relnamespace
LEFT JOIN pg_stats s ON s.schemaname=ns.nspname
AND s.tablename = tbl.relname
AND s.inherited=false
AND s.attname=att.attname,
(
SELECT
(SELECT current_setting($$block_size$$)::numeric) AS bs,
CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)
IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
WHERE att.attnum > 0 AND tbl.relkind=$$r$$
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
ON cc.relname = rs.relname AND nn.nspname = rs.nspname
LEFT JOIN pg_index i ON indrelid = cc.oid
LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml order by wastedbytes desc limit 5`
done
5 索引膨脹
https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `SELECT
current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,
ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,
CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,
CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,
iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,
ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,
CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,
CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,
CASE WHEN relpages < otta THEN
CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END
ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)
ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END
END AS totalwastedbytes
FROM (
SELECT
nn.nspname AS schemaname,
cc.relname AS tablename,
COALESCE(cc.reltuples,0) AS reltuples,
COALESCE(cc.relpages,0) AS relpages,
COALESCE(bs,0) AS bs,
COALESCE(CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,
COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
FROM
pg_class cc
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$
LEFT JOIN
(
SELECT
ma,bs,foo.nspname,foo.relname,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
ns.nspname, tbl.relname, hdr, ma, bs,
SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,
MAX(coalesce(null_frac,0)) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname
) AS nullhdr
FROM pg_attribute att
JOIN pg_class tbl ON att.attrelid = tbl.oid
JOIN pg_namespace ns ON ns.oid = tbl.relnamespace
LEFT JOIN pg_stats s ON s.schemaname=ns.nspname
AND s.tablename = tbl.relname
AND s.inherited=false
AND s.attname=att.attname,
(
SELECT
(SELECT current_setting($$block_size$$)::numeric) AS bs,
CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)
IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
WHERE att.attnum > 0 AND tbl.relkind=$$r$$
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
ON cc.relname = rs.relname AND nn.nspname = rs.nspname
LEFT JOIN pg_index i ON indrelid = cc.oid
LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
) AS sml order by wastedibytes desc limit 5`
done
6 TOP SQL
《PostgreSQL 如何查詢TOP SQL (例如IO消耗最高的SQL) – 珍藏級》
7 未使用索引
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_indexes where idx_scan=0 or idx_tup_read=0 or idx_tup_fetch=0;
`
done
8 未使用(查詢)表
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_tables where seq_scan=0 and idx_scan=0;
`
done
9 熱表
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan desc limit 10;
`
done
10 冷表
針對性刪除表,或者使用OSS冷儲存。
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_tables order by seq_scan+idx_scan limit 10;
`
done
11 熱索引
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_indexes order by idx_scan desc limit 10;
`
done
12 冷索引
針對性刪除索引。
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_indexes order by idx_scan limit 10;
`
done
13 全表掃描次數TOP物件
針對性建立索引。
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_tables order by seq_scan desc limit 10;
`
done
14 全表掃描返回記錄數TOP物件
針對性建立索引。
for db in `psql --pset=pager=off -t -A -q -c `select datname from pg_database where datname not in ($$template0$$, $$template1$$)``
do
psql -d $db --pset=pager=off -q -x -c `
select current_database(),* from pg_stat_all_tables order by seq_tup_read desc limit 10;
`
done
15 監控統計資訊不準的物件(表、物化檢視)
背景知識
1、建立測試表
postgres=# create table a(id int);
CREATE TABLE
2、禁止收集統計資訊
postgres=# alter table a set (autovacuum_enabled =off);
ALTER TABLE
3、寫入大量資料
postgres=# insert into a select generate_series(1,10000000);
INSERT 0 10000000
4、檢視統計資訊項,目前佔用多少空間(pages)
postgres=# select relpages from pg_class where relname=`a`;
relpages
----------
0
(1 row)
5、檢視真實空間佔用
postgres=# select pg_relation_size(`a`);
pg_relation_size
------------------
362479616
(1 row)
6、真實空間佔用,轉換為PAGES
postgres=# select pg_relation_size(`a`)/current_setting(`block_size`)::float8;
?column?
----------
44248
(1 row)
7、收集統計資訊,檢視統計資訊項,目前佔用多少空間(pages)
postgres=# analyze a;
ANALYZE
postgres=# select relpages from pg_class where relname=`a`;
relpages
----------
44248
(1 row)
8、根據以上原理,可以設計評估統計資訊不準確的表 SQL如下:
select oid::regclass as table, relpages, pg_relation_size(oid)/current_setting(`block_size`)::float8 as real_pages from pg_class
where relkind in (`r`, `m`) -- 表和物化檢視
and pg_relation_size(oid) > 1048576 -- 大於1MB
and (pg_relation_size(oid)/current_setting(`block_size`)::float8 - relpages)/(pg_relation_size(oid)/current_setting(`block_size`)::float8) > 0.2; -- 大於 20% 偏差
table | relpages | real_pages
-------+----------+------------
a | 0 | 44248
(1 row)
核心增加功能
1、讀寫磁碟吞吐快照區間統計,區分索引,表,垃圾回收,FREEZE,AUTOANALYZE。分類統計。
2、鎖等待時長快照區間統計,區分鎖粒度,下鑽到物件。
注意
stat的資訊在使用pg_stat_reset()後會清零。請注意。
參考
https://raw.githubusercontent.com/digoal/pgsql_admin_script/master/generate_report.sh
《PostgreSQL Freeze 風暴預測續 – 珍藏級SQL》
相關文章
- PostgreSQL實時健康監控大屏-高頻指標(伺服器)SQL指標伺服器
- 微服務:指標和健康監控微服務指標
- 輕鬆搞定實時分析及監控大屏
- mongodb 監控指標MongoDB指標
- EMQ 監控指標MQ指標
- Orabbix監控指標指標
- Java程式監控指標Java指標
- 監控innodb status指標指標
- mysql效能監控指標MySql指標
- 淺談Orabbix監控指標指標
- 通過micrometer實時監控執行緒池的各項指標執行緒指標
- K8S Canal基於Prometheus進行實時指標監控K8SPrometheus指標
- PostgreSQL業務資料質量實時監控實踐SQL
- 系統監控&JVM監控指標資料查詢JVM指標
- OpenTelemetry 實戰:從零實現應用指標監控指標
- 運維監控指標彙總運維指標
- MYSQL和SQLServer效能監控指標MySqlServer指標
- 如何使用nload實時監控網路頻寬
- 全球疫情實時監控——約翰斯·霍普金斯大學資料大屏實現方案
- 實戰| 配置DataDog監控Apache Hudi應用指標Apache指標
- C 指標有害健康指標
- beta版 tomcat 應用監控指標Tomcat指標
- 基於 prometheus 的微服務指標監控Prometheus微服務指標
- 04、MySQL Case-MySQL常用監控指標MySql指標
- 分散式架構的監控與指標分散式架構指標
- nagios監控例項 -- PostgreSQL監控iOSSQL
- 程式實時監控
- 徒手教你製作運維監控大屏運維
- 簡單4步,利用Prometheus Operator實現自定義指標監控Prometheus指標
- 如何高效利用 Grafana 監控分析 TiDB 指標GrafanaTiDB指標
- MySQL 效能監控4大指標——第一部分MySql指標
- MySQL效能監控4大指標——第一部分MySql指標
- 資料庫效能大揭秘:玩轉MySQL監控指標狀態變數資料庫MySql指標變數
- 實時檔案監控
- iOS實時卡頓監控iOS
- Spring Boot 揭祕與實戰(九) 應用監控篇 - HTTP 健康監控Spring BootHTTP
- 使用Prometheus監控Linux系統各項指標PrometheusLinux指標
- 【系統設計】指標監控和告警系統指標