GPDB43 Administrator Guide--第九章 管理greenplum系統

panpong發表於2015-09-24

第九章 管理greenplum系統

一、監控資料庫活動與效能

使用工具greenplum command center

二、監控系統狀態

(一)開啟系統告警與通知

可以配置greenplum資料庫系統觸發snmp去告警並email通知系統管理員,當特定的資料庫事件發生時;這些事件包括:

  • All PANIC-level error conditions
  • All FATAL-level error conditions
  • ERROR-level conditions that are "internal errors" (for example, SIGSEGV errors)
  • Database system shutdown and restart
  • Segment failure and recovery
  • Standby master out-of-sync conditions
  • Master host manual shutdown or other software problem (in certain failure scenarios, Greenplum Database cannot send an alert or notification)

1. SNMP配置

a.準備,確認OS是否安裝snmp/usr/sbin/snmpd/etc/snmp/snmpd.conf

b.安裝之後:設定自啟動# /sbin/chkconfig snmpd on

c.測試執行情況

# snmpwalk -v 1 -c community_name localhost .1.3.6.1.2.1.1.1.0

c.設定snmp通知

i)master主機上配置引數,使用gpconfig工具

gp_snmp_community:snmp community名稱

gp_snmp_monitor_address:hostname:port,多個地址用逗號隔開

gp_snmp_use_inform_or_trap:trap inform

例子:

$ gpconfig -c gp_snmp_community -v public --masteronly

$ gpconfig -c gp_snmp_monitor_address -v mdw:162 --masteronly

$ gpconfig -c gp_snmp_use_inform_or_trap -v trap --masteronly

ii)測試snmp通知

# /usr/sbin/snmptrapd -m ALL -Lf ~/filename.log

-Lf:trap寫入日誌檔案

-Le:trap標準輸出

-m all:載入所有可用的MIB(Management Information Bases)

 

2.開啟EMAIL通知

a.開啟$MASTER_DATA_DIRECTORY/postgresql.conf檔案

b.修改EMAIL ALERTS段資訊,例如:

gp_email_smtp_server='smtp.company.com:25'

gp_email_smtp_userid='gpadmin@company.com'

gp_email_smtp_password='mypassword'

gp_email_from='Greenplum Database <gpadmin@company.com>'

gp_email_to='dba@company.com;John Smith <jsmith@company.com>'

也可以使用外網,公用的SMTP伺服器,例如gmail

gp_email_smtp_server='smtp.gmail.com:25'

#gp_email_smtp_userid=''

#gp_email_smtp_password=''

gp_email_from='gpadmin@company.com'

gp_email_to='test_account@gmail.com'

c.儲存並關閉postgresql.conf

d.過載GP資料庫postgresql.conf

$ gpstop -u

3.測試EMAIL通知

$ ping my_email_server

$ psql template1

=# SELECT gp_elog('Test GPDB Email',true); gp_elog

 

(二)檢查系統狀態

$ gpstate       #segment例項簡潔資訊

$ gpstate -s  #GP詳細資訊

$ gpstate -m #mirror資訊

$ gpstate -c  #primary與mirror對應資訊

$ gpstate -f  #standby master狀態

 

(三)檢查磁碟空間

·         查詢磁碟空閒空間

=# SELECT * FROM gp_toolkit.gp_disk_free ORDER BY dfsegment;

·         查詢資料庫佔用空間

=> SELECT * FROM gp_toolkit.gp_size_of_database ORDER BY soddatname;

·         查詢表佔用空間

=> SELECT relname AS name, sotdsize AS size, sotdtoastsize AS toast, sotdadditionalsize AS other

FROM gp_size_of_table_disk as sotd, pg_class

WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;

·         查詢索引佔用空間

=> SELECT soisize, relname as indexname

FROM pg_class, gp_size_of_index

WHERE pg_class.oid=gp_size_of_index.soioid

AND pg_class.relkind='i';

 

(四)檢查資料分佈

1.查詢表的分佈鍵

=# \d+ sales

Table "retail.sales"

Column | Type | Modifiers | Description

-------------+--------------+-----------+-------------

sale_id    | integer       |               |

amt          | float            |               |

date         | date            |               |

Has OIDs: no

Distributed by: (sale_id)  

2.查詢資料的分佈情況

=# SELECT gp_segment_id, count(*)  FROM table_name GROUP BY gp_segment_id;

3.檢查查詢程式的傾斜

=# SELECT gp_segment_id, count(*) FROM table_name

WHERE column='value' GROUP BY gp_segment_id;

·         極度傾斜警告

hash連線查詢時如果傾斜嚴重的話,則警告如下:

Extreme skew in the innerside of Hashjoin

參考如下步驟避免傾斜:

a.保證所有事實表已經分析過

b.驗證所有需要用到的臨時表也被分析

c.檢視EXPLAIN ANALYZE的查詢計劃,並查詢以下內容:

?如果多列過濾器掃描與估算值相比產生了更多的行,則設定在gp_selectivity_damping_factor伺服器配置引數為2或更高,並重新測試查詢。

?如果連線一個相對較小(小於5000行)的事實表時發生傾斜,則設定gp_segments_for_planner伺服器配置引數為1,並重新測試查詢。

d. 檢查查詢的過濾器是否應用在基表的分佈鍵上,如果過濾器與分佈鍵是相同的,則考慮不同的分佈鍵重新分佈基表資料

e.檢查連線鍵的基數。如果他們有低基數,嘗試用不同的連線列重寫查詢或附加過濾器,以減少行數。這些變化可能更改查詢語義。

(五)檢視資料庫物件的後設資料資訊

·         查詢最後一次執行的操作

=> SELECT schemaname as schema, objname as table, usename as role, actionname as action, subtype as type, statime as time 

FROM pg_stat_operations  WHERE objname='cust'; 

schema | table | role | action | type | time

--------+-------+------+---------+-------+--------------------------

sales | cust | main | CREATE | TABLE | 2010-02-09 18:10:07.867977-08

sales | cust | main | VACUUM | | 2010-02-10 13:32:39.068219-08

sales | cust | main | ANALYZE | | 2010-02-25 16:07:01.157168-08

(3 rows)

·         查詢物件定義:

=> \d+ mytable

(六)檢視session記憶體使用情況

·         建立session_level_memory_consumption檢視

$ psql –d testdb –f $GPHOME/share/postgresql/contrib/gp_session_state.sql

 

·         session_level_memory_consumption檢視介紹

is_runaway列指標是否是runaway session,而控制什麼時候runaway的引數是runaway_detector_activation_percent

column

type

column

datname

name

Name of the database that the session is connected to.

sess_id

integer

Session ID.

usename

name

Name of the session user.

current_query

text

Current SQL query that the session is running.

segid

integer

Current SQL query that the session is running.

vmem_mb

integer

Total vmem memory usage for the session in MB.

is_runaway

boolean

Session is marked as runaway on the segment.

qe_count

integer

Number of query processes for the session.

active_qe_count

integer

Number of active query processes for the session.

dirty_qe_count

integer

Number of query processes that have not yet released their memory. The value is -1 for sessions that are not running.

runaway_vmem_mb

integer

Amount of vmem memory that the session was consuming when it was marked as a runaway session.

runaway_command_cnt

integer

Command count for the session when it was marked as a runaway session.

 

(七)檢視query workfile使用情況

檢視資訊用於指定引數:gp_workfile_limit_per_querygp_workfile_limit_per_segment.

schema gp_toolkit中相關檢視:

gp_workfile_entries:

gp_workfile_usage_per_query:

gp_workfile_usage_per_segment:

 

三、檢視資料庫伺服器日誌檔案

日常日誌檔案在pg_log目錄,master和每一個segment主機中都有;日誌格式通常是csv

查詢日誌檔案中內容:

$ gplogfilter -n 3

$ gpssh -f seg_host_file

=> source /usr/local/greenplum-db/greenplum_path.sh

=> gplogfilter -n 3 /gpdata/gp*/pg_log/gpdb*.log

 

四、使用gp_toolkit

=> ALTER ROLE myrole SET search_path TO myschema,gp_toolkit;


五、greenplum資料庫的SNMP OIDERROR code

(一).Greenplum Database SNMP OIDs 

This is the Greenplum Database OID hierarchy structure:

    • iso(1) 
    • identified-organization(3) 
    • dod(6) 
    • internet(1) 
    • private(4) 
    • enterprises(1) 
    • gpdbMIB(31327) 
    • gpdbObjects(1) 
    • gpdbAlertMsg(1) 
  • gpdbAlertMsg 

1.3.6.1.4.1.31327.1.1: STRING: alert message text

  • gpdbAlertSeverity 
    • 1.3.6.1.4.1.31327.1.2: INTEGER: severity level 

                gpdbAlertSeverity can have one of the following values: 

 

    • gpdbSevUnknown(0) 
    • gpdbSevOk(1) 
    • gpdbSevWarning(2) 
    • gpdbSevError(3) 
    • gpdbSevFatal(4) 
    • gpdbSevPanic(5) 
    • gpdbSevSystemDegraded(6) 
    • gpdbSevSystemDown(7) 
  • gpdbAlertSqlstate 

1.3.6.1.4.1.31327.1.3: STRING: SQL standard error codes

        For a list of codes, see SQL Standard Error Codes.

  • gpdbAlertDetail 

1.3.6.1.4.1.31327.1.4: STRING: detailed alert message text

  • gpdbAlertSqlStmt 

1.3.6.1.4.1.31327.1.5: STRING: SQL statement generating this alert if  applicable 

  • gpdbAlertSystemName 

1.3.6.1.4.1.31327.1.6: STRING: hostname


(二)SQL標準ERROR code

見下表:

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16976507/viewspace-1807806/,如需轉載,請註明出處,否則將追究法律責任。

相關文章