Greenplum初始化資料庫gpinitsystem報錯以及解決

你好我是李白發表於2020-06-19

 

初始化報錯解決

報錯Unable to resolve mdw on this host

現象

[gpadmin@sdw1 ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Locale set to en_US.utf8
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[WARN]:-Master hostname mdw does not match hostname output
20200618:15:33:10:010535 gpinitsystem:sdw1:gpadmin-[INFO]:-Checking to see if mdw can be resolved on this host
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Master hostname in configuration file is mdw
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Operating system command returns sdw1
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Unable to resolve mdw on this host
20200618:15:33:11:010535 gpinitsystem:sdw1:gpadmin-[FATAL]:-Master hostname in gpinitsystem configuration file must be mdw Script Exiting!


原因

由於gpinitsystem命令config要求不能寫master節點hostname,所以無法使用hostfile解析master節點hostname,所以需要使用Master節點初始化Greenplum。

解決

使用mdw節點gpinitsystem初始化Greenplum。

報錯Unknown host sdw1: ping: sdw1

現象

[gpadmin@mdw ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem
20200618:15:35:11:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, please wait...
 
...
 
> y
20200618:15:36:27:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Building the Master instance database, please wait...
20200618:15:36:33:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Starting the Master in admin mode
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[FATAL]:-Unknown host sdw1: ping: sdw1: Name or service not known
ping: sdw1: Name or service not known Script Exiting!
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[WARN]:-Script has left Greenplum Database in an incomplete state
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[WARN]:-Run command bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20200618_153511 to remove these changes
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND
20200618:15:36:34:007212 gpinitsystem:mdw:gpadmin-[INFO]:-End Function BACKOUT_COMMAND


原因

segment兩臺機器真實的hostname需要在/etc/hosts檔案中有對映關係。

解決

需要將segment主機真實hostname加入/etc/hosts檔案。

報錯No segment started for content: 0

現象

初始化命令報錯資訊:

20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.8.0 build commit:d9b16e3438fc6e01e6083cd82cf76ba99c1b50b5'
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Greenplum Catalog Version: '301908232'
20200618:15:58:27:014010 gpstart:mdw:gpadmin-[INFO]:-Starting Master instance in admin mode
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Setting new master era
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Master Started...
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Shutting down master
20200618:15:58:28:014010 gpstart:mdw:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Process results...
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[ERROR]:-No segment started for content: 0.
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-dumping success segments: ['sdw1:/data2/primary/gpseg1:content=1:dbid=3:role=p:preferred_role=p:mode=n:status=u', 'sdw2:/data2/primary/gpseg3:content=3:dbid=5:role=p:preferred_role=p:mode=n:status=u']
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-DBID:2  FAILED  host:'sdw1' datadir:'/data1/primary/gpseg0' with reason:'PG_CTL failed.'
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-DBID:4  FAILED  host:'sdw2' datadir:'/data1/primary/gpseg2' with reason:'PG_CTL failed.'
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
 
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Failed segment starts                                                = 2   <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Successfully started 2 of 4 segment instances <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Segment instance startup failures reported
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Failed start 2 of 4 segment instances <<<<<<<<
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[WARNING]:-Review /home/gpadmin/gpAdminLogs/gpstart_20200618.log
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-----------------------------------------------------
20200618:15:58:34:014010 gpstart:mdw:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait...
20200618:15:58:36:014010 gpstart:mdw:gpadmin-[ERROR]:-gpstart error: Do not have enough valid segments to start the array.

gpinitsystem_20200618.log報錯資訊:

20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:-Failed to start Greenplum instance; review gpstart output to
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- determine why gpstart failed and reinitialize cluster after resolving
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- issues.  Not all initialization tasks have completed so the cluster
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:- should not be used.
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:-gpinitsystem will now try to stop the cluster
20200618:16:21:05:015525 gpinitsystem:mdw:gpadmin-[WARN]:
20200618:16:21:06:015525 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function ERROR_EXIT
20200618:16:21:06:015525 gpinitsystem:mdw:gpadmin-[WARN]:-Failed to stop new Greenplum instance Script Exiting!


診斷

gpstart -m -d /data/master/gpseg-1   /* 只啟動master
gpstop -a -M fast /* -a禁止輸出確認y/n,-M fast/immediate/smart,相當於oracle shutdown abort/immediate/normal
gpstart -a -v     /* -v verbose輸出詳細啟動日誌。


上翻找到該節點啟動失敗命令

  stderr=''
20200618:16:36:06:016487 gpsegstart.py_sdw2:gpadmin:sdw2:gpadmin-[DEBUG]:-[worker1] finished cmd: Starting seg at dir /data1/primary/gpseg2 
cmdStr='env GPSESSID=0000000000 GPERA=8a0d21cca0b8bbb8_200618163604 
$GPHOME/bin/pg_ctl -D /data1/primary/gpseg2 -l /data1/primary/gpseg2/pg_log/startup.log -w -t 600 -o " -p 6000 " start 2>&1'  
had result: cmd had rc=1 completed=True halted=False
  stdout='waiting for server to start.... stopped waiting
pg_ctl: could not start server


 

去對應節點找到啟動日誌檔案/data2/primary/gpseg3/pg_log/startup.log

2020-06-18 16:36:05.835068 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
2020-06-18 16:36:05.835486 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","XX000","could not bind IPv4 socket: Address already in use",,"Is another postmaster already running on port 6000? If not, wait a few seconds and retry.",,,,,,"StreamServerPort","pqcomm.c",503,
2020-06-18 16:36:05.835741 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"LOG","XX000","could not bind IPv6 socket: Address already in use",,"Is another postmaster already running on port 6000? If not, wait a few seconds and retry.",,,,,,"StreamServerPort","pqcomm.c",503,
2020-06-18 16:36:05.836023 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"WARNING","01000","could not create listen socket for ""*""",,,,,,,,"PostmasterMain","postmaster.c",1202,
2020-06-18 16:36:05.836162 CST,,,p16504,th646150272,,,,0,,,seg2,,,,,"FATAL","XX000","could not create any TCP/IP sockets",,,,,,,,"PostmasterMain","postmaster.c",1207,1    0xbe84ec postgres errstart (elog.c:557)


檢查發現圖形介面佔用了6000埠,導致segment啟動失敗。

[root@sdw2 ~]# netstat -anp|grep 6000
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      3969/X              
tcp6       0      0 :::6000                 :::*                    LISTEN      3969/X              
[root@sdw2 ~]#


解決

修改gpinitsystem gpconfig,將Master instance與Segment instance埠修改為:PORT_BASE=6500,MIRROR_PORT_BASE=7500

重新執行初始化

報錯Inconsistency between number of multi-home hostnames and number of segments per host

現象

[gpadmin@mdw gpconfigs]$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -s mdw -S /data/
master/  standby/ 
[gpadmin@mdw gpconfigs]$ gpinitsystem -c gpinitsystem_config -h hostfile_gpinitsystem -s mdw -S /data/standby/
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Reading Greenplum configuration file gpinitsystem_config
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Locale has not been set in gpinitsystem_config, will set to default value
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Locale set to en_US.utf8
20200619:21:15:17:001508 gpinitsystem:mdw:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20200619:21:15:18:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Checking configuration parameters, Completed
20200619:21:15:18:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
....
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Configuring build for multi-home array
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[FATAL]:-Inconsistency between number of multi-home hostnames and number of segments per host
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-Have 3 data directories and 2 multi-home hostnames for each host
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-For multi-home configuration, number of segment instance data directories per host must be multiple of
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[INFO]:-the number of multi-home hostnames within the GPDB array
20200619:21:15:19:001508 gpinitsystem:mdw:gpadmin-[FATAL]:-Unable to continue Script Exiting!


原因

hostfile中只有兩個主機名對應segment interface,config檔案中DATA_DIRECTORY每個主機指定了3個segment instance,無法平衡,報錯。

解決

修改DATA_DIRECTORY為每個segment主機四個segment instance,或者修改hostfile列表,修改為3個interface。

新增Standby Master報錯

gpinitstandby -S指定目錄已存在

[gpadmin@mdw gpconfigs]$ gpinitstandby -s mdw -S /data/standby/ -P 5433
20200619:21:27:37:011844 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Data directory already exists on host mdw
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-If you want to initialize a new standby on the same host as the master (not recommended), use -S and -P to specify a new data directory and port
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Failed to create standby
20200619:21:27:38:011844 gpinitstandby:mdw:gpadmin-[ERROR]:-Error initializing standby master: master data directory exists


解決

       檢視目錄,如果已經存在,更換或者刪除目錄,gpinitstandby命令自行建立。

同機器建立Standby Master Instance使用預設埠與Master Instance衝突

[gpadmin@mdw data]$ gpinitstandby -s mdw -S /data/standby/
20200619:21:29:03:012052 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:29:03:012052 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:29:04:012052 gpinitstandby:mdw:gpadmin-[ERROR]:-Failed to create standby
20200619:21:29:04:012052 gpinitstandby:mdw:gpadmin-[ERROR]:-Error initializing standby master: cannot create standby on the same host and port


解決

       使用gpinitstandby -P指定與Master不同埠。

解決問題重新執行

[gpadmin@mdw data]$ gpinitstandby -s mdw -S /data/standby/ -P 5532
20200619:21:29:23:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20200619:21:29:23:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for data directory /data/standby/ on mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname               = mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port                   = 5432
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname       = mdw
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port           = 5532
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /data/standby/
20200619:21:29:24:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog         = On
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20200619:21:29:27:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on mdw are consistent.
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog...
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully.
20200619:21:29:28:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file...
20200619:21:29:51:012192 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20200619:21:29:53:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master
20200619:21:29:53:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: mdw  in directory: /data/standby/
20200619:21:29:58:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20200619:21:30:07:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20200619:21:30:07:012192 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on mdw
[gpadmin@mdw data]$


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31439444/viewspace-2699576/,如需轉載,請註明出處,否則將追究法律責任。

相關文章