12cRAC打季度補丁時遇到ORA-12547: TNS:lost contact錯誤
一、環境說明:
作業系統:RHEL7.4
資料庫版本:12.2.0.1
已打補丁:201910月的季度補丁
由於保密緣故,相關命名與IP已做變更
--------------20210719 打19C的季度補丁時也遇到了這個問題
二、問題描述
在對GI打季度補丁時,節點1執行opatchauto apply後,CRS正常關閉,但是到自動拉起CRS時遇到ORA-12547: TNS:lost contact報錯了,導致打補丁失敗。
詳細資訊如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 |
CRS-4123: Starting Oracle High Availability Services-managed resources CRS-2672: Attempting to start 'ora.mdnsd' on 'db1' CRS-2672: Attempting to start 'ora.evmd' on 'db1' CRS-2676: Start of 'ora.mdnsd' on 'db1' succeeded CRS-2676: Start of 'ora.evmd' on 'db1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'db1' CRS-2676: Start of 'ora.gpnpd' on 'db1' succeeded CRS-2672: Attempting to start 'ora.gipcd' on 'db1' CRS-2676: Start of 'ora.gipcd' on 'db1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'db1' CRS-2676: Start of 'ora.cssdmonitor' on 'db1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'db1' CRS-2672: Attempting to start 'ora.diskmon' on 'db1' CRS-2676: Start of 'ora.diskmon' on 'db1' succeeded CRS-2676: Start of 'ora.cssd' on 'db1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'db1' CRS-2672: Attempting to start 'ora.ctssd' on 'db1' CRS-2676: Start of 'ora.ctssd' on 'db1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'db1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'db1' CRS-2676: Start of 'ora.asm' on 'db1' succeeded CRS-2672: Attempting to start 'ora.storage' on 'db1' ORA-12547: TNS:lost contact ORA-12547: TNS:lost contact ORA-15077: could not locate ASM instance serving a required diskgroup CRS-2883: Resource 'ora.storage' failed during Clusterware stack start. CRS-4406: Oracle High Availability Services synchronous start failed. CRS-4000: Command Start failed, or completed with errors. 2020/06/10 16:46:12 CLSRSC-117: Failed to start Oracle Clusterware stack After fixing the cause of failure Run opatchauto resume ] OPATCHAUTO-68061: The orchestration engine failed. OPATCHAUTO-68061: The orchestration engine failed with return code 1 OPATCHAUTO-68061: Check the log for more details. OPatchAuto failed. OPatchauto session completed at Wed Jun 10 16:46:13 2020 Time taken to complete the session 17 minutes, 25 seconds
opatchauto failed with error code 42 |
呼叫crsctl check crs檢視叢集服務,發現Cluster Ready Services、Event Manager異常。檢視/u01/app/grid/diag/crs/db1/crs/trace/alert.log日誌,發現如下資訊:
1
2
3
4
5
6 |
2020-06-10 16:36:08.150 [ORAROOTAGENT(679524)]CRS-5019: All OCR locations are on ASM disk groups [OCR], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/grid/diag/crs/db1/crs/trace/ohasd_orarootagent_root.trc". 2020-06-10 16:45:58.112 [ORAROOTAGENT(679524)]CRS-5818: Aborted command 'start' for resource 'ora.storage'. Details at (:CRSAGF00113:) {0:5:3} in /u01/app/grid/diag/crs/db1/crs/trace/ohasd_orarootagent_root.trc. 2020-06-10 16:46:00.427 [ORAROOTAGENT(679524)]CRS-5017: The resource action "ora.storage start" encountered the following error: 2020-06-10 16:46:00.427+Storage agent start action aborted. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/db1/crs/trace/ohasd_orarootagent_root.trc". 2020-06-10 16:46:00.429 [OHASD(679429)]CRS-2757: Command 'Start' timed out waiting for response from the resource 'ora.storage'. Details at (:CRSPE00221:) {0:5:3} in /u01/app/grid/diag/crs/db1/crs/trace/ohasd.trc. 2020-06-10 16:47:12.689 [OSYSMOND(684959)]CRS-8500: Oracle Clusterware OSYSMOND process is starting with operating system process ID 684959 |
/u01/app/grid/diag/crs/db1/crs/trace/ohasd_orarootagent_root.trc日誌如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 |
2020-06-09 16:34:08.611 :CLSDYNAM:3440035584: [ora.storage]{0:1:251} [check] StorageAgent::parsekgforetcodes retcode = 0, kgfoCheckMount(OCR), flag 4 2020-06-09 16:34:08.611 :CLSDYNAM:3440035584: [ora.storage]{0:1:251} [check] StorageAgent::check kgfo returncode 0 2020-06-09 16:34:11.243 :CLSDYNAM:3429529344: [ ora.crf]{0:5:3} [check] Check return = 0, state detail = NULL 2020-06-09 16:34:15.926 :CLSDYNAM:2525263616: [ora.ctssd]{0:5:3} [check] translateReturnCodes, return = 0, state detail = OBSERVERCheckcb data [0x7f2369c293d0]: mode[0xee] offset[0 ms]. 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Arg Value = check 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils::execCmd 1 USR_ORA_ENV: oracleHome:/u01/app/12.2.0/grid CrsHome:/u01/app/12.2.0/grid 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Adding Environment Variables ORA_DAEMON_LOGGING_LEVELS=GIPCLIB=2,GIPC=2,GIPCGEN=2,GIPCTRAC=2,GIPCWAIT=2,GIPCXCPT=2,GIPCOSD=2,GIPCBASE=2,GIPCCLSA=2,GIPCCLSC=2,GIPCEXMP=2,GIPCGMOD=2,GIPCHEAD=2,GIPCMUX=2,GIPCNET=2,GIPCNULL=2,GIPCPKT=2,GIPCSMEM=2,GIPCHAUP=2,GIPCHALO=2,GIPCHTHR=2,GIPCHGEN=2,GIPCHLCK=2,GIPCHDEM=2,GIPCHWRK=2,GIPCTLS=2,GIPCHGNS=2 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Adding Environment Variables ORA_DAEMON_TRACE_FILE_OPTIONS=filesize=26214400,numsegments=10 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Adding Environment Variables _ORA_AGENT_ACTION=TRUE 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Adding Environment Variables __IS_HASD_AGENT= 2020-06-09 16:34:19.107 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils:execCmd action = 3 flags = 6 ohome = (null) cmdname = acfsload. 2020-06-09 16:34:19.108 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:19.108 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:19.219 :CLSDYNAM:3425326848: [ora.drivers.acfs]{0:0:2} [check] execCmd ret = 0 2020-06-09 16:34:20.911 :CLSDYNAM:3427428096: [ora.driver.afd]{0:0:2} [check] Utils:execCmd action = 3 flags = 38 ohome = (null) cmdname = afddriverstate. 2020-06-09 16:34:20.911 :CLSDYNAM:3427428096: [ora.driver.afd]{0:0:2} [check] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:20.911 :CLSDYNAM:3427428096: [ora.driver.afd]{0:0:2} [check] Utils::getOracleHomeAttrib oracle_home:/u01/app/12.2.0/grid 2020-06-09 16:34:21.121 :CLSDYNAM:3427428096: [ora.driver.afd]{0:0:2} [check] execCmd ret = 0 2020-06-09 16:34:22.763 : USRTHRD:2548033280: {0:5:3} Check: 0-1 2020-06-09 16:34:26.337 : AGFW:3435833088: {0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2086505 2020-06-09 16:34:32.281 : CLSDMC:3425326848: command 0 failed with status 16843265 2020-06-09 16:34:32.282 :CLSDYNAM:3425326848: [ora.crsd]{0:1:52} [check] DaemonAgent::check returned 0 2020-06-09 16:34:32.282 :CLSDYNAM:3425326848: [ora.crsd]{0:1:52} [check] DaemonAgent::check checkErrorCode=16843265, pestate=512,perole=65536, pemode=1 2020-06-09 16:34:38.611 :CLSDYNAM:3427428096: [ora.storage]{0:1:251} [check] StorageAgent::check NODEROLE_HUB getOCRdetails 2020-06-09 16:34:38.686 :CLSDYNAM:3427428096: [ora.storage]{0:1:251} [check] StorageAgent::parsekgforetcodes retcode = 0, kgfoCheckMount(OCR), flag 2 2020-06-09 16:34:38.686 :CLSDYNAM:3427428096: [ora.storage]{0:1:251} [check] StorageAgent::check kgfo returncode 0 2020-06-09 16:34:41.239 :CLSDYNAM:3444238080: [ ora.crf]{0:5:3} [check] Check return = 0, state detail = NULL 2020-06-09 16:34:45.921 :CLSDYNAM:3444238080: [ora.ctssd]{0:5:3} [check] translateReturnCodes, return = 0, state detail = OBSERVERCheckcb data [0x7f23baa99360]: mode[0xee] offset[0 ms]. 2020-06-09 16:34:52.763 : USRTHRD:2548033280: {0:5:3} Check: 0-1 2020-06-09 16:34:56.341 : AGFW:3435833088: {0:0:2} Agent received the message: AGENT_HB[Engine] ID 12293:2086522 2020-06-09 16:35:02.285 : CLSDMC:2525263616: command 0 failed with status 16843265 2020-06-09 16:35:02.285 :CLSDYNAM:2525263616: [ora.crsd]{0:1:52} [check] DaemonAgent::check returned 0 |
三、故障分析
從告警日誌中可知,CRS程式未能識別到OCR磁碟有掛載導致CRS服務啟動失敗。
嘗試手工啟動ASM例項,CRS啟動成功:
1
2 |
sqlplus / as sysasm startup |
關閉CRS後再呼叫crsctl start crs,ASM例項一樣未能自動拉起,只能透過startup的方式。
檢視/u01/app/grid/diag/crs/db1/crs/trace/crsd.trc日誌,發現如下資訊:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25 |
2020-06-10 16:48:24.867*:kgfn.c@1370: kgfnFindLocalNode: found no members 2020-06-10 16:48:24.867 : OCRRAW:633798784: kgfnFindLocalNode: not ok 2020-06-10 16:48:24.867*:kgfn.c@1422: kgfnFindLocalNode: not ok 2020-06-10 16:48:24.867 : OCRRAW:633798784: kgfnTgtInit: local node not found, free kgfnpds 2020-06-10 16:48:24.867*:kgfn.c@2208: kgfnTgtInit: not found 2020-06-10 16:48:24.867 : OCRRAW:633798784: kgfnGetBeqData failed init target; inst=(null) flags=0x6000 2020-06-10 16:48:24.867*:kgfn.c@5791: kgfnGetBeqData: kgfnTgtInit failed, inst=NULL flags=0x6000 2020-06-10 16:48:24.867 : CLSNS:633798784: clsns_SetTraceLevel:trace level set to 1. 2020-06-10 16:48:24.900 : OCRRAW:633798784: 8154 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS 2020-06-10 16:48:24.904 : OCRRAW:633798784: 8154 Error 4 querying length of attr ASM_STATIC_DISCOVERY_ADDRESS 2020-06-10 16:48:24.940 : OCRRAW:633798784: 7872 Error 4 opening dom root in 0x5fb60d0 2020-06-10 16:48:24.958 : OCRRAW:633798784: kgfnConnect2: kgfnGetBeqData failed 2020-06-10 16:48:24.958*:kgfn.c@5012: kgfnConnect2: kgfnGetBeqData failed 2020-06-10 16:48:24.992 : OCRRAW:633798784: kgfnConnect2Int: cstr=(DESCRIPTION=(TCP_USER_TIMEOUT=1)(TRANSPORT_CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.43)(PORT=1526)))(CONNECT_DATA=(SERVICE_NAME=+ASM))) 2020-06-10 16:48:24.992*:kgfn.c@6788: kgfnConnect2Int: cstr=(DESCRIPTION=(TCP_USER_TIMEOUT=1)(TRANSPORT_CONNECT_TIMEOUT=60)(EXPIRE_TIME=1)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=10.10.10.43)(PORT=1526)))(CONNECT_DATA=(SERVICE_NAME=+ASM))) 2020-06-10 16:48:24.993 : OCRRAW:633798784: kgfnConnect2Int: ServerAttach 2020-06-10 16:48:24.993*:kgfn.c@6811: kgfnConnect2Int: OCIServerAttach failed Failed to connect to ASM 1 0 0 (nil) 0 2020-06-10 16:48:25.994 : OCRRAW:633798784: kgfnRecordErr 12547 OCI error: ORA-12547: TNS:lost contact 2020-06-10 16:48:25.994*:kgfn.c@1740: kgfnRecordErrPriv: 12547 error=ORA-12547: TNS:lost contact 2020-06-10 16:48:25.995 : OCRRAW:633798784: kgfnConnect2: failed to connect 2020-06-10 16:48:25.995*:kgfn.c@5333: kgfnConnect2: failed to connect 2020-06-10 16:48:25.995 : OCRRAW:633798784: kgfnConnect2Retry: failed to connect connect after 2 attempts, 331s elapsed 2020-06-10 16:48:25.995 : OCRRAW:633798784: kgfo_kge2slos error stack at kgfoAl06: ORA-12547: TNS:lost contact |
ORA-12547: TNS:lost contact的報錯出現在該日誌檔案中,而該報錯是要訪問私網IP10.10.10.43 1526埠失敗所致。
四、問題原因
1526埠是執行在ASMNET1LSNR_ASM監聽上的。在12cRAC中,從FLEXASM開始,crsd.bin將使用ASMNET1LSNR_ASM監聽進行遠端連線。即會連線10.10.10.43的1526埠。
那麼為何連線節點2的1526埠會失敗呢。
檢視節點2的sqlnet.ora配置,發現配置了白名單,但是沒有增加節點1的私網IP。
五、解決辦法
1
2
3
4
5
6
7 |
1) Modify the file sqlnet.ora vi $GRID_HOME/network/admin/sqlnet.ora 2) Add the IP which are used for Private interconnect. e.g. TCP.VALIDNODE_CHECKING = YES TCP.INVITED_NODES=(node1.localhost, node2.localhost,node1-priv.localhost,node2-priv.localhost, node1-vip.localhost, node2-vip.localhost, application server VIPS) 3) Restart the CRS in issue node. |
六、總結
打補丁前最好將oracle和grid的ORACLE_HOME目錄打包備份,以備打補丁失敗能正常回退。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25462274/viewspace-2781993/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 【ORA】ORA-12547 TNS: Lost Contact錯誤診斷
- 【ORA-】ORA-12547: TNS:lost contact錯誤的跟蹤分析
- sqlplus登入資料庫報錯ORA-12547: TNS:lost contactSQL資料庫
- Oracle案例04——TNS-12547: TNS:lost contactOracle
- 在Suse 12.4上安裝11.2.0.4的rac執行root.sh報錯“ORA-12547: TNS:lost contact”
- oracle 19c rac打補丁常見錯誤Oracle
- windows oracle 11201打補丁報錯WindowsOracle
- 打補丁時重建Inventory目錄
- SharePoint or Office 打補丁或產品安裝遇到問題
- SAP打補丁時需要注意的地方
- oracle打補丁回顧Oracle
- 如何給esxi打補丁
- Linux檔案打補丁Linux
- 12C RAC 打31720486補丁 後報錯處理
- DG環境下打補丁
- weblogic的版本及打補丁Web
- Oracle RAC 19.3打19.5.1 RU補丁Oracle
- [202021127]sql打補丁問題.txtSQL
- ORACLE打補丁的方法和案例Oracle
- 使用React Hooks時遇到的錯誤提示ReactHook
- 19c 自動打RU補丁
- Oracle Goldengate 12c打pus補丁OracleGo
- 【PSU】怎麼給RAC打PSU補丁
- 打補丁打出新的BUG來了
- windows10補丁更新失敗怎麼辦_windows10更新補丁安裝錯誤解決方法Windows
- [重慶思莊每日技術分享]-11G 打補丁遇到:unable to get oracle owner forOracle
- 修改表名時遇到ORA-14047錯誤
- 【UP_ORACLE】如何給Oracle DG打補丁(二)備庫安裝補丁步驟Oracle
- 【UP_ORACLE】如何給Oracle DG打補丁(三)主庫安裝補丁步驟Oracle
- Oracle RAC 第二節點打補丁報錯 oui-patch.xml (Permission denied)OracleUIXML
- ORACLE 19C RAC FOR RHEL7 打補丁報錯OPatchException: Unable to create patchObjectOracleExceptionObject
- Oracle的TNS-12502 錯誤原因及解決Oracle
- 19c rac自動打補丁步驟
- [20210929]sql打補丁使用rule提示問題.txtSQL
- [20220329]19c sql語句打補丁.txtSQL
- [20220330]編寫sql打補丁的指令碼.txtSQL指令碼
- weblogic10.3.6軟體打補丁很慢問題Web
- Git 打補丁– patch 和 diff 的使用(詳細)Git