OMS to Agent Comm Fails if IP Address of the Grid Ag Machine is Changed-605009.1

rongshiyuan發表於2013-06-22
Communication: OMS to Agent Communication Fails if IP Address of the Grid Agent Machine is Changed [ID 605009.1]
 

In this Document
Symptoms
Changes
Cause
Solution
References


Applies to:

Enterprise Manager Base Platform. - Version: 10.2.0.1 to 10.2.0.5 - Release: 10.2 to 10.2
Information in this document applies to any platform.

Symptoms

The OMS cannot communicate with the agent. Any actions involving communication from the OMS to Agent fails although Agent to OMS communication is working fine.

- In Grid Control, the Setup -> Agents -> Agent Home page displays the following:

"Communication from the Oracle Management Service host to the agent host failed. Refer to help for details. Connection timed out. "

- In 10.2.0.5 Grid Console version, the Setup -> Agents -> Agent homepage will show:

Communication between the Oracle Management Service host to the Agent host is unavailable. Any functions or displayed information requiring this communication link will be unavailable. For example: deleting/configuring/adding targets, uploading metric data, or displaying Agent home page information such as Agent to Management Service Response Time (ms).

- The "Upload Metric Data' button is not available.

- Setting preferred credentials, submitting jobs against the agent host or trying to configure database targets fails.

- The following error message is seen either in the UI when trying to configure database targets on this host or in the OMS trace files.

Error : oracle.sysman.emSDK.emd.comm.CommException: Connection refused

- Agent uploads are working fine.
Name resolution and connectivity works fine both ways.
There are no firewalls or proxies between the OMS and agent.

- Trying the 'Agent Resynchronisation' option in the 10.2.0.5 Grid console returns:

Agent Operation completed with errors. For those targets that could not be saved, please go to the target's monitoring configuration page to save them. All other targets have been saved successfully. Agent has not been unblocked.

Error communicating with the agent. Exception message - oracle.sysman.emSDK.emd.comm.CommException: java.net.NoRouteToHostException: No route to host

And the OMS -> Agent communication still does not work.

- Clicking on the Targets -> Hosts -> hostname of this machine hangs for a long time and results in the following error seen in the /sysman/log/emoms.trc file:

2010-07-22 07:13:29,569 [EMUI_07_09_44_/console/monitoring/hostOverview$ctxType=Hosts$selTab=0$target=agentmachine.domain$type=host] ERROR host.HostOverviewDataObject getLogonInfo.2616 - java.net.ConnectException: Connection timed out
oracle.sysman.emSDK.emd.comm.CommException: java.net.ConnectException: Connection timed out



Changes

The IP address on the Grid Agent machine was changed or
this machine is configured to get its IP address via a DHCP server and was recently rebooted, resulting in a change of IP address.

Cause

Even if the nslookup command / hosts file on the OMS machine correctly resolves the Agent hostname with the new IP address, the java in the OMS has cached the old IP address.

This was investigated Bug 5899294 CHANGING IP ADDRESS OF THE AGENT REQUIRES RESTART OF THE OMS

Reference :
http://java.sun.com/j2se/1.4.2/docs/api/java/net/InetAddress.html
Topic : InetAddress Caching

Solution

There are two possible solutions :

1) Increase the TTL values used (provided with Java security properties control) for positive host name resolution caching

1. Take a backup of the /jdk/jre/lib/security/java.security file.
2. Change

networkaddress.cache.ttl=-1

To

networkaddress.cache.ttl=180

The default value of '-1' means that the java will cache all the successful hostname lookups indefinitely, until the process (OMS here) is re-started.
With a value of 180, the java will cache any successful hostname lookup for a period of 180 secs only.
This value can be tuned according to the setup and number of machines monitored as this can affect the number of network host resolution calls made by the OMS.

If the reverse lookup the old IP address of the Agent machine fails, then the old cached data will be cleared by the OMS and the new IP address for the agent machine is saved in the cache.This parameter will not be effective if the OMS is still able to correctly resolve the old IP address of the Agent machine to the same hostname as before the IP address change.

In such a case, it is necessary to ensure that the DNS tables / hosts file is correctly populated with the new IP address of the Agent machine or follow the solution below.

2) Restart the Grid Control Management Server(s):

OMS_HOME/opmn/bin/opmnctl stopall
OMS_HOME/opmn/bin/opmnctl startall

The cache information is refreshed after the OMS re-start.
This is described in the Enterprise Manager Installation Guide

References

BUG:5899294 - CHANGING IP ADDRESS OF THE AGENT REQUIRES RESTART OF THE OMS
http://docs.oracle.com/cd/B16240_01/doc/install.102/e10953/post-install_config_tasks.htm#insertedID8

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-764585/,如需轉載,請註明出處,否則將追究法律責任。

相關文章