Work Load Management (WLM)

This is an overview of the options available for WLM in Websphere application server v6 and some common issues and resolutions in WLM

Two types of Workload Management (WLM) in WebSphere Application Server
-Web Server Plug-In WLM
-Enterprise Java Bean (EJB) WLM

=> EJB WLM balances WLM enabled RMI/IIOP requests between clients and clusters

The load balancing occurs for the following:
* JNDI Lookups
* EJB creates
* EJB business methods
* EJB removes

=> Types of clients that support EJB WLM

* AppServers hosting client application
* WebSphere managed Application clients
* Stand alone JavaTM clients

=>WLM Routing

How it happens?
* Routing is based on weights associated with cluster members
* Round Robin routing when weights are equal
* Weights can be modified to send more requests to a particular cluster member or members

Note: If the client is on the same physical box as the cluster, the “Prefer Local” setting will ensure that all requests from the client go to the local cluster member.

=>Commonly encountered erros

* EJB requests not load balanced between cluster members
• Uneven routing
• No requests sent to a particular member
• NO_IMPLMENT errors
• No Available Target
• No Cluster Data Available
• Forward Limit Reached/Retry Limit Reached

=> Basic Troubleshooting

1. Gather ORB/WLM tracing on client where routing problem occurs (ORBRas=all:WLM*=all)
2. Find all of the three parameter getConnection() lines
3. Determine if the host/port for each similar operation are rotating between the cluster members

=> WLM trace points

* Keyword “Unexpected”
• Trace point to indicate things that should not be happening
* popServerForInvocation()
• Selected cluster member target printed during trace exit, this can be used to confirm WLM routing
* setObservedWeight()
• Used to decrement weight of cluster member during request routing. Check to ensure maximum observed weight is n-1 of the configured value for the cluster member

=> Further troubleshooting

* Check the configured Weights for each cluster member under Servers > Clusters > NAME > Cluster Members
* Check for a static routing table in the cluster config directory on the target EJB cell: /config/cells/cell_name/clusters/cluster_name/cluster_name.wsrttbl
* Check the Prefer Local setting for the cluster if the client and cluster member receiving all the requests are on the same host

=> HA Manager

* HA Manager Service required to be enabled by WLM. WLM uses information from the HA Manager Bulletin Board
* Check for DCS view exceptions or warnings in the SystemOut.log
* Check Core Group configuration
* EJB client, EJB cluster, and cluster Nodeagents should be in same core group

=> ORB Problems

* Base ORB part of SDK and layer under WLM
• Try latest SDK cumulative fixpack for your WebSphere version to rule out known defects
• Use ORB/WLM traces to determine if there is an underlying ORB exception leading to WLM problem

=> WLM Exceptions (some examples)

CORBA.NO_IMPLEMENT: No Cluster Data
* Where: Occurs on a Nodeagent
* Reson: Indicates that the process does not have WLM data for the target cluster and is unable to gather any
* Solution: Check HA Manager configuration to ensure Nodeagent is in the same core group. If HA config is correct, gather ORB/WLM trace on both client and Nodeagent

CORBA.NO_IMPLEMENT: No Available Target
* Where: Occurs on Nodeagents and clients
* Reson: WLM attempted to route a request, but based on the current cluster data, there was not a target which would be able to service that request
* Solution: Check for known APARs at your WebSphere version (most common problem fixed). Gather ORB/WLM traces on client and Nodeagent if no solution found

CORBA.NO_IMPLEMENT: Forward Limit Reached and CORBA.NO_IMPLEMENT:Retry Limit Reached
* Where: Occurs on a Nodeagent (Forward Limit) or a client (Retry Limit)
* Reson: Error thrown when 10 consecutive errors occur while trying route a request
* Solution: Check for known APARs at your WebSphere version. Gather ORB/WLM traces on client and Nodeagent to determine what underlying exception is leading to the retries

CORBA.TRANSIENT: SIGNAL_RETRY
* Where: Occurs on a client
* Reson: The ORB attempts to send a request out to a target that WLM chooses and then never receives a reply whether the request was successful or not. Usually caused by CORBA.NO_RESPONSE exception
* Solution: Gather ORB/WLM trace on the client to determine where request was sent. Next, a trace should be gathered on the target server to determine why request did not complete. The client should retry the request if this exception is caught.

[@more@] was, WebSphere Application Server

Websphere (WAS) Work Load Management

相關文章