ThreadDump和Java應用診斷(轉)

developerguy發表於2015-03-31

 

Thread Dump 和Java應用診斷
Thread Dump是非常有用的診斷Java應用問題的工具,每一個Java虛擬機器都有及時生成顯示所有執行緒在某一點狀態的thread-dump的能力。雖然各個Java虛擬機器thread dump列印輸出格式上略微有一些不同,但是Thread dumps出來的資訊包含執行緒;執行緒的執行狀態、標識和呼叫的堆疊;呼叫的堆疊包含完整的類名,所執行的方法,如果可能的話還有原始碼的行數。

Thread Dump特點:

?能在各種作業系統下使用
?能在各種Java應用伺服器下使用
?可以在生產環境下使用而不影響系統的效能
?可以將問題直接定位到應用程式的程式碼行上
Thread Dump能診斷的問題包括:

?查詢記憶體洩露,常見的是程式裡load大量的資料到快取
?發現死鎖執行緒
Sun的JVM用下列方法可以產生Thread Dump堆疊資訊:

1,Solaris OS
<ctrl>-’/’ (Control-Backslash)
 kill -QUIT <pid>

2, HP-UX/UNIX/Linux
Kill -3 PID
PID通過下面方法獲取
ps -efHl | grep `java` **. **

3,Windows
直接對MSDOS視窗的程式按Ctrl-break

有些Java應用伺服器是在控制檯上執行,如Weblogic,為了方便獲取threaddump資訊,在weblogic啟動的時候,最好將其標準輸出重定向到一個檔案,用”nohup sh startWebLogic.sh > start.log &”命令,執行”kill -3 <pid>”,Stack trace就會輸出到start.log裡。Tomcat的Thread Dump會輸出到命令列控制檯或者logs的catalina.out檔案裡。為了反映執行緒狀態的動態變化,需要接連多次做thread dump,每次間隔10-20s。

IBM JVM下產生Thread Dump:

在AIX上用IBM的JVM,記憶體溢位時預設地會產生javacore檔案(關於cpu的)和heapdump檔案(關於記憶體的)。如果沒有參照下列方法:
1 choose one cluster member, set the following before this server start:
在was啟動前設定下面環境變數(可以加在啟動指令碼中)
export IBM_HEAPDUMP=true
export IBM_HEAP_DUMP=true
export IBM_HEAPDUMP_OUTOFMEMORY=true
export IBM_HEAPDUMPDIR=<directory path>

2 please use set command to make sure you do not have DISABLE_JAVADUMP parameter
then start this cluster member.
用set命令檢查引數設定,確保沒有設定DISABLE_JAVADUMP,然後啟動server

3 when you find free memory < 50% when no heavy access, please run kill -3 <pid>
執行kill -3 <pid>命令可以生成javacore檔案和heapdump檔案(pid為was java程式的id號,可以用ps -ef|grep java 查到),可以多執行幾次,按照下面操作進行

ps -ef > psef1.txt
ps aux > psaux1.txt
vmstat 5 10 > vmstat.txt
kill -3 <app server id>
wait for 2 mins
kill -3 <app server id>
wait for 2 mins
kill -3 <app server id>
netstat -an> netstat2.txt
ps -ef > psef2.txt
ps aux > psaux2.txt
將上面產生的 txt 檔案和/usr/WebSphere/AppServer/javacore*檔案和heapdump檔案拷貝到本地,然後刪除這些檔案,因為這些檔案會佔用較大的檔案系統空間。
將/usr/WebSphere/AppServer/logs/wlmserver1(或2)目錄下當天產生的日誌拷貝出來

在IBM JVM產生的javacore或者Threaddump檔案中應用伺服器Web容器的常見執行緒狀態:

Idle執行緒:一個已經準備好接受請求的執行緒,但是沒有和外掛或者客戶端建立連線
Keep-Alive執行緒:是一個已經準備好接受請求的執行緒,並且已經和外掛或者客戶端建立連線
正在接受請求的執行緒:是一個執行緒正在讀取request的內容或者頭部

下面就給出各種執行緒在javacore或者Threaddump中的表現形式:

Idle執行緒:
“Servlet.Engine.Transports : 20” (TID:0x427F190, sys_thread_t:0x15D175E8, state:R, native ID:0xBB8) prio=5
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:429)
at com.ibm.ws.util.BoundedBuffer.take(BoundedBuffer.java:161)
at com.ibm.ws.util.ThreadPool.getTask(ThreadPool.java(Compiled Code)) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java(Compiled Code))

Keep-alive執行緒 (非SSL模式):
“Servlet.Engine.Transports : 20” (TID:0x427F190, sys_thread_t:0x15D175E8, state:R, native ID:0xBB8) prio=5
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:86)
at com.ibm.ws.io.Stream.read(Stream.java)
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java)
at com.ibm.ws.io.ReadStream.read(ReadStream.java)
at com.ibm.ws.http.HttpRequest.readRequestLine(HttpRequest.java)
at com.ibm.ws.http.HttpRequest.readRequest(HttpRequest.java)
at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection.java)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java)
at com.ibm.ws.util.CachedThread.run(ThreadPool.java)

Keep-alive執行緒 (SSL模式):
“Servlet.Engine.Transports : 12” (TID:0x458DBA18, sys_thread_t:0x60B297C0, state:R, native ID:0x427E) prio=5
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java(Compiled Code))
at com.ibm.sslite.s.a(Unknown Source)(Compiled Code)
at com.ibm.sslite.s.b(Unknown Source)(Compiled Code)
at com.ibm.sslite.s.a(Unknown Source)(Compiled Code)
at com.ibm.sslite.a.read(Unknown Source)(Compiled Code)
at com.ibm.jsse.a.read(Unknown Source)(Compiled Code)
at com.ibm.ws.io.Stream.read(Stream.java(Compiled Code))
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java(Inlined Compiled Code))
at com.ibm.ws.io.ReadStream.read(ReadStream.java(Inlined Compiled Code))
at com.ibm.ws.http.HttpRequest.readRequestLine(HttpRequest.java(Compiled Code))
at com.ibm.ws.http.HttpRequest.readRequest(HttpRequest.java(Compiled Code))
at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java(Compiled Code))
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:672)

正在接受請求的執行緒:
at java.net.SocketInputStream.socketRead(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:85)
at com.ibm.ws.io.Stream.read(Stream.java:17)
at com.ibm.ws.io.ReadStream.readBuffer(ReadStream.java:411)
at com.ibm.ws.io.ReadStream.read(ReadStream.java:110)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java:448)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:672)

Sun JVM的常見執行緒狀態:

對於thread dump資訊,主要關注的是執行緒的狀態和其執行堆疊
執行緒的狀態一般為三類
Runnable(R):當前可以執行的執行緒
Waiting on monitor(CW):執行緒主動wait
Waiting for monitor entry(MW):執行緒等鎖
一般關注的都是第一和第三種狀態的執行緒
Cpu很忙則關注runnable的執行緒
Cpu閒則關注waiting for monitor entry的執行緒
一種典型的死鎖是由於在server端應用(比如servlet)中請求由同一weblogic例項server的資源
解決辦法就是將該servlet放到另外的執行佇列裡去執行

下面給出一個典型的死鎖執行緒(注意STUCK關鍵字):

“[STUCK] ExecuteThread: `2` for queue: `weblogic.kernel.Default (self-tuning)`” daemon prio=10 tid=02fe9a18 nid=35 lwp_id=7518924 runnable [440dd000..440db878]
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:134)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.getArrayOfBytesFromSocket(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.readFirstPacketInBuffer(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.readPacket(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleDataProvider.receive(Unknown Source)
 at weblogic.jdbc.oracle.net8.OracleNet8NSPTDAPacket.sendRequest(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplStatement.fetchNext(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplStatement.fetchNext2(Unknown Source)
 at weblogic.jdbc.oracle.OracleImplResultset.fetchAtPosition(Unknown Source)
 at weblogic.jdbc.base.BaseImplResultSet.next(Unknown Source)
 at weblogic.jdbc.base.BaseResultSet.next(Unknown Source)
 – locked <55f25550> (a weblogic.jdbc.oracle.OracleConnection)
 at weblogic.jdbc.wrapper.ResultSet_weblogic_jdbc_base_BaseResultSet.next(Unknown Source)
 at org.hibernate.loader.Loader.doQuery(Loader.java:685)

UNIX/Linux下可用top、vmstat或prstat命令觀察系統資源狀況

Mandy Chung`s Blog 有一篇關於Thread Dump and Concurrency Locks的blog,摘來如下:
Thread dumps are very useful for diagnosing synchronization related problems such as deadlocks on object monitors. Ctrl-/ on Solaris/Linux or Ctrl-Break on Windows has been a common way to get a thread dump of a running application. On Solaris or Linux, you can send a QUIT signal to the target application. The target application in both cases prints a thread dump to the standard output and also detects if there is any deadlock involving object monitors.
jstack, a new troubleshooting utility introduced in Tiger (J2SE 5.0), provides another way to obtain a thread dump of an application. Alan Bateman has a nice blog about jstack and its several improvements in Mustang (Java SE 6). Mustang jstack works like a remote Ctrl-/ or Ctrl-Break if you are on Windows.
jconsole is JMX-complaint GUI tool which allows you to get a thread dump on the fly. The “Using JConsole to Monitor Applications” article gives you an overview of the Tiger monitoring and management functionality.
Mustang extends the thread dump, jstack, and jconsole to support java.util.concurrent.locks to improve its diagnosability. For example, the Threads tab in the Mustang jconsole now shows which synchronizer a thread is waiting to acquire when the thread is blocked to lock a ReentrantLock and also which thread is owning that lock.

In addition, it has a new “detect deadlock” button (in the bottom). When you click on the “detect deadlock” button, it will send a request to the target application to perform the deadlock detection operation. If the target application is running on Mustang, it finds deadlocks involving both object monitors as well as the java.util.concurrent.locks. If the target application is running on Tiger, it finds deadlocks involving object monitors only. Each deadlock cycle will be displayed in a separate Deadlock tab.

Click here to see a wider form of this screenshot.
JDK 6 has a nice demo FullThreadDump under $JDK_HOME/demo/management/FullThreadDump where JDK_HOME is the location of your JDK 6. This demo has been included in JDK 5.0 and is updated to use the new Mustang API. It demonstrates the use of the java.lang.management API to get the thread dump and detect deadlock programmatically.

 


相關文章