Kubernetes 使用arthas進行除錯

hongdada發表於2020-08-06

環境

因為k8s中是最基本的jre,網上說缺少tools.jar,但是補充了以後還是不行,最後還是將整個jdk給移到容器中的。

jre中執行:

/home # /opt/jre/bin/java -jar /home/arthas-bin/arthas-boot.jar 1
[INFO] arthas-boot version: 3.3.7
[INFO] arthas home: /home/arthas-bin
[INFO] Try to attach process 1
Exception in thread "main" java.lang.IllegalArgumentException: Can not find tools.jar under java home: /opt/jre1.8.0_231, please try to start arthas-boot with full path java. Such as /opt/jdk/bin/java -jar arthas-boot.jar
        at com.taobao.arthas.boot.ProcessUtils.findJavaHome(ProcessUtils.java:222)
        at com.taobao.arthas.boot.ProcessUtils.startArthasCore(ProcessUtils.java:233)
        at com.taobao.arthas.boot.Bootstrap.main(Bootstrap.java:515)

tools.jar遷移到/opt/jre/lib/

/opt/jre1.8.0_231/lib # /opt/jre/bin/java -jar /home/arthas-bin/arthas-boot.jar 1
[INFO] arthas-boot version: 3.3.7
[INFO] arthas home: /home/arthas-bin
[INFO] Try to attach process 1
java.lang.UnsatisfiedLinkError: no attach in java.library.path
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
        at java.lang.Runtime.loadLibrary0(Runtime.java:870)
        at java.lang.System.loadLibrary(System.java:1122)
        at sun.tools.attach.LinuxVirtualMachine.<clinit>(LinuxVirtualMachine.java:342)
        at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
        at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
        at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:86)
        at com.taobao.arthas.core.Arthas.<init>(Arthas.java:28)
        at com.taobao.arthas.core.Arthas.main(Arthas.java:124)
[ERROR] Start arthas failed, exception stack trace:
[ERROR] attach fail, targetPid: 1

最後還是將整個jdk遷移到容器中。

//k8s中進行遷移
kubectl cp arthas-bin/ test-huishi-server-7b5dd79689-tfsvz:/home -n irm-server

進入容器:

kubectl exec -ti test-huishi-server-test1-58fb7775fd-2n7bp -n irm-server -- /bin/sh

啟動

/ # /home/jdk1.8.0_91/bin/java -jar /home/arthas-bin/arthas-boot.jar
[INFO] arthas-boot version: 3.3.7
[INFO] Can not find java process. Try to pass <pid> in command line.
Please select an available pid.

起動不了,檢視幫助

/ # /home/jdk1.8.0_91/bin/java -jar /home/arthas-bin/arthas-boot.jar -help
[INFO] arthas-boot version: 3.3.7
Usage: arthas-boot [-h] [--target-ip <value>] [--telnet-port <value>]
       [--http-port <value>] [--session-timeout <value>] [--arthas-home <value>]
       [--use-version <value>] [--repo-mirror <value>] [--versions] [--use-http]
       [--attach-only] [-c <value>] [-f <value>] [--height <value>] [--width
       <value>] [-v] [--tunnel-server <value>] [--agent-id <value>] [--stat-url
       <value>] [--select <value>] [pid]

Bootstrap Arthas

EXAMPLES:
  java -jar arthas-boot.jar <pid>
  java -jar arthas-boot.jar --target-ip 0.0.0.0
  java -jar arthas-boot.jar --telnet-port 9999 --http-port -1
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
--agent-id bvDOe8XbTM2pQWjF4cfw
  java -jar arthas-boot.jar --stat-url 'http://192.168.10.11:8080/api/stat'
  java -jar arthas-boot.jar -c 'sysprop; thread' <pid>
  java -jar arthas-boot.jar -f batch.as <pid>
  java -jar arthas-boot.jar --use-version 3.3.7
  java -jar arthas-boot.jar --versions
  java -jar arthas-boot.jar --select arthas-demo
  java -jar arthas-boot.jar --session-timeout 3600
  java -jar arthas-boot.jar --attach-only
  java -jar arthas-boot.jar --repo-mirror aliyun --use-http
WIKI:
  https://alibaba.github.io/arthas

使用pid進行啟動

/ # ps -ef |grep huishi
    1 root      8:25 java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XX:MaxRAMFraction=1 -Xms256M -Duser.timezone=GMT+08 -Dfile.encoding=UTF-8 -javaagent:/data/jacocoagent.jar=includes=*,output=file,append=true,destfile=/data/log/huishi-server/jacoco.exec -jar /home/huishi-server.jar


/ # /home/jdk1.8.0_91/bin/java -jar /home/arthas-bin/arthas-boot.jar 1
[INFO] arthas-boot version: 3.3.7
[INFO] arthas home: /home/arthas-bin
[INFO] Try to attach process 1
[INFO] Attach process 1 success.
[INFO] arthas-client connect 127.0.0.1 3658
  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.
 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'
|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.
|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |
`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'


wiki      https://alibaba.github.io/arthas
tutorials https://alibaba.github.io/arthas/arthas-tutorials
version   3.3.7
pid       1
time      2020-08-06 15:48:34

除錯

儀表盤

[arthas@1]$ dashboard
ID               NAME                                                 GROUP                              PRIORITY         STATE            %CPU              TIME             INTERRUPTED       DAEMON           207              Timer-for-arthas-dashboard-d85134f6-3f4c-4e5b-ae6f-6 system                             5                RUNNABLE         67                0:0              false             true
76               DubboResponseTimeoutScanTimer                        main                               5                TIMED_WAITING    21                0:14             false             true
40               Hashed wheel timer #1                                main                               5                TIMED_WAITING    11                0:5              false             false
24               Abandoned connection cleanup thread                  main                               5                TIMED_WAITING    0                 0:0              false             true
161              Attach Listener                                      system                             9                RUNNABLE         0                 0:0              false             true
22               ContainerBackgroundProcessor[StandardEngine[Tomcat]] main                               5                TIMED_WAITING    0                 0:0              false             true
50               Curator-ConnectionStateManager-0                     main                               5                WAITING          0                 0:0              false             true
53               Curator-Framework-0                                  main                               5                WAITING          0                 0:0              false             true
75               DestroyJavaVM                                        main                               5                RUNNABLE         0                 3:16             false             false
25               Druid-ConnectionPool-Create-1936208710               main                               5                WAITING          0                 0:0              false             true
26               Druid-ConnectionPool-Destroy-1936208710              main                               5                TIMED_WAITING    0                 0:0              false             true
43               DubboClientReconnectTimer-thread-1                   main                               5                TIMED_WAITING    0                 0:0              false             true
46               DubboClientReconnectTimer-thread-2                   main                               5                WAITING          0                 0:0              false             true
33               DubboRegistryFailedRetryTimer-thread-1               main                               5                TIMED_WAITING    0                 0:0              false             true
37               DubboSaveRegistryCache-thread-1                      main                               5                WAITING          0                 0:0              false             true
3                Finalizer                                            system                             8                WAITING          0                 0:0              false             true
89               Java2D Disposer                                      system                             10               WAITING          0                 0:0              false             true
41               New I/O boss #3                                      main                               5                RUNNABLE         0                 0:1              false             true
38               New I/O worker #1                                    main                               5                RUNNABLE         0                 0:1              false             true
39               New I/O worker #2                                    main                               5                RUNNABLE         0                 0:1              false             true
59               NioBlockingSelector.BlockPoller-1                    main                               5                RUNNABLE         0                 0:0              false             true
Memory                                       used           total          max            usage          GC                                                                                                      heap                                         389M           510M           1979M          19.68%         gc.copy.count                                       838                                                 eden_space                                   104M           140M           546M           19.10%         gc.copy.time(ms)                                    14061
survivor_space                               3M             17M            68M            4.48%          gc.marksweepcompact.count                           6                                                   tenured_gen                                  282M           351M           1365M          20.67%         gc.marksweepcompact.time(ms)                        2640
nonheap                                      267M           278M           -1             96.01%
code_cache                                   84M            84M            240M           35.03%
metaspace                                    166M           175M           -1             94.89%
compressed_class_space                       16M            18M            1024M          1.64%
direct                                       211K           211K           -              100.00%
mapped                                       0K             0K             -              NaN%
Runtime                                                                                                                                                                                                          os.name                                                                                                  Linux
os.version                                                                                               3.10.0-693.el7.x86_64
java.version                                                                                             1.8.0_231
java.home                                                                                                /opt/jre1.8.0_231
systemload.average                                                                                       0.40
processors                                                                                               1
uptime                                                                                                   8607s

檢視JVM資訊

參考:

Arthas排查Kubernetes中的應用頻繁掛掉重啟問題

相關文章