使用monit監控storm

技術小胖子發表於2017-11-11

一、安裝。首先去monit官方下載最新的monit版本,老的版本可能不支援

       官網地址是http://mmonit.com/monit/

       yum源沒有配的話用rpm安裝。

       yum –y install pam-devel

       tar -zxf monit-5.12.tar.gz -C /tmp/

       cd /tmp/monit-5.12

      ./configure –prefix=/usr/local/monit –sysconfdir=/usr/local/monit/etc –without-ssl –without-pam

       make && make install

       mkdir -p /usr/local/monit/etc

       cp monitrc /usr/local/monit/etc/

       chmod 600 /usr/local/monit/etc/monitrc

       cd /bin

       ln –s  /usr/local/monit/bin/monit /bin/monit

       cp /usr/local/monit/etc/monitrc /etc/monitrc

二、使用。

對monit不熟悉的話,建議首先要熟悉monit的使用。

monit 啟動 monit -c /etc/monitrc 

         停止 monit quit

         檢視 monit status

         重新載入 monit reload

         啟動監控項 monit start 監控名

其他

# monit -h

Usage: monit [options] {arguments}

Options are as follows:

 -cfile       Use this control file

 -dn          Run as a daemon once per nseconds

 -gname       Set group name for start,stop, restart, monitor and unmonitor

 -llogfile    Print log information to thisfile

 -ppidfile    Use this lock file in daemonmode

 -sstatefile  Set the file monit shouldwrite state information to

 -I           Do not run in background (needed for run from init)

 -t           Run syntax check for the control file

 -v           Verbose mode, work noisy (diagnostic output)

 -H[filename] Print SHA1 and MD5 hashes of the file or of stdin if the

               filename is omited; monit willexit afterwards

 -V           Print version number and patchlevel

 -h           Print this text

Optional action arguments for non-daemonmode are as follows:

 start all     – Start all services

 start name    – Only start the named service

 stopall       – Stop all services

 stopname      – Only stop the named service

 restart all   – Stop and start all services

 restart name  – Only restart the named service

 monitorall    – Enable monitoring of allservices

 monitor name  – Only enable monitoring of the named service

unmonitor all – Disable monitoring of all services

 unmonitor name – Only disable monitoring ofthe named service

 reload        – Reinitialize monit

 status        – Print full status information for each service

 summary       – Print short status information for each service

 quit          – Kill monit daemon process

 

 validate      – Check all services and start if not running

 

===================================================================================

 

三、監控storm

       我的storm路徑是/usr/storm 

 1、寫一個storm的nimbus啟動指令碼

#!/bin/bash

      /usr/storm/bin/storm nimbus &

      並以startstorm.sh命名,儲存後給執行許可權,並存放在STORM_HOME/bin下邊

      然後執行許可權看看該指令碼能執行成功不

2、  檢視nimbus程式,呵呵別見怪,nimbus程式名挺長的。。。。

root@master Desktop]# ps -elf | grep nimbus

0 S root     12355     1  5  80   0 – 764995 futex_ 01:47 ?       00:00:04 /usr/java/jdk1.8.0_31/bin/java -server -Dstorm.options= -Dstorm.home=/usr/storm -Dstorm.log.dir=/usr/storm/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/storm/lib/ring-devel-0.3.11.jar:/usr/storm/lib/carbonite-1.4.0.jar:/usr/storm/lib/tools.macro-0.1.0.jar:/usr/storm/lib/metrics-core-2.2.0.jar:/usr/storm/lib/tools.logging-0.2.3.jar:/usr/storm/lib/compojure-1.1.3.jar:/usr/storm/lib/chill-java-0.3.5.jar:/usr/storm/lib/asm-4.0.jar:/usr/storm/lib/ring-jetty-adapter-0.3.11.jar:/usr/storm/lib/kryo-2.21.jar:/usr/storm/lib/tools.cli-0.2.4.jar:/usr/storm/lib/slf4j-api-1.7.2.jar:/usr/storm/lib/clojure-1.5.1.jar:/usr/storm/lib/servlet-api-2.5.jar:/usr/storm/lib/zkclient-0.3.jar:/usr/storm/lib/joda-time-2.0.jar:/usr/storm/lib/minlog-1.2.jar:/usr/storm/lib/json-simple-1.1.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-scaladoc.jar:/usr/storm/lib/storm-core-0.9.3.jar:/usr/storm/lib/jetty-6.1.26.jar:/usr/storm/lib/commons-io-2.4.jar:/usr/storm/lib/logback-core-1.0.13.jar:/usr/storm/lib/kafka_2.10-0.8.1.1.jar:/usr/storm/lib/objenesis-1.2.jar:/usr/storm/lib/commons-codec-1.6.jar:/usr/storm/lib/zookeeper-3.3.4.jar:/usr/storm/lib/jopt-simple-3.2.jar:/usr/storm/lib/math.numeric-tower-0.0.1.jar:/usr/storm/lib/jetty-util-6.1.26.jar:/usr/storm/lib/snakeyaml-1.11.jar:/usr/storm/lib/jline-2.11.jar:/usr/storm/lib/clj-stacktrace-0.2.2.jar:/usr/storm/lib/commons-fileupload-1.2.1.jar:/usr/storm/lib/log4j-1.2.15.jar:/usr/storm/lib/ring-servlet-0.3.11.jar:/usr/storm/lib/commons-exec-1.1.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-javadoc.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-sources.jar:/usr/storm/lib/clout-1.0.1.jar:/usr/storm/lib/commons-lang-2.5.jar:/usr/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/storm/lib/slf4j-api-1.7.5.jar:/usr/storm/lib/ring-core-1.1.5.jar:/usr/storm/lib/snappy-java-1.0.5.jar:/usr/storm/lib/reflectasm-1.07-shaded.jar:/usr/storm/lib/hiccup-0.3.6.jar:/usr/storm/lib/disruptor-2.10.1.jar:/usr/storm/lib/core.incubator-0.1.0.jar:/usr/storm/lib/scala-library-2.10.1.jar:/usr/storm/lib/jgrapht-core-0.9.0.jar:/usr/storm/lib/logback-classic-1.0.13.jar:/usr/storm/lib/commons-logging-1.1.3.jar:/usr/storm/lib/clj-time-0.4.1.jar:/usr/storm/conf -Xmx1024m -Dlogfile.name=nimbus.log -Dlogback.configurationFile=/usr/storm/logback/cluster.xml backtype.storm.daemon.nimbus

 

 

3、  編輯/etc/monitrc在最後加上下邊的內容,這裡要講一下,stormnb是監控名字,matching 後邊的內容是上邊命令檢視到的程式名,版本不同勿照搬,若monit已經啟動,則用monit start stormnb啟動監控程式。

 

check process stormnb matching “/usr/java/jdk1.8.0_31/bin/java -server -Dstorm.options= -Dstorm.home=/usr/storm -Dstorm.log.dir=/usr/storm/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/storm/lib/ring-devel-0.3.11.jar:/usr/storm/lib/carbonite-1.4.0.jar:/usr/storm/lib/tools.macro-0.1.0.jar:/usr/storm/lib/metrics-core-2.2.0.jar:/usr/storm/lib/tools.logging-0.2.3.jar:/usr/storm/lib/compojure-1.1.3.jar:/usr/storm/lib/chill-java-0.3.5.jar:/usr/storm/lib/asm-4.0.jar:/usr/storm/lib/ring-jetty-adapter-0.3.11.jar:/usr/storm/lib/kryo-2.21.jar:/usr/storm/lib/tools.cli-0.2.4.jar:/usr/storm/lib/slf4j-api-1.7.2.jar:/usr/storm/lib/clojure-1.5.1.jar:/usr/storm/lib/servlet-api-2.5.jar:/usr/storm/lib/zkclient-0.3.jar:/usr/storm/lib/joda-time-2.0.jar:/usr/storm/lib/minlog-1.2.jar:/usr/storm/lib/json-simple-1.1.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-scaladoc.jar:/usr/storm/lib/storm-core-0.9.3.jar:/usr/storm/lib/jetty-6.1.26.jar:/usr/storm/lib/commons-io-2.4.jar:/usr/storm/lib/logback-core-1.0.13.jar:/usr/storm/lib/kafka_2.10-0.8.1.1.jar:/usr/storm/lib/objenesis-1.2.jar:/usr/storm/lib/commons-codec-1.6.jar:/usr/storm/lib/zookeeper-3.3.4.jar:/usr/storm/lib/jopt-simple-3.2.jar:/usr/storm/lib/math.numeric-tower-0.0.1.jar:/usr/storm/lib/jetty-util-6.1.26.jar:/usr/storm/lib/snakeyaml-1.11.jar:/usr/storm/lib/jline-2.11.jar:/usr/storm/lib/clj-stacktrace-0.2.2.jar:/usr/storm/lib/commons-fileupload-1.2.1.jar:/usr/storm/lib/log4j-1.2.15.jar:/usr/storm/lib/ring-servlet-0.3.11.jar:/usr/storm/lib/commons-exec-1.1.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-javadoc.jar:/usr/storm/lib/kafka_2.10-0.8.1.1-sources.jar:/usr/storm/lib/clout-1.0.1.jar:/usr/storm/lib/commons-lang-2.5.jar:/usr/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/storm/lib/slf4j-api-1.7.5.jar:/usr/storm/lib/ring-core-1.1.5.jar:/usr/storm/lib/snappy-java-1.0.5.jar:/usr/storm/lib/reflectasm-1.07-shaded.jar:/usr/storm/lib/hiccup-0.3.6.jar:/usr/storm/lib/disruptor-2.10.1.jar:/usr/storm/lib/core.incubator-0.1.0.jar:/usr/storm/lib/scala-library-2.10.1.jar:/usr/storm/lib/jgrapht-core-0.9.0.jar:/usr/storm/lib/logback-classic-1.0.13.jar:/usr/storm/lib/commons-logging-1.1.3.jar:/usr/storm/lib/clj-time-0.4.1.jar:/usr/storm/conf -Xmx1024m -Dlogfile.name=nimbus.log -Dlogback.configurationFile=/usr/storm/logback/cluster.xml backtype.storm.daemon.nimbus

    start program = “/bin/sh /usr/storm/bin/start_storm.sh” with timeout 30 seconds

    stop program = “/usr/bin/kill -9 `ps -elf | grep nimbus | grep -v grep | awk  `{print $4}“”

 

      注意:自己配置storm_home的值和matching “”號中的值要和自己執行ps -elf | grep nimbus得到的程式名一致,最後兩句的意思是啟動nimbus和停止nimbus。

 

 [root@master storm]# monit reload 

[root@master storm]# monit status

The Monit daemon 5.12 uptime: 1m 

 

Process `stormnb`

  status                            Running

  monitoring status                 Monitored

  pid                               4224

  parent pid                        1

  uid                               0

  effective uid                     0

  gid                               0

  uptime                            1m 

  children                          0

  memory                            109.1 MB

  memory total                      109.1 MB

  memory percent                    5.8%

  memory percent total              5.8%

  cpu percent                       0.4%

  cpu percent total                 0.4%

  data collected                    Tue, 17 Mar 2015 01:19:57

 

System `master`

  status                            Running

  monitoring status                 Monitored

  load average                      [0.53] [0.20] [0.23]

  cpu                               62.1%us 4.2%sy 0.5%wa

  memory usage                      1.5 GB [79.7%]

  swap usage                        460.5 MB [23.2%]

  data collected                    Tue, 17 Mar 2015 01:19:57

[root@master storm]# jps

4224 nimbus

3233 Kafka

4338 Jps

3061 QuorumPeerMain

3273 Kafka

[root@master storm]# kill -9 4224

殺死nimbus,並檢視,剛殺死這裡已經在啟動了

[root@master storm]# jps

3233 Kafka

3061 QuorumPeerMain

4360 Jps

3273 Kafka

4350 config_value

[root@master storm]# jps

3233 Kafka

3061 QuorumPeerMain

3273 Kafka

4349 nimbus

4414 Jps

[root@master storm]# monit status

The Monit daemon 5.12 uptime: 3m 

 

Process `stormnb`

  status                            Running

  monitoring status                 Monitored

  pid                               4349

  parent pid                        1

  uid                               0

  effective uid                     0

  gid                               0

  uptime                            0m 

  children                          0

  memory                            104.8 MB

  memory total                      104.8 MB

  memory percent                    5.6%

  memory percent total              5.6%

  cpu percent                       0.4%

  cpu percent total                 0.4%

  data collected                    Tue, 17 Mar 2015 01:21:45

 

System `master`

  status                            Running

  monitoring status                 Monitored

  load average                      [0.59] [0.28] [0.25]

  cpu                               2.6%us 0.3%sy 0.0%wa

  memory usage                      1.4 GB [74.6%]

  swap usage                        460.5 MB [23.2%]

  data collected                    Tue, 17 Mar 2015 01:21:45

 

 

監控supervisor,core的方法也一樣,照著改一下就好。

 

 

 

      本文轉自yzy121403725 51CTO部落格,原文連結:http://blog.51cto.com/lookingdream/1850595,如需轉載請自行聯絡原作者


相關文章