[20200430]監測機房溫度.txt

lfree發表於2020-04-30

[20200430]監測機房溫度.txt

--//以前的一個需求,要求監測機房內溫度,實際上間接測試硬碟的問題一樣可以大致瞭解機房的溫度。
--//正好別人有這樣的需求,我自己在新的測試環境測試看看。

# fdisk -l

Disk /dev/cciss/c0d0: 1800.2 GB, 1800280694784 bytes
255 heads, 63 sectors/track, 218871 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1   *           1        1305    10482381   83  Linux
/dev/cciss/c0d0p2            1306        7832    52428127+  83  Linux
/dev/cciss/c0d0p3            7833       11748    31455270   82  Linux swap / Solaris
/dev/cciss/c0d0p4           11749      218871  1663715497+   5  Extended
/dev/cciss/c0d0p5           11749       13799    16474626   8e  Linux LVM
/dev/cciss/c0d0p6           13800      218871  1647240808+  83  Linux

# smartctl -a /dev/cciss/c0d0 -d cciss,0
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-2.6.39-300.26.1.el5uek] (local build)
Copyright (C) 2002-11 by Bruce Allen,

Vendor:               HP
Product:              EG0600FCVBK
Revision:             HPD5
User Capacity:        600,127,266,816 bytes [600 GB]
Logical block size:   512 bytes
Logical Unit id:      0x5000c500763520db
Serial number:        S0M2Q6EW0000B4449XDF
Device type:          disk
Transport protocol:   SAS
Local Time is:        Thu Apr 30 09:21:37 2020 CST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK

Current Drive Temperature:     32 C
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Drive Trip Temperature:        60 C
Manufactured in week 19 of year 2014
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  133
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  2198
Elements in grown defect list: 0
Vendor (Seagate) cache information
  Blocks sent to initiator = 37741738
  Blocks received from initiator = 1549759033
  Blocks read from cache and sent to initiator = 2597055257
  Number of read and write commands whose size <= segment size = 3182526893
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 49708.35
  number of minutes until next internal SMART test = 7

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0    3196278.151           0
write:         0        0         0         0          0      29020.030           0

Non-medium error count:       36

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -       9                 - [-   -    -]
# 2  Background short  Completed                   -       2                 - [-   -    -]
# 3  Background short  Completed                   -       1                 - [-   -    -]

Long (extended) Self Test duration: 3870 seconds [64.5 minutes]

--//簡單執行如下就可以記錄硬碟溫度。
# smartctl -a /dev/cciss/c0d0 -d cciss,0 | grep "Current Drive Temperature" | cut -f2 -d:
     32 C

--//有一些機型還可以看CPU溫度:
$ cat /sys/class/thermal/thermal_zone0/temp
8300

--//但是這個溫度不是攝氏溫度。而是華氏溫度/100.轉換一下。公式如下:℃=(F-32)×5/9

$ echo "scale=2;($(cat  /sys/class/thermal/thermal_zone0/temp)/100 - 32 )*5/9" | bc -l
28.33

--//以前我同事要求將資訊插入資料庫,對方定時提取,溫度過高透過簡訊提醒,我記憶硬碟溫度35還是36度簡訊提醒。

SCOTT@book> create table Temperature (t date,h_temp  number(5,2),c_temp number(5,2));
Table created.

SCOTT@book> create unique index i_Temperature_t on Temperature(t);
Index created.

#! /bin/bash
hard_temp_value=$( /usr/sbin/smartctl -a /dev/cciss/c0d0 -d cciss,0| grep '^Current Drive Temperature'| cut -f2 -d: | cut -f1 -d"C" | sed 's/ //g')
cpu_temp_value=$(echo "scale=2;($(cat  /sys/class/thermal/thermal_zone0/temp)/100 - 32 )*5/9" | bc -l)

# echo $hard_temp_value $cpu_temp_value
su - oracle -c 'sqlplus -S  scott/book'   <<EOF
set feedback off
set termout off
insert into  Temperature values(sysdate,$hard_temp_value,$cpu_temp_value);
set feedback on
quit
EOF

--//然後5分鐘定時執行1次,記錄在表中。

SCOTT@book> select * from Temperature ;
T                       H_TEMP     C_TEMP
------------------- ---------- ----------
2020-04-30 09:40:15         32      28.33
2020-04-30 09:41:32         32      28.33

--//當然現在已經不需要了^_^。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2689379/,如需轉載,請註明出處,否則將追究法律責任。

相關文章