MongoDB觸發oom-killer的簡單處理(一)
今天正在處理問題,突然收到一封郵件,提示mongo的一個副本集出現了問題。郵件內容如下:
當前狀態:PROBLEM
事件編號:11187790
發生時間:2015.12.21 13:30:01
持續時長:1h 20m
監控節點:mongo_mobile_gl_comment_sec.161
報警級別:高危(High)
事件資訊:One or more replication members need attention
事件詳情:Replication: Count of ReplicaSet Members Needing Attention: 1
DEBUG資訊:
Replication: Count of ReplicaSet Members Needing Attention (mongo_mobile_gl_comment_sec.161:repl_member_attention_state_count): 1
對於這類問題,最開始以為是網路延時,結果郵件還是持續報警,這個時候感覺應該是哪裡出了問題。連線到環境中,發現mongo程式已經消失了。
嘗試連線,提示連線失敗。
[root@mongo_mobile_gl_comment_sec ~]# mongo --port=27010
MongoDB shell version: 2.6.3
connecting to: 127.0.0.1:27010/test
2015-12-21T12:21:28.407+0530 warning: Failed to connect to 127.0.0.1:27010, reason: errno:111 Connection refused
2015-12-21T12:21:28.407+0530 Error: couldn't connect to server 127.0.0.1:27010 (127.0.0.1), connection attempt failed at src/mongo/shell/mongo.js:148
exception: connect failed
這個時候準備重啟mongoDB,這個時候提示重啟失敗。
[root@mongo_mobile_gl_comment_sec ~]# /etc/init.d/mongod start
Starting mongod: [FAILED]
這個時候只能檢視日誌了,得到的日誌資訊如下,可以看到伺服器重啟的痕跡,但是mongo重啟失敗。
less /storage/mongodb/log/mongod.log
2015-12-21T11:06:02.893+0530 [PeriodicTaskRunner] task: DBConnectionPool-cleaner took: 8ms
2015-12-21T11:06:33.976+0530 [conn1675793] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 1094ms
2015-12-21T11:06:33.976+0530 [conn1675786] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 30244ms
2015-12-21T11:06:33.976+0530 [conn1675795] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 1096ms
2015-12-21T11:06:33.976+0530 [conn1675794] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 numYields:0 reslen:700 1039ms
2015-12-21T11:06:33.982+0530 [conn1675789] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31111ms
2015-12-21T11:06:33.982+0530 [conn1675788] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 30028ms
2015-12-21T11:06:33.982+0530 [conn1675790] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31079ms
2015-12-21T11:06:33.982+0530 [conn1675791] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31095ms
2015-12-21T11:06:34.715+0530 [conn1675792] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 30769ms
2015-12-21T11:06:34.769+0530 [conn271045] query mobile_UCENTER.scoreRecord query: { $and: [ { uid: 2093542 }, { createTime: { $lte: new Date(1450675608270) } }, { createTime: { $gte: new Date(1450636200270) } } ] } planSummary: IXSCAN { uid: 1 } ntoreturn:0 ntoskip:0 nscanned:125 nscannedObjects:125 keyUpdates:0 numYields:109 locks(micros) r:275071880 nreturned:0 reslen:20 576812ms
2015-12-21T11:06:43.649+0530 [PeriodicTaskRunner] task: WriteBackManager::cleaner took: 15ms
2015-12-21T12:25:31.762+0530 ***** SERVER RESTARTED *****
2015-12-21T12:25:31.775+0530 [initandlisten] MongoDB starting : pid=11668 port=27010 dbpath=/storage/mongodb/data/db/ 64-bit host=mongo_mobile_gl_comment_sec.126
2015-12-21T12:25:31.775+0530 [initandlisten] db version v2.6.3
2015-12-21T12:25:31.775+0530 [initandlisten] git version: 255f67a66f9603c59380b2a389e386910bbb52cb
2015-12-21T12:25:31.775+0530 [initandlisten] build info: Linux build12.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-12-21T12:25:31.775+0530 [initandlisten] allocator: tcmalloc
2015-12-21T12:25:31.775+0530 [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1,172.16.3.126", http: { enabled: false }, maxIncomingConnections: 5000, port: 27010 }, processManagement: { fork: true, pidFilePath: "/storage/mongodb/mongod.pid" }, replication: { oplogSizeMB: 30960, replSet: "rs_comment" }, security: { authorization: "enabled", keyFile: "/home/mongodb/key/rs_key" }, storage: { dbPath: "/storage/mongodb/data/db/", directoryPerDB: true, journal: { enabled: false } }, systemLog: { destination: "file", logAppend: true, path: "/storage/mongodb/log/mongod.log", quiet: true } }
2015-12-21T12:25:31.786+0530 [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2015-12-21T12:25:31.786+0530 [initandlisten] dbexit:
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to close listening sockets...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to flush diaglog...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to close sockets...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: waiting for fs preallocator...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: closing all files...
2015-12-21T12:25:31.786+0530 [initandlisten] closeAllFiles() finished
2015-12-21T12:25:31.786+0530 [initandlisten] dbexit: really exiting now
對於這類問題,因為也算是菜鳥,所以有些不知所措,不過看了下日誌錯誤,提示是存在一個鎖檔案,這個看起來和weblogic啟動也有點類似,查了一下百度,也得到了印證,我就不繼續靠谷歌了。開始備份這個鎖檔案
[root@mongo_mobile_gl_comment_sec db]# mv mongod.lock mongod.lock.bak
然後繼續嘗試重啟,這次就沒有問題了。
[root@mongo_mobile_gl_comment_sec db]# /etc/init.d/mongod start
Starting mongod: [ OK ]
再次嘗試登入就可以看到是成功了。
[root@mongo_mobile_gl_comment_sec db]# mongo --port=27010
MongoDB shell version: 2.6.3
connecting to: 127.0.0.1:27010/test
當然回過頭來,為什麼伺服器會重啟,重啟之後可以看到,伺服器的負載極高。幾乎沒有任何喘氣的空隙了。
top - 15:29:37 up 364 days, 6:50, 1 user, load average: 92.52, 59.74, 34.39
Tasks: 73 total, 1 running, 72 sleeping, 0 stopped, 0 zombie
Cpu(s): 26.7%us, 7.8%sy, 0.0%ni, 0.7%id, 60.7%wa, 2.8%hi, 1.2%si, 0.2%st
Mem: 15408212k total, 15199732k used, 208480k free, 2732k buffers
Swap: 16777212k total, 9540k used, 16767672k free, 13722616k cached
使用sar得到的結果如下,可以看到idle沒有什麼空間了。
[root@mongo_mobile_gl_comment_sec db]# sar 1 3
Linux 3.10.42-52.145.amzn1.x86_64 (mongo_mobile_gl_comment_sec.126) 12/21/2015 _x86_64_ (2 CPU)
03:30:12 PM CPU %user %nice %system %iowait %steal %idle
03:30:13 PM all 46.84 0.00 8.95 43.68 0.53 0.00
03:30:14 PM all 23.16 0.00 9.47 66.32 0.00 1.05
03:30:15 PM all 18.18 0.00 5.88 75.40 0.53 0.00
Average: all 29.45 0.00 8.11 61.73 0.35 0.35
檢視系統日誌,看看奔潰前的伺服器情況。然後又看到了熟悉的oom-killer,又是out of memory,又是swap不足了。
Dec 21 10:47:51 template-mongodb-2-6-3 dhclient[1504]: bound to 172.16.3.126 -- renewal in 1415 seconds.
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.279409] mongod invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.286302] mongod cpuset=/ mems_allowed=0
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.288577] CPU: 0 PID: 11565 Comm: mongod Not tainted 3.10.42-52.145.amzn1.x86_64 #1
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.292344] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/03/2014
。。。
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.503711] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.508107] 37117 total pagecache pages
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.510198] 34175 pages in swap cache
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.512098] Swap cache stats: add 5987691, delete 5953516, find 100670521/100858969
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.515633] Free swap = 8kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.517084] Total swap = 16777212kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.554801] 3999999 pages RAM
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.556423] 82410 pages reserved
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.558234] 526732 pages shared
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.559786] 3881988 pages non-shared
。。。
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.678772] [ 8407] 0 8407 63087 900 75 0 0 php
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.682439] Out of memory: Kill process 23821 (mongod) score 982 or sacrifice child
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.686761] Killed process 23821 (mongod) total-vm:794240260kB, anon-rss:13553192kB, file-rss:0kB
可以在重啟之後看到swap的程式情況。
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11830 mongod 20 0 728g 10g 9.9g S 82.2 70.0 106:53.68 /usr/bin/mongod -f /etc/mongod.conf
326 root 20 0 0 0 0 S 2.7 0.0 4989:16 [kswapd0]
以10秒為間隔,可以看到產生了大量的讀操作。
[root@mongo_mobile_gl_comment_sec ~]# mongotop -h 127.0.0.1 --port 27010 -u admin -p'xxxxxx' 10
connected to: 127.0.0.1:27010
ns total read write 2015-12-21T10:33:56
mobile_UCENTER.scoreRecord 1986786ms 1986786ms 0ms
social.t_user 117490ms 117490ms 0ms
mobile_UCENTER.T_USER_INVITE_CODE 97416ms 97416ms 0ms
social.t_comment 94913ms 94913ms 0ms
social.t_topic 74259ms 74259ms 0ms
mobile_UCENTER.expRecord 48979ms 48979ms 0ms
social.t_praise 7286ms 7286ms 0ms
social.t_topic_sub 4003ms 4003ms 0ms
mobile_UCENTER.boughtRecord 1561ms 1561ms 0ms
使用mongostat檢視統計資訊。可以看到不同的db的負載情況,連dml的情況都能夠統計的非常詳細。
# mongostat -h 127.0.0.1 --port 27010 -u admin -p'xxxxxxx' 5
connected to: 127.0.0.1:27010
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time
*9 955 *8 *0 0 11|0 0 726g 728g 10.1g 2371 social:0.3% 0 116|0 1|0 166k 308k 1527 rs_comment SEC 16:05:19
*20 2080 *7 *0 0 10|0 0 726g 728g 10.1g 2315 mobile_UCENTER:0.4% 0 700|0 2|0 342k 449k 1527 rs_comment SEC 16:05:25
*22 1963 *13 *0 0 14|0 0 726g 728g 10.2g 2909 mobile_UCENTER:0.2% 0 0|0 300|0 352k 426k 1527 rs_comment SEC 16:05:30
*18 1583 *5 *0 0 11|0 0 726g 728g 10.2g 2495 mobile_UCENTER:0.3% 0 282|0 10|0 275k 346k 1527 rs_comment SEC 16:05:35
後續的就是持續深化,最佳化好之後再來收尾,看看為什麼會負載突然提升,從系統層面怎麼提升了。
當前狀態:PROBLEM
事件編號:11187790
發生時間:2015.12.21 13:30:01
持續時長:1h 20m
監控節點:mongo_mobile_gl_comment_sec.161
報警級別:高危(High)
事件資訊:One or more replication members need attention
事件詳情:Replication: Count of ReplicaSet Members Needing Attention: 1
DEBUG資訊:
Replication: Count of ReplicaSet Members Needing Attention (mongo_mobile_gl_comment_sec.161:repl_member_attention_state_count): 1
對於這類問題,最開始以為是網路延時,結果郵件還是持續報警,這個時候感覺應該是哪裡出了問題。連線到環境中,發現mongo程式已經消失了。
嘗試連線,提示連線失敗。
[root@mongo_mobile_gl_comment_sec ~]# mongo --port=27010
MongoDB shell version: 2.6.3
connecting to: 127.0.0.1:27010/test
2015-12-21T12:21:28.407+0530 warning: Failed to connect to 127.0.0.1:27010, reason: errno:111 Connection refused
2015-12-21T12:21:28.407+0530 Error: couldn't connect to server 127.0.0.1:27010 (127.0.0.1), connection attempt failed at src/mongo/shell/mongo.js:148
exception: connect failed
這個時候準備重啟mongoDB,這個時候提示重啟失敗。
[root@mongo_mobile_gl_comment_sec ~]# /etc/init.d/mongod start
Starting mongod: [FAILED]
這個時候只能檢視日誌了,得到的日誌資訊如下,可以看到伺服器重啟的痕跡,但是mongo重啟失敗。
less /storage/mongodb/log/mongod.log
2015-12-21T11:06:02.893+0530 [PeriodicTaskRunner] task: DBConnectionPool-cleaner took: 8ms
2015-12-21T11:06:33.976+0530 [conn1675793] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 1094ms
2015-12-21T11:06:33.976+0530 [conn1675786] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 30244ms
2015-12-21T11:06:33.976+0530 [conn1675795] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 1096ms
2015-12-21T11:06:33.976+0530 [conn1675794] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 numYields:0 reslen:700 1039ms
2015-12-21T11:06:33.982+0530 [conn1675789] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31111ms
2015-12-21T11:06:33.982+0530 [conn1675788] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 30028ms
2015-12-21T11:06:33.982+0530 [conn1675790] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31079ms
2015-12-21T11:06:33.982+0530 [conn1675791] command local.$cmd command: getnonce { getnonce: 1 } ntoreturn:1 keyUpdates:0 numYields:0 reslen:65 31095ms
2015-12-21T11:06:34.715+0530 [conn1675792] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 numYields:0 reslen:434 30769ms
2015-12-21T11:06:34.769+0530 [conn271045] query mobile_UCENTER.scoreRecord query: { $and: [ { uid: 2093542 }, { createTime: { $lte: new Date(1450675608270) } }, { createTime: { $gte: new Date(1450636200270) } } ] } planSummary: IXSCAN { uid: 1 } ntoreturn:0 ntoskip:0 nscanned:125 nscannedObjects:125 keyUpdates:0 numYields:109 locks(micros) r:275071880 nreturned:0 reslen:20 576812ms
2015-12-21T11:06:43.649+0530 [PeriodicTaskRunner] task: WriteBackManager::cleaner took: 15ms
2015-12-21T12:25:31.762+0530 ***** SERVER RESTARTED *****
2015-12-21T12:25:31.775+0530 [initandlisten] MongoDB starting : pid=11668 port=27010 dbpath=/storage/mongodb/data/db/ 64-bit host=mongo_mobile_gl_comment_sec.126
2015-12-21T12:25:31.775+0530 [initandlisten] db version v2.6.3
2015-12-21T12:25:31.775+0530 [initandlisten] git version: 255f67a66f9603c59380b2a389e386910bbb52cb
2015-12-21T12:25:31.775+0530 [initandlisten] build info: Linux build12.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2015-12-21T12:25:31.775+0530 [initandlisten] allocator: tcmalloc
2015-12-21T12:25:31.775+0530 [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1,172.16.3.126", http: { enabled: false }, maxIncomingConnections: 5000, port: 27010 }, processManagement: { fork: true, pidFilePath: "/storage/mongodb/mongod.pid" }, replication: { oplogSizeMB: 30960, replSet: "rs_comment" }, security: { authorization: "enabled", keyFile: "/home/mongodb/key/rs_key" }, storage: { dbPath: "/storage/mongodb/data/db/", directoryPerDB: true, journal: { enabled: false } }, systemLog: { destination: "file", logAppend: true, path: "/storage/mongodb/log/mongod.log", quiet: true } }
2015-12-21T12:25:31.786+0530 [initandlisten] exception in initAndListen: 12596 old lock file, terminating
2015-12-21T12:25:31.786+0530 [initandlisten] dbexit:
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to close listening sockets...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to flush diaglog...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: going to close sockets...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: waiting for fs preallocator...
2015-12-21T12:25:31.786+0530 [initandlisten] shutdown: closing all files...
2015-12-21T12:25:31.786+0530 [initandlisten] closeAllFiles() finished
2015-12-21T12:25:31.786+0530 [initandlisten] dbexit: really exiting now
對於這類問題,因為也算是菜鳥,所以有些不知所措,不過看了下日誌錯誤,提示是存在一個鎖檔案,這個看起來和weblogic啟動也有點類似,查了一下百度,也得到了印證,我就不繼續靠谷歌了。開始備份這個鎖檔案
[root@mongo_mobile_gl_comment_sec db]# mv mongod.lock mongod.lock.bak
然後繼續嘗試重啟,這次就沒有問題了。
[root@mongo_mobile_gl_comment_sec db]# /etc/init.d/mongod start
Starting mongod: [ OK ]
再次嘗試登入就可以看到是成功了。
[root@mongo_mobile_gl_comment_sec db]# mongo --port=27010
MongoDB shell version: 2.6.3
connecting to: 127.0.0.1:27010/test
當然回過頭來,為什麼伺服器會重啟,重啟之後可以看到,伺服器的負載極高。幾乎沒有任何喘氣的空隙了。
top - 15:29:37 up 364 days, 6:50, 1 user, load average: 92.52, 59.74, 34.39
Tasks: 73 total, 1 running, 72 sleeping, 0 stopped, 0 zombie
Cpu(s): 26.7%us, 7.8%sy, 0.0%ni, 0.7%id, 60.7%wa, 2.8%hi, 1.2%si, 0.2%st
Mem: 15408212k total, 15199732k used, 208480k free, 2732k buffers
Swap: 16777212k total, 9540k used, 16767672k free, 13722616k cached
使用sar得到的結果如下,可以看到idle沒有什麼空間了。
[root@mongo_mobile_gl_comment_sec db]# sar 1 3
Linux 3.10.42-52.145.amzn1.x86_64 (mongo_mobile_gl_comment_sec.126) 12/21/2015 _x86_64_ (2 CPU)
03:30:12 PM CPU %user %nice %system %iowait %steal %idle
03:30:13 PM all 46.84 0.00 8.95 43.68 0.53 0.00
03:30:14 PM all 23.16 0.00 9.47 66.32 0.00 1.05
03:30:15 PM all 18.18 0.00 5.88 75.40 0.53 0.00
Average: all 29.45 0.00 8.11 61.73 0.35 0.35
檢視系統日誌,看看奔潰前的伺服器情況。然後又看到了熟悉的oom-killer,又是out of memory,又是swap不足了。
Dec 21 10:47:51 template-mongodb-2-6-3 dhclient[1504]: bound to 172.16.3.126 -- renewal in 1415 seconds.
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.279409] mongod invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.286302] mongod cpuset=/ mems_allowed=0
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.288577] CPU: 0 PID: 11565 Comm: mongod Not tainted 3.10.42-52.145.amzn1.x86_64 #1
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.292344] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/03/2014
。。。
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.503711] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.508107] 37117 total pagecache pages
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.510198] 34175 pages in swap cache
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.512098] Swap cache stats: add 5987691, delete 5953516, find 100670521/100858969
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.515633] Free swap = 8kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.517084] Total swap = 16777212kB
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.554801] 3999999 pages RAM
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.556423] 82410 pages reserved
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.558234] 526732 pages shared
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.559786] 3881988 pages non-shared
。。。
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.678772] [ 8407] 0 8407 63087 900 75 0 0 php
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.682439] Out of memory: Kill process 23821 (mongod) score 982 or sacrifice child
Dec 21 11:09:07 template-mongodb-2-6-3 kernel: [31458594.686761] Killed process 23821 (mongod) total-vm:794240260kB, anon-rss:13553192kB, file-rss:0kB
可以在重啟之後看到swap的程式情況。
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11830 mongod 20 0 728g 10g 9.9g S 82.2 70.0 106:53.68 /usr/bin/mongod -f /etc/mongod.conf
326 root 20 0 0 0 0 S 2.7 0.0 4989:16 [kswapd0]
以10秒為間隔,可以看到產生了大量的讀操作。
[root@mongo_mobile_gl_comment_sec ~]# mongotop -h 127.0.0.1 --port 27010 -u admin -p'xxxxxx' 10
connected to: 127.0.0.1:27010
ns total read write 2015-12-21T10:33:56
mobile_UCENTER.scoreRecord 1986786ms 1986786ms 0ms
social.t_user 117490ms 117490ms 0ms
mobile_UCENTER.T_USER_INVITE_CODE 97416ms 97416ms 0ms
social.t_comment 94913ms 94913ms 0ms
social.t_topic 74259ms 74259ms 0ms
mobile_UCENTER.expRecord 48979ms 48979ms 0ms
social.t_praise 7286ms 7286ms 0ms
social.t_topic_sub 4003ms 4003ms 0ms
mobile_UCENTER.boughtRecord 1561ms 1561ms 0ms
使用mongostat檢視統計資訊。可以看到不同的db的負載情況,連dml的情況都能夠統計的非常詳細。
# mongostat -h 127.0.0.1 --port 27010 -u admin -p'xxxxxxx' 5
connected to: 127.0.0.1:27010
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time
*9 955 *8 *0 0 11|0 0 726g 728g 10.1g 2371 social:0.3% 0 116|0 1|0 166k 308k 1527 rs_comment SEC 16:05:19
*20 2080 *7 *0 0 10|0 0 726g 728g 10.1g 2315 mobile_UCENTER:0.4% 0 700|0 2|0 342k 449k 1527 rs_comment SEC 16:05:25
*22 1963 *13 *0 0 14|0 0 726g 728g 10.2g 2909 mobile_UCENTER:0.2% 0 0|0 300|0 352k 426k 1527 rs_comment SEC 16:05:30
*18 1583 *5 *0 0 11|0 0 726g 728g 10.2g 2495 mobile_UCENTER:0.3% 0 282|0 10|0 275k 346k 1527 rs_comment SEC 16:05:35
後續的就是持續深化,最佳化好之後再來收尾,看看為什麼會負載突然提升,從系統層面怎麼提升了。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23718752/viewspace-1876475/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 簡單的字串處理字串
- 簡單的文字處理
- 串的簡單處理
- 簡單的全域性異常統一處理
- 處理一些編譯警告的簡單編譯
- ORACLE的簡單處理高水位Oracle
- 一個慢查詢報警的簡單處理
- MongoDB故障處理MongoDB
- SQL server觸發器簡單示例SQLServer觸發器
- 一個簡單的 PHP 時間處理擴充套件PHP套件
- 基於Opencv的簡單影像處理OpenCV
- Java的簡單理解(22)---處理流Java
- 簡單實現批處理
- 處理 EF 併發其實就這麼簡單
- 簡單建立序列和觸發器示例觸發器
- 一個簡單的MapReduce示例(多個MapReduce任務處理)
- 一個簡單易用的資料庫壞塊處理方案資料庫
- MongoDB索引的簡單理解MongoDB索引
- MongoDB 簡單上手MongoDB
- 簡單批處理內部命令簡介
- android簡單的圖形特效處理Android特效
- iOS換膚功能的簡單處理框架iOS框架
- TensorFlow進行簡單的影像處理
- 包含(處理)HTML的最簡單方法包括HTML
- 使用Lambda表示式處理簡單的業務
- 刪除前一天的備份的一個簡單批處理
- 影象處理庫GPUImage簡單使用GPUUI
- php 處理訊號簡單演示PHP
- 關於MongoDB的簡單理解(一)--基礎篇MongoDB
- 分享一下 Laravel 的中時間的簡單處理方式例項Laravel
- window下mongodb的簡單配置MongoDB
- Mongodb請求處理流程MongoDB
- Rust中錯誤處理的最簡單指南Rust
- MySQL:簡單記錄訊號處理MySql
- 實現一個簡單的基於 WebAssembly 的圖片處理應用Web
- 一個簡單的bigfile tablespace無法擴充套件的案例處理套件
- 基於gulp的一個簡單的處理多個api域名環境的開發工作流API
- jwt 多使用者認證,一次簡單處理。JWT