Linux alarm signal (SIGALRM) to detach process isAlive
題記
最近做專案遇到的問題,程式跑了多個process,每個process都是相互獨立的,為了解耦,類似於微服務的架構,我們要求系統可以detach 到 主執行緒跑飛,死迴圈等其他bug 問題,最初的設計方案是:每個process 都會給每一個monitor的process 去傳送keep alive 訊息,由monitor去收集每個module的keep alive訊息,然後去判斷是否process 跑飛等情況。但是這種方案,由於需要多一個monitor模組,在本來記憶體有限的嵌入式裝置上,有點得不償失,後來就想能否有linux 系統內部的 實現可以達到我們的要求,也就是 SIGALRM
1. Signal & Semaphore 區別
Signal: 是通過軟中斷訊號通知程式發生了非同步事件。程式之間可以通過系統呼叫kill 傳送軟中斷訊號,核心也可以因為內部事件而給程式傳送訊號,通知程式發生了某個事件。
Semaphore: 訊號量是用來作業系統程式間同步訪問共享資源。訊號量在建立時需要設定一個初始值,表示同時可以有幾個任務可以訪問該訊號量保護的共享資源,初始值為1就變成互斥鎖(Mutex),即同時只能有一個任務可以訪問訊號量保護的共享資源。
2. SIGALRM 以及python code 實現
SIGALRM是在定時器終止時傳送給程式的訊號,在進行阻塞式系統呼叫時,為避免程式陷入無限的等待,可以為阻塞式系統呼叫設定定時器。
#include <unistd.h>
unsigned int alarm(unsigned int seconds);
在alarm成功呼叫後,開始計時,超過該事件將觸發SIGALARM訊號,然後會調到handler 執行。如下 是python的例子,
import signal,time,sys,thread,traceback
class Example:
def __init__(self):
self.handler_counter = 0
self.retry_counter = 3
pass
def timout_handler(self, signum, frame):
'''
timeout handler when failed to send signal alarm
there is a retry to make sure main thread hung
'''
self.handler_counter += 1
print "call timeout_handler counter: " + str(self.handler_counter)
if self.handler_counter == self.retry_counter:
print("Have retry %s, exit process", self.retry_counter)
traceback.print_stack(frame) # print traceback
sys.exit()
def monitor_alive(self, threadName, delay):
'''
monitor alive to send alarm message every (delay + 1) second, if after (delay + 1) doesn't receive response from
kernel, will interrupt timout_handler
'''
count = 0
while True:
time.sleep(delay)
signal.alarm(delay + 1)
print "sign_time count " + str(count)
# below if logic to mock 3 time timeout
if count == 2:
time.sleep(delay)
if count == 4:
time.sleep(delay)
if count == 6:
time.sleep(delay)
count += 1
print "%s: %s" % (threadName, time.ctime(time.time()))
if __name__ == '__main__':
example = Example()
# register handler
# only could set signal handler in main thread
# https://stackoverflow.com/questions/44151888/why-only-main-thread-can-set-signal-handler-in-python
signal.signal(signal.SIGALRM, example.timout_handler)
thread.start_new_thread(example.monitor_alive, ("Thread-1", 2,))
while True:
time.sleep(2)
print('main thread ')
執行結果:
sign_time count 0
Thread-1: Sat Jun 9 10:52:43 2018
main thread
sign_time count 1
Thread-1: Sat Jun 9 10:52:45 2018
main thread
sign_time count 2
main thread
Thread-1: Sat Jun 9 10:52:49 2018
main thread
call timeout_handler counter: 1
main thread
sign_time count 3
Thread-1: Sat Jun 9 10:52:51 2018
main thread
sign_time count 4
main thread
Thread-1: Sat Jun 9 10:52:55 2018
main thread
call timeout_handler counter: 2
main thread
sign_time count 5
Thread-1: Sat Jun 9 10:52:57 2018
main thread
sign_time count 6
main thread
Thread-1: Sat Jun 9 10:53:01 2018
main thread
call timeout_handler counter: 3
('Have retry %s, exit process', 3)
File "/home/odl/sereno/tests/singal.py", line 52, in <module>
time.sleep(2)
相關文章
- Linux Signal 示例Linux
- Linux訊號(signal)機制Linux
- tensor .detach()
- LINUX inner-process communicationLinux
- Service Alarm Platform 介紹Platform
- Linux安裝即時通訊軟體SignalLinux
- pytorch訓練GAN時的detach()PyTorch
- pthread_join()和pthread_detach()thread
- jQuery如何使用文件操作detach()方法jQuery
- signal協議協議
- RecyclerView的Adapter中attach和detach探索ViewAPT
- golang處理signalGolang
- 【node】process
- python 之訊號SignalPython
- unix signal : signalfd, eventfd, timerfd
- Recursive Algorithm for Sliding Signal ProcessingGo
- find process by port
- os/signal學習筆記筆記
- wifi管理神器:WiFi Signal MacWiFiMac
- 鬧鐘和時間管理工具:Alarm Clock Pro for macMac
- [Symfony Component Process Exception RuntimeException] The Process class relies on proc_open, whichException
- RuntimeError: An attempt has been made to start a new process before the current process hasError
- AQS相關(lock、unlock、await、signal)AQSAI
- Signal:更多前端框架的選擇前端框架
- MetricMeasurement calculates Peak Signal-to-Noise RatioREM
- Message from debugger: Terminated due to signal 13
- pipe stderr into another process
- Statistical Process Control in SAP
- 一個可擴充套件的報警系統Quick-Alarm套件UI
- 每週一個 Python 模組 | signalPython
- iOS Mach異常和signal訊號iOSMac
- Process object has no attribute '_popen'Object
- CHEE 4703: Process Dynamics and Control
- 常用類process1
- check memcached process and restart if downREST
- Quality Inspection in drop shipment process
- dotnet 測試在 UOS Linux 上使用 Process Start 開啟檔案的行為Linux
- 公司不是家庭 -DHH from Signal v.Noise