使用Jiralert實現AlertManager告警對接Jira

東風微鳴發表於2023-01-04

簡介

Alertmanager 處理由客戶端應用程式(如 Prometheus server)傳送的警報。它負責去重(deduplicating),分組(grouping),並將它們路由(routing)到正確的接收器(receiver)整合,如電子郵件,微信,或釘釘。它還負責處理警報的靜默/遮蔽(silencing)、定時傳送/不傳送(Mute)和抑制(inhibition)問題。

AlertManager 作為 開源的為 Prometheus 而設計的告警應用, 已經具備了告警應用各類豐富、靈活、可定製的功能:

Jiralert

用於JIRA的Prometheus Alertmanager Webhook Receiver

JIRAlert實現了Alertmanager的webhook HTTP API,並連線到一個或多個JIRA例項以建立高度可配置的JIRA Issues。每個不同的 Groupkey 建立一個Issue--由Alertmanager的路由配置部分的group_by引數定義--但在警報解決時不會關閉(預設引數, 可調整)。我們的期望是,人們會檢視這個issue。,採取任何必要的行動,然後關閉它。如果沒有人的互動是必要的,那麼它可能首先就不應該報警。然而,這種行為可以透過設定auto_resolve部分進行修改,它將以所需的狀態解決jira issue。

如果一個相應的JIRA issue。已經存在,但被解決了,它將被重新開啟(reopened)。在解決的狀態和重開的狀態之間必須存在一個JIRA transition--如reopen_state--否則重開將失敗。可以選擇定義一個 "won't fix" 的決議(resolution)--由wont_fix_resolution定義:有此決議的JIRA問題將不會被JIRAlert重新開啟。

安裝 Jiralert

Jiralert 的安裝比較簡單, 主要由 Deployment、Secret(Jiralert 的配置)和 Service 組成。典型示例如下:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jiralert
spec:
  selector:
    matchLabels:
      app: jiralert
  template:
    metadata:
      labels:
        app: jiralert
    spec:
      containers:
      - name: jiralert
        image: quay.io/jiralert/jiralert-linux-amd64:latest
        imagePullPolicy: IfNotPresent
        args:
        - "--config=/jiralert-config/jiralert.yml"
        - "--log.level=debug"
        - "--listen-address=:9097"
        readinessProbe:
          tcpSocket:
            port: 9097
          initialDelaySeconds: 15
          periodSeconds: 15
          timeoutSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 9097
          initialDelaySeconds: 15
          periodSeconds: 15
          timeoutSeconds: 5
        ports:
        - containerPort: 9091
          name: metrics
        volumeMounts:
        - mountPath: /jiralert-config
          name: jiralert-config
          readOnly: true
      volumes:
      - name: jiralert-config
        secret:
          secretName: jiralert-config
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: jiralert-config
stringData:
  jiralert.tmpl: |-
    {{ define "jira.summary" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join "," }}{{ end }}
    
    {{ define "jira.description" }}{{ range .Alerts.Firing }}Labels:
    {{ range .Labels.SortedPairs }} - {{ .Name }} = {{ .Value }}
    {{ end }}
    
    Annotations:
    {{ range .Annotations.SortedPairs }} - {{ .Name }} = {{ .Value }}
    {{ end }}
    
    Source: {{ .GeneratorURL }}
    {{ end }}
    
    CommonLabels:
    {{ range .CommonLabels.SortedPairs }} - {{ .Name }} = {{ .Value}}
    {{ end }}
    
    GroupLabels:
    {{ range .GroupLabels.SortedPairs }} - {{ .Name }} = {{ .Value}}
    {{ end }}
    {{ end }}
  jiralert.yml: |-
    # Global defaults, applied to all receivers where not explicitly overridden. Optional.
    template: jiralert.tmpl
    defaults:
      # API access fields.
      api_url: https://jira.example.com
      user: foo
      password: bar
      # The type of JIRA issue to create. Required.
      issue_type: Bug
      # Issue priority. Optional.
      priority: Major
      # Go template invocation for generating the summary. Required.
      summary: '{{ template "jira.summary" . }}'
      # Go template invocation for generating the description. Optional.
      description: '{{ template "jira.description" . }}'
      # State to transition into when reopening a closed issue. Required.
      reopen_state: "REOPENED"
      # Do not reopen issues with this resolution. Optional.
      wont_fix_resolution: "Won't Fix"
      # Amount of time after being closed that an issue should be reopened, after which, a new issue is created.
      # Optional (default: always reopen)
      # reopen_duration: 30d
    
    # Receiver definitions. At least one must be defined.
    # Receiver names must match the Alertmanager receiver names. Required.
    receivers:
    - name: 'jiralert'
      project: 'YOUR-JIRA-PROJECT'
---
apiVersion: v1
kind: Service
metadata:
  name: jiralert
spec:
  selector:
    app: jiralert
  ports:
  - port: 9097
    targetPort: 9097                

相應 AlertManager 的配置:

...
receivers:
- name: jiralert
  webhook_configs:
  - send_resolved: true
    url: http://jiralert:9097/alert
routes:
- receiver: jiralert
  matchers:
  - severity = critical
  continue: true
...

? 說明:

  • 官方 jiralert 映象地址: https://quay.io/repository/jiralert/jiralert-linux-amd64?tab=tags
    • 官方 jiralert latest 映象: <quay.io/jiralert/jiralert-linux-amd64:latest>
  • jiralert.tmpl 類似 AlertManager 的 Template, 傳送到 Jira 的 Issue 會以此為模板
  • jiralert.yml Jiralert 的配置檔案
    • defaults 基礎版配置
    • receivers 可以設定多個 receiver, 屆時 AlertManager 要發到哪個 Jira 的receiver就需要與這個 jiralert 的receiver 同名. (比如上面的例子, 都是jiralert)

Jiralert 配置

經過生產實踐的 Jiralert 完整配置如下:

# Global defaults, applied to all receivers where not explicitly overridden. Optional.
template: jiralert.tmpl
defaults:
  # API access fields.
  api_url: https://example.atlassian.net
  user: <your-account-email>
  password: '<your-account-api-token>'
  # The type of JIRA issue to create. Required.
  issue_type: Support
  # Issue priority. Optional.
  priority: High
  # Go template invocation for generating the summary. Required.
  summary: '{{ template "jira.summary" . }}'
  # Go template invocation for generating the description. Optional.
  description: '{{ template "jira.description" . }}'
  # State to transition into when reopening a closed issue. Required.
  reopen_state: "Back to in progress"
  # Do not reopen issues with this resolution. Optional.
  wont_fix_resolution: "Won't Do"
  # Amount of time after being closed that an issue should be reopened, after which, a new issue is created.
  # Optional (default: always reopen)
  reopen_duration: 30d

# Receiver definitions. At least one must be defined.
# Receiver names must match the Alertmanager receiver names. Required.
receivers:
- name: 'jiralert'
  project: <your-project-code>
  add_group_labels: true
  auto_resolve:
    state: 'Resolve this issue'

?詳細說明如下:

  1. api_url: Jira 的地址, 如果用的是 Jira 的 SaaS 服務, 就是https://<tenant>.atlassian.net
  2. 認證:
    1. 對於公有云版的 Jira, 只能用 userpassword, 其中:
      1. user 填寫你的賬號郵箱地址;
      2. password 需要先在 API Token | Atlassian account 申請 API Token. (?注意: 登入用的密碼是無法認證透過的)
    2. 對於其他版本, 也可以填寫使用 personal_access_token 進行認證. 其值為: user@example.com:api_token_string 的 base64 編碼後字串. 具體說明見: Basic auth for REST APIs (atlassian.com)
  3. issue_type: 根據您的 Jira Issue Type 來填寫, 可能是: Alert Support Bug New Feature 等等或其他
  4. priority 根據您的 Issue priority 來填寫, 可能是: Critical High Medium Low 等等或其他
  5. reopen_state: Jira 的問題已經關閉, 要重新開啟, 需要的 transition, 如: Back to in progress. (?注意: 這裡需要填寫的是您自定義的 transition, 而非 status)
  6. wont_fix_resolution: 帶有這個 resolution (解決方案)的問題就不會重新開啟. 如: Won't Do Won't Fix, 需要根據自己的 resolution 定義內容來填寫.
  7. reopen_duration: 多久時間之內的問題會重新開啟, 預設是 always reopen, 可以設定為如: 30d, 表示這個問題如果30天以前有同樣的問題, 新開一個 Issue, 而不是重新開啟老的 Issue.
  8. receivers: 可以定義多個 receivers, 指向不同 project
  9. project: Jira 的 Project ID, 是 Project 詳細名字的首字母大寫. 如 Project 是 For Example, 這裡就填寫 FE
  10. add_group_labels: 是否要將 AlertManager 的 Group Labels 加到 Jira 的 Labels. (?注意: Jira Labels 的 Value 是不能有空格的, 所以如果你的 AlertManager 的 Group Label 的Value如果有空格, 不要開啟此項功能)
  11. auto_resolve: 最新 1.2 版本新增的功能, 當告警恢復了, 可以自動 resolve 對應的 Jira Issue.
    1. state: 'Resolve this issue' 這裡也是要填寫您預定義的 Jira 解決該問題的 transition 而非 status, 如'Resolve this issue'.

其他疑難情況

如果你碰到各種詭異的日誌, 原因大部分都是因為沒有正確認證登入導致的, 典型的比如這個報錯:

The value 'XXX' does not exist for the field 'project'.

事實上就是因為沒有正確認證登入導致的.

具體可以參考這裡: Solved: REST error "The value 'XXX' does not exist for the... (atlassian.com)

還有一類報錯, 提示您無法 transition an issue, 這往往是因為以下幾種原因:

  1. Jiralert 中reopen_stateauto_resolvestate 沒有填寫正確的 transition
  2. 您用的賬號沒有相應的許可權
  3. 該 Issue 現在所處的狀態(比如 Closed)不允許再進行 transition

具體可以參考這裡: I can't transition an issue in my Jira project - W... - Atlassian Community

最終效果

如下圖:

Jiralert 效果

可以建立 Issue, 更新 Summary, 更新 Description, 更新 Resolution, 更新 Status; 同樣問題再次出現, reopen 之前的 Issue...

???

?️ 參考文件

三人行, 必有我師; 知識共享, 天下為公. 本文由東風微鳴技術部落格 EWhisper.cn 編寫.

相關文章