k8s與監控--k8s部署grafana6.0

weixin_34127717發表於2019-02-28

前言

本文主要介紹最新版本grafana6.0的一些新特性和如何部署到k8s當中。

grafana6.0簡介

Grafana的這一更新引入了一種新的查詢展示資料的方式,支援日誌資料和大量其他功能。

主要亮點是:

  • Explore - 一個新的查詢工作流,用於臨時資料探索和故障排除。
  • Grafana Loki - 與Grafana Labs的新開源日誌聚合系統整合。
  • Gauge Panel - 種用於gauges的新型獨立皮膚。
  • New Panel Editor UX 改進了皮膚編輯,並可在不同的視覺化之間輕鬆切換。
  • Google Stackdriver Datasource 已經過測試版並正式釋出。
  • Azure Monitor 外掛從作為外部外掛移植到核心資料來源。
  • React Plugin 支援可以更輕鬆地構建外掛。
  • Named Colors 包含在我們新的改良顏色選擇器中。
  • Removal of user session storage 使Grafana更易於部署並提高安全性。

其實可以看出,Explore和Grafana Loki是專為用於grafana增強自己在日誌展示方面而推出的future。不過 loki這個受prometheus啟發而建立的日誌儲存和檢索框架至今沒有release,而且官方也不建議生產環境使用。但是loki是值得大家關注的一個技術,深度和k8s結合,可以用於專門處理k8s當中的日誌。

下面是一張使用Explore處理日誌的截圖:

圖片描述

grafana6.0 部署

我們主要提供將grafana6.0 部署到k8s中的方法。

由於我們的環境是aws託管的k8s,所以需要注意pvc和svc這兩個地方,需要大家移植的時候稍微做一下修改。

下面是configmap,主要包含了ldap.toml 和 grafana.ini 兩個配置檔案。由於企業實際環境中,需要對接單位的ldap,所以包含了ldap.toml

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app: hawkeye-grafana
  name: hawkeye-grafana-cm
  namespace: sgt
data:
  ldap.toml: |-
    # To troubleshoot and get more log info enable ldap debug logging in grafana.ini
    # [log]
    # filters = ldap:debug

    [[servers]]
    # Ldap server host (specify multiple hosts space separated)
    host = "ldap.xxx.org"
    # Default port is 389 or 636 if use_ssl = true
    port = 389
    # Set to true if ldap server supports TLS
    use_ssl = false
    # Set to true if connect ldap server with STARTTLS pattern (create connection in insecure, then upgrade to secure connection with TLS)
    start_tls = false
    # set to true if you want to skip ssl cert validation
    ssl_skip_verify = false
    # set to the path to your root CA certificate or leave unset to use system defaults
    # root_ca_cert = "/path/to/certificate.crt"
    # Authentication against LDAP servers requiring client certificates
    # client_cert = "/path/to/client.crt"
    # client_key = "/path/to/client.key"

    # Search user bind dn
    bind_dn = "cn=Manager,dc=xxx,dc=com"
    # Search user bind password
    # If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
    bind_password = 'xxxxx'

    # User search filter, for example "(cn=%s)" or "(sAMAccountName=%s)" or "(uid=%s)"
    search_filter = "(cn=%s)"

    # An array of base dns to search through
    search_base_dns = ["ou=tech,cn=hawkeye,dc=xxxx,dc=com"]

    ## For Posix or LDAP setups that does not support member_of attribute you can define the below settings
    ## Please check grafana LDAP docs for examples
    # group_search_filter = "(&(objectClass=posixGroup)(memberUid=%s))"
    # group_search_base_dns = ["ou=groups,dc=grafana,dc=org"]
    # group_search_filter_user_attribute = "uid"

    # Specify names of the ldap attributes your ldap uses
    [servers.attributes]
    name = "givenName"
    surname = "sn"
    username = "cn"
    member_of = "memberOf"
    email =  "email"

    # Map ldap groups to grafana org roles
    [[servers.group_mappings]]
    group_dn = "cn=admins,dc=grafana,dc=org"
    org_role = "Admin"
    # To make user an instance admin  (Grafana Admin) uncomment line below
    # grafana_admin = true
    # The Grafana organization database id, optional, if left out the default org (id 1) will be used
    # org_id = 1

    [[servers.group_mappings]]
    group_dn = "cn=users,dc=grafana,dc=org"
    org_role = "Editor"

    [[servers.group_mappings]]
    # If you want to match all (or no ldap groups) then you can use wildcard
    group_dn = "*"
    org_role = "Viewer"

  grafana.ini: |-
    ##################### Grafana Configuration Example #####################
    #
    # Everything has defaults so you only need to uncomment things you want to
    # change

    # possible values : production, development
    ;app_mode = production

    # instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty
    ;instance_name = ${HOSTNAME}

    #################################### Paths ####################################
    [paths]
    # Path to where grafana can store temp files, sessions, and the sqlite3 db (if that is used)
    ;data = /var/lib/grafana

    # Temporary files in `data` directory older than given duration will be removed
    ;temp_data_lifetime = 24h

    # Directory where grafana can store logs
    ;logs = /var/log/grafana

    # Directory where grafana will automatically scan and look for plugins
    ;plugins = /var/lib/grafana/plugins

    # folder that contains provisioning config files that grafana will apply on startup and while running.
    ;provisioning = conf/provisioning

    #################################### Server ####################################
    [server]
    # Protocol (http, https, socket)
    ;protocol = http

    # The ip address to bind to, empty will bind to all interfaces
    ;http_addr =

    # The http port  to use
    http_port = 3000

    # The public facing domain name used to access grafana from a browser
    ;domain = localhost

    # Redirect to correct domain if host header does not match domain
    # Prevents DNS rebinding attacks
    ;enforce_domain = false

    # The full public facing url you use in browser, used for redirects and emails
    # If you use reverse proxy and sub path specify full url (with sub path)
    ;root_url = http://localhost:3000

    # Log web requests
    ;router_logging = false

    # the path relative working path
    ;static_root_path = public

    # enable gzip
    ;enable_gzip = false

    # https certs & key file
    ;cert_file =
    ;cert_key =

    # Unix socket path
    ;socket =

    #################################### Database ####################################
    [database]
    # You can configure the database connection by specifying type, host, name, user and password
    # as separate properties or as on string using the url properties.

    # Either "mysql", "postgres" or "sqlite3", it's your choice
    ;type = sqlite3
    ;host = 127.0.0.1:3306
    ;name = grafana
    ;user = root
    # If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;"""
    ;password =

    # Use either URL or the previous fields to configure the database
    # Example: mysql://user:secret@host:port/database
    ;url =

    # For "postgres" only, either "disable", "require" or "verify-full"
    ;ssl_mode = disable

    # For "sqlite3" only, path relative to data_path setting
    ;path = grafana.db

    # Max idle conn setting default is 2
    ;max_idle_conn = 2

    # Max conn setting default is 0 (mean not set)
    ;max_open_conn =

    # Connection Max Lifetime default is 14400 (means 14400 seconds or 4 hours)
    ;conn_max_lifetime = 14400

    # Set to true to log the sql calls and execution times.
    log_queries =

    #################################### Session ####################################
    [session]
    # Either "memory", "file", "redis", "mysql", "postgres", default is "file"
    ;provider = file

    # Provider config options
    # memory: not have any config yet
    # file: session dir path, is relative to grafana data_path
    # redis: config like redis server e.g. `addr=127.0.0.1:6379,pool_size=100,db=grafana`
    # mysql: go-sql-driver/mysql dsn config string, e.g. `user:password@tcp(127.0.0.1:3306)/database_name`
    # postgres: user=a password=b host=localhost port=5432 dbname=c sslmode=disable
    ;provider_config = sessions

    # Session cookie name
    ;cookie_name = grafana_sess

    # If you use session in https only, default is false
    ;cookie_secure = false

    # Session life time, default is 86400
    ;session_life_time = 86400

    #################################### Data proxy ###########################
    [dataproxy]

    # This enables data proxy logging, default is false
    ;logging = false

    #################################### Analytics ####################################
    [analytics]
    # Server reporting, sends usage counters to stats.grafana.org every 24 hours.
    # No ip addresses are being tracked, only simple counters to track
    # running instances, dashboard and error counts. It is very helpful to us.
    # Change this option to false to disable reporting.
    ;reporting_enabled = true

    # Set to false to disable all checks to https://grafana.net
    # for new vesions (grafana itself and plugins), check is used
    # in some UI views to notify that grafana or plugin update exists
    # This option does not cause any auto updates, nor send any information
    # only a GET request to http://grafana.com to get latest versions
    ;check_for_updates = true

    # Google Analytics universal tracking code, only enabled if you specify an id here
    ;google_analytics_ua_id =

    #################################### Security ####################################
    [security]
    # default admin user, created on startup
    ;admin_user = admin

    # default admin password, can be changed before first start of grafana,  or in profile settings
    ;admin_password = admin

    # used for signing
    ;secret_key = xxxxx

    # Auto-login remember days
    ;login_remember_days = 7
    ;cookie_username = grafana_user
    ;cookie_remember_name = grafana_remember

    # disable gravatar profile images
    ;disable_gravatar = false

    # data source proxy whitelist (ip_or_domain:port separated by spaces)
    ;data_source_proxy_whitelist =

    # disable protection against brute force login attempts
    ;disable_brute_force_login_protection = false

    #################################### Snapshots ###########################
    [snapshots]
    # snapshot sharing options
    ;external_enabled = true
    ;external_snapshot_url = https://snapshots-origin.raintank.io
    ;external_snapshot_name = Publish to snapshot.raintank.io

    # remove expired snapshot
    ;snapshot_remove_expired = true

    #################################### Dashboards History ##################
    [dashboards]
    # Number dashboard versions to keep (per dashboard). Default: 20, Minimum: 1
    ;versions_to_keep = 20

    #################################### Users ###############################
    [users]
    # disable user signup / registration
    ;allow_sign_up = true

    # Allow non admin users to create organizations
    ;allow_org_create = true

    # Set to true to automatically assign new users to the default organization (id 1)
    ;auto_assign_org = true

    # Default role new users will be automatically assigned (if disabled above is set to true)
    ;auto_assign_org_role = Viewer

    # Background text for the user field on the login page
    ;login_hint = email or username

    # Default UI theme ("dark" or "light")
    ;default_theme = dark

    # External user management, these options affect the organization users view
    ;external_manage_link_url =
    ;external_manage_link_name =
    ;external_manage_info =

    # Viewers can edit/inspect dashboard settings in the browser. But not save the dashboard.
    ;viewers_can_edit = false

    [auth]
    # Set to true to disable (hide) the login form, useful if you use OAuth, defaults to false
    ;disable_login_form = false

    # Set to true to disable the signout link in the side menu. useful if you use auth.proxy, defaults to false
    ;disable_signout_menu = false

    # URL to redirect the user to after sign out
    ;signout_redirect_url =

    #################################### Anonymous Auth ##########################
    [auth.anonymous]
    # enable anonymous access
    ;enabled = false

    # specify organization name that should be used for unauthenticated users
    ;org_name = Main Org.

    # specify role for unauthenticated users
    ;org_role = Viewer

    #################################### Github Auth ##########################
    [auth.github]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email,read:org
    ;auth_url = https://github.com/login/oauth/authorize
    ;token_url = https://github.com/login/oauth/access_token
    ;api_url = https://api.github.com/user
    ;team_ids =
    ;allowed_organizations =

    #################################### Google Auth ##########################
    [auth.google]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_client_id
    ;client_secret = some_client_secret
    ;scopes = https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
    ;auth_url = https://accounts.google.com/o/oauth2/auth
    ;token_url = https://accounts.google.com/o/oauth2/token
    ;api_url = https://www.googleapis.com/oauth2/v1/userinfo
    ;allowed_domains =

    #################################### Generic OAuth ##########################
    [auth.generic_oauth]
    ;enabled = false
    ;name = OAuth
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email,read:org
    ;auth_url = https://foo.bar/login/oauth/authorize
    ;token_url = https://foo.bar/login/oauth/access_token
    ;api_url = https://foo.bar/user
    ;team_ids =
    ;allowed_organizations =
    ;tls_skip_verify_insecure = false
    ;tls_client_cert =
    ;tls_client_key =
    ;tls_client_ca =

    #################################### Grafana.com Auth ####################
    [auth.grafana_com]
    ;enabled = false
    ;allow_sign_up = true
    ;client_id = some_id
    ;client_secret = some_secret
    ;scopes = user:email
    ;allowed_organizations =

    #################################### Auth Proxy ##########################
    [auth.proxy]
    ;enabled = false
    ;header_name = X-WEBAUTH-USER
    ;header_property = username
    ;auto_sign_up = true
    ;ldap_sync_ttl = 60
    ;whitelist = 192.168.1.1, 192.168.2.1
    ;headers = Email:X-User-Email, Name:X-User-Name

    #################################### Basic Auth ##########################
    [auth.basic]
    ;enabled = true

    #################################### Auth LDAP ##########################
    [auth.ldap]
    enabled = true
    ;config_file = /etc/grafana/ldap.toml
    ;allow_sign_up = true

    #################################### SMTP / Emailing ##########################
    [smtp]
    enabled = true
    host = smtp.exmail.qq.com:465
    user = noreply@xxx.com
    # If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
    password = AFxxxxxxYoQ2G
    from_address = noreply@xxxx.com
    from_name = Hawkeye
    ;cert_file =
    ;key_file =
    ;skip_verify = false
    ;from_address = admin@grafana.localhost
    ;from_name = Grafana
    # EHLO identity in SMTP dialog (defaults to instance_name)
    ;ehlo_identity = dashboard.example.com

    [emails]
    ;welcome_email_on_sign_up = false

    #################################### Logging ##########################
    [log]
    # Either "console", "file", "syslog". Default is console and  file
    # Use space to separate multiple modes, e.g. "console file"
    ;mode = console file

    # Either "debug", "info", "warn", "error", "critical", default is "info"
    ;level = info

    # optional settings to set different levels for specific loggers. Ex filters = sqlstore:debug
    ;filters =

    # For "console" mode only
    [log.console]
    ;level =

    # log line format, valid options are text, console and json
    ;format = console

    # For "file" mode only
    [log.file]
    ;level =

    # log line format, valid options are text, console and json
    ;format = text

    # This enables automated log rotate(switch of following options), default is true
    ;log_rotate = true

    # Max line number of single file, default is 1000000
    ;max_lines = 1000000

    # Max size shift of single file, default is 28 means 1 << 28, 256MB
    ;max_size_shift = 28

    # Segment log daily, default is true
    ;daily_rotate = true

    # Expired days of log file(delete after max days), default is 7
    ;max_days = 7

    [log.syslog]
    ;level =

    # log line format, valid options are text, console and json
    ;format = text

    # Syslog network type and address. This can be udp, tcp, or unix. If left blank, the default unix endpoints will be used.
    ;network =
    ;address =

    # Syslog facility. user, daemon and local0 through local7 are valid.
    ;facility =

    # Syslog tag. By default, the process' argv[0] is used.
    ;tag =

    #################################### Alerting ############################
    [alerting]
    # Disable alerting engine & UI features
    ;enabled = true
    # Makes it possible to turn off alert rule execution but alerting UI is visible
    ;execute_alerts = true

    # Default setting for new alert rules. Defaults to categorize error and timeouts as alerting. (alerting, keep_state)
    ;error_or_timeout = alerting

    # Default setting for how Grafana handles nodata or null values in alerting. (alerting, no_data, keep_state, ok)
    ;nodata_or_nullvalues = no_data

    # Alert notifications can include images, but rendering many images at the same time can overload the server
    # This limit will protect the server from render overloading and make sure notifications are sent out quickly
    ;concurrent_render_limit = 5

    #################################### Explore #############################
    [explore]
    # Enable the Explore section
    ;enabled = false

    #################################### Internal Grafana Metrics ##########################
    # Metrics available at HTTP API Url /metrics
    [metrics]
    # Disable / Enable internal metrics
    ;enabled           = true

    # Publish interval
    ;interval_seconds  = 10

    # Send internal metrics to Graphite
    [metrics.graphite]
    # Enable by setting the address setting (ex localhost:2003)
    ;address =
    ;prefix = prod.grafana.%(instance_name)s.

    #################################### Distributed tracing ############
    [tracing.jaeger]
    # Enable by setting the address sending traces to jaeger (ex localhost:6831)
    ;address = localhost:6831
    # Tag that will always be included in when creating new spans. ex (tag1:value1,tag2:value2)
    ;always_included_tag = tag1:value1
    # Type specifies the type of the sampler: const, probabilistic, rateLimiting, or remote
    ;sampler_type = const
    # jaeger samplerconfig param
    # for "const" sampler, 0 or 1 for always false/true respectively
    # for "probabilistic" sampler, a probability between 0 and 1
    # for "rateLimiting" sampler, the number of spans per second
    # for "remote" sampler, param is the same as for "probabilistic"
    # and indicates the initial sampling rate before the actual one
    # is received from the mothership
    ;sampler_param = 1

    #################################### Grafana.com integration  ##########################
    # Url used to import dashboards directly from Grafana.com
    [grafana_com]
    ;url = https://grafana.com

    #################################### External image storage ##########################
    [external_image_storage]
    # Used for uploading images to public servers so they can be included in slack/email messages.
    # you can choose between (s3, webdav, gcs, azure_blob, local)
    ;provider =

    [external_image_storage.s3]
    ;bucket =
    ;region =
    ;path =
    ;access_key =
    ;secret_key =

    [external_image_storage.webdav]
    ;url =
    ;public_url =
    ;username =
    ;password =

    [external_image_storage.gcs]
    ;key_file =
    ;bucket =
    ;path =

    [external_image_storage.azure_blob]
    ;account_name =
    ;account_key =
    ;container_name =

    [external_image_storage.local]
    # does not require any configuration

    [rendering]
    # Options to configure external image rendering server like https://github.com/grafana/grafana-image-renderer
    ;server_url =
    ;callback_url =

---

以上凡是我打 xxx都是經過修改的,隱藏了本司的一些重要資訊。大家需要根據實際情況,自行配置修改。

。如果你不需要ldap 認證,
這可以刪除configmap當中的 ldap.toml 並且在grafana.ini 當中 將true改為false。

#################################### Auth LDAP ##########################
    [auth.ldap]
    enabled = true
    ;config_file = /etc/grafana/ldap.toml
    ;allow_sign_up = true
  

deployment.yaml 如下:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hawkeye-grafana
  namespace: sgt
  labels:
    app: hawkeye-grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hawkeye-grafana
  template:
    metadata:
      labels:
        app: hawkeye-grafana
    spec:
      containers:
      - image: grafana/grafana:6.0.0
        name: grafana
        imagePullPolicy: IfNotPresent
        # env:
        env:
          - name: GF_PATHS_PROVISIONING
            value: /var/lib/grafana/provisioning
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 100Mi
        readinessProbe:
          httpGet:
            path: /login
            port: 3000
          # initialDelaySeconds: 30
          # timeoutSeconds: 1
        volumeMounts:
        - name: grafana-persistent-storage
          mountPath: /var/lib/grafana/
        - name: config
          mountPath: /etc/grafana/
      initContainers:
      - name: "init-chown-data"
        image: "busybox:latest"
        imagePullPolicy: "IfNotPresent"
        command: ["chown", "-R", "472:472", "/var/lib/grafana/"]
        volumeMounts:
        - name: grafana-persistent-storage
          mountPath: /var/lib/grafana/
          subPath: ""
      volumes:
      - name: config
        configMap:
          name: hawkeye-grafana-cm
      - name: grafana-persistent-storage
        persistentVolumeClaim:
          claimName: hawkeye-grafana-claim
---

注意增加了initContainers,主要是解決掛載的寫許可權的問題。

service.yaml 如下:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  labels:
    app: hawkeye-grafana
  name: hawkeye-grafana
  namespace: sgt
spec:
  type: LoadBalancer
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 3000
  selector:
    app: hawkeye-grafana

---

pvc.yaml 如下:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: hawkeye-grafana-claim
  namespace: sgt
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30Gi

執行成功以後,訪問成功,通過admin/admin 登入。

圖片描述

可以看出,左側新增了Explore圖示。

總結

grafana 首先會從/usr/share/grafana/conf/defaults.ini讀取配置檔案,然後再讀取/etc/grafana/grafana.ini讀取,同一引數的配置,那麼/etc/grafana/grafana.ini 會覆蓋
/usr/share/grafana/conf/defaults.ini中配置。而命令列配置的引數會覆蓋/etc/grafana/grafana.ini中的同一引數,最後環境變數中同一配置,又會覆蓋命令列中的。

下面是預設的一些環境變數:

GF_PATHS_CONFIG /etc/grafana/grafana.ini
GF_PATHS_DATA /var/lib/grafana
GF_PATHS_HOME /usr/share/grafana
GF_PATHS_LOGS /var/log/grafana
GF_PATHS_PLUGINS /var/lib/grafana/plugins
GF_PATHS_PROVISIONING /etc/grafana/provisioning

相關文章