Skywalking-06:OAL基礎

switchvov發表於2021-08-16

OAL 基礎知識

基本介紹

OAL(Observability Analysis Language) 是一門用來分析流式資料的語言。

因為 OAL 聚焦於度量 Service 、 Service Instance 和 Endpoint 的指標,所以它學習和使用起來非常簡單。

OAL 基於 altlr 與 javassist 將 oal 指令碼轉化為動態生成的類檔案。

自從 6.3 版本後, OAL 引擎內建在 OAP 伺服器中,可以看做 oal-rt(OAL Runtime) 。 OAL 指令碼位置 OAL 配置目錄下( /config/oal ),使用者能夠更改指令碼並重啟生效。注意: OAL 指令碼仍然是一門編譯語言, oal-rt 動態的生成 Java 程式碼。

如果你配置了環境變數 SW_OAL_ENGINE_DEBUG=Y,能在工作目錄下的 oal-rt 目錄下找到生成的 Class 檔案。

語法

// 宣告一個指標
METRICS_NAME = from(SCOPE.(* | [FIELD][,FIELD ...])) // 從某一個SCOPE中獲取資料
[.filter(FIELD OP [INT | STRING])] // 可以過濾掉部分資料
.FUNCTION([PARAM][, PARAM ...]) // 使用某個聚合函式將資料聚合

// 禁用一個指標
disable(METRICS_NAME);

語法案例

oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal

// 從ServiceInstanceJVMMemory的used獲取資料,只需要 heapStatus 為 true的資料,並取long型的平均值
instance_jvm_memory_heap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == true).longAvg();

org.apache.skywalking.oap.server.core.source.ServiceInstanceJVMMemory

@ScopeDeclaration(id = SERVICE_INSTANCE_JVM_MEMORY, name = "ServiceInstanceJVMMemory", catalog = SERVICE_INSTANCE_CATALOG_NAME)
@ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
public class ServiceInstanceJVMMemory extends Source {
    @Override
    public int scope() {
        return DefaultScopeDefine.SERVICE_INSTANCE_JVM_MEMORY;
    }

    @Override
    public String getEntityId() {
        return String.valueOf(id);
    }

    @Getter @Setter
    private String id;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
    private String name;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
    private String serviceName;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
    private String serviceId;
    @Getter @Setter
    private boolean heapStatus;
    @Getter @Setter
    private long init;
    @Getter @Setter
    private long max;
    @Getter @Setter
    private long used;
    @Getter @Setter
    private long committed;
}

可供參考的官方文件:Observability Analysis Language

從一個案例開始分析 OAL 原理

缺少的類載入資訊監控

預設的 APM/Instance 頁面,缺少關於 JVM Class 的資訊(如下圖所示),故這次將相關資訊補齊。由這次案例來分析 OAL 的原理。

file

Skywalking-04:擴充套件Metric監控資訊 中,講到了如何在已有 Source 類的情況下,增加一些指標。

這次直接連 Source 類以及 OAL 詞法語法關鍵字都自己定義。

可供參考的官方文件:Source and Scope extension for new metrics

確定增加的指標

通過Java ManagementFactory解析這篇文章,可以確定監控指標為“當前載入類的數量”、“已解除安裝類的數量”、“一共載入類的數量”三個指標

ClassLoadingMXBean classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
// 當前載入類的數量
int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
// 已解除安裝類的數量
long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
// 一共載入類的數量
long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();

定義 agent 與 oap server 通訊類

apm-protocol/apm-network/src/main/proto/language-agent/JVMMetric.proto 協議檔案中增加如下定義。

apm-protocol/apm-network 目錄下執行 mvn clean package -DskipTests=true 會生成新的相關 Java 類,org.apache.skywalking.apm.network.language.agent.v3.Class 該類就是我們在程式碼中實際操作的類。

message Class {
  int64 loadedClassCount = 1;
  int64 unloadedClassCount = 3;
  int64 totalLoadedClassCount = 2;
}

message JVMMetric {
    int64 time = 1;
    CPU cpu = 2;
    repeated Memory memory = 3;
    repeated MemoryPool memoryPool = 4;
    repeated GC gc = 5;
    Thread thread = 6;
    // 在JVM指標中新增Class的定義
    Class clazz = 7;
}

收集 agent 的資訊後,將資訊傳送至 oap server

收集 Class 相關的指標資訊

package org.apache.skywalking.apm.agent.core.jvm.clazz;

import org.apache.skywalking.apm.network.language.agent.v3.Class;

import java.lang.management.ClassLoadingMXBean;
import java.lang.management.ManagementFactory;

public enum ClassProvider {
    /**
     * instance
     */
    INSTANCE;

    private final ClassLoadingMXBean classLoadingMXBean;

    ClassProvider() {
        this.classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
    }
	
    // 構建class的指標資訊
    public Class getClassMetrics() {
        int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
        long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
        long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();
        return Class.newBuilder().setLoadedClassCount(loadedClassCount)
                .setUnloadedClassCount(unloadedClassCount)
                .setTotalLoadedClassCount(totalLoadedClassCount)
                .build();
    }

}

org.apache.skywalking.apm.agent.core.jvm.JVMService#run 方法中,將 class 相關指標設定到 JVM 指標類中

    @Override
    public void run() {
        long currentTimeMillis = System.currentTimeMillis();
        try {
            JVMMetric.Builder jvmBuilder = JVMMetric.newBuilder();
            jvmBuilder.setTime(currentTimeMillis);
            jvmBuilder.setCpu(CPUProvider.INSTANCE.getCpuMetric());
            jvmBuilder.addAllMemory(MemoryProvider.INSTANCE.getMemoryMetricList());
            jvmBuilder.addAllMemoryPool(MemoryPoolProvider.INSTANCE.getMemoryPoolMetricsList());
            jvmBuilder.addAllGc(GCProvider.INSTANCE.getGCList());
            jvmBuilder.setThread(ThreadProvider.INSTANCE.getThreadMetrics());
            // 設定class的指標
            jvmBuilder.setClazz(ClassProvider.INSTANCE.getClassMetrics());
			// 將JVM的指標放在阻塞佇列中
            // org.apache.skywalking.apm.agent.core.jvm.JVMMetricsSender#run方法,會將相關資訊傳送至oap server
            sender.offer(jvmBuilder.build());
        } catch (Exception e) {
            LOGGER.error(e, "Collect JVM info fail.");
        }
    }

建立 Source 類

public class DefaultScopeDefine {
    public static final int SERVICE_INSTANCE_JVM_CLASS = 11000;

    /** Catalog of scope, the metrics processor could use this to group all generated metrics by oal rt. */
    public static final String SERVICE_INSTANCE_CATALOG_NAME = "SERVICE_INSTANCE";
}
package org.apache.skywalking.oap.server.core.source;

import lombok.Getter;
import lombok.Setter;

import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_CATALOG_NAME;
import static org.apache.skywalking.oap.server.core.source.DefaultScopeDefine.SERVICE_INSTANCE_JVM_CLASS;

@ScopeDeclaration(id = SERVICE_INSTANCE_JVM_CLASS, name = "ServiceInstanceJVMClass", catalog = SERVICE_INSTANCE_CATALOG_NAME)
@ScopeDefaultColumn.VirtualColumnDefinition(fieldName = "entityId", columnName = "entity_id", isID = true, type = String.class)
public class ServiceInstanceJVMClass extends Source {
    @Override
    public int scope() {
        return SERVICE_INSTANCE_JVM_CLASS;
    }

    @Override
    public String getEntityId() {
        return String.valueOf(id);
    }

    @Getter @Setter
    private String id;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "name", requireDynamicActive = true)
    private String name;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_name", requireDynamicActive = true)
    private String serviceName;
    @Getter @Setter @ScopeDefaultColumn.DefinedByField(columnName = "service_id")
    private String serviceId;
    @Getter @Setter
    private long loadedClassCount;
    @Getter @Setter
    private long unloadedClassCount;
    @Getter @Setter
    private long totalLoadedClassCount;
}

將從 agent 獲取到的資訊,傳送至 SourceReceive

org.apache.skywalking.oap.server.analyzer.provider.jvm.JVMSourceDispatcher 進行如下修改

    public void sendMetric(String service, String serviceInstance, JVMMetric metrics) {
        long minuteTimeBucket = TimeBucket.getMinuteTimeBucket(metrics.getTime());

        final String serviceId = IDManager.ServiceID.buildId(service, NodeType.Normal);
        final String serviceInstanceId = IDManager.ServiceInstanceID.buildId(serviceId, serviceInstance);

        this.sendToCpuMetricProcess(
            service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getCpu());
        this.sendToMemoryMetricProcess(
            service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryList());
        this.sendToMemoryPoolMetricProcess(
            service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getMemoryPoolList());
        this.sendToGCMetricProcess(
            service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getGcList());
        this.sendToThreadMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getThread());
        // class指標處理
        this.sendToClassMetricProcess(
                service, serviceId, serviceInstance, serviceInstanceId, minuteTimeBucket, metrics.getClazz());
    }

    private void sendToClassMetricProcess(String service,
            String serviceId,
            String serviceInstance,
            String serviceInstanceId,
            long timeBucket,
            Class clazz) {
        // 拼裝Source物件
        ServiceInstanceJVMClass serviceInstanceJVMClass = new ServiceInstanceJVMClass();
        serviceInstanceJVMClass.setId(serviceInstanceId);
        serviceInstanceJVMClass.setName(serviceInstance);
        serviceInstanceJVMClass.setServiceId(serviceId);
        serviceInstanceJVMClass.setServiceName(service);
        serviceInstanceJVMClass.setLoadedClassCount(clazz.getLoadedClassCount());
        serviceInstanceJVMClass.setUnloadedClassCount(clazz.getUnloadedClassCount());
        serviceInstanceJVMClass.setTotalLoadedClassCount(clazz.getTotalLoadedClassCount());
        serviceInstanceJVMClass.setTimeBucket(timeBucket);
        // 將Source物件傳送至SourceReceive進行處理
        sourceReceiver.receive(serviceInstanceJVMClass);
    }

OAL 詞法定義和語法定義中加入 Source 相關資訊

oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALLexer.g4 定義 Class 關鍵字

// Keywords

FROM: 'from';
FILTER: 'filter';
DISABLE: 'disable';
SRC_ALL: 'All';
SRC_SERVICE: 'Service';
SRC_SERVICE_INSTANCE: 'ServiceInstance';
SRC_ENDPOINT: 'Endpoint';
SRC_SERVICE_RELATION: 'ServiceRelation';
SRC_SERVICE_INSTANCE_RELATION: 'ServiceInstanceRelation';
SRC_ENDPOINT_RELATION: 'EndpointRelation';
SRC_SERVICE_INSTANCE_JVM_CPU: 'ServiceInstanceJVMCPU';
SRC_SERVICE_INSTANCE_JVM_MEMORY: 'ServiceInstanceJVMMemory';
SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL: 'ServiceInstanceJVMMemoryPool';
SRC_SERVICE_INSTANCE_JVM_GC: 'ServiceInstanceJVMGC';
SRC_SERVICE_INSTANCE_JVM_THREAD: 'ServiceInstanceJVMThread';
SRC_SERVICE_INSTANCE_JVM_CLASS:'ServiceInstanceJVMClass'; // 在OAL詞法定義中新增Class的關鍵字
SRC_DATABASE_ACCESS: 'DatabaseAccess';
SRC_SERVICE_INSTANCE_CLR_CPU: 'ServiceInstanceCLRCPU';
SRC_SERVICE_INSTANCE_CLR_GC: 'ServiceInstanceCLRGC';
SRC_SERVICE_INSTANCE_CLR_THREAD: 'ServiceInstanceCLRThread';
SRC_ENVOY_INSTANCE_METRIC: 'EnvoyInstanceMetric';

oap-server/oal-grammar/src/main/antlr4/org/apache/skywalking/oal/rt/grammar/OALParser.g4 新增 Class 關鍵字

source
    : SRC_ALL | SRC_SERVICE | SRC_DATABASE_ACCESS | SRC_SERVICE_INSTANCE | SRC_ENDPOINT |
      SRC_SERVICE_RELATION | SRC_SERVICE_INSTANCE_RELATION | SRC_ENDPOINT_RELATION |
      SRC_SERVICE_INSTANCE_JVM_CPU | SRC_SERVICE_INSTANCE_JVM_MEMORY | SRC_SERVICE_INSTANCE_JVM_MEMORY_POOL | 
      SRC_SERVICE_INSTANCE_JVM_GC | SRC_SERVICE_INSTANCE_JVM_THREAD | SRC_SERVICE_INSTANCE_JVM_CLASS |// 在OAL語法定義中新增詞法定義中定義的關鍵字
      SRC_SERVICE_INSTANCE_CLR_CPU | SRC_SERVICE_INSTANCE_CLR_GC | SRC_SERVICE_INSTANCE_CLR_THREAD |
      SRC_ENVOY_INSTANCE_METRIC |
      SRC_BROWSER_APP_PERF | SRC_BROWSER_APP_PAGE_PERF | SRC_BROWSER_APP_SINGLE_VERSION_PERF |
      SRC_BROWSER_APP_TRAFFIC | SRC_BROWSER_APP_PAGE_TRAFFIC | SRC_BROWSER_APP_SINGLE_VERSION_TRAFFIC
    ;

oap-server/oal-grammar 目錄下執行 mvn clean package -DskipTests=true 會生成新的相關 Java

定義 OAL 指標

oap-server/server-bootstrap/src/main/resources/oal/java-agent.oal 中新增基於 OAL 語法的 Class 相關指標定義

// 當前載入類的數量
instance_jvm_class_loaded_class_count = from(ServiceInstanceJVMClass.loadedClassCount).longAvg();
// 已解除安裝類的數量
instance_jvm_class_unloaded_class_count = from(ServiceInstanceJVMClass.unloadedClassCount).longAvg();
// 一共載入類的數量
instance_jvm_class_total_loaded_class_count = from(ServiceInstanceJVMClass.totalLoadedClassCount).longAvg();

配置 UI 皮膚

將如下介面配置匯入 APM 皮膚中

{
  "name": "Instance",
  "children": [{
      "width": "3",
      "title": "Service Instance Load",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "service_instance_cpm",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "CPM - calls per minute"
    },
    {
      "width": 3,
      "title": "Service Instance Throughput",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "service_instance_throughput_received,service_instance_throughput_sent",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "Bytes"
    },
    {
      "width": "3",
      "title": "Service Instance Successful Rate",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "service_instance_sla",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "%",
      "aggregation": "/",
      "aggregationNum": "100"
    },
    {
      "width": "3",
      "title": "Service Instance Latency",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "service_instance_resp_time",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "ms"
    },
    {
      "width": 3,
      "title": "JVM CPU (Java Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_jvm_cpu",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "%",
      "aggregation": "+",
      "aggregationNum": ""
    },
    {
      "width": 3,
      "title": "JVM Memory (Java Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_jvm_memory_heap, instance_jvm_memory_heap_max,instance_jvm_memory_noheap, instance_jvm_memory_noheap_max",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "MB",
      "aggregation": "/",
      "aggregationNum": "1048576"
    },
    {
      "width": 3,
      "title": "JVM GC Time",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_jvm_young_gc_time, instance_jvm_old_gc_time",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "ms"
    },
    {
      "width": 3,
      "title": "JVM GC Count",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartBar",
      "metricName": "instance_jvm_young_gc_count, instance_jvm_old_gc_count"
    },
    {
      "width": 3,
      "title": "JVM Thread Count (Java Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "metricName": "instance_jvm_thread_live_count, instance_jvm_thread_daemon_count, instance_jvm_thread_peak_count,instance_jvm_thread_deadlocked,instance_jvm_thread_monitor_deadlocked"
    },
    {
      "width": 3,
      "title": "JVM Thread State Count (Java Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_jvm_thread_new_thread_count,instance_jvm_thread_runnable_thread_count,instance_jvm_thread_blocked_thread_count,instance_jvm_thread_wait_thread_count,instance_jvm_thread_time_wait_thread_count,instance_jvm_thread_terminated_thread_count",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartBar"
    },
    {
      "width": 3,
      "title": "JVM Class Count (Java Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_jvm_class_loaded_class_count,instance_jvm_class_unloaded_class_count,instance_jvm_class_total_loaded_class_count",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartArea"
    },
    {
      "width": 3,
      "title": "CLR CPU  (.NET Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_clr_cpu",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "%"
    },
    {
      "width": 3,
      "title": "CLR GC (.NET Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_clr_gen0_collect_count, instance_clr_gen1_collect_count, instance_clr_gen2_collect_count",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartBar"
    },
    {
      "width": 3,
      "title": "CLR Heap Memory (.NET Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "metricName": "instance_clr_heap_memory",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "unit": "MB",
      "aggregation": "/",
      "aggregationNum": "1048576"
    },
    {
      "width": 3,
      "title": "CLR Thread (.NET Service)",
      "height": "250",
      "entityType": "ServiceInstance",
      "independentSelector": false,
      "metricType": "REGULAR_VALUE",
      "queryMetricType": "readMetricsValues",
      "chartType": "ChartLine",
      "metricName": "instance_clr_available_completion_port_threads,instance_clr_available_worker_threads,instance_clr_max_completion_port_threads,instance_clr_max_worker_threads"
    }
  ]
}

結果校驗

可以看到匯入的介面中,已經有 Class 相關指標了

file

程式碼貢獻

參考文件

分享並記錄所學所見

相關文章