說明:
mariadb audit log是 mariadb 的審計日誌
目的是把日誌拆分成 tab 鍵分隔的欄位
直接附上 fluentd 配置檔案
<system> log_level error </system> <source> @type tail path /data/mysql_audit/* limit_recently_modified 86400
open_on_every_update true tag mysql_audit read_from_head true pos_file /tmp/fluentd.pos <parse> @type multiline format_firstline /^\d{8}/ format1 /^(?<Date>\d{8}) (?<Hour>\d{2}):(?<Min>\d{2}):(?<Sec>\d{2}),(?<host>[^,]+),(?<user>[^,]+),(?<ip>[^,]+),(?<connid>[^,]+),(?<queryid>[^,]+),(?<action>[^,]+),(?<db>[^,]+),(?<message>.*),(?<retcode>\d+)$/ </parse> </source> <filter mysql_audit> @type grep <regexp> key action pattern QUERY </regexp> <exclude> key user pattern lagou_status </exclude> <exclude> key db pattern information_schema </exclude> </filter> <filter mysql_audit> @type record_transformer enable_ruby <record> message ${record["message"].gsub(/\s/, ' ')} message ${record["message"].gsub(/\s+/, ' ')} </record> </filter> <match mysql_audit> #@type stdout @type webhdfs host oss-hadoop-namenode-bjc-002 path /mysql_audit/${Date}/${host}_${Hour} append true compress gzip <format> @type csv fields Date,Hour,Min,Sec,host,user,ip,action,db,message,retcode delimiter ' ' </format> <buffer host,Date,Hour> @type memory flush_interval 20s </buffer> </match>
fluentd 比 logstash 記憶體佔用大大下降
分析同樣的日誌 logstash 佔用700M, fluentd 佔用35M
不過 cpu 佔用相當,對於日誌量大的機器 cpu 到100%
看來對日誌做正則過濾很損耗 cpu
如果不加 open_on_every_update true 那麼 td-agent 會一直保持開啟過的檔案描述符