linux 效能調優工具perf + 火焰圖 常用命令

舊街角發表於2018-05-16

本文是從本人筆記直接搬運過來,將就著看

perf效能分析:
生成火焰圖(執行1-4步驟)
1、perf record -e cpu-clock -g -p pid (perf record -F 99 -g -p pid 99HZ取樣)
-g 選項是告訴perf record額外記錄函式的呼叫關係
-e cpu-clock 指perf record監控的指標為cpu週期
-p 指定需要record的程式pid
    perf report -i perf.data
-i 指定要檢視的檔案
2、perf script -i perf.data &> perf.unfold
用perf script工具對perf.data進行解析
3、./stackcollapse-perf.pl perf.unfold &> perf.folded
將perf.unfold中的符號進行摺疊
4、./flamegraph.pl perf.folded > perf.svg
最後生成svg圖
火焰圖專案地址:git clone https://github.com/brendangregg/FlameGraph.git

1、統計事件,stat:statistics
# CPU counter statistics for the specified command:
perf stat command
# Detailed CPU counter statistics (includes extras) for the specified command:
perf stat -d command
# CPU counter statistics for the specified PID, until Ctrl-C:
perf stat -p PID
# CPU counter statistics for the entire system, for 5 seconds:
perf stat -a sleep 5
# Various basic CPU statistics, system wide, for 10 seconds:
perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10

2、剖析 Profiling
# Sample on-CPU functions for the specified command, at 99 Hertz:
perf record -F 99 command
# Sample on-CPU functions for the specified PID, at 99 Hertz, until Ctrl-C:
perf record -F 99 -p PID
# Sample on-CPU functions for the specified PID, at 99 Hertz, for 10 seconds:
perf record -F 99 -p PID sleep 10
# Sample CPU stack traces (via frame pointers) for the specified PID, at 99 Hertz, for 10 seconds:
perf record -F 99 -p PID -g -- sleep 10
常用引數
-e:Select the PMU event.
-a:System-wide collection from all CPUs.
-p:Record events on existing process ID (comma separated list).
-A:Append to the output file to do incremental profiling.
 -f:Overwrite existing data file.
-o:Output file name.
-g:Do call-graph (stack chain/backtrace) recording.
-C:Collect samples only on the list of CPUs provided.
3、Static Tracing
# Trace new processes, until Ctrl-C:
perf record -e sched:sched_process_exec -a
# Trace all context-switches, until Ctrl-C:
perf record -e context-switches -a
# Trace context-switches via sched tracepoint, until Ctrl-C:
perf record -e sched:sched_switch -a
# Trace all context-switches with stack traces, until Ctrl-C:
perf record -e context-switches -ag
# Trace all context-switches with stack traces, for 10 seconds:
perf record -e context-switches -ag -- sleep 10

4、Dynamic Tracing
# Add a tracepoint for the kernel tcp_sendmsg() function entry ("--add" is optional):
perf probe --add tcp_sendmsg
# Remove the tcp_sendmsg() tracepoint (or use "--del"):
perf probe -d tcp_sendmsg
# Add a tracepoint for the kernel tcp_sendmsg() function return:
perf probe 'tcp_sendmsg%return'
# Show available variables for the kernel tcp_sendmsg() function (needs debuginfo):
perf probe -V tcp_sendmsg
# Show available variables for the kernel tcp_sendmsg() function, plus external vars (needs debuginfo):
perf probe -V tcp_sendmsg --externs

5、Mixed
# Sample stacks at 99 Hertz, and, context switches:
perf record -F99 -e cpu-clock -e cs -a -g 
# Sample stacks to 2 levels deep, and, context switch stacks to 5 levels (needs 4.8):
perf record -F99 -e cpu-clock/max-stack=2/ -e cs/max-stack=5/ -a -g

6、Reporting
# Show perf.data in an ncurses browser (TUI) if possible:
perf report
# Show perf.data with a column for sample count:
perf report -n
# Show perf.data as a text report, with data coalesced and percentages:
perf report --stdio
# Report, with stacks in folded format: one line per stack (needs 4.4):
perf report --stdio -n -g folded
# List all events from perf.data:
perf script
# List all perf.data events, with data header (newer kernels; was previously default):
perf script --header


相關文章