SciTech-AV-Audio-DAP(Digital Audio Processing)-ffmpeg -AMR錄音檔案剪下+格式轉換+Normalization音量操控



  1. 剪下+格式轉換:
    ffmpeg -i evidence.amr -ss 01:10 -to 04:10 -f mp3 sample.mp3
    ffmpeg -i x.mp3 -ss 01:14 -to 01:27 -f mp3 sample.mp3
  2. FFmpeg音訊 格式轉換
    • ffmpeg做 8000hz取樣率,單聲道,每個取樣樣本8bit 這種轉換隻要:
      ffmpeg -i in.mp3 -acodec pcm_s8 -ac 1 -ar 8000 -vn out.wav
    • mplayer可以用mencoder,對wav的細節處理也要間接用ffmpeg
      mencoder in.mp3 −oac lavc -lavcopts acodec=pcm_s8:o=ac=1,ar=8000 -of lavf -o out.wav
  3. 音量檢測和Normalization:
    • 音量取樣、離散統計檢測:
      ffmpeg -i sample.mp3    -filter:a volumedetect -f null /dev/null  2>>input.log 1>>input.log
      grep 'volumedetect' input.log
    • Loudness Normalization
      ffmpeg -i sample.mp3    -filter:a loudnorm         norm.mp3

Audio Volume Manipulation

  1. Changing volume: you may use FFmpeg's ​volume audio filter.

    • If we want our volume to be

      # half of the input volume
      ffmpeg -i input.wav -filter:a "volume=0.5" output.wav
      # 150% of current volume
      ffmpeg -i input.wav -filter:a "volume=1.5" output.wav
      # To **reduce the volume**, use a negative value:
       ffmpeg -i input.wav -filter:a "volume=-5dB" output.wav
      # You can also use decibel measures. To **increase the volume by 10dB**:
      `ffmpeg -i input.wav -filter:a "volume=10dB" output.wav`
    • Note: the volume filter only adjusts the volume. It does not set the volume. To set or otherwise normalize the volume of a stream, see the sections below.

  2. Peak and RMS Normalization
    To normalize the volume to a given peak or RMS level,
    the file first has to be analyzed using the volumedetect filter,
    ffmpeg -i input.wav -filter:a volumedetect -f null /dev/null 2>>input.log 1>>input.log
    Read the output values from the command line log from file: input.log :

    [Parsed_volumedetect_0 @ 0x30000a7490] n_samples: 96000
    [Parsed_volumedetect_0 @ 0x30000a7490] mean_volume: -28.3 dB
    [Parsed_volumedetect_0 @ 0x30000a7490] max_volume: -7.5 dB
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_7db: 1
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_8db: 4
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_9db: 16
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_10db: 43
    [Parsed_volumedetect_0 @ 0x30000a7490] histogram_11db: 131


    [Parsed_volumedetect_0 @ 0x30000a7480] n_samples: 7291680
    [Parsed_volumedetect_0 @ 0x30000a7480] mean_volume: -25.1 dB
    [Parsed_volumedetect_0 @ 0x30000a7480] max_volume: -3.7 dB                                           
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_3db: 2                                              
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_4db: 8                                              
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_5db: 33                                             
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_6db: 97                                             
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_7db: 479                                            
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_8db: 1855                                           
    [Parsed_volumedetect_0 @ 0x30000a7480] histogram_9db: 5001

    then calculate the required offset, and use the volume filter as shown above.

  3. Loudness Normalization
    If you want to normalize the (perceived) loudness of the file, use the ​loudnorm filter,
    which implements the EBU R128 algorithm:
    ffmpeg -i input.wav -filter:a loudnorm output.wav
    This is recommended for most applications, as it will lead to a more uniform loudness level,
    compared to simple peak-based normalization. However, it is recommended to run the normalization with two passes,

    • extracting the measured values from the first run,
    • then using the values in a second run with linear normalization enabled.
      See the loudnorm filter documentation for more.
  4. Automatization with ffmpeg-normalize
    To automate the normalization processes with ffmpeg without having to manually perform two passes,
    and run normalization on multiple files (including video), you can also use the ​ffmpeg-normalize Python program via pip install ffmpeg-normalize. The script defaults to EBU R128 normalization with two passes, but peak and RMS normalization are also supported.
    For details, run ffmpeg-normalize -h or see the README file.

FFmpeg CommandLine Arguments and Parameters

FFmpeg Audio options:

-aframes number     **set the number of audio frames to output**
-aq quality         set audio quality (codec-specific)
-ar rate            **set audio sampling rate (in Hz)**
-ac channels        **set number of audio channels**
-an                 **disable audio**
-acodec codec       **force audio codec** ('copy' to copy stream)
-ab bitrate         **audio bitrate (please use -b:a)**
-af filter_graph    **set audio filters**

FFmpeg Time Duration

FFmpeg時間段( time duration )格式,兩種:

  • [-][:]:[....]
    HH 小時, MM 分鐘(最大兩位數), SS 秒(最多兩位數),
    m 小數秒(高精度, 十進位制);
  • [-]<S>+[.<m>...]
    S 秒數, m 作小數秒(高精度, 十進位制);;

上兩種duration, 都可選前加-指示negative duration.

-13.567:      **negative** 13.567 seconds
12:03:45:    12hours 03minutes 45 seconds
23.189:       23.189 seconds
00:01:00:    60 seconds
60:               60 seconds

FFmpeg Global options

affect whole program instead of just one file:

  • -loglevel loglevel set logging level
  • -v loglevel **set logging level**
  • -report generate a report
  • -max_alloc bytes **set maximum size of a single allocated block**
  • -y **overwrite output files**
  • -n never overwrite output files
  • -ignore_unknown Ignore unknown stream types
  • -filter_threads **number of non-complex filter threads**
  • -filter_complex_threads **number of threads for -filter_complex**
  • -stats print progress report during encoding
  • -max_error_rate maximum error rate ratio of decoding errors (0.0: no errors, 1.0: 100% errors) above which ffmpeg returns an error instead of success.
  • -frames[:stream_specifier] framecount (output,per-stream)
    指定產出幀數: 設定產出影片的幀數 framecount .
  • -f fmt (input/output)
    • 匯入:會自動檢測匯入格式;
    • 產出:副檔名自動推導產出格式 ( 因此這個 -f fmt 選項只在必要時使用. )
    • FFmpeg 支援的所有 fmt 格式可以檢視執行: ffmpeg -formats
  • -ss position (input/output)
    • -i 引數的前面是作為匯入設定, 從匯入檔案快進到指定position.
      • 多數檔案不真正支援seek, ffmpeg 會快進到 position 之前最接近的seek point.
      • 轉碼(transcoding)時並啟用選項 -accurate_seek(預設), 則解碼並丟棄 前置seek point 和 position 之間的幀.
      • 流複製(stream copy) 時並啟用選項 -noaccurate_seek (預設), 則保留seek point 和 position 之間的幀.
    • -i 引數的後面是作為產出選項(放在 output url 之前);
      解碼讀入檔案並丟棄匯入, 直到產出流的 timestamp 到達這個指定的 position.
  • -sseof position (input/output)
    類似 "-ss" 選項,但時間點相對於 eof(end of file). 0 表示 EOF, 負數表示檔案的stream上.
  • -to position (input/output)
    • 在 寫產出檔案 / 讀匯入檔案 到達指定時間終點 position後停止. ( ffmpeg Time duration 格式)
    • -to -t 兩個選項只能兩選一,且 -t優先順序更高.
  • -t duration (input/output)
    • 在 "-i" 的前面,作為匯入設定, 指定只從匯入檔案讀取的資料時間長度.
    • 在 "-i" 的後面(output url前),作為產出設定, 指定只寫指定時長的資料,就停止.
    • -to -t 兩個選項只能兩選一,且 -t優先順序更高
  • -fs limit_size (output)
    Set the file size limit, expressed in bytes.
    No further chunk of bytes is written after the limit is exceeded.
    The size of the output file is slightly more than the requested file size.
  • -itsoffset offset (input)
    指定匯入時間偏移. offsetffmpeg time duration 格式
    offset 被新增到匯入檔案的timestamps(時間戳).
    指定positive offset 表示 streams 被 delayed 到指定 offset 的時間.
  • -timestamp date (output)
    在 container(資料容器, 輸出stream到儲存的格式容器)上 設定錄製 timestamp.

FFmpeg Per-file main options:

-f fmt              **force format**
-c codec            **codec name**
-codec codec        codec name
-pre preset         preset name
-map_metadata outfile[,metadata]:infile[,metadata]  **set metadata information of outfile from infile**
-t duration         **record or transcode "duration" seconds of audio/video**
-to time_stop       **record or transcode stop time**
-fs limit_size      **set the limit file size in bytes**
-ss time_off        **set the start time offset**
-sseof time_off     **set the start time offset relative to EOF**
-seek_timestamp     enable/disable seeking by timestamp with -ss
-timestamp time     set the recording timestamp ('now' to set the current time)
-metadata string=string  **add metadata**
-program title=string:st=number...  add program with specified streams
-target type        specify target file type ("vcd", "svcd", "dvd", "dv" or "dv50" with optional prefixes "pal-", "ntsc-" or "film-")
-apad               audio pad
-frames number      **set the number of frames to output**
-filter filter_graph  **set stream filtergraph**
-filter_script filename  **read stream filtergraph description from a file**
-reinit_filter      **reinit filtergraph on input parameter changes**
-discard            discard
-disposition        disposition

FFmpeg 常用到PCM格式

 DE s16be           PCM signed 16-bit big-endian
 DE s16le            PCM signed 16-bit little-endian
 DE s24be           PCM signed 24-bit big-endian
 DE s24le            PCM signed 24-bit little-endian
 DE s32be           PCM signed 32-bit big-endian
 DE s32le            PCM signed 32-bit little-endian
 DE s8                 PCM signed 8-bit
 DE f32be           PCM 32-bit floating-point big-endian
 DE f32le            PCM 32-bit floating-point little-endian
 DE f64be           PCM 64-bit floating-point big-endian
 DE f64le            PCM 64-bit floating-point little-endian
 DE mulaw         PCM mu-law
 DE u16be           PCM unsigned 16-bit big-endian
 DE u16le           PCM unsigned 16-bit little-endian
 DE u24be           PCM unsigned 24-bit big-endian
 DE u24le           PCM unsigned 24-bit little-endian
 DE u32be           PCM unsigned 32-bit big-endian
 DE u32le           PCM unsigned 32-bit little-endian
 DE u8                PCM unsigned 8-bit

ffmpeg 的解碼編碼格式
