前端語音轉文字實踐總結

最近準備一個技術分享，看到以前做個一個語音轉文字的功能，放在slides上落灰了，索性整理到這裡和大家分享下。

從技術選型，到方案設計，到實際落地，可以說把全鏈路都覆蓋到了。

語音轉寫流程圖
PC端瀏覽器如何錄音
錄音完畢後語音如何傳送
語音傳送和實時轉寫
通用錄音元件
總結

語音轉寫流程圖

PC端瀏覽器如何錄音

AudioContext，AudioNode是什麼？
MediaDevice.getUserMedia()是什麼？
為什麼localhost能播放，預生產不能播放？
js中的資料型別TypedArray知多少？
js-audio-recorder原始碼分析
程式碼實現

AudioContext是什麼？

AudioContext介面表示由連結在一起的音訊模組構建的音訊處理圖形，每個模組由一個AudioNode表示。

一個audio context會控制所有節點的建立和音訊處理解碼的執行。所有事情都是在一個上下文中發生的。

ArrayBuffer：音訊二進位制檔案
decodeAudioData：解碼
AudioBufferSourceNode：
connect用於連線音訊檔案
start播放音訊
AudioContext.destination：揚聲器裝置

AudioNode是什麼？

AudioNode是用於音訊處理的一個基類，包括context，numberOfInputs，channelCount，connect
上文講到的用於連線音訊檔案的AudioBufferSourceNode繼承了AudioNode的connect和start方法
用於設定音量的GainNode也繼承於AudioNode
用於連線麥克風裝置的MediaElementAudioSourceNode也繼承於AudioNode
用於濾波的OscillationNode間接繼承於AudioNode
表示音訊源訊號在空間中的位置和行為的PannerNode也繼承於AudioNode
AudioListener介面表示聽音訊場景的唯一的人的位置和方向，並用於音訊空間化
上述節點可以通過裝飾者模式一層層connect，AudioBufferSourceCode可以先connect到GainNode，GainNode再connect到AudioContext.destination揚聲器去調節音量

初見：MediaDevice.getUserMedia()是什麼

MediaStream MediaStreamTrack audio track

demo演示：https://github.com/FrankKai/n...

<button onclick="record()">開始錄音</button>
<script>
function record () {
    navigator.mediaDevices.getUserMedia({
        audio: true
    }).then(mediaStream => {
        console.log(mediaStream);
    }).catch(err => {
        console.error(err);
    })  ;
}

相識：MediaDevice.getUserMedia()是什麼

MediaStream MediaStreamTrack audio track

MediaStream介面代表media content stream
MediaStreamTrack介面代表的是在一個stream內部的單media track
track可以理解為音軌，所以audio track就是音訊音軌的意思
提醒使用者”是否允許程式碼獲得麥克風的許可權“。若拒絕，會報錯DOMException: Permission denied；若允許，返回一個由audio track組成的MediaStream，其中包含了audio音軌上的詳細資訊

為什麼localhost能播放，預生產不能播放？

沒招了，在stackOverflow提了一個問題

Why navigator.mediaDevice only works fine on localhost:9090?

網友說只能在HTTPS環境做測試。

嗯，生產是HTTPS，可以用。？？？但是我localhost哪裡來的HTTPS環境？？？所以到底是什麼原因？

終於從chromium官方更新記錄中找到了答案
https://sites.google.com/a/ch...

Chrome 47以後，getUserMedia API只能允許來自“安全可信”的客戶端的視訊音訊請求，如HTTPS和本地的Localhost。如果頁面的指令碼從一個非安全源載入，navigator物件中則沒有可用的mediaDevices物件，Chrome丟擲錯誤。

語音功能預生產，預發需要以下配置：
位址列輸入chrome://flags
搜尋：insecure origins treated as secure
配置：http://foo.test.gogo.com

生產的https://foo.gogo.com是完全OK的

js中的資料型別TypedArray知多少？

typed array基本知識: TypedArray Buffer ArrayBuffer View Unit8Array Unit16Array Float64Array

用來處理未加工過的二進位制資料
TypedArray分為buffers和views兩種
buffer（通過ArrayBuffer類實現）指的是一個資料塊物件；buffer沒有固定的格式；buffer中的內容是不能訪問到的。
buffer中記憶體的訪問許可權，需要用到view；view提供了一個上下文（包括資料型別，初始位置，元素數量），這個上下文將資料轉換為typed array

https://github.com/FrankKai/F...

typed array使用例子

// 建立一個16位元組定長的buffer
let buffer = new ArrayBuffer(16);

處理音訊資料前置知識點

struct someStruct {
  unsigned long id; // long 32bit
  char username[16];// char 8bit
  float amountDue;// float 32bit
};

let buffer = new ArrayBuffer(24);
// ... read the data into the buffer ...
let idView = new Uint32Array(buffer, 0, 1);
let usernameView = new Uint8Array(buffer, 4, 16);
let amountDueView = new Float32Array(buffer, 20, 1);

偏移量為什麼是1，4，20？
因為32/8 = 4。0到3屬於idView。8/8 =1。4到19屬於usernameView。32/8 = 4。20到23屬於amountView。

程式碼實現及原始碼分析

一、程式碼實現

流程圖：1.初始化recorder 2.開始錄音 3.停止錄音

設計思路：錄音器，錄音器助手，語音構造器，語音轉換器

二、嘗試過的技術方案

1.人人網某前端開發

https://juejin.im/post/5b8bf7...

無法靈活指定取樣位數，取樣頻率和聲道數；不能輸出多種格式的音訊；因此棄用。

2.js-audio-recorder

可以靈活指定取樣位數，取樣頻率和聲道數；可輸出多種格式的音訊；提供多種易用的API。

github地址：https://github.com/2fps/recorder

沒學過語音相關的知識，因此只能參考前輩的成果邊學邊做！

程式碼實現及原始碼分析

一、錄音過程拆解

1.初始化錄音例項

initRecorderInstance() {
// 取樣相關
  const sampleConfig = {
    sampleBits: 16, // 取樣位數，訊飛實時語音轉寫 16bits
    sampleRate: 16000, // 取樣率，訊飛實時語音轉寫 16000kHz
    numChannels: 1, // 聲道，訊飛實時語音轉寫 單聲道
  };
  this.recorderInstance = new Recorder(sampleConfig);
},

2.開始錄音

startRecord() {
  try {
    this.recorderInstance.start();
    // 回撥持續輸出時長
    this.recorderInstance.onprocess = (duration) => {
      this.recorderHelper.duration = duration;
    };
  } catch (err) {
    this.$debug(err);
  }
},

3.停止錄音

stopRecord() {
  this.recorderInstance.stop();
  this.recorder.blobObjMP3 = new Blob([this.recorderInstance.getWAV()], { type: 'audio/mp3' });
  this.recorder.blobObjPCM = this.recorderInstance.getPCMBlob();
  this.recorder.blobUrl = URL.createObjectURL(this.recorder.blobObjMP3);
  if (this.audioAutoTransfer) {
    this.$refs.audio.onloadedmetadata = () => {
      this.audioXFTransfer();
    };
  }
},

二、設計思路

錄音器例項recorderInstance
- js-audio-recorder
錄音器助手RecorderHelper
- blobUrl,blobObjPCM,blobObjMP3
- hearing,tip,duration
編輯器Editor
- transfered,tip,loading
語音器Audio
- urlPC,urlMobile,size
轉換器Transfer
- text

三、原始碼分析之初始化例項-constructor

/**
 * @param {Object} options 包含以下三個引數：
 * sampleBits，取樣位數，一般8,16，預設16
 * sampleRate，取樣率，一般 11025、16000、22050、24000、44100、48000，預設為瀏覽器自帶的取樣率
 * numChannels，聲道，1或2
 */
constructor(options: recorderConfig = {}) {
    // 臨時audioContext，為了獲取輸入取樣率的
    let context = new (window.AudioContext || window.webkitAudioContext)();

    this.inputSampleRate = context.sampleRate;     // 獲取當前輸入的取樣率
    // 配置config，檢查值是否有問題
    this.config = {
        // 取樣數位 8, 16
        sampleBits: ~[8, 16].indexOf(options.sampleBits) ? options.sampleBits : 16,
        // 取樣率
        sampleRate: ~[11025, 16000, 22050, 24000, 44100, 48000].indexOf(options.sampleRate) ? options.sampleRate : this.inputSampleRate,
        // 聲道數，1或2
        numChannels: ~[1, 2].indexOf(options.numChannels) ? options.numChannels : 1,
    };
    // 設定取樣的引數
    this.outputSampleRate = this.config.sampleRate;     // 輸出取樣率
    this.oututSampleBits = this.config.sampleBits;      // 輸出取樣數位 8, 16
    // 判斷端位元組序
    this.littleEdian = (function() {
        var buffer = new ArrayBuffer(2);
        new DataView(buffer).setInt16(0, 256, true);
        return new Int16Array(buffer)[0] === 256;
    })();
}

new DataView(buffer).setInt16(0, 256, true)怎麼理解？

控制記憶體儲存的大小端模式。
true是littleEndian，也就是小端模式，地位資料儲存在低地址，Int16Array uses the platform's endianness。

所謂大端模式，指的是低位資料高地址，0x12345678，12存buf[0]，78（低位資料）存buf[3]（高地址）。也就是常規的正序儲存。
小端模式與大端模式相反。0x12345678，78存在buf[0]，存在低地址。

三.原始碼分析之初始化例項-initRecorder

/** 
 * 初始化錄音例項
 */
initRecorder(): void {
    if (this.context) {
        // 關閉先前的錄音例項，因為前次的例項會快取少量資料
        this.destroy();
    }
    this.context = new (window.AudioContext || window.webkitAudioContext)();
    
    this.analyser = this.context.createAnalyser();  // 錄音分析節點
    this.analyser.fftSize = 2048;                   // 表示儲存頻域的大小

    // 第一個參數列示收集取樣的大小，採集完這麼多後會觸發 onaudioprocess 介面一次，該值一般為1024,2048,4096等，一般就設定為4096
    // 第二，三個引數分別是輸入的聲道數和輸出的聲道數，保持一致即可。
    let createScript = this.context.createScriptProcessor || this.context.createJavaScriptNode;
    this.recorder = createScript.apply(this.context, [4096, this.config.numChannels, this.config.numChannels]);

    // 相容 getUserMedia
    this.initUserMedia();
    // 音訊採集
    this.recorder.onaudioprocess = e => {
        if (!this.isrecording || this.ispause) {
            // 不在錄音時不需要處理，FF 在停止錄音後，仍會觸發 audioprocess 事件
            return;
        } 
        // getChannelData返回Float32Array型別的pcm資料
        if (1 === this.config.numChannels) {
            let data = e.inputBuffer.getChannelData(0);
            // 單通道
            this.buffer.push(new Float32Array(data));
            this.size += data.length;
        } else {
            /*
                * 雙聲道處理
                * e.inputBuffer.getChannelData(0)得到了左聲道4096個樣本資料，1是右聲道的資料，
                * 此處需要組和成LRLRLR這種格式，才能正常播放，所以要處理下
                */
            let lData = new Float32Array(e.inputBuffer.getChannelData(0)),
                rData = new Float32Array(e.inputBuffer.getChannelData(1)),
                // 新的資料為左聲道和右聲道資料量之和
                buffer = new ArrayBuffer(lData.byteLength + rData.byteLength),
                dData = new Float32Array(buffer),
                offset = 0;

            for (let i = 0; i < lData.byteLength; ++i) {
                dData[ offset ] = lData[i];
                offset++;
                dData[ offset ] = rData[i];
                offset++;
            }

            this.buffer.push(dData);
            this.size += offset;
        }
        // 統計錄音時長
        this.duration += 4096 / this.inputSampleRate;
        // 錄音時長回撥
        this.onprocess && this.onprocess(this.duration);
    }
}

三.原始碼分析之開始錄音-start

/**
 * 開始錄音
 *
 * @returns {void}
 * @memberof Recorder
 */
start(): void {
    if (this.isrecording) {
        // 正在錄音，則不允許
        return;
    }
    // 清空資料
    this.clear();
    this.initRecorder();
    this.isrecording = true;

    navigator.mediaDevices.getUserMedia({
        audio: true
    }).then(stream => {
        // audioInput表示音訊源節點
        // stream是通過navigator.getUserMedia獲取的外部（如麥克風）stream音訊輸出，對於這就是輸入
        this.audioInput = this.context.createMediaStreamSource(stream);
    }, error => {
        // 丟擲異常
        Recorder.throwError(error.name + " : " + error.message);
    }).then(() => {
        // audioInput 為聲音源，連線到處理節點 recorder
        this.audioInput.connect(this.analyser);
        this.analyser.connect(this.recorder);
        // 處理節點 recorder 連線到揚聲器
        this.recorder.connect(this.context.destination);
    });
}

三.原始碼分析之停止錄音及輔助函式

/**
 * 停止錄音
 *
 * @memberof Recorder
 */
stop(): void {
    this.isrecording = false;
    this.audioInput && this.audioInput.disconnect();
    this.recorder.disconnect();
}

// 錄音時長回撥
this.onprocess && this.onprocess(this.duration);
/**
 * 獲取WAV編碼的二進位制資料(dataview)
 *
 * @returns {dataview}  WAV編碼的二進位制資料
 * @memberof Recorder
 */
private getWAV() {
    let pcmTemp = this.getPCM(),
        wavTemp = Recorder.encodeWAV(pcmTemp, this.inputSampleRate, 
            this.outputSampleRate, this.config.numChannels, this.oututSampleBits, this.littleEdian);

    return wavTemp;
}
/**
 * 獲取PCM格式的blob資料
 *
 * @returns { blob }  PCM格式的blob資料
 * @memberof Recorder
 */
getPCMBlob() {
    return new Blob([ this.getPCM() ]);
}
/**
 * 獲取PCM編碼的二進位制資料(dataview)
 *
 * @returns {dataview}  PCM二進位制資料
 * @memberof Recorder
 */
private getPCM() {
    // 二維轉一維
    let data = this.flat();
    // 壓縮或擴充套件
    data = Recorder.compress(data, this.inputSampleRate, this.outputSampleRate);
    // 按取樣位數重新編碼
    return Recorder.encodePCM(data, this.oututSampleBits, this.littleEdian);
}

四.原始碼分析之核心演算法-encodeWAV

static encodeWAV(bytes: dataview, inputSampleRate: number, outputSampleRate: number, numChannels: number, oututSampleBits: number, littleEdian: boolean = true) {
    let sampleRate = Math.min(inputSampleRate, outputSampleRate),
        sampleBits = oututSampleBits,
        buffer = new ArrayBuffer(44 + bytes.byteLength),
        data = new DataView(buffer),
        channelCount = numChannels, // 聲道
        offset = 0;

    // 資源交換檔案識別符號
    writeString(data, offset, 'RIFF'); offset += 4;
    // 下個地址開始到檔案尾總位元組數,即檔案大小-8
    data.setUint32(offset, 36 + bytes.byteLength, littleEdian); offset += 4;
    // WAV檔案標誌
    writeString(data, offset, 'WAVE'); offset += 4;
    // 波形格式標誌
    writeString(data, offset, 'fmt '); offset += 4;
    // 過濾位元組,一般為 0x10 = 16
    data.setUint32(offset, 16, littleEdian); offset += 4;
    // 格式類別 (PCM形式取樣資料)
    data.setUint16(offset, 1, littleEdian); offset += 2;
    // 聲道數
    data.setUint16(offset, channelCount, littleEdian); offset += 2;
    // 取樣率,每秒樣本數,表示每個通道的播放速度
    data.setUint32(offset, sampleRate, littleEdian); offset += 4;
    // 波形資料傳輸率 (每秒平均位元組數) 聲道數 × 取樣頻率 × 取樣位數 / 8
    data.setUint32(offset, channelCount * sampleRate * (sampleBits / 8), littleEdian); offset += 4;
    // 快資料調整數 取樣一次佔用位元組數 聲道數 × 取樣位數 / 8
    data.setUint16(offset, channelCount * (sampleBits / 8), littleEdian); offset += 2;
    // 取樣位數
    data.setUint16(offset, sampleBits, littleEdian); offset += 2;
    // 資料識別符號
    writeString(data, offset, 'data'); offset += 4;
    // 取樣資料總數,即資料總大小-44
    data.setUint32(offset, bytes.byteLength, littleEdian); offset += 4;
    
    // 給wav頭增加pcm體
    for (let i = 0; i < bytes.byteLength;) {
        data.setUint8(offset, bytes.getUint8(i));
        offset++;
        i++;
    }

    return data;
}

四.原始碼分析之核心演算法-encodePCM

/**
 * 轉換到我們需要的對應格式的編碼
 * 
 * @static
 * @param {float32array} bytes      pcm二進位制資料
 * @param {number}  sampleBits      取樣位數
 * @param {boolean} littleEdian     是否是小端位元組序
 * @returns {dataview}              pcm二進位制資料
 * @memberof Recorder
 */
static encodePCM(bytes, sampleBits: number, littleEdian: boolean = true)  {
    let offset = 0,
        dataLength = bytes.length * (sampleBits / 8),
        buffer = new ArrayBuffer(dataLength),
        data = new DataView(buffer);

    // 寫入取樣資料
    if (sampleBits === 8) {
        for (var i = 0; i < bytes.length; i++, offset++) {
            // 範圍[-1, 1]
            var s = Math.max(-1, Math.min(1, bytes[i]));
            // 8位取樣位劃分成2^8=256份，它的範圍是0-255; 
            // 對於8位的話，負數*128，正數*127，然後整體向上平移128(+128)，即可得到[0,255]範圍的資料。
            var val = s < 0 ? s * 128 : s * 127;
            val = +val + 128;
            data.setInt8(offset, val);
        }
    } else {
        for (var i = 0; i < bytes.length; i++, offset += 2) {
            var s = Math.max(-1, Math.min(1, bytes[i]));
            // 16位的劃分的是2^16=65536份，範圍是-32768到32767
            // 因為我們收集的資料範圍在[-1,1]，那麼你想轉換成16位的話，只需要對負數*32768,對正數*32767,即可得到範圍在[-32768,32767]的資料。
            data.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7FFF, littleEdian);
        }
    }

    return data;
}

語音傳送和實時轉寫

音訊檔案存哪裡？
Blob Url那些事兒
實時語音轉寫服務服務端需要做什麼？
前端程式碼實現

音訊檔案存哪裡？

語音錄一次往阿里雲OSS傳一次嗎？

這樣做顯示是浪費資源的。

編輯狀態：存本地，當前瀏覽器端可訪問即可

傳送狀態：存OSS，公網可訪問

如何存本地？Blob Url的方式儲存

如何存OSS？從cms獲取token，上傳到OSS的xxx-audio bucket，然後得到一個hash

Blob Url那些事兒

Blob Url長什麼樣？

blob:http://localhost:9090/39b60422-26f4-4c67-8456-7ac3f29115ec

blob物件在前端開發中是非常常見的，下面我將列舉幾個應用場景：

canvas toDataURL後的base64格式屬性，會超出標籤屬性值有最大長度的限制
<input type="file" />上傳檔案之後的File物件，最初只想在本地留存，時機合適再上傳到伺服器

建立BlobUrl：URL.createObjectURL(object)

釋放BlobUrl：URL.revokeObjectURL(objectURL)

Blob Url那些事兒

URL的生命週期在vue元件中如何表現？

vue的單檔案元件共有一個document，這也是它被稱為單頁應用的原因，因此可以在元件間直接通過blob URL進行通訊。
在vue-router採用hash模式的情況下，頁面間的路由跳轉，不會重新載入整個頁面，所以URL的生命週期非常強力，因此在跨頁面（非新tab）的元件通訊，也可以使用blob URL。
需要注意的是，在vue的hash mode模式下，需要更加註意通過URL.revokeObjectURL()進行的記憶體釋放

<!--元件發出blob URL-->
<label for="background">上傳背景</label>
<input type="file" style="display: none"
           id="background" name="background"
           accept="image/png, image/jpeg" multiple="false"
           @change="backgroundUpload"
>
backgroundUpload(event) {
  const fileBlobURL = window.URL.createObjectURL(event.target.files[0]);
  this.$emit('background-change', fileBlobURL);
  // this.$bus.$emit('background-change', fileBlobURL);
},

<!--元件接收blob URL-->
<BackgroundUploader @background-change="backgroundChangeHandler"></BackgroundUploader>
// this.$bus.$on("background-change", backgroundChangeHandler);
backgroundChangeHandler(url) {
    // some code handle blob url...
},

URL的生命週期在vue元件中如何表現？

https://github.com/FrankKai/F...

實時語音轉寫服務服務端需要做什麼？

提供一個傳遞儲存音訊Blob物件的File例項返回文字的介面。

this.recorder.blobObjPCM = this.recorderInstance.getPCMBlob();

transferAudioToText() {
  this.editor.loading = true;
  const formData = new FormData();
  const file = new File([this.recorder.blobObjPCM], `${+new Date()}`, { type: this.recorder.blobObjPCM.type });
  formData.append('file', file);
  apiXunFei
    .realTimeVoiceTransliterationByFile(formData)
    .then((data) => {
      this.xunfeiTransfer.text = data;
      this.editor.tip = '傳送文字';
      this.editor.transfered = !this.editor.transfered;
      this.editor.loading = false;
    })
    .catch(() => {
      this.editor.loading = false;
      this.$Message.error('轉寫語音失敗');
    });
},

/**
* 獲取PCM格式的blob資料
*
* @returns { blob }  PCM格式的blob資料
* @memberof Recorder
*/
getPCMBlob() {
    return new Blob([ this.getPCM() ]);
}

服務端需要如何實現呢？

1.鑑權

客戶端在與服務端建立WebSocket連結的時候，需要使用Token進行鑑權

2.start and confirm

客戶端發起請求，服務端確認請求有效

3.send and recognize

迴圈傳送語音資料，持續接收識別結果

stop and complete
通知服務端語音資料傳送完成，服務端識別結束後通知客戶端識別完畢

阿里OSS提供了java，python，c++，ios，android等SDK

https://help.aliyun.com/docum...

前端程式碼實現

// 傳送語音
async audioSender() {
  const audioValidated = await this.audioValidator();
  if (audioValidated) {
    this.audio.urlMobile = await this.transferMp3ToAmr(this.recorder.blobObjMP3);
    const audioBase64Str = await this.transferBlobFileToBase64(this.recorder.blobObjMP3);
    this.audio.urlPC = await this.uploadAudioToOSS(audioBase64Str);
    this.$emit('audio-sender', {
      audioPathMobile: this.audio.urlMobile,
      audioLength: parseInt(this.$refs.audio.duration * 1000),
      transferredText: this.xunfeiTransfer.text,
      audioPathPC: this.audio.urlPC,
    });
    this.closeSmartAudio();
  }
},

// 生成移動端可以傳送的amr格式音訊
transferMp3ToAmr() {
  const formData = new FormData();
  const file = new File([this.recorder.blobObjMP3], `${+new Date()}`, { type: this.recorder.blobObjMP3.type });
  formData.append('file', file);
  return new Promise((resolve) => {
    apiXunFei
      .mp32amr(formData)
      .then((data) => {
        resolve(data);
      })
      .catch(() => {
        this.$Message.error('mp3轉換amr格式失敗');
      });
  });
},

// 轉換Blob物件為Base64 string,以供上傳OSS
async transferBlobFileToBase64(file) {
  return new Promise((resolve) => {
    const reader = new FileReader();
    reader.readAsDataURL(file);
    reader.onloadend = function onloaded() {
      const fileBase64 = reader.result;
      resolve(fileBase64);
    };
  });
},

通用錄音元件

1.指定取樣位數，取樣頻率，聲道數

2.指定音訊格式

3.指定音訊計算單位Byte，KB，MB

4.自定義開始和停止來自iView的icon，型別、大小

5.返回音訊blob，音訊時長和大小

6.指定最大音訊時長和音訊大小, 達到二者其一自動停止錄製

通用元件程式碼分析

/*
 * * 設計思路：
 * * 使用到的庫：js-audio-recorder
 * * 功能：
 * * 1.指定取樣位數，取樣頻率，聲道數
 * * 2.指定音訊格式
 * * 3.指定音訊計算單位Byte，KB，MB
 * * 4.自定義開始和停止來自iView的icon，型別、大小
 * * 5.返回音訊blob，音訊時長和大小
 * * 6.指定最大音訊時長和音訊大小, 達到二者其一自動停止錄製
 * * Author: 高凱
 * * Date: 2019.11.7
 */

 <template>
  <div class="audio-maker-container">
    <Icon :type="computedRecorderIcon" @click="recorderVoice" :size="iconSize" />
  </div>
</template>
 
 
 <script>
import Recorder from 'js-audio-recorder';
/*
 * js-audio-recorder例項
 * 在這裡新建的原因在於無需對recorderInstance在當前vue元件上建立多餘的watcher，避免效能浪費
 */
let recorderInstance = null;
/*
* 錄音器助手
* 做一些輔助錄音的工作，例如記錄錄製狀態，音訊時長，音訊大小等等
*/
const recorderHelperGenerator = () => ({
  hearing: false,
  duration: 0,
  size: 0,
});
export default {
  name: 'audio-maker',
  props: {
    sampleBits: {
      type: Number,
      default: 16,
    },
    sampleRate: {
      type: Number,
    },
    numChannels: {
      type: Number,
      default: 1,
    },
    audioType: {
      type: String,
      default: 'audio/wav',
    },
    startIcon: {
      type: String,
      default: 'md-arrow-dropright-circle',
    },
    stopIcon: {
      type: String,
      default: 'md-pause',
    },
    iconSize: {
      type: Number,
      default: 30,
    },
    sizeUnit: {
      type: String,
      default: 'MB',
      validator: (unit) => ['Byte', 'KB', 'MB'].includes(unit),
    },
    maxDuration: {
      type: Number,
      default: 10 * 60,
    },
    maxSize: {
      type: Number,
      default: 1,
    },
  },
  mounted() {
    this.initRecorderInstance();
  },
  beforeDestroy() {
    recorderInstance = null;
  },
  computed: {
    computedSampleRate() {
      const audioContext = new (window.AudioContext || window.webkitAudioContext)();
      const defaultSampleRate = audioContext.sampleRate;
      return this.sampleRate ? this.sampleRate : defaultSampleRate;
    },
    computedRecorderIcon() {
      return this.recorderHelper.hearing ? this.stopIcon : this.startIcon;
    },
    computedUnitDividend() {
      const sizeUnit = this.sizeUnit;
      let unitDividend = 1024 * 1024;
      switch (sizeUnit) {
        case 'Byte':
          unitDividend = 1;
          break;
        case 'KB':
          unitDividend = 1024;
          break;
        case 'MB':
          unitDividend = 1024 * 1024;
          break;
        default:
          unitDividend = 1024 * 1024;
      }
      return unitDividend;
    },
    computedMaxSize() {
      return this.maxSize * this.computedUnitDividend;
    },
  },
  data() {
    return {
      recorderHelper: recorderHelperGenerator(),
    };
  },
  watch: {
    'recorderHelper.duration': {
      handler(duration) {
        if (duration >= this.maxDuration) {
          this.stopRecord();
        }
      },
    },
    'recorderHelper.size': {
      handler(size) {
        if (size >= this.computedMaxSize) {
          this.stopRecord();
        }
      },
    },
  },
  methods: {
    initRecorderInstance() {
      // 取樣相關
      const sampleConfig = {
        sampleBits: this.sampleBits, // 取樣位數
        sampleRate: this.computedSampleRate, // 取樣頻率
        numChannels: this.numChannels, // 聲道數
      };
      recorderInstance = new Recorder(sampleConfig);
    },
    recorderVoice() {
      if (!this.recorderHelper.hearing) {
        // 錄音前重置錄音狀態
        this.reset();
        this.startRecord();
      } else {
        this.stopRecord();
      }
      this.recorderHelper.hearing = !this.recorderHelper.hearing;
    },
    startRecord() {
      try {
        recorderInstance.start();
        // 回撥持續輸出時長
        recorderInstance.onprogress = ({ duration }) => {
          this.recorderHelper.duration = duration;
          this.$emit('on-recorder-duration-change', parseFloat(this.recorderHelper.duration.toFixed(2)));
        };
      } catch (err) {
        this.$debug(err);
      }
    },
    stopRecord() {
      recorderInstance.stop();
      const audioBlob = new Blob([recorderInstance.getWAV()], { type: this.audioType });
      this.recorderHelper.size = (audioBlob.size / this.computedUnitDividend).toFixed(2);
      this.$emit('on-recorder-finish', { blob: audioBlob, size: parseFloat(this.recorderHelper.size), unit: this.sizeUnit });
    },
    reset() {
      this.recorderHelper = recorderHelperGenerator();
    },
  },
};
</script>
 
 <style lang="scss" scoped>
.audio-maker-container {
  display: inline;
  i.ivu-icon {
    cursor: pointer;
  }
}
</style>

https://github.com/2fps/recor...

通用元件使用

import AudioMaker from '@/components/audioMaker';
<AudioMaker
  v-if="!recorderAudio.blobUrl"
  @on-recorder-duration-change="durationChange"
  @on-recorder-finish="recorderFinish"
  :maxDuration="audioMakerConfig.maxDuration"
  :maxSize="audioMakerConfig.maxSize"
  :sizeUnit="audioMakerConfig.sizeUnit"
></AudioMaker>

durationChange(duration) {
  this.resetRecorderAudio();
  this.recorderAudio.duration = duration;
},
recorderFinish({ blob, size, unit }) {
  this.recorderAudio.blobUrl = window.URL.createObjectURL(blob);
  this.recorderAudio.size = size;
  this.recorderAudio.unit = unit;
},
releaseBlobMemory(blorUrl) {
  window.URL.revokeObjectURL(blorUrl);
},

前端語音轉文字實踐總結

語音轉寫流程圖

PC端瀏覽器如何錄音

AudioContext是什麼？

AudioNode是什麼？

初見：MediaDevice.getUserMedia()是什麼

相識：MediaDevice.getUserMedia()是什麼

為什麼localhost能播放，預生產不能播放？

js中的資料型別TypedArray知多少？

typed array基本知識: TypedArray Buffer ArrayBuffer View Unit8Array Unit16Array Float64Array

typed array使用例子

程式碼實現及原始碼分析

程式碼實現及原始碼分析

一、錄音過程拆解

二、設計思路

三、原始碼分析之初始化例項-constructor

三.原始碼分析之初始化例項-initRecorder

三.原始碼分析之開始錄音-start

三.原始碼分析之停止錄音及輔助函式

四.原始碼分析之核心演算法-encodeWAV

四.原始碼分析之核心演算法-encodePCM

語音傳送和實時轉寫

音訊檔案存哪裡？

Blob Url那些事兒

Blob Url那些事兒

實時語音轉寫服務服務端需要做什麼？

前端程式碼實現

通用錄音元件

通用元件程式碼分析

通用元件使用

總結

相關文章