音視訊學習之 - H264編碼

聰莞發表於2019-08-06

原文網址 : https://juejin.im/post/5d4949bef265da03a0495ff2

有了前面[音視訊學習之 - 基礎概念和[音視訊學習之 - H264結構與碼流解析的基礎，這篇文章開始寫程式碼，前面根據AVFoundation框架做的採集工作流程就不寫了，直接從採集的代理方法**captureOutput: didOutputSampleBuffer: fromConnection:**裡開始對視訊幀就行編碼。大致的流程分為三步：

準備編碼器，即建立session：VTCompressionSessionCreate，並設定編碼器屬性；
開始編碼：VTCompressionSessionEncodeFrame
編碼完成的回撥裡處理資料：新增起始碼**"\x00\x00\x00\x01"，新增sps pps**等。
結束編碼，清除資料，釋放資源。

準備編碼器

建立session ： VTCompressionSessionCreate
設定屬性：VTSessionSetProperty 是否實時編碼輸出、是否產生B幀、設定關鍵幀、設定期望幀率、設定位元速率、最大位元速率值等等
準備開始編碼：VTCompressionSessionPrepareToEncodeFrames

-(void)initVideoToolBox
{
    // cEncodeQueue是一個序列佇列
    dispatch_sync(cEncodeQueue, ^{

        frameID = 0;
        int width = 480,height = 640;
        
        //建立編碼session
        OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &cEncodeingSession);
        NSLog(@"H264:VTCompressionSessionCreate:%d",(int)status);
        
        if (status != 0) {
            NSLog(@"H264:Unable to create a H264 session");
            return ;
        }
        
        //設定實時編碼輸出（避免延遲）
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ProfileLevel,kVTProfileLevel_H264_Baseline_AutoLevel);
        
        //是否產生B幀(因為B幀在解碼時並不是必要的,是可以拋棄B幀的)
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse);
        
        //設定關鍵幀（GOPsize）間隔，GOP太小的話影像會模糊
        int frameInterval = 10;
        CFNumberRef frameIntervalRaf = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRaf);
        
        //設定期望幀率，不是實際幀率
        int fps = 10;
        CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);
        
        //位元速率的理解：位元速率大了話就會非常清晰，但同時檔案也會比較大。位元速率小的話，影像有時會模糊，但也勉強能看
        //位元速率計算公式，參考印象筆記
        //設定位元速率、上限、單位是bps
        int bitRate = width * height * 3 * 4 * 8;
        CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);
        
        //設定位元速率，均值，單位是byte
        int bigRateLimit = width * height * 3 * 4;
        CFNumberRef bitRateLimitRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bigRateLimit);
        VTSessionSetProperty(cEncodeingSession, kVTCompressionPropertyKey_DataRateLimits, bitRateLimitRef);
        
        //準備開始編碼
        VTCompressionSessionPrepareToEncodeFrames(cEncodeingSession);

    });
    
}
複製程式碼

VTCompressionSessionCreate建立編碼物件引數詳解：

allocator：NULL 分配器,設定NULL為預設分配
width：width
height：height
codecType：編碼型別,如kCMVideoCodecType_H264
encoderSpecification：NULL encoderSpecification: 編碼規範。設定NULL由videoToolbox自己選擇
sourceImageBufferAttributes：NULL sourceImageBufferAttributes: 源畫素緩衝區屬性.設定NULL不讓videToolbox建立,而自己建立
compressedDataAllocator：壓縮資料分配器.設定NULL,預設的分配
outputCallback：編碼回撥，當VTCompressionSessionEncodeFrame被呼叫壓縮一次後會被非同步呼叫.這裡設定的函式名是 didCompressH264
outputCallbackRefCon：回撥客戶定義的參考值，此處把self傳過去，因為我們需要在C函式中呼叫self的方法，而C函式無法直接調self
compressionSessionOut：編碼會話變數

開始編碼

拿到未編碼的視訊幀： CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
設定幀時間：CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000);
開始編碼：呼叫 VTCompressionSessionEncodeFrame進行編碼

 - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    //開始視訊錄製，獲取到攝像頭的視訊幀，傳入encode 方法中
    dispatch_sync(cEncodeQueue, ^{
        [self encode:sampleBuffer];
    });
}
複製程式碼

- (void) encode:(CMSampleBufferRef )sampleBuffer
{
  //拿到每一幀未編碼資料
  CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);

  //設定幀時間
  CMTime presentationTimeStamp = CMTimeMake(frameID++, 1000);

  //開始編碼 
  OSStatus statusCode = VTCompressionSessionEncodeFrame(cEncodeingSession, imageBuffer, presentationTimeStamp, kCMTimeInvalid, NULL, NULL, &flags);

  if (statusCode != noErr) {
        //編碼失敗
        NSLog(@"H.264:VTCompressionSessionEncodeFrame faild with %d",(int)statusCode);
        
        //釋放資源
        VTCompressionSessionInvalidate(cEncodeingSession);
        CFRelease(cEncodeingSession);
        cEncodeingSession = NULL;
        return;
    }
}

複製程式碼

VTCompressionSessionEncodeFrame編碼函式引數詳解：

session ：編碼會話變數
imageBuffer：未編碼的資料
presentationTimeStamp：獲取到的這個sample buffer資料的展示時間戳。每一個傳給這個session的時間戳都要大於前一個展示時間戳
duration：對於獲取到sample buffer資料,這個幀的展示時間.如果沒有時間資訊,可設定kCMTimeInvalid.
frameProperties：包含這個幀的屬性.幀的改變會影響後邊的編碼幀.
sourceFrameRefcon：回撥函式會引用你設定的這個幀的參考值.
infoFlagsOut：指向一個VTEncodeInfoFlags來接受一個編碼操作.如果使用非同步執行,kVTEncodeInfo_Asynchronous被設定；同步執行,kVTEncodeInfo_FrameDropped被設定；設定NULL為不想接受這個資訊.

編碼完成後資料處理

判斷是否是關鍵幀：是的話，CMVideoFormatDescriptionGetH264ParameterSetAtIndex獲取sps和pps資訊，並轉換為二進位制寫入檔案或者進行上傳
組裝NALU資料：獲取編碼後的h264流資料：CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer)，通過首地址、單個長度、總長度通過dataPointer指標偏移做遍歷 OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer); 讀取資料時有個大小端模式：網路傳輸一般都是大端模式

/*
    1.H264硬編碼完成後，回撥VTCompressionOutputCallback
    2.將硬編碼成功的CMSampleBuffer轉換成H264碼流，通過網路傳播
    3.解析出引數集SPS & PPS，加上開始碼組裝成 NALU。提現出視訊資料，將長度碼轉換為開始碼，組成NALU，將NALU傳送出去。
 */
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)
{
    NSLog(@"didCompressH264 called with status %d infoFlags %d",(int)status,(int)infoFlags);
    //狀態錯誤
    if (status != 0) {
        return;
    }
    
    //沒準備好
    if (!CMSampleBufferDataIsReady(sampleBuffer)) {
        NSLog(@"didCompressH264 data is not ready");
        return;
    }
    
    ViewController *encoder = (__bridge ViewController *)outputCallbackRefCon;
    
    //判斷當前幀是否為關鍵幀
    CFArrayRef array = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true);
    CFDictionaryRef dic = CFArrayGetValueAtIndex(array, 0);
    bool keyFrame = !CFDictionaryContainsKey(dic, kCMSampleAttachmentKey_NotSync);
    
    //判斷當前幀是否為關鍵幀
    //獲取sps & pps 資料 只獲取1次，儲存在h264檔案開頭的第一幀中
    //sps(sample per second 取樣次數/s),是衡量模數轉換（ADC）時取樣速率的單位
    //pps()
    if (keyFrame) {
        //影像儲存方式，編碼器等格式描述
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        
        //sps
        size_t sparameterSetSize,sparameterSetCount;
        const uint8_t *sparameterSet;
        OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0);
        
        if (statusCode == noErr) {
            
            //獲取pps
            size_t pparameterSetSize,pparameterSetCount;
            const uint8_t *pparameterSet;
            
            //從第一個關鍵幀獲取sps & pps
            OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0);
            
            //獲取H264引數集合中的SPS和PPS
            if (statusCode == noErr)
            {
                NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
                NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
                
                if(encoder)
                {
                    [encoder gotSpsPps:sps pps:pps];
                }
            }
        }
    }
    
    CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
    size_t length,totalLength;
    char *dataPointer;
    OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
    if (statusCodeRet == noErr) {
        size_t bufferOffset = 0;
        static const int AVCCHeaderLength = 4;//返回的nalu資料前4個位元組不是001的startcode,而是大端模式的幀長度length
        
        //迴圈獲取nalu資料
        while (bufferOffset < totalLength - AVCCHeaderLength) {
            
            uint32_t NALUnitLength = 0;
            
            //讀取 一單元長度的 nalu
            memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
            
            //從大端模式轉換為系統端模式
            NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
            
            //獲取nalu資料
            NSData *data = [[NSData alloc]initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
            
            //將nalu資料寫入到檔案
            [encoder gotEncodedData:data isKeyFrame:keyFrame];
            
            //move to the next NAL unit in the block buffer
            //讀取下一個nalu 一次回撥可能包含多個nalu資料
            bufferOffset += AVCCHeaderLength + NALUnitLength;
        }
    }
}

//第一幀寫入 sps & pps
- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
    const char bytes[] = "\x00\x00\x00\x01";
    
    size_t length = (sizeof bytes) - 1;    // 最後一位是\0結束符
    
    NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
    
    [fileHandele writeData:ByteHeader];
    [fileHandele writeData:sps];
    [fileHandele writeData:ByteHeader];
    [fileHandele writeData:pps];
}

- (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame
{
    if (fileHandele != NULL) {
        //新增4個位元組的H264 協議 start code 分割符
        //一般來說編碼器編出的首幀資料為PPS & SPS
        //H264編碼時，在每個NAL前新增起始碼 0x00000001,解碼器在碼流中檢測起始碼，當前NAL結束。
        const char bytes[] ="\x00\x00\x00\x01";
        //長度
        size_t length = (sizeof bytes) - 1;
        
        //頭位元組
        NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
        //寫入頭位元組
        [fileHandele writeData:ByteHeader];
        
        //寫入H264資料
        [fileHandele writeData:data];
    }
}
複製程式碼

結束編碼

-(void)endVideoToolBox
{
    VTCompressionSessionCompleteFrames(cEncodeingSession, kCMTimeInvalid);
    VTCompressionSessionInvalidate(cEncodeingSession);
    CFRelease(cEncodeingSession);
    cEncodeingSession = NULL;  
}
複製程式碼

碎片化學習之–10分鐘學會Jave視訊轉碼avi–>mp4（h264編碼格式）！
2019-01-17
Android 音視訊開發視訊編碼，音訊編碼格式
2020-12-21
Android音訊
Android 音視訊 - MediaCodec 編解碼音視訊
2021-11-03
Android
音視訊編解碼 -- 編碼引數 CRF
2021-10-19
CRF
js對flv提取h264、aac音視訊流
2020-12-27
JS
音視訊學習路線
2020-02-17
ffmpeg音訊編碼之pcm轉碼aac
2024-06-08
音訊
Android音視訊(四)MediaCodec編解碼AAC
2019-03-04
Android
音視訊開發-全網最全常用音視訊編碼和格式彙總
2021-05-11
音視訊入門之音訊採集、編碼、播放
2018-12-28
音訊
JavaCV FFmpeg H264編碼
2020-10-12
Java
Android 音視訊錄製硬編碼實現
2018-09-18
Android
【秒懂音視訊開發】14_AAC編碼
2021-04-10
萬彩動畫大師教程 | 視訊編碼器可選h264或h265
2019-03-21
動畫
Android音視訊之AudioRecord
2018-11-06
Android
Android如何回撥編碼後的音視訊資料
2018-11-05
Android
【秒懂音視訊開發】23_H.264編碼
2021-05-25
Camera2錄製視訊(一)：音訊的錄製及編碼
2019-06-26
音訊
轉載:iOS音視訊實時採集硬體編碼
2018-06-04
iOS
3C視訊音訊內容資訊編碼引數檢測MediaInfo
2022-06-25
音訊AI
音視訊學習（一）-- 基礎知識準備
2019-08-25
H264編碼分析及隱寫實踐
2024-04-29
音訊編碼基礎詳解
2024-07-28
音訊
音視訊學習 (十一) Android 端實現 rtmp 推流
2020-03-02
Android
Android音視訊處理之MediaCodec
2018-11-06
Android
Android音視訊處理之MediaMuxer
2018-11-06
AndroidUX
【經驗分享】RTC 技術系列之視訊編解碼
2021-09-30
視訊硬編碼（iOS端）
2018-03-26
iOS
AVAssetWriter視訊資料編碼
2021-06-21
短視訊“音訊化”，音樂“視訊化”
2021-11-29
音訊
Camera開發系列之四-使用MediaMuxer封裝編碼後的音視訊到mp4容器
2019-02-13
UX封裝
webpack 學習筆記：實戰之 babel 編碼
2020-09-27
Web筆記Babel
iOS 儲存攝像頭H264視訊流
2018-11-21
iOS
視訊播放學習
2018-05-17
音視訊--音訊入門
2018-12-18
音訊
音視訊–音訊入門
2018-12-18
音訊
一、視音訊編解碼技術零基礎(理論總結)
2018-05-08
音訊
音視訊--視訊入門
2018-12-25

音視訊學習之 - H264編碼

準備編碼器

開始編碼

編碼完成後資料處理

結束編碼

相關文章