[MetalKit]37-Using-ARKit-with-Metal使用ARKit與Metal

蘋果API搬運工發表於2017-12-25

本系列文章是對 metalkit.org 上面MetalKit內容的全面翻譯和學習.

MetalKit系統文章目錄


Augmented Reality擴增實境提供了一種疊加虛擬內容到攝像頭獲取到的真實世界檢視上的方法.上個月在WWDC2017當看到Apple的新的ARKit框架時我們都很興奮,這是一個高階API,工作在執行iOS11A9裝置或更新裝置上.有些ARKit實驗確實非常傑出,比如下面這個:

ARKit.gif

ARKit應用中有三種不同的圖層:

  • Tracking追蹤 - 使用視覺慣性里程計來實現世界追蹤,無需額外設定.
  • Scene Understanding場景理解 - 使用平面檢測,點選測試和光照估計來探測場景屬性的能力.
  • Rendering渲染 - 可以輕鬆整合,因為AR檢視模板是由SpriteKitSceneKit提供的,可以用Metal自定義.所有預渲染過程由ARKit處理完成,它還同時負責用AVFoundationCoreMotion進行影像捕捉.

在本系列的第一章節裡,我們將主要關注Metal中的渲染,其餘兩步將在本系列下一章節中討論.在一個AR應用中,Tracking追蹤Scene Understanding場景理解是完全由ARKit框架處理的,但渲染可以用SpriteKit, SceneKitMetal來處理:

ARKit1.png

開始,我們需要有一個ARSession例項,它是用一個ARSessionConfiguration物件來建立的.然後,我們呼叫run() 函式來配置.這個會話管理著同時執行的AVCaptureSessionCMMotionManager物件,來獲取影像和運動資料來實現追蹤.最後,會話將輸出當前幀到一個ARFrame物件:

ARKit2.png

ARSessionConfiguration物件包含了關於追蹤型別的資訊.ARSessionConfiguration的基礎類提供3個自由度的追蹤,而它的子類,ARWorldTrackingSessionConfiguration提供6個自由度的追蹤(裝置位置旋轉方向).

ARKit4.png

當一個裝置不支援世界追蹤時,回落到基礎配置:

if ARWorldTrackingSessionConfiguration.isSupported { 
    configuration = ARWorldTrackingSessionConfiguration()
} else {
    configuration = ARSessionConfiguration() 
}
複製程式碼

ARFrame包含了捕捉到的影像,追蹤資訊和場景資訊,場景資訊通過包含真實世界位置和旋轉資訊的ARAnchor物件來獲取,這個物件可以輕易從會話中被新增,更新或移除.Tracking追蹤是實時確定物理位置的能力.World Tracking,能同時確定位置和朝向,它使用物理距離,與起始位置相關聯並提供3D特徵點.

ARFrame的最後一個元件是ARCamera物件,它處理變換(平移,旋轉,縮放)並攜帶了追蹤狀態和相機本體.追蹤的質量強烈依賴於不間斷的感測器資料,穩定的場景,並且當場景中有大量複雜紋理時會更加精確.追蹤狀態有三個值:Not Available不可用(相機只有單位矩陣),Limited受限(場景中特徵不足或不夠穩定),還有Normal正常(相機資料正常).當相機輸入不可用時或當追蹤停止時,會引發會話打斷:

func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) { 
    if case .limited(let reason) = camera.trackingState {
        // Notify user of limited tracking state
    } 
}
func sessionWasInterrupted(_ session: ARSession) { 
    showOverlay()
}
func sessionInterruptionEnded(_ session: ARSession) { 
    hideOverlay()
    // Optionally restart experience
}
複製程式碼

Rendering可以在SceneKit中完成,它使用ARSCNView的代理來新增,更新或移除節點.類似的,Rendering也可以在SpriteKit中完成,它使用ARSKView代理來佈局SKNodesARAnchor物件.因為SpriteKit2D的,它不能使用真實世界的相機位置,所以它是投影錨點位置到ARSKView,然後在這個被投影的位置作為廣告牌(平面)來渲染點精靈的,所以點精靈總是面對著攝像機.對於Metal,沒有定製的AR檢視,所以這個責任落到了程式設計師手裡.為了處理渲染出的影像,我們需要:

  • 繪製背景相機影像(從畫素緩衝器生成一個紋理)
  • 更新虛擬攝像機
  • 更新光照
  • 更新幾何體的變換

所有這些資訊都在ARFrame物件中.為訪問這個幀,有兩種設定:polling輪詢或使用delegate代理.我們將使用後者.我拿出ARKitMetal準備的模板,並精簡到最簡,這樣我能更好地理解它是如何工作的.我做的第一件事就是移除所有C語言的依賴項,這樣就不在需要橋接了.保留這些型別和列舉常量在以後可能會很有用,能用來在API程式碼和著色器之間共享這些型別和列舉,但是對於本文來說這是不需要的.

下一步,到ViewController中,它將作為我們MTKViewARSession的代理.我們建立一個Renderer例項,它將與代理協作,實時更新應用:

var session: ARSession!
var renderer: Renderer!

override func viewDidLoad() {
    super.viewDidLoad()
    session = ARSession()
    session.delegate = self
    if let view = self.view as? MTKView {
        view.device = MTLCreateSystemDefaultDevice()
        view.delegate = self
        renderer = Renderer(session: session, metalDevice: view.device!, renderDestination: view)
        renderer.drawRectResized(size: view.bounds.size)
    }
    let tapGesture = UITapGestureRecognizer(target: self, action: #selector(self.handleTap(gestureRecognize:)))
    view.addGestureRecognizer(tapGesture)
}
複製程式碼

正如你看到的,我們將新增一個手勢識別器,我們用它來新增虛擬內容到檢視中.我們首先拿到會話的當前幀,然後建立一個轉換來將我們的物體放到相機前(本例中為0.3米),最後用這個變換來新增一個新的錨點到會話中:

func handleTap(gestureRecognize: UITapGestureRecognizer) {
    if let currentFrame = session.currentFrame {
        var translation = matrix_identity_float4x4
        translation.columns.3.z = -0.3
        let transform = simd_mul(currentFrame.camera.transform, translation)
        let anchor = ARAnchor(transform: transform)
        session.add(anchor: anchor)
    }
}
複製程式碼

我們使用**viewWillAppear()viewWillDisappear()**方法來開始和暫停會話:

override func viewWillAppear(_ animated: Bool) {
    super.viewWillAppear(animated)
    let configuration = ARWorldTrackingSessionConfiguration()
    session.run(configuration)
}

override func viewWillDisappear(_ animated: Bool) {
    super.viewWillDisappear(animated)
    session.pause()
}
複製程式碼

剩下的只有響應檢視更新或會話錯誤及打斷的代理方法:

func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
    renderer.drawRectResized(size: size)
}

func draw(in view: MTKView) {
    renderer.update()
}

func session(_ session: ARSession, didFailWithError error: Error) {}

func sessionWasInterrupted(_ session: ARSession) {}

func sessionInterruptionEnded(_ session: ARSession) {}
複製程式碼

讓我們現在轉到Renderer.swift檔案中.需要注意的第一件事情是,使用了一個非常有用的協議,它將讓我們能訪問我們稍後繪製呼叫中需要的所有MTKView屬性:

protocol RenderDestinationProvider {
    var currentRenderPassDescriptor: MTLRenderPassDescriptor? { get }
    var currentDrawable: CAMetalDrawable? { get }
    var colorPixelFormat: MTLPixelFormat { get set }
    var depthStencilPixelFormat: MTLPixelFormat { get set }
    var sampleCount: Int { get set }
}
複製程式碼

現在你可以只需擴充套件MTKView類(在ViewController中),就讓它遵守了這個協議:

extension MTKView : RenderDestinationProvider {}
複製程式碼

要想有一個Renderer類的高階檢視,下面是它的虛擬碼:

init() {
    setupPipeline()
    setupAssets()
}
    
func update() {
    updateBufferStates()
    updateSharedUniforms()
    updateAnchors()
    updateCapturedImageTextures()
    updateImagePlane()
    drawCapturedImage()
    drawAnchorGeometry()
}
複製程式碼

像以前一樣,我們首先建立管線,這裡用setupPipeline() 函式.然後,在setupAssets() 裡,我們建立我們的模型,當我們的點選手勢識別時就會載入出來.當繪製呼叫時或需要更新時,MTKView代理將會呼叫update() 函式.讓我們仔細看看它們.首先我們用了updateBufferStates(),它更新我們為當前幀(本例中,我們使用3個空位的環形緩衝器)寫入到緩衝器中的位置.

func updateBufferStates() {
    uniformBufferIndex = (uniformBufferIndex + 1) % maxBuffersInFlight
    sharedUniformBufferOffset = alignedSharedUniformSize * uniformBufferIndex
    anchorUniformBufferOffset = alignedInstanceUniformSize * uniformBufferIndex
    sharedUniformBufferAddress = sharedUniformBuffer.contents().advanced(by: sharedUniformBufferOffset)
    anchorUniformBufferAddress = anchorUniformBuffer.contents().advanced(by: anchorUniformBufferOffset)
}
複製程式碼

下一步,在updateSharedUniforms() 中我們更新該幀的共享的uniforms,併為場景設定光照:

func updateSharedUniforms(frame: ARFrame) {
    let uniforms = sharedUniformBufferAddress.assumingMemoryBound(to: SharedUniforms.self)
    uniforms.pointee.viewMatrix = simd_inverse(frame.camera.transform)
    uniforms.pointee.projectionMatrix = frame.camera.projectionMatrix(withViewportSize: viewportSize, orientation: .landscapeRight, zNear: 0.001, zFar: 1000)
    var ambientIntensity: Float = 1.0
    if let lightEstimate = frame.lightEstimate {
        ambientIntensity = Float(lightEstimate.ambientIntensity) / 1000.0
    }
    let ambientLightColor: vector_float3 = vector3(0.5, 0.5, 0.5)
    uniforms.pointee.ambientLightColor = ambientLightColor * ambientIntensity
    var directionalLightDirection : vector_float3 = vector3(0.0, 0.0, -1.0)
    directionalLightDirection = simd_normalize(directionalLightDirection)
    uniforms.pointee.directionalLightDirection = directionalLightDirection
    let directionalLightColor: vector_float3 = vector3(0.6, 0.6, 0.6)
    uniforms.pointee.directionalLightColor = directionalLightColor * ambientIntensity
    uniforms.pointee.materialShininess = 30
}
複製程式碼

下一步,在updateAnchors() 中我們用當前幀的錨點的變換來更新錨點uniform緩衝器:

func updateAnchors(frame: ARFrame) {
    anchorInstanceCount = min(frame.anchors.count, maxAnchorInstanceCount)
    var anchorOffset: Int = 0
    if anchorInstanceCount == maxAnchorInstanceCount {
        anchorOffset = max(frame.anchors.count - maxAnchorInstanceCount, 0)
    }
    for index in 0..<anchorInstanceCount {
        let anchor = frame.anchors[index + anchorOffset]
        var coordinateSpaceTransform = matrix_identity_float4x4
        coordinateSpaceTransform.columns.2.z = -1.0
        let modelMatrix = simd_mul(anchor.transform, coordinateSpaceTransform)
        let anchorUniforms = anchorUniformBufferAddress.assumingMemoryBound(to: InstanceUniforms.self).advanced(by: index)
        anchorUniforms.pointee.modelMatrix = modelMatrix
    }
}
複製程式碼

下一步,在updateCapturedImageTextures() 我們從提供的幀的捕捉影像裡,建立兩個紋理:

func updateCapturedImageTextures(frame: ARFrame) {
    let pixelBuffer = frame.capturedImage
    if (CVPixelBufferGetPlaneCount(pixelBuffer) < 2) { return }
    capturedImageTextureY = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.r8Unorm, planeIndex:0)!
    capturedImageTextureCbCr = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.rg8Unorm, planeIndex:1)!
}
複製程式碼

下一步,在updateImagePlane() 中,我們更新影像平面的紋理座標來適應視口:

func updateImagePlane(frame: ARFrame) {
    let displayToCameraTransform = frame.displayTransform(withViewportSize: viewportSize, orientation: .landscapeRight).inverted()
    let vertexData = imagePlaneVertexBuffer.contents().assumingMemoryBound(to: Float.self)
    for index in 0...3 {
        let textureCoordIndex = 4 * index + 2
        let textureCoord = CGPoint(x: CGFloat(planeVertexData[textureCoordIndex]), y: CGFloat(planeVertexData[textureCoordIndex + 1]))
        let transformedCoord = textureCoord.applying(displayToCameraTransform)
        vertexData[textureCoordIndex] = Float(transformedCoord.x)
        vertexData[textureCoordIndex + 1] = Float(transformedCoord.y)
    }
}
複製程式碼

下一步,在drawCapturedImage() 中我們在場景中繪製來自相機的畫面:

func drawCapturedImage(renderEncoder: MTLRenderCommandEncoder) {
    guard capturedImageTextureY != nil && capturedImageTextureCbCr != nil else { return }
    renderEncoder.pushDebugGroup("DrawCapturedImage")
    renderEncoder.setCullMode(.none)
    renderEncoder.setRenderPipelineState(capturedImagePipelineState)
    renderEncoder.setDepthStencilState(capturedImageDepthState)
    renderEncoder.setVertexBuffer(imagePlaneVertexBuffer, offset: 0, index: 0)
    renderEncoder.setFragmentTexture(capturedImageTextureY, index: 1)
    renderEncoder.setFragmentTexture(capturedImageTextureCbCr, index: 2)
    renderEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
    renderEncoder.popDebugGroup()
}
複製程式碼

最終,在drawAnchorGeometry() 中,我們為建立的虛擬內容繪製錨點:

func drawAnchorGeometry(renderEncoder: MTLRenderCommandEncoder) {
    guard anchorInstanceCount > 0 else { return }
    renderEncoder.pushDebugGroup("DrawAnchors")
    renderEncoder.setCullMode(.back)
    renderEncoder.setRenderPipelineState(anchorPipelineState)
    renderEncoder.setDepthStencilState(anchorDepthState)
    renderEncoder.setVertexBuffer(anchorUniformBuffer, offset: anchorUniformBufferOffset, index: 2)
    renderEncoder.setVertexBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
    renderEncoder.setFragmentBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
    for bufferIndex in 0..<mesh.vertexBuffers.count {
        let vertexBuffer = mesh.vertexBuffers[bufferIndex]
        renderEncoder.setVertexBuffer(vertexBuffer.buffer, offset: vertexBuffer.offset, index:bufferIndex)
    }
    for submesh in mesh.submeshes {
        renderEncoder.drawIndexedPrimitives(type: submesh.primitiveType, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: submesh.indexBuffer.offset, instanceCount: anchorInstanceCount)
    }
    renderEncoder.popDebugGroup()
}
複製程式碼

回到前面提到的setupPipeline() 函式.建立兩個渲染管線狀態物件,一個用於捕捉影像(從相機接收),一個用於我們在場景中放置虛擬物體時建立的錨點.正如所料,每一個狀態物件都將有一對自己的頂點函式和片段函式 - 讓我們轉到最後一個需要關注的檔案 - 在Shaders.metal檔案中.用來捕捉影像的第一對著色器中,我們在頂點著色器中傳遞影像的頂點位置和紋理座標:

vertex ImageColorInOut capturedImageVertexTransform(ImageVertex in [[stage_in]]) {
    ImageColorInOut out;
    out.position = float4(in.position, 0.0, 1.0);
    out.texCoord = in.texCoord;
    return out;
}
複製程式碼

在片段著色器中,我們取樣兩個紋理來得到給定紋理座標處的顏色,然後返回修改過的RGB顏色:

fragment float4 capturedImageFragmentShader(ImageColorInOut in [[stage_in]],
                                            texture2d<float, access::sample> textureY [[ texture(1) ]],
                                            texture2d<float, access::sample> textureCbCr [[ texture(2) ]]) {
    constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);
    const float4x4 ycbcrToRGBTransform = float4x4(float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
                                                  float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
                                                  float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
                                                  float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f));
    float4 ycbcr = float4(textureY.sample(colorSampler, in.texCoord).r, textureCbCr.sample(colorSampler, in.texCoord).rg, 1.0);
    return ycbcrToRGBTransform * ycbcr;
}
複製程式碼

給錨點幾何體的第二對著色器中,頂點著色器中我們在裁剪空間裡計算頂點的位置並輸出,以供裁剪和光柵化,然後給每個面著上不同顏色,然後計算我們頂點在觀察空間的位置,並最終旋轉法線到世界座標系:

vertex ColorInOut anchorGeometryVertexTransform(Vertex in [[stage_in]],
                                                constant SharedUniforms &sharedUniforms [[ buffer(3) ]],
                                                constant InstanceUniforms *instanceUniforms [[ buffer(2) ]],
                                                ushort vid [[vertex_id]],
                                                ushort iid [[instance_id]]) {
    ColorInOut out;
    float4 position = float4(in.position, 1.0);
    float4x4 modelMatrix = instanceUniforms[iid].modelMatrix;
    float4x4 modelViewMatrix = sharedUniforms.viewMatrix * modelMatrix;
    out.position = sharedUniforms.projectionMatrix * modelViewMatrix * position;
    ushort colorID = vid / 4 % 6;
    out.color = colorID == 0 ? float4(0.0, 1.0, 0.0, 1.0)  // Right face
              : colorID == 1 ? float4(1.0, 0.0, 0.0, 1.0)  // Left face
              : colorID == 2 ? float4(0.0, 0.0, 1.0, 1.0)  // Top face
              : colorID == 3 ? float4(1.0, 0.5, 0.0, 1.0)  // Bottom face
              : colorID == 4 ? float4(1.0, 1.0, 0.0, 1.0)  // Back face
              :                float4(1.0, 1.0, 1.0, 1.0); // Front face
    out.eyePosition = half3((modelViewMatrix * position).xyz);
    float4 normal = modelMatrix * float4(in.normal.x, in.normal.y, in.normal.z, 0.0f);
    out.normal = normalize(half3(normal.xyz));
    return out;
}
複製程式碼

在片段著色器中,我們計算方向光的貢獻值,使用漫反射和高光專案的總和,然後通過將從顏色地圖的取樣與片段的光照值相乘來計算最終顏色,最後,用剛計算出來的顏色和顏色地圖的透明通道給片段的透明度值:

fragment float4 anchorGeometryFragmentLighting(ColorInOut in [[stage_in]],
                                               constant SharedUniforms &uniforms [[ buffer(3) ]]) {
    float3 normal = float3(in.normal);
    float3 directionalContribution = float3(0);
    {
        float nDotL = saturate(dot(normal, -uniforms.directionalLightDirection));
        float3 diffuseTerm = uniforms.directionalLightColor * nDotL;
        float3 halfwayVector = normalize(-uniforms.directionalLightDirection - float3(in.eyePosition));
        float reflectionAngle = saturate(dot(normal, halfwayVector));
        float specularIntensity = saturate(powr(reflectionAngle, uniforms.materialShininess));
        float3 specularTerm = uniforms.directionalLightColor * specularIntensity;
        directionalContribution = diffuseTerm + specularTerm;
    }
    float3 ambientContribution = uniforms.ambientLightColor;
    float3 lightContributions = ambientContribution + directionalContribution;
    float3 color = in.color.rgb * lightContributions;
    return float4(color, in.color.w);
}
複製程式碼

如果你執行應用,將能夠通過點選螢幕來新增立方體到相機檢視上,到處移動或湊近或環繞立方體來觀察每個面的不同顏色,比如這樣:

ARKit5.gif

在本系列的下一章節,我們將更深入學習Tracking追蹤Scene Understanding場景理解,並看看平面檢測,點選測試,碰撞和物理效果是如何讓我們的經歷更美好的. 原始碼source code已釋出在Github上.

下次見!

相關文章