[譯] 基於 Metal 的 ARKit 使用指南（上）

LeviDing發表於2017-12-02

原文地址：Using ARKit with Metal

原文作者：Marius Horga

譯文出自：掘金翻譯計劃

本文永久連結：github.com/xitu/gold-m…

譯者：RichardLeeH

校對者：Danny1451

擴增實境提供了一種將虛擬內容渲染到通過移動裝置攝像頭捕獲的真實世界場景之上的方法。上個月，在 WWDC 2017 上，我們都非常興奮地看到了 蘋果 的新 ARKit 高階 API 框架，它執行於搭載 A9 處理器或更高配置的 iOS 11 裝置上。我們看到的一些 ARKit 實驗已相當出色，比如下面這個：

一個 ARKit 應用中包含 3 種不同的層：

追蹤層 - 不需要額外的配置就可以採用視覺慣性定位追蹤場景。
場景理解層 - 利用平面檢測，點選檢測和光照估計來檢測場景屬性的能力。
渲染層 - 由於 SpriteKit 和 SceneKit 提供的模板 AR 檢視，因此可以輕鬆整合，也可以使用 Metal自定義檢視。所有的預渲染處理都是由 ARKit 完成的，它還負責使用 AVFoundation 和 CoreMotion 捕獲影像。

在本系列的第一部分中，我們將主要關注 Metal 下的 渲染，並在本系列的下一部分討論其他兩個部分。在一個 AR 應用中，追蹤層 和 場景理解層 完全由 ARKit 框架處理，而 渲染層 由 SpriteKit、SceneKit 或 Metal 處理：

開始之前，我們需要通過一個 ARSessionConfiguration 物件建立一個 ARSession 例項，接著我們在這個配置上呼叫 run() 方法。ARSession 同時會依賴 AVCaptureSession 和 CMMotionManager 執行物件來獲取追蹤的影像和運動資料。最後，ARSession 將會輸出當前 frame 到一個 ARFrame 物件。

ARSessionConfiguration 物件包含了會話將會使用的追蹤型別資訊。 ARSessionConfiguration 基礎配置類提供了 3 個自由度的運動追蹤 (裝置方向) 而其子類 ARWorldTrackingSessionConfiguration，提供了 6 個自由度的運動追蹤 (裝置位置和方向)。

當裝置不支援真實場景追蹤時，它會採用基本配置：

if ARWorldTrackingSessionConfiguration.isSupported { 
    configuration = ARWorldTrackingSessionConfiguration()
} else {
    configuration = ARSessionConfiguration() 
}
複製程式碼

ARFrame 包含捕獲的影像，跟蹤資訊以及通過 ARAnchor 物件獲取的場景資訊，，**ARAnchor ** 物件包含有關真實世界位置和方向的資訊，並且可以輕鬆地新增，更新或從會話中刪除。跟蹤是實時確定物理位置的能力。然而，世界追蹤決定了位置和方向，它與物理距離一起工作，相對於起始位置並提供3D特徵點。

ARFrame 的最後一個元件是 ARCamera 物件，它便於轉換（平移，旋轉，縮放），並且包含了跟蹤的狀態和相機的相關方法。跟蹤質量在很大程度上依賴於不間斷的感測器資料，靜態場景，並且在場景紋理複雜的環境中更加準確。跟蹤狀態有三個值：不可用（攝像機只有單位矩陣），限制（場景功能不足或不夠靜態）和正常（攝像機被填充資料）。會話中斷是由於相機輸入不可用或停止跟蹤造成的：

func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) { 
    if case .limited(let reason) = camera.trackingState {
        // Notify user of limited tracking state
    } 
}
func sessionWasInterrupted(_ session: ARSession) { 
    showOverlay()
}
func sessionInterruptionEnded(_ session: ARSession) { 
    hideOverlay()
    // Optionally restart experience
}
複製程式碼

在 SceneKit 中使用 ARSCNView 的代理進行渲染，包括新增，更新或者刪除節點。類似的，SpriteKit 使用 ARSKView 的代理將SKNodes 對映為 ARAnchor 物件。由於 SpriteKit 為 2D，因此它不能使用真實世界的攝像頭位置，所以它將錨點的位置投影到 ARSKView，並在投影的位置上將精靈渲染為一個廣告牌（平面），所以精靈會一直面對著攝像頭。對於 Metal，沒有自定義的 AR 檢視，所以重任就落在了程式設計師手裡。為了處理渲染的影像，我們需要：

繪製背景攝像機影像 (從畫素緩衝區生成一個紋理)
更新虛擬攝像頭
更新光照
更新幾何圖形的變換

所有這些資訊都在 ARFrame 物件中。獲取 frame，有兩種方式：輪詢或使用代理。我們將簡單介紹後者。我使用了 Metal 的 ARKit 模板，把它精簡到最小，這樣我就能更好地理解它是如何工作的。我做的第一件事是移除所有的 C 依賴，這樣就不需要橋接。它在以後會很有用，因為型別和列舉常量可以在 API 程式碼和著色器之間共享，但這篇文章的目的並不需要。

接著，回到 ViewController 上，它需要作為 MTKView 和 ARSession 的代理。我們建立一個 Renderer 例項，用於同代理一起實時更新應用：

var session: ARSession!
var renderer: Renderer!

override func viewDidLoad() {
    super.viewDidLoad()
    session = ARSession()
    session.delegate = self
    if let view = self.view as? MTKView {
        view.device = MTLCreateSystemDefaultDevice()
        view.delegate = self
        renderer = Renderer(session: session, metalDevice: view.device!, renderDestination: view)
        renderer.drawRectResized(size: view.bounds.size)
    }
    let tapGesture = UITapGestureRecognizer(target: self, action: #selector(self.handleTap(gestureRecognize:)))
    view.addGestureRecognizer(tapGesture)
}
複製程式碼

正如你所看到的，我們還新增了一個手勢識別，用於在場景中新增虛擬內容。首先，我們獲取會話的當前幀，接著建立一個變換將我們的實體放到攝像頭前（本例中 0.3 米），最後使用這個變換在會話中新增一個新的錨點。

func handleTap(gestureRecognize: UITapGestureRecognizer) {
    if let currentFrame = session.currentFrame {
        var translation = matrix_identity_float4x4
        translation.columns.3.z = -0.3
        let transform = simd_mul(currentFrame.camera.transform, translation)
        let anchor = ARAnchor(transform: transform)
        session.add(anchor: anchor)
    }
}
複製程式碼

我們分別使用 viewWillAppear() 和 viewWillDisappear() 方法啟動和暫停會話：

override func viewWillAppear(_ animated: Bool) {
    super.viewWillAppear(animated)
    let configuration = ARWorldTrackingSessionConfiguration()
    session.run(configuration)
}

override func viewWillDisappear(_ animated: Bool) {
    super.viewWillDisappear(animated)
    session.pause()
}
複製程式碼

剩下的就是我們需要實現檢視更新、會話錯誤和中斷的代理方法：

func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
    renderer.drawRectResized(size: size)
}

func draw(in view: MTKView) {
    renderer.update()
}

func session(_ session: ARSession, didFailWithError error: Error) {}

func sessionWasInterrupted(_ session: ARSession) {}

func sessionInterruptionEnded(_ session: ARSession) {}
複製程式碼

開啟 Renderer.swift 檔案。要注意的第一件事是使用一個非常方便的協議，它可以讓我們訪問所有的 MTKView屬性：

protocol RenderDestinationProvider {
    var currentRenderPassDescriptor: MTLRenderPassDescriptor? { get }
    var currentDrawable: CAMetalDrawable? { get }
    var colorPixelFormat: MTLPixelFormat { get set }
    var depthStencilPixelFormat: MTLPixelFormat { get set }
    var sampleCount: Int { get set }
}
複製程式碼

現在我們可以擴充套件 MTKView 類(在 ViewController中)，以便其遵守這個協議：

extension MTKView : RenderDestinationProvider {}
複製程式碼

Renderer 類的高階檢視，以下為虛擬碼：

init() {
    setupPipeline()
    setupAssets()
}

func update() {
    updateBufferStates()
    updateSharedUniforms()
    updateAnchors()
    updateCapturedImageTextures()
    updateImagePlane()
    drawCapturedImage()
    drawAnchorGeometry()
}
複製程式碼

和往常一樣，我們首先使用 setupPipeline() 函式設定管道。然後，在 **setupAssets()**中，我們建立了模型，每當我們使用我們的單擊手勢時，模型將被載入。 MTKView 委託將呼叫 update() 函式獲取所需更新並繪製。我們詳細介紹他們。首先我們看看 updateBufferStates()，它更新我們寫入當前幀的緩衝區的位置（本例項中，我們使用一個 3 個槽的環形緩衝區）：

func updateBufferStates() {
    uniformBufferIndex = (uniformBufferIndex + 1) % maxBuffersInFlight
    sharedUniformBufferOffset = alignedSharedUniformSize * uniformBufferIndex
    anchorUniformBufferOffset = alignedInstanceUniformSize * uniformBufferIndex
    sharedUniformBufferAddress = sharedUniformBuffer.contents().advanced(by: sharedUniformBufferOffset)
    anchorUniformBufferAddress = anchorUniformBuffer.contents().advanced(by: anchorUniformBufferOffset)
}
複製程式碼

在 updateSharedUniforms() 方法中，我們更新 frame 的共享 uniform 變數並設定場景的光照：

func updateSharedUniforms(frame: ARFrame) {
    let uniforms = sharedUniformBufferAddress.assumingMemoryBound(to: SharedUniforms.self)
    uniforms.pointee.viewMatrix = simd_inverse(frame.camera.transform)
    uniforms.pointee.projectionMatrix = frame.camera.projectionMatrix(withViewportSize: viewportSize, orientation: .landscapeRight, zNear: 0.001, zFar: 1000)
    var ambientIntensity: Float = 1.0
    if let lightEstimate = frame.lightEstimate {
        ambientIntensity = Float(lightEstimate.ambientIntensity) / 1000.0
    }
    let ambientLightColor: vector_float3 = vector3(0.5, 0.5, 0.5)
    uniforms.pointee.ambientLightColor = ambientLightColor * ambientIntensity
    var directionalLightDirection : vector_float3 = vector3(0.0, 0.0, -1.0)
    directionalLightDirection = simd_normalize(directionalLightDirection)
    uniforms.pointee.directionalLightDirection = directionalLightDirection
    let directionalLightColor: vector_float3 = vector3(0.6, 0.6, 0.6)
    uniforms.pointee.directionalLightColor = directionalLightColor * ambientIntensity
    uniforms.pointee.materialShininess = 30
}
複製程式碼

在 updateAnchors() 方法中，我們用當前 frame 的錨點的變換來更新錨定元素緩衝區：

func updateAnchors(frame: ARFrame) {
    anchorInstanceCount = min(frame.anchors.count, maxAnchorInstanceCount)
    var anchorOffset: Int = 0
    if anchorInstanceCount == maxAnchorInstanceCount {
        anchorOffset = max(frame.anchors.count - maxAnchorInstanceCount, 0)
    }
    for index in 0..<anchorInstanceCount {
        let anchor = frame.anchors[index + anchorOffset]
        var coordinateSpaceTransform = matrix_identity_float4x4
        coordinateSpaceTransform.columns.2.z = -1.0
        let modelMatrix = simd_mul(anchor.transform, coordinateSpaceTransform)
        let anchorUniforms = anchorUniformBufferAddress.assumingMemoryBound(to: InstanceUniforms.self).advanced(by: index)
        anchorUniforms.pointee.modelMatrix = modelMatrix
    }
}
複製程式碼

在 updateCapturedImageTextures() 方法中，我們從提供的幀捕獲的影像中建立兩個紋理：

func updateCapturedImageTextures(frame: ARFrame) {
    let pixelBuffer = frame.capturedImage
    if (CVPixelBufferGetPlaneCount(pixelBuffer) < 2) { return }
    capturedImageTextureY = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.r8Unorm, planeIndex:0)!
    capturedImageTextureCbCr = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.rg8Unorm, planeIndex:1)!
}
複製程式碼

在 updateImagePlane() 方法中，我們更新影像螢幕的紋理座標，讓它能夠保持比例並填滿整個檢視：

func updateImagePlane(frame: ARFrame) {
    let displayToCameraTransform = frame.displayTransform(withViewportSize: viewportSize, orientation: .landscapeRight).inverted()
    let vertexData = imagePlaneVertexBuffer.contents().assumingMemoryBound(to: Float.self)
    for index in 0...3 {
        let textureCoordIndex = 4 * index + 2
        let textureCoord = CGPoint(x: CGFloat(planeVertexData[textureCoordIndex]), y: CGFloat(planeVertexData[textureCoordIndex + 1]))
        let transformedCoord = textureCoord.applying(displayToCameraTransform)
        vertexData[textureCoordIndex] = Float(transformedCoord.x)
        vertexData[textureCoordIndex + 1] = Float(transformedCoord.y)
    }
}
複製程式碼

在 drawCapturedImage() 方法中，我們在場景中繪製攝像頭：

func drawCapturedImage(renderEncoder: MTLRenderCommandEncoder) {
    guard capturedImageTextureY != nil && capturedImageTextureCbCr != nil else { return }
    renderEncoder.pushDebugGroup("DrawCapturedImage")
    renderEncoder.setCullMode(.none)
    renderEncoder.setRenderPipelineState(capturedImagePipelineState)
    renderEncoder.setDepthStencilState(capturedImageDepthState)
    renderEncoder.setVertexBuffer(imagePlaneVertexBuffer, offset: 0, index: 0)
    renderEncoder.setFragmentTexture(capturedImageTextureY, index: 1)
    renderEncoder.setFragmentTexture(capturedImageTextureCbCr, index: 2)
    renderEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
    renderEncoder.popDebugGroup()
}
複製程式碼

最後，在 drawAnchorGeometry() 中為我們建立的虛擬內容繪製錨點：

func drawAnchorGeometry(renderEncoder: MTLRenderCommandEncoder) {
    guard anchorInstanceCount > 0 else { return }
    renderEncoder.pushDebugGroup("DrawAnchors")
    renderEncoder.setCullMode(.back)
    renderEncoder.setRenderPipelineState(anchorPipelineState)
    renderEncoder.setDepthStencilState(anchorDepthState)
    renderEncoder.setVertexBuffer(anchorUniformBuffer, offset: anchorUniformBufferOffset, index: 2)
    renderEncoder.setVertexBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
    renderEncoder.setFragmentBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
    for bufferIndex in 0..<mesh.vertexBuffers.count {
        let vertexBuffer = mesh.vertexBuffers[bufferIndex]
        renderEncoder.setVertexBuffer(vertexBuffer.buffer, offset: vertexBuffer.offset, index:bufferIndex)
    }
    for submesh in mesh.submeshes {
        renderEncoder.drawIndexedPrimitives(type: submesh.primitiveType, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: submesh.indexBuffer.offset, instanceCount: anchorInstanceCount)
    }
    renderEncoder.popDebugGroup()
}
複製程式碼

回到我們前面簡要提到的 setupPipeline() 方法。我們建立兩個渲染管道狀態的物件，一個用於捕獲的影像(攝像頭) ，另一個用於在場景中放置虛擬物件時建立的錨點。正如預期的那樣，每個狀態物件都有自己的一對頂點和片段函式 - 它把我們帶到我們需要檢視的最後一個檔案 - Shaders.metal 檔案。在第一對被捕獲影像的著色部分，在頂點著色器中，我們傳入影像的頂點位置和紋理座標引數：

vertex ImageColorInOut capturedImageVertexTransform(ImageVertex in [[stage_in]]) {
    ImageColorInOut out;
    out.position = float4(in.position, 0.0, 1.0);
    out.texCoord = in.texCoord;
    return out;
}
複製程式碼

在片段著色器中，我們對兩個紋理進行取樣，得到給定紋理座標下的顏色，然後返回轉換後的 RGB 顏色：

fragment float4 capturedImageFragmentShader(ImageColorInOut in [[stage_in]],
                                            texture2d<float, access::sample> textureY [[ texture(1) ]],
                                            texture2d<float, access::sample> textureCbCr [[ texture(2) ]]) {
    constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);
    const float4x4 ycbcrToRGBTransform = float4x4(float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
                                                  float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
                                                  float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
                                                  float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f));
    float4 ycbcr = float4(textureY.sample(colorSampler, in.texCoord).r, textureCbCr.sample(colorSampler, in.texCoord).rg, 1.0);
    return ycbcrToRGBTransform * ycbcr;
}
複製程式碼

對於第二個幾何錨點的著色器，在頂點著色器中，我們計算我們頂點在剪輯空間中的位置，並輸出剪裁和光柵化，然後為每個面著色不同的顏色，然後計算觀察座標空間中頂點的位置，最後將我們的座標系轉換到世界座標系：

vertex ColorInOut anchorGeometryVertexTransform(Vertex in [[stage_in]],
                                                constant SharedUniforms &sharedUniforms [[ buffer(3) ]],
                                                constant InstanceUniforms *instanceUniforms [[ buffer(2) ]],
                                                ushort vid [[vertex_id]],
                                                ushort iid [[instance_id]]) {
    ColorInOut out;
    float4 position = float4(in.position, 1.0);
    float4x4 modelMatrix = instanceUniforms[iid].modelMatrix;
    float4x4 modelViewMatrix = sharedUniforms.viewMatrix * modelMatrix;
    out.position = sharedUniforms.projectionMatrix * modelViewMatrix * position;
    ushort colorID = vid / 4 % 6;
    out.color = colorID == 0 ? float4(0.0, 1.0, 0.0, 1.0)  // Right face
              : colorID == 1 ? float4(1.0, 0.0, 0.0, 1.0)  // Left face
              : colorID == 2 ? float4(0.0, 0.0, 1.0, 1.0)  // Top face
              : colorID == 3 ? float4(1.0, 0.5, 0.0, 1.0)  // Bottom face
              : colorID == 4 ? float4(1.0, 1.0, 0.0, 1.0)  // Back face
              :                float4(1.0, 1.0, 1.0, 1.0); // Front face
    out.eyePosition = half3((modelViewMatrix * position).xyz);
    float4 normal = modelMatrix * float4(in.normal.x, in.normal.y, in.normal.z, 0.0f);
    out.normal = normalize(half3(normal.xyz));
    return out;
}
複製程式碼

在片段著色器中，我們計算定向光的貢獻作為漫反射和鏡面反射項的總和，然後我們通過將顏色對映的取樣乘以片段的光照值來計算最終的顏色，最後我們用剛剛計算出來的顏色和顏色對映的 alpha 通道的值作為該片段的 alpha 的值：

fragment float4 anchorGeometryFragmentLighting(ColorInOut in [[stage_in]],
                                               constant SharedUniforms &uniforms [[ buffer(3) ]]) {
    float3 normal = float3(in.normal);
    float3 directionalContribution = float3(0);
    {
        float nDotL = saturate(dot(normal, -uniforms.directionalLightDirection));
        float3 diffuseTerm = uniforms.directionalLightColor * nDotL;
        float3 halfwayVector = normalize(-uniforms.directionalLightDirection - float3(in.eyePosition));
        float reflectionAngle = saturate(dot(normal, halfwayVector));
        float specularIntensity = saturate(powr(reflectionAngle, uniforms.materialShininess));
        float3 specularTerm = uniforms.directionalLightColor * specularIntensity;
        directionalContribution = diffuseTerm + specularTerm;
    }
    float3 ambientContribution = uniforms.ambientLightColor;
    float3 lightContributions = ambientContribution + directionalContribution;
    float3 color = in.color.rgb * lightContributions;
    return float4(color, in.color.w);
}
複製程式碼

如果你執行這個程式，你就可以點選螢幕並在實時攝像頭檢視中新增立方體，然後移動或靠近這些立方體觀察每個面的不同顏色，就像這樣：

在本系列的下一部分，我們將會更深入的研究 追蹤層 和 場景解析層 並瞭解並瞭解平面檢測，撞擊測試，碰撞和物理效果如何使我們的體驗更加豐富。原始碼已經發布到 GitHub。

下次見！

掘金翻譯計劃是一個翻譯優質網際網路技術文章的社群，文章來源為掘金上的英文分享文章。內容覆蓋 Android、iOS、React、前端、後端、產品、設計等領域，想要檢視更多優質譯文請持續關注掘金翻譯計劃、官方微博、知乎專欄。

[譯]基於 Metal 的 ARKit 使用指南（下）
2017-09-15
[MetalKit]37-Using-ARKit-with-Metal使用ARKit與Metal
2017-12-25
[MetalKit]Using ARKit with Metal part 2使用ARKit與Metal 2
2017-08-20
[MetalKit]38-Using-ARKit-with-Metal-part-2使用ARKit與Metal-2
2017-12-25
[譯]Metal 渲染管線教程
2018-10-05
[MetalKit]45-Using eGPUs with Metal 在 eGPU上使用 Metal
2019-01-03
GPU
ARKit 入坑 1 基礎篇
2017-12-20
[ARKit]12-[譯]在ARKit中建立一個時空門App:新增物體
2018-08-23
APP
AR實踐：基於ARKit實現電影中的全息視訊會議
2018-03-14
[ARKit]13-[譯]在ARKit中建立一個時空門App:材質和光照
2018-08-31
APP
[ARKit]11-[譯]在ARKit中建立一個時空門App:準備開始
2018-07-25
APP
基於ARKit的iOS無限屏實現，還原錘子釋出會效果
2019-03-03
iOS
[ARKit]7-ARKit1.5的圖片識別功能
2018-03-12
[翻譯]基於redis的分散式鎖
2018-12-02
Redis分散式
基於ARkit和SceneKit檢測相機位置和設定2個物體碰撞的事件
2018-05-17
事件
基於百度翻譯API開發屬於自己的翻譯工具
2014-12-12
API
ARKit 初探
2017-12-16
Flutter在iOS上採用Metal驅動GPU
2020-05-07
FlutteriOSGPU
PyMongo 基礎使用指南
2017-12-20
Go
Packet tracer使用指南（上）
2018-05-15
LILO使用指南(上)(轉)
2007-08-11
[譯] 基於虛擬DOM(Snabbdom)的迷你React
2019-05-01
React
基於SpringMVC的上傳圖片
2019-02-21
SpringMVC
基於WebUploader的圖片上傳
2018-09-11
Web
關於基於Form的多檔案上載 (轉)
2007-08-15
ORM
Developing for ARKit 1.5 update using Unity ARKit Plugin
2018-03-28
devUnityPlugin
JavaScript基礎——Promise使用指南
2019-01-03
JavaScriptPromise
ARKit入門
2017-10-10
初識ARKit
2018-05-17
[譯] 基於評論的機器學習線上課程排名
2019-03-03
機器學習
基於.net standard 的動態編譯實現
2021-09-09
編譯
基於後編譯的國際化解決方案
2018-07-12
編譯
基於PYQT5的截圖翻譯工具
2022-05-31
QT
autoit au3 IT管理員使用指南（一）基礎安裝、測試、編譯
2024-05-12
編譯
[轉]基於Quercus的手遊專案終於上線了
2014-02-24
基於React的表單開發的分析(上)
2019-03-04
React
檔案上傳之三基於flash的檔案上傳
2021-09-09
基於畢昇上線基於大模型對應服務
2024-04-24
大模型

[譯] 基於 Metal 的 ARKit 使用指南（上）

相關文章