一次虛擬機器 (virtual box + vagrant + homestead) 崩潰與 debug 的過程

Squ1rrel發表於2021-01-10

原先虛擬機器架構

Win 10 專業版

Vagrant:2.2.12(最新)

Virtual Box:6.1.16(最新)

解決時:

Win 10 2019 ltsc

Vagrant:2.2.12(最新)

Virtual Box:6.1.16(最新)

解決辦法:顯示卡驅動回退

問題

某次重啟後虛擬機器無法開啟

虛擬機器核心級別的崩潰,尚未進入其系統

vagrant 報錯資訊:

There was an error while executing VBoxManage, a CLI used by Vagrant for controlling VirtualBox. The command and stderr is shown below.

Command: [“startvm”, “966fd2e6-c0e0-4b1e-8cdd-e9c5305bed08”, “–type”, “headless“]

Stderr: VBoxManage.exe: error: The virtual machine ‘new_default_1518243933264_77412’ has terminated unexpectedly during startup with exit code 1 (0x1). More details may be available in ‘C:\Users\mafei\VirtualBox VMs\new_default_1518243933264_77412\Logs\VBoxHardening.log’ VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component MachineWrap, interface IMachine

virtual box 報錯資訊:

The virtual machine ‘ubuntu’ has terminated unexpectedly during startup with exit code 1 (0x1). More details may be available in ‘C:\Users\Chopto\VirtualBox VMs\ubuntu\Logs\VBoxHardening.log’.

Result Code: E_FAIL (0x80004005) Component: MachineWrap Interface: IMachine {b2547866-a0a1-4391-8b86-6952d82efaa0}

錯誤分析與排查

直接在 stack overflow 查詢報錯資訊並未有直接收穫

核心級別的錯誤因為過於複雜,尋找癥結非常困難,先從程式的呼叫順序嘗試

vagrant 的 headless 模式啟動了 VBoxManage,進而引發報錯,初步推測問題出在 virtual box 身上,基本排除 vagrant 配置問題(仍考慮版本適配性與相容性問題)

排查 virtual box log,發現幾條有價值的資訊:

NtOpenDirectoryObject failed on \Driver: 0xc0000022

Error -104 in supR3HardenedWinReSpawn! (enmWhat=5)

Error relaunching VirtualBox VM process: 5

supR3HardenedMonitor_LdrLoadDll: error opening 'C:\WINDOWS\system32\wintab32.dll': 0 (NtPath=\??\C:\WINDOWS\system32\wintab32.dll; Input=C:\WINDOWS\system32\wintab32.dll; rcNtGetDll=0x0

發現了一條重要線索:virtual box 開啟 system32 下的 wintab32.dll 報 error 級別的錯,說明 virtual box 需要的動態庫未得到滿足

可能的問題

登錄檔汙染

動態庫缺失

動態庫搶佔

動態庫汙染

嘗試解決問題

提到了登錄檔汙染的問題,嘗試重做系統,保守選用 win 10 2019 ltsc,安裝後恢復正常,但安裝幾個軟體後重啟立刻再次崩潰

基本排除登錄檔問題,尋找 dll 的可能性

官方 FAQ 中,提及:HARDENING ISSUE 不應被看作 bug,而應思考 dll 的注入問題 (This was introduced to guard against the possibility that malware running on the HOST could inject a DLL)

下面提及了幾種可能性:

  • Graphics drivers (e.g. from NVidia) may inject a DLL which filters graphics function calls to the host OS, redirecting them from software to hardware accelerated versions. VirtualBox VMs use OpenGL, which is precisely the main API that most benefits from this.

  • Unofficial Windows Theme providers often use hacked versions of the Windows theme DLL (uxtheme.dll).

  • Accessibility tools designed to help partially sighted people sometimes inject themselves into standard apps in order to speak the text, force high contrast colors etc.

  • Antivirus software is invasive by its very nature, with behaviour hard to distinguish from malware.

  • Many others.

官方關於 0x1 報錯:

Error Symptom 1: Exit with error code 1

This seems to have several causes, but one of the main ones seems to be that some DLL that VirtualBox requires has failed to load. That probably indicates corruption somewhere. Sometimes you find an error related to USER32.DLL in the hardening log. This may be fixed by running “sfc /scannow” in an administrator command prompt window.

See also the “More than one thread in process” note in the previous post.

這兩個文件指向了某些自動安裝的驅動(尤其是影響系統的)!

查詢安裝時間,發現下載的 GeForce Experience 自動更新了顯示卡驅動(461.09),嘗試回退到(456.71),完全解決問題,再次重啟虛擬機器仍照常啟動

猜測顯示卡驅動更新對核心 dll 產生影響

一些收穫

耐得煩仔細查 log 總會有收穫

以下連結有所幫助:

stackoverflow.com/questions/393453...

stackoverflow.com/questions/522692...

stackoverflow.com/questions/605824...

forums.virtualbox.org/viewtopic.ph...

forums.virtualbox.org/viewtopic.ph...
注:如果電腦上安裝過老版本的virtualbox,解除安裝後會有驅動殘留,需要將老的驅動刪除後才能安裝新版本的virtualbox。於是按照論壇上的建議,先解除安裝軟體,然後刪除了C:\Windowns\System32\drivers 目錄下所有以VBox開頭的檔案

本作品採用《CC 協議》,轉載必須註明作者和本文連結
辛勞 篤定 輕苦 微甜 ----汪曾祺

相關文章