.Net6中想實現對某個網址截圖,可透過Selenium模擬訪問網址並實現截圖。
實現
安裝Nuget包
<PackageReference Include="Selenium.Chrome.WebDriver" Version="85.0.0" /> <PackageReference Include="Selenium.Support" Version="4.1.0" /> <PackageReference Include="Selenium.WebDriver" Version="4.1.0" />
之後可透過程式碼實現模擬訪問網址並截圖
public static string PageScreenshot(string url, string uploadbasepath) { ChromeDriver driver = null; try { ChromeOptions options = new ChromeOptions(); options.AddArguments("headless", "disable-gpu", "no-sandbox"); driver = new ChromeDriver(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), options); //driver = new ChromeDriver("/usr/bin/google-chrome-stable", options); driver.Navigate().GoToUrl(url); string width = driver.ExecuteScript("return document.body.scrollWidth").ToString(); string height = driver.ExecuteScript("return document.body.scrollHeight").ToString(); driver.Manage().Window.Size = new System.Drawing.Size(int.Parse(width), int.Parse(height)); //=int.Parse( height); var screenshot = (driver as ITakesScreenshot).GetScreenshot(); //directory create var basepath = uploadbasepath + DateTime.Now.ToString("yyyyMMdd") + "/"; if (!Directory.Exists(uploadbasepath)) { Directory.CreateDirectory(uploadbasepath); } if (!Directory.Exists(basepath)) { Directory.CreateDirectory(basepath); } var path = basepath + Guid.NewGuid().ToString("N") + ".jpg"; screenshot.SaveAsFile(path); return path; } catch (Exception ex) { throw; } finally { if (driver != null) { driver.Close(); driver.Quit(); } } }
需要另外做的一步是把chromedriver從bin/Release/netcoreapp3.1/chromedriver複製到publish目錄。
你以為到這就完了?這個程式碼確實可以在windows/linux非容器環境下執行。但是在docker裡還是有些不一樣。
Docker中執行的那些坑
首先需要注意.netcore3.1在Docker中操作圖片記得安裝libgdiplus.so
#Dockerfile RUN apt-get update -y && apt-get install -y --allow-unauthenticated libgdiplus && apt-get clean && ln -s /usr/lib/libgdiplus.so /usr/lib/gdiplus.dll
1.第一個坑
首先遇到的就是OpenQA.Selenium.DriverServiceNotFoundException異常,異常資訊是
OpenQA.Selenium.DriverServiceNotFoundException: The file /opt/google/chrome/chrome/chromedriver does not exist. The driver can be downloaded at http://chromedriver.storage.googleapis.com/index.html
這個異常明顯是找不到chromedriver,那就與在非Docker環境linux中直接執行的方式一樣,嘗試把chromedriver複製到Docker的publish目錄中,在Dockerfile中新增以下內容
#dockerfile RUN cp /src/xxx/Release/netcoreapp3.1/chromedriver /app/publish/
2.第二個坑
嘗試執行以上容器,還是失敗,進入容器內部,直接執行chromedriver,可以看到缺少libxx.so之類的庫。那咋辦,只能嘗試在映象中安裝chrome,這樣相關庫就有了
安裝chrome相關資料
https://stackoverflow.com/questions/55206172/how-to-run-dotnet-core-app-with-selenium-in-docker
https://github.com/devpabloassis/seleniumdotnetcore/blob/master/Dockerfile
那在Dockerfile中新增安裝chrome的命令
#Dockerfile Install Chrome RUN apt-get update && apt-get install -y \ apt-transport-https \ ca-certificates \ curl \ gnupg \ hicolor-icon-theme \ libcanberra-gtk* \ libgl1-mesa-dri \ libgl1-mesa-glx \ libpango1.0-0 \ libpulse0 \ libv4l-0 \ fonts-symbola \ --no-install-recommends \ && curl -sSL https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \ && echo "deb [arch=amd64] https://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google.list \ && apt-get update && apt-get install -y \ google-chrome-stable \ --no-install-recommends \ && apt-get purge --auto-remove -y curl \ && rm -rf /var/lib/apt/lists/*
3.第三個坑
執行以上修改後的容器,又一個異常
DevToolsActivePort file doesn't exist
繼續查資料發現需要加個引數disable-dev-shm-usage
https://stackoverflow.com/questions/50642308/webdriverexception-unknown-error-devtoolsactiveport-file-doesnt-exist-while-t
但是前面試了不在docker內執行,需要這個引數,那就加個環境變數區分開docker與非docker環境
#Dockerfile ENV INDOCKER 1
public static string PageScreenshot(string url, string uploadbasepath) { ChromeDriver driver = null; try { var indocker = Environment.GetEnvironmentVariable("INDOCKER"); ChromeOptions options = new ChromeOptions(); if (indocker == "1") { options.AddArguments("headless", "disable-gpu", "no-sandbox", "disable-dev-shm-usage"); //driver = new ChromeDriver("/opt/google/chrome/chrome", options); } else { options.AddArguments("headless", "disable-gpu", "no-sandbox"); } driver = new ChromeDriver(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), options); //driver = new ChromeDriver("/usr/bin/google-chrome-stable", options); driver.Navigate().GoToUrl(url); string width = driver.ExecuteScript("return document.body.scrollWidth").ToString(); string height = driver.ExecuteScript("return document.body.scrollHeight").ToString(); driver.Manage().Window.Size = new System.Drawing.Size(int.Parse(width), int.Parse(height)); //=int.Parse( height); var screenshot = (driver as ITakesScreenshot).GetScreenshot(); //directory create var basepath = uploadbasepath + DateTime.Now.ToString("yyyyMMdd") + "/"; if (!Directory.Exists(uploadbasepath)) { Directory.CreateDirectory(uploadbasepath); } if (!Directory.Exists(basepath)) { Directory.CreateDirectory(basepath); } var path = basepath + Guid.NewGuid().ToString("N") + ".jpg"; screenshot.SaveAsFile(path); return path; } catch (Exception ex) { throw; } finally { if (driver != null) { driver.Close(); driver.Quit(); } } }
4.第四個坑
嘗試執行上面修改後的容器,又一個異常
This version of ChromeDriver only supports Chrome version 99 Current browser version is 109.0.5414.74 with binary path /usr/bin/google-chrome
這個資訊字面意思就是之前第一個坑複製的chromedriver版本較低。那就直接去官網下載最新的chromedriver,並放到映象內
下載地址:http://chromedriver.storage.googleapis.com/index.html
# Dockerfile COPY ["xxx/chromedriver", "."] RUN chmod +x chromedriver
5.第五個坑
繼續嘗試執行,發現這次能成功截圖了,等等...這字型咋還是亂碼呢
明顯是中文亂碼了,應該是容器內沒中文字型,那就安裝中文字型,字型可以從C:\Windows\Fonts中獲取ttc,ttf字型檔案
#Dockerfile RUN apt-get update RUN apt-get install -y --no-install-recommends libgdiplus libc6-dev RUN apt-get install -y fontconfig xfonts-utils COPY fonts/ /usr/share/fonts/ RUN mkfontscale RUN mkfontdir RUN fc-cache -fv
再次執行,終於成功