Laravel 配合 puppeteer 實現操作瀏覽器(以谷歌翻譯為例,免費翻譯無限量文字)

家豬配種專家發表於2020-02-18

(本文和 Laravel 配合 puppeteer 抓取 SPA 頁面 有聯絡,請配合食用)

Laravel 配合 puppeteer 抓取 SPA 頁面 這篇博文中,我簡單介紹了 puppeteer ,然而 puppeteer 能做的遠不止這些。下面我們用 puppeteer 來使用它實現自動通過 Google 翻譯 進行填寫待翻譯內容,並返回翻譯結果的功能


假設我們現在通過 Laravel 配合 puppeteer 抓取 SPA 頁面 這篇文章裡的介紹的方法抓取到了 SpaceX’s Crew Dragon is now in Florida to prep for its first flight with astronauts onboard 並提取出了正文內容:

SpaceX  has moved its Crew Dragon commercial astronaut spacecraft to Florida, the site from which it’ll launch in likels to plan. The Crew Dragon capsule is now going to undergo final testing and checkouts in Florida before its departure from Cape Canaveunch atop a Falcon 9 rocket, with NASA astronauts Bob Behnken and Doug Hurley onboard.            Behnken and Hurley will be taking a tn (ISS) courtesy of the Crew Dragon, as part of a demonstration mission codenamed “Demo-2” by SpaceX and NASA  that will serve as a key the spacecraft for regular service carrying people to and from the ISS. SpaceX’s Crew Dragon is one of two spacecraft that aim to achi alongside the Boeing Starliner CST-100 crew vehicle, which is undergoing development and testing.                                    Cight to and from the @space_station with @NASA astronauts @AstroBehnken and @Astro_Doug onboard! pic.twitter.com/nerz0Qujso                                   Boeing’s spacecraft has recently encountered some issues that could extend its testing timeline and set back itss with astronauts onboard. The Starliner encountered two potentially serious software issues during an uncrewed demonstration mission tASA and the company are determining corrective action, including safety reviews of Boeing and its software development and testing procerformed an in-flight abort test in January, the last major demonstration it needed to do before moving on to the crewed demo mission. ss, showing how the Crew Dragon would separate and distance itself from the launch craft in case of an unexpected error, in order to sa      SpaceX has been sharing details of its preparation for this final planned demo before operational commercial crew flights, tweetiraft undergoing ultrasonic testing. Currently, the Demo-2 mission is tentatively set for May 2, though that date is said to be flexibleater, depending on mission needs and remaining preparation progress.
  1. 新建 Translator.php

    <?php 
    namespace App\Console\Commands;
    use QL\QueryList;
    use Nesk\Puphpeteer\Puppeteer;
    trait Translator{
    ...
    public function translate(String $target_language,String $content){
    
         $puppeteer = new Puppeteer();
    
         $browser = $puppeteer->launch([
             'args' => [
                 '--no-sandbox'
             ]
         ]);
    
         $page = $browser->newPage();
    
         $page->goto(
         "https://translate.google.cn/#view=home&op=translate&sl=en&tl=$target_language",
             [
                 'timeout' => 0,
                 'read_timeout' => 0
             ]
         );
    
         // 操作無頭瀏覽器輸入英文原文
         $page->focus('textarea#source');
         $page->keyboard->type($content);
    
         // 等待翻譯結果
         $page->waitFor("span.tlid-translation.translation[lang=$target_language]");
         $dom = QueryList::html($page->content());
    
         // 關閉瀏覽器
         $browser->close();
    
         // 提取翻譯結果
         $result = $dom->find("span.tlid-translation.translation[lang=$target_language]")->text();
    
         return $result;
     }
     ...
    }
  2. 在之前的 ShallowScrapingData.php 呼叫

     use Translator;
     ...
     public function handle(){
         ...
         $this->info($this->translate('zh-CN',$raw_content));
     }
     ...

    結果:
    翻譯結果

詳細說明:

    // 操作無頭瀏覽器輸入英文原文
    $page->focus('textarea#source');
    $page->keyboard->type($content);

谷歌翻譯
檢查谷歌翻譯的頁面我們可以發現,使用者是在 textarea#source 裡面填寫的待翻譯內容

    // 等待翻譯結果
    $page->waitFor("span.tlid-translation.translation[lang=$target_language]");
    $dom = QueryList::html($page->content());

谷歌翻譯
只有在谷歌完成翻譯後,span.tlid-translation.translation[lang=zh-CN] 才會被 js 渲染出來

本作品採用《CC 協議》,轉載必須註明作者和本文連結

相關文章