最近需要抓取抖音直播的彈幕訊息,網上找了一下基本上都是 python 的版本,雖然用起來沒有太大的影響,但本著 PHP 是世界上最好的語言 就寫了一個簡單的指令碼方便使用。以下是主要程式碼:
首先透過直播連結獲取 ttwid
$client = new Client(); $response = $client->get($liveUrl, [ 'headers' => [ 'accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'User-Agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36', 'cookie' => '__ac_nonce=0638733a400869171be51', ] ]); $cookieString = $response->getHeader('Set-Cookie'); $cookieArray = explode(';', $cookieString[0]); $ttwidStr = $cookieArray[0]; return substr($ttwidStr, strpos($ttwidStr, '=') + 1);
在透過該連結解析出roomid
$html = $response->getBody()->getContents(); $pattern = '/roomId\\\\":\\\\"(\d+)\\\\"/'; preg_match($pattern, $html, $matches); return $matches[1];
拼接出websocket 連線和請求頭
$header = [ 'cookie' => 'ttwid=' . $ttwid, 'user-agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36', ]; $webSocketUrl = 'ws://webcast3-ws-web-lq.douyin.com/webcast/im/push/v2/?app_name=douyin_web&version_code=180800&webcast_sdk_version=1.3.0&update_version_code=1.3.0&compress=gzip&internal_ext=internal_src:dim|wss_push_room_id:' . $liveRoomId . '|wss_push_did:7188358506633528844|dim_log_id:20230521093022204E5B327EF20D5CDFC6|fetch_time:1684632622323|seq:1|wss_info:0-1684632622323-0-0|wrds_kvs:WebcastRoomRankMessage-1684632106402346965_WebcastRoomStatsMessage-1684632616357153318&cursor=t-1684632622323_r-1_d-1_u-1_h-1&host=https://live.douyin.com&aid=6383&live_id=1&did_rule=3&debug=false&maxCacheMessageNumber=20&endpoint=live_pc&support_wrds=1&im_path=/webcast/im/fetch/&user_unique_id=7188358506633528844&device_platform=web&cookie_enabled=true&screen_width=1440&screen_height=900&browser_language=zh&browser_platform=MacIntel&browser_name=Mozilla&browser_version=5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/113.0.0.0%20Safari/537.36&browser_online=true&tz_name=Asia/Shanghai&identity=audience&room_id=' . $liveRoomId . '&heartbeatDuration=0&signature=00000000';
- 最後透過workman 中的AsyncTcpConnection 進行連結獲取資料
$wsClient = new AsyncTcpConnection($webSocketUrl);
// 設定以ssl加密方式訪問,使之成為wss
$wsClient->transport = 'ssl';
$wsClient->headers = $header;
$parseMsg = new ParseMsg($conn);
$wsClient->onMessage = [$parseMsg, 'on_message'];
$wsClient->connect();
具體具體的解析程式碼和 protobuf 我放在github 上面了,需要的朋友自己去看吧。
還有一個比較重要的點是彈幕訊息是透過google 的 protobuf 協議進行編碼,需要大家瞭解一下protobuf 協議
提供一個測試地址吧 ws://47.93.122.172:4200
訊息格式如下:
{
"url":"https://live.douyin.com/619592756125"
}
本作品採用《CC 協議》,轉載必須註明作者和本文連結