node爬取網易雲歌曲

胖紙Esther發表於2019-03-03

原文網址 : https://flycode.co/archives/267618

起因：老爸讓我下載幾千首歌曲給他在車上播放，感覺手動下載，就算批量下載也要時間，索性寫個爬蟲自動下載吧。。

對於這個爬蟲小專案，選擇了node+koa2，初始化專案koa2 projectName（需要先全域性安裝koa-generator），然後進入專案檔案，npm install && npm start，其中依賴用到了superagent, cheerio, async, fs, path

開啟網易雲網頁版，點選歌單頁面，我選擇了華語分類，右鍵檢視框架原始碼，獲取真實url，找到id為m-pl-container的html結構，這就是這次需要爬取的歌單列表，直接用superagent請求url，只能爬取到第一頁的資料，需要async來併發爬取

static getPlayList(){
	const pageUrlList = this.getPageUrl();

	return new Promise((resolve, reject) => {
		asy.mapLimit(pageUrlList, 1, (url, callback) => {
			this.requestPlayList(url, callback);
		}, (err, result) => {
			if(err){
				reject(err);
			}

			resolve(result);
		})
	})
}
複製程式碼

其中const asy = require(`async`)，因為用到async/await，所以區分下，requestPlayList是superagent發起的請求

static requestPlayList(url, callback){
	superagent.get(url).set({
		`Connection`: `keep-alive`
	}).end((err, res) => {
		if(err){
			console.info(err);
			callback(null, null);
			return;
		}

		const $ = cheerio.load(res.text);
		let curList = this.getCurPalyList($);
		callback(null, curList);  
	})
}
複製程式碼

getCurPalyList是獲取頁面上的資訊，傳入$用於dom操作

static getCurPalyList($){
	let list = [];

	$(`#m-pl-container li`).each(function(i, elem){
		let _this = $(elem);
		list.push({
			name: _this.find(`.dec a`).text(),
			href: _this.find(`.dec a`).attr(`href`),
			number: _this.find(`.nb`).text()
		});
	});

	return list;
}
複製程式碼

至此，歌單列表爬取完成，接下來要爬取歌曲列表

static async getSongList(){
	const urlCollection = await playList.getPlayList();

	let urlList = [];
	for(let item of urlCollection){
		for(let subItem of item){
			urlList.push(baseUrl + subItem.href);
		}
	}

	return new Promise((resolve, reject) => {
		asy.mapLimit(urlList, 1, (url, callback) => {
			this.requestSongList(url, callback);
		}, (err, result) => {
			if(err){
				reject(err);
			}

			resolve(result);
		})
	})
}
複製程式碼

requestSongList的使用跟上面playList的差不多，因此不再重複。上面程式碼獲取到歌曲列表後，需要下載到本地

static async downloadSongList(){
	const songList = await this.getSongList();

	let songUrlList = [];
	for(let item of songList){
		for(let subItem of item){
			let id = subItem.url.split(`=`)[1];
			songUrlList.push({
				name: subItem.name,
				downloadUrl: downloadUrl + `?id=` + id + `.mp3`
			});
		}
	}

	if(!fs.existsSync(dirname)){
		fs.mkdirSync(dirname);
	}
	
	return new Promise((resolve, reject) => {
		asy.mapSeries(songUrlList, (item, callback) => {
			setTimeout(() => {
				this.requestDownload(item, callback);
				callback(null, item);
			}, 5e3);
		}, (err, result) => {
			if(err){
				reject(err);
			}

			resolve(result);
		})
	})
}
複製程式碼

其中requestDownload是請求downloadUrl並下載儲存到本地

static requestDownload(item, callback){
	let stream = fs.createWriteStream(path.join(dirname, item.name + `.mp3`));

	superagent.get(item.downloadUrl).set({
		`Connection`: `keep-alive`
	}).pipe(stream).on(`error`, (err) => {
		console.info(err);   // error處理，爬取錯誤時，列印錯誤並繼續向下執行
	})
}
複製程式碼

到此，爬蟲小程式完成。該專案爬取歌單列表–>歌曲列表–>下載到本地，當然也可以直接找到某位歌手的主頁，修改傳入songList的url，直接下載該歌手的熱門歌曲。

Python爬取網易雲音樂歌單歌曲
2018-04-17
Python
如何用Python網路爬蟲爬取網易雲音樂歌曲
2018-04-27
Python爬蟲
網易雲歌詞爬取（java）
2018-07-01
Java
網易雲音樂評論爬蟲（2）：歌曲的全部評論
2018-10-10
爬蟲
網易雲音樂歌曲資訊
2019-04-05
Python爬蟲實踐--爬取網易雲音樂
2022-02-15
Python爬蟲
網易雲音樂評論爬蟲(1)：全部熱門歌曲及其 id 號
2018-10-10
爬蟲
網易雲音樂解鎖灰色歌曲教程
2022-03-09
node：爬蟲爬取網頁圖片
2019-02-16
爬蟲網頁
python爬蟲:瞭解JS加密爬取網易雲音樂
2021-08-19
Python爬蟲JS加密
Python 爬蟲獲取網易雲音樂歌手歌詞
2018-08-09
Python爬蟲
Python 爬蟲獲取網易雲音樂歌手資訊
2019-03-04
Python爬蟲
Node JS爬蟲：爬取瀑布流網頁高清圖
2018-05-17
JS爬蟲網頁
Java爬蟲系列之實戰：爬取酷狗音樂網 TOP500 的歌曲
2019-05-27
Java爬蟲
python3.x爬取網易雲音樂，超詳細版
2018-03-09
Python
Python 爬取網易雲音樂自動安裝所需模組
2020-10-04
Python
python爬取_網易雲音樂_ 姬和不如_MP3_獲取無損音源
2020-11-14
Python
python爬取_網易雲音樂_你的姑娘 _MP3_獲取無損音源
2020-11-14
Python
python3.基礎爬取網易雲音樂【超詳細版】
2018-12-04
Python
爬蟲實踐之獲取網易雲評論資料資訊
2022-03-29
爬蟲
Java爬取並下載酷狗TOP500歌曲
2019-01-03
Java
「譯」如何用 Node.Js 和 Puppeteer 爬取網頁
2019-03-03
Node.js網頁
Python爬蟲實踐-網易雲音樂
2018-09-09
Python爬蟲
蘇寧易購網址爬蟲爬取商品資訊及圖片
2021-10-12
爬蟲
【Python爬蟲實戰】使用Selenium爬取QQ音樂歌曲及評論資訊
2021-03-24
Python爬蟲
教你爬取騰訊課堂、網易雲課堂、mooc等所有課程資訊
2020-04-17
用Python程式碼來下載任意指定網易雲歌曲（超詳細版）
2019-02-16
Python
Node.js爬取妹子圖-crawler爬蟲的使用
2018-04-04
Node.js爬蟲
利用Python網路爬蟲抓取網易雲音樂歌詞
2018-05-06
Python爬蟲
10、在QQ音樂中爬取某首歌曲的歌詞
2019-04-11
Python下載網易雲歌曲（版許可權制的怎麼播放和下載呢？）
2018-08-09
Python
第一彈：puppeteer爬蟲小demo —— 網易雲音樂
2018-05-27
爬蟲
Python爬蟲：逆向分析網易雲音樂加密引數
2020-09-14
Python爬蟲加密
爬取薅羊毛網站百度雲資源
2020-02-16
網站
python爬取網圖
2019-10-15
Python
爬取網頁文章
2021-09-29
網頁
Python從網易雲音樂、QQ 音樂、酷狗音樂等搜尋和下載歌曲
2019-07-19
Python
11、按照提示輸入歌手名字，爬取該歌手所有歌曲資訊
2019-04-11

node爬取網易雲歌曲

相關文章