說明
非常感謝nick老師的提點
老師部落格:https://home.cnblogs.com/u/nickchen121/
專案連線
1.碼雲:https://gitee.com/wjup/html_to_md (碼雲由於上傳檔案大小限制現在exe不是最新的,最新的再github上)
2.github:https://github.com/a568972484/html_to_md
功能介紹
- 功能一:批量爬取部落格園首頁的所有隨筆字典並儲存JSON檔案,且隨筆全部轉成MD格式檔案
- 功能二:輸入指定隨筆網址把隨筆內容轉成MD並且儲存
- 功能三:爬取某個分目錄下部落格
由於不同部落格具有不同的見狀性`要根據部落格能讓進行適當的修改就可以使用此程式
程式沒有加入多程式
與多執行緒
進去增加部落格園的負擔
爬取內容請不要用做商業用途
初衷主要是為了幫助博主把已上傳的隨筆下載至本地方便修改
更新日誌
2019.7.20
增加了功能
功能介紹:爬取某個分目錄下部落格
版本升級至5.0,增加了視覺化介面視覺化介面exe程式,增加了見狀性,
只需下載exe執行即可
溫馨提示:
程式由可能會被流氓防毒軟體遮蔽請自行恢復
絕對無毒的,沒有新增任何惡意資訊
執行程式第一功能和第三功能會因為部落格數量多出現卡頓,由於本人對程式理解還不深刻沒能找到解決辦法,請大家見諒請不要關閉程式,結束後會自動出現資料的
都是自學的一些模組可能會有點理解不到位請大家見諒,需要原始碼的解壓密碼私聊我就好了.
核心程式碼在'core_code.py'中註釋都加全了
再次強調
該程式只為了幫助學習
碼雲名稱:YWY
碼雲連結:https://gitee.com/pythonywy
github_id:a568972484
github_url:https://github.com/a568972484
作者部落格:小小鹹魚ywy
部落格連結:`https://www.cnblogs.com/pythonywy
希望得到大家相關體驗,好進行後續的改進,謝謝
description
Function introduction
- function 1: batch access to all the essay dictionaries on the homepage of blog garden and save JSON files, and convert all the essays into MD format files
- function 2: input the specified essay website to convert the essay contents to MD and save
Since different blogs have different perspectives, you can use this program to make the appropriate changes according to the blog
Do not add 'multi - process' and 'multi - threaded' to add to the burden of the blog park
Crawl content 'please do not use it for commercial purposes'
The original intention is to help bloggers download the uploaded essays to the local site for easy modification
Run 'run.py' when in use
update log
2019.7.20
Added functionality
Function description: crawl a subdirectory under the blog
Version 5.0, added visual interface visual interface exe program, increased visibility
Just run exe
Tips:
Program by may be rogue antivirus software shield please restore
Absolutely non-toxic, without adding any malicious information
The first function and the third function of running the program will appear because of the number of blog card, because I understand the program is not deep did not find a solution, please forgive me please do not close the program, the end will automatically appear data
Some modules are self-taught may not understand a little bit in place, please forgive me, need the original code to extract password private chat on me.
The core code is commented out in 'core_code.py'
again
Code cloud name :YWY
Yards cloud link: https://gitee.com/pythonywy
Github_id: a568972484
github_url:https://github.com/a568972484
Author's blog: little salted fish ywy
Blog links: ` https://www.cnblogs.com/pythonywy
hope to get relevant experience, so as to carry out subsequent improvements,thanks