Python全棧開發武沛齊day07模組

lang333333發表於2024-03-11

原文網址 : https://www.cnblogs.com/fresher20240311/p/18067184

day07 模組

1.知識回顧
• 模組的分類
- 自定義模組

內建
第三方模組
• 自定義模組
– 關於拆分
crm
utils
- encrypt.py
- db.py
- message.py
app.py
– 資料夾和檔案（擴充套件）
py檔案 -> 模組
資料夾 -> 包
py2的包：內部必須有一個 init.py
py3的包：無限制
– 匯入模組
• 去哪裡匯入？
import sys

sys.path
當前執行指令碼所在目錄
python的安裝目錄
python的安裝目錄/site-packages

注意：pycharm自動將專案目錄加入sys.path【應該忽略】
• 怎麼匯入？
import ???
from xxxx.xxxx import ???
from xxxx.xxxx import *
from xxxx.xxxx import xxx
from xxxx.xxxx import xxx as ???
from xxxx.xxxx import xxx,xx,x
– 主檔案+主函式
def func():
pass

func()
def func():
pass

if name == "main":
func()
當一個py檔案被執行時，__name__就是"main"
當一個py檔案被匯入時，__name__就是模組的名字，例如：utils.xxx.info
• 內建模組
– random
– hashlib
– json
– time/datetime
– os

2.內建模組
2.1 os
• 路徑拼接
import os

path = os.path.join("xxx","xxxx","xxxxx","xxxx.txt")
• 上級目錄
import os

path = os.path.dirname(".....")
• 絕對路徑
import os

os.path.abspath(".....")
os.path.abspath(file)
import os

base_dir = os.path.dirname(os.path.abspath(file))
• 判斷路徑是否存在
import os

v1 = os.path.exists("路徑")
print(v1) # True/False
import os

user = input("使用者名稱:")

file_path = os.path.join("files", "db.txt")

判斷資料夾是否存在？

if os.path.exists(os.path.dirname(file_path)):
with open(file_path, mode='a', encoding='utf-8') as f:
f.write(f"{user}\n")
else:
print("路徑不存在")
import os

file_path = os.path.join("files", 'db.txt')

if os.path.exists(file_path):

with open(file_path, mode='r', encoding='utf-8') as f:
    data = f.read()
    print(data)

else:
print("檔案不存在")
• 建立資料夾（資料夾不存在）
import os

os.makedirs("xxx/xxxx/xxxx")
import os

user = input("使用者名稱:")

file_path = os.path.join("files", "db.txt")

folder_path = os.path.dirname(file_path)
if not os.path.exists(folder_path):
os.makedirs(folder_path)

with open(file_path, mode='a', encoding='utf-8') as f:
f.write(f"{user}\n")
import os.path

import requests
from datetime import datetime

pip install requests

url_list = [
"https://www3.autoimg.cn/newsdfs/g30/M05/3C/5E/400x300_0_autohomecar__ChxknGRTwLSAP9MuAAA6hd6ZZDY038.jpg",
"https://www3.autoimg.cn/cubetopic/g27/M04/43/F9/400x300_0_autohomecar__ChxkmWRTWkaACfQbAADCg_L8aQM773.jpg"
]

for url in url_list:
res = requests.get(url=url)
# 圖片寫入本地檔案
date_string = datetime.now().strftime("%Y-%m-%d")
name = url.split("__")[-1]

if not os.path.exists(date_string):
    os.makedirs(date_string)

file_path = os.path.join(date_string, name)
with open(file_path, mode='wb') as f:
    f.write(res.content)

• 判斷檔案或資料夾
os.path.isdir(...)
• 刪除檔案和資料夾
import os

os.remove("檔案路徑") # 刪除檔案
import shutil

shutil.rmtree("資料夾的路徑") # 刪除資料夾
• 檢視目錄下的檔案或資料夾【一級目錄】
import os

name_list = os.listdir("路徑")
import os.path

folder_path = os.path.join("2023-05-05")

name_list = os.listdir(folder_path)
for name in name_list:
inner_path = os.path.join(folder_path,name)
if os.path.isdir(inner_path):
print(os.listdir(inner_path))
else:
print(inner_path)
• 檢視目錄下的檔案或資料夾【多級目錄】
import

os.walk("路徑")
import os

folder_path = os.path.join("2023-05-05")
for base_dir, folder_list, file_list in os.walk(folder_path):
for name in file_list:
file_path = os.path.join(base_dir,name)
print(file_path)

import os

folder_path = os.path.join(r"E:\EvVideo\其他影片")
for base_dir, folder_list, file_list in os.walk(folder_path):
for name in file_list:
if not name.endswith(".mp4"):
continue
file_path = os.path.join(base_dir, name)
print(file_path)

案例：使用者註冊

輸入：使用者名稱+密碼（md5加密）
寫入檔案
2023-11-11
12-09.txt
12-10.txt
2023-11-12
12-09.txt
12-10.txt
import hashlib
import os.path
from datetime import datetime

MD5_SALT = "asdfasdkfojlkjao9urpqoiwj;lkafjsdf"

def md5(data_str):
obj = hashlib.md5(MD5_SALT.encode('utf-8'))
obj.update(data_str.encode('utf-8'))
return obj.hexdigest()

def gen_file_path():
ctime = datetime.now()
date_string = ctime.strftime("%Y-%m-%d")
time_string = ctime.strftime("%H-%M")

# 3.1 資料夾處理
if not os.path.exists(date_string):
    os.makedirs(date_string)
# 3.2 檔案路徑
file_path = os.path.join(date_string, f"{time_string}.txt")
return file_path

def run():
# 1.輸入使用者名稱和密碼
user = input("使用者名稱：")
pwd = input("密碼：")
md5_pwd = md5(pwd)

# 2.拼接
line = f"{user},{md5_pwd}\n"

# 3.建立路徑+寫入
file_path = gen_file_path()
with open(file_path, mode='a', encoding='utf-8') as f:
    f.write(line)

if name == 'main':
run()
1分鐘就會生成1個檔案，

2.2 sys
import sys

1.匯入模組時，都會去哪裡找

sys.path

2.sys.argv，執行指令碼時傳入的引數

sys.argv

這東西有什麼用？
• IT相關同學：shell指令碼
• 非IT相關的同學
寫程式去實現下載某個影片的一個需求。

方式1：input來實現
url = input("請輸入下載網址：")
方式2：argv
import sys
url = sys.argv[1]
print("下載：", url)

2.3 打包程式碼
pip3.9 install pyinstaller

2.3.1 多檔案打包
pyinstaller -D client.py

2.3.2 單檔案打包（喜歡）
pyinstaller -F client.py
如果你的程式中會涉及到檔案的操作，檔案放在專案的相對目錄。【BUG】

當執行 client.exe 檔案時，內部有很多的程式碼，會放在電腦的臨時目錄。

os.path.abspath(file)

C:\Users\ADMINI~1\AppData\Local\Temp_MEI77322\client.py

• 如果使用的【絕對路徑】+【file】【錯誤】
import os

def run():
base_dir = os.path.dirname(os.path.abspath(file))
db_file_path = os.path.join(base_dir, "db.txt")
with open(db_file_path, mode='r', encoding='utf-8') as f:
print(f.read())

user = input("使用者名稱：")
pwd = input("密碼：")

line = f"{user},{pwd}"
print(line)

input("回車繼續")

if name == 'main':
run()

雙擊的話，一閃而過
把檔案拖到cmd裡就能如上面看到報錯

• 如果使用的【絕對路徑】+【sys.argv】【正確】
import os
import sys

def run():
base_dir = os.path.dirname(sys.argv[0])
db_file_path = os.path.join(base_dir, "db.txt")
with open(db_file_path, mode='r', encoding='utf-8') as f:
print(f.read())

user = input("使用者名稱：")
pwd = input("密碼：")

line = f"{user},{pwd}"
print(line)

input("回車繼續")

if name == 'main':
run()

• 相對路徑【正確】
def run():
with open("db.txt", mode='r', encoding='utf-8') as f:
print(f.read())

user = input("使用者名稱：")
pwd = input("密碼：")

line = f"{user},{pwd}"
print(line)

input("回車繼續")

if name == 'main':
run()

小結
後期如果你的專案需要讀取檔案 + 資料夾等需求，推薦使用絕對路徑。
• 開發階段：程式碼執行
• 打包階段：exe執行
import os
import sys

BASE_DIR = os.path.dirname(os.path.realpath(sys.argv[0]))

def run():
with open(os.path.join(BASE_DIR,'db.txt'), mode='r', encoding='utf-8') as f:
print(f.read())

user = input("使用者名稱：")
pwd = input("密碼：")

line = f"{user},{pwd}"
print(line)

input("回車繼續")

if name == 'main':
run()

2.3 configparser
• 用txt檔案【配置檔案操作】
xxx=123
xxxx=123
• 專門用於對於 .ini 格式的檔案【一般用於專案的配置檔案】 *
[server]
v1=123
v2=456

[client]
v9=111
• 使用JSON格式 .json格式【一般用於專案的配置檔案】
{
"count":123,
"info":[11,22,33]
}
• 專門使用xml格式檔案【一般用於專案的配置檔案】
123
123
• Python的模組 settings.py 【無法再打包後進行修改生效】
COUNT = 123
INFO = 999

2.3.1 基本操作
import configparser

1.開啟並讀取檔案

obj = configparser.ConfigParser()
obj.read("my.ini", encoding='utf-8')

2.讀取節點

v1 = obj.sections()

print(v1)

[‘server’,’client’]

3.鍵值對

v2 = obj.items("server")

for k,v in v2:

print(k,v)

v2

4.鍵值

v3 = obj.get("server",'v3')

print(v3)

中國聯通

5.包含

v3 = obj.has_section("server")

print(v3)

True

6.新增

obj.add_section("group")

obj.set("group", 'name', "武沛齊")

obj.set("group", 'age', "19")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

7.刪除

obj.remove_section("group")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

8.刪除

obj.remove_option("server", "v2")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

9.修改

obj.set("server", "v1", "999")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

2.3.2 應用案例

1.價格監測平臺
透過爬蟲的程式，一直請求網頁，檢視他得價格。一旦低於我的預期，我就進行訊息的通知。
監測過程中需求配置：
• 間隔時間：30
• 企業微信：群 + 群機器人 + 網址傳送

import configparser
import os
import sys

BASE_DIR = os.path.dirname(os.path.realpath(sys.argv[0]))
SETTINGS_PATH = os.path.join(BASE_DIR, 'my.ini')

def run():
# 1.載入配置檔案
# 1.1 找到檔案
if not os.path.exists(SETTINGS_PATH):
print("配置檔案不存在")
input("")
return
# 2.2 讀取
config_dict = {}
config = configparser.ConfigParser()
config.read(SETTINGS_PATH, encoding='utf-8')

for k, v in config.items("server"):
    config_dict[k] = v

# 2.根據配置編寫程式碼
print(config_dict)
input("功能實現...")
input("回車繼續")

if name == 'main':
run()
打包後要把my.ini放在dist下，和client.exe放在一起，才能正常執行，不放一起的話，會提示配置檔案不存在，然後退出

import configparser
import os
import sys
import time
import requests

BASE_DIR = os.path.dirname(os.path.realpath(sys.argv[0]))
SETTINGS_PATH = os.path.join(BASE_DIR, 'my.ini')

def fetch_new_info():
return True

for k, v in config.items("server"):
    config_dict[k] = v

# 2.根據配置編寫程式碼
print(config_dict)
interval = int(config_dict["interval"])
while True:
    status = fetch_new_info()
    if status:
        break

    time.sleep(interval)

# 3.訊息通知 [企業微信]+[群]+[機器人]
# https://www.zhihu.com/question/395840381/answer/2278274881
notify_url = config_dict['notify']

input("功能實現...")
input("回車繼續")

if name == 'main':
run()

import configparser
import os
import sys
import time
import requests

BASE_DIR = os.path.dirname(os.path.realpath(sys.argv[0]))
SETTINGS_PATH = os.path.join(BASE_DIR, 'my.ini')

CONFIG_DICT = None

def fetch_new_info():
return True

def process():
print(CONFIG_DICT)
interval = int(CONFIG_DICT["interval"])
while True:
status = fetch_new_info()
if status:
break

    time.sleep(interval)

def notify():
# https://www.zhihu.com/question/395840381/answer/2278274881
notify_url = CONFIG_DICT['notify']

def load_config():
# 1.1 找到檔案
if not os.path.exists(SETTINGS_PATH):
return False, "配置檔案不存在"
# 2.2 讀取
config_dict = {}
config = configparser.ConfigParser()
config.read(SETTINGS_PATH, encoding='utf-8')

for k, v in config.items("server"):
    config_dict[k] = v
return True, config_dict

def run():
# 1.載入配置檔案
status, config_dict = load_config()
if not status:
print(config_dict)
input("")
return

global CONFIG_DICT
CONFIG_DICT = config_dict

# 2.根據配置編寫程式碼
process()

# 3.訊息通知 [企業微信]+[群]+[機器人]
notify()

input("功能實現...")
input("回車繼續")

if name == 'main':
run()

2.4 XML
一種特殊的格式，用於表示和儲存資料。

武沛齊
18

{
"name":"武沛齊",
"age":18
}

text = """

2
2023
141100

69
2026
13600

"""

from xml.etree import ElementTree

root = ElementTree.XML(text)

1.第一個

node = root.find("country")

print(node)

print(node.tag)

print(node.attrib)

2.找到多個

node_list = root.findall("country")

for node in node_list:

print(node.tag, node.attrib)

3.繼續向下找

node_list = root.findall("country")
for node in node_list:
# print(node.tag, node.attrib)
# res = node.getchildren()
# print(res)
# rank = node.find("rank")
# print(rank.text, rank.attrib)

for child in node:
    print(child.tag, child.text, child.attrib)

案例：騰訊介面API
from xml.etree import ElementTree

text = """

武沛齊
root
1348831860
text
吃飯了
1234567890123456
xxxx
11

"""

data_dict = {}
root = ElementTree.XML(text)
for node in root:
data_dict[node.tag] = node.text
print(data_dict)

2.5 re模組
• 正規表示式，提取or校驗你的資料。
• re模組Python中的內建模組，用於和正規表示式搭配。

2.5.1 正規表示式
text = "樓主太牛逼了，線上想要 s42662578@qq.com 和 xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"

需求：提取手機號 \d{11}
需求：提取郵箱 \w+@\w+.\w+
text = "樓主太牛逼了，在15131255781線想要 s42662578@qq.com 和 xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"

import re

ret = re.findall(r"\d{11}", text)
print(ret)
text = "樓主太牛逼了，在15131255781線想要 s42662578@qq.com 和 xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"

import re

ret = re.findall(r"\w+@\w+.\w+", text, re.ASCII)
print(ret)

email = input("郵箱：")

import re

ret = re.match(r"^\w+@\w+.\w+$", email)
if ret:
print("格式合法")
else:
print("格式錯誤")

1.字元相關
• 固定文字
import re

text = "你好wupeiqi,阿斯頓發wupeiqasd 阿士大夫能接受的wupeiqiff"

data_list = re.findall(r"wupeiqi", text)

print(data_list) # ["wupeiqi", "wupeiqi"]
• 含有特定字串
import re

text = "你好wupeiqi,阿斯頓發wupeiqasd 阿士大夫能接受的wupeiqbiff"

wupeiqa wupeiqb wupeiqi

data_list = re.findall(r"wupeiq[abi]", text)

print(data_list) # ["wupeiqi", "wupeiqa","wupeiqb"]
• 範圍 [a-z] [0-9]
import re

text = "你好twupeiqi,阿斯頓發wupetiqasd 阿士大夫能接受的wutpeiqbff"

data_list = re.findall(r"t[a-z]", text)

print(data_list) # ['tw', 'ti', 'tp']
import re

text = "你好twupeiqi,阿斯頓發wupet2iqasd 阿士大夫能接受的wut1peiqbff"

data_list = re.findall(r"t[0-9]", text)

print(data_list)
import re

text = "你好twupeiqi,阿斯頓發wupet2iqasd 阿士大夫能接受的wut1peiqbff"

data_list = re.findall(r"t[0-9][a-z]", text)

print(data_list)
• \d 數字
import re

text = "你好t11wupeiqi,阿斯頓發wupet22iqasd 阿士大夫能接受的wut8peiqbff"

data_list = re.findall(r"t\d", text)

print(data_list) # ['t1', 't2', 't8']
import re

text = "你好t11wupeiqi,阿斯頓發wupet22iqasd 阿士大夫能接受的wut8peiqbff"

data_list = re.findall(r"t\d\d", text)

print(data_list) # ['t11', 't22']
• \w，字母、數字、下劃線（漢字）
text = "你t好t11wupeiqi,阿斯頓發wupet21232123iqasd 阿s頓大夫能接受的wut8peiqbff"

import re

result = re.findall(r"阿\w頓",text)

print(result) # ['阿斯頓', '阿s頓']
text = "你t好t11wupeiqi,阿斯頓發wupet21232123iqasd 阿s頓大夫能接受的wut8peiqbff"

import re

result = re.findall(r"阿\w頓",text,re.ASCII)

print(result) # ['阿s頓']
• . 除換行符以外的任意字元
text = "你t好t11wupeiqi,阿斯頓發wupet21232123iqasd 阿s頓大夫能接受的阿\n頓ut8peiqbff"

import re

result = re.findall(r"阿.頓", text)

print(result) # ['阿斯頓', '阿s頓']

2.數量先關
• {n}
• {n,}
• {n,m}
• *，0次或n次
import re

text = "你好t1wupeiqi,阿斯頓發wupetwdiqasd 阿士大夫能接受的wut18weiqbff"

data_list = re.findall(r"t\d*w", text)

print(data_list) # ['t1w', 'tw', 't18w']
• +，1次或n次
import re

text = "你好t1wupeiqi,阿斯頓發wupetwdiqasd 阿士大夫能接受的wut18weiqbff"

data_list = re.findall(r"t\d+w", text)

print(data_list) # ['t1w', 't18w']
• ?，0次或1次
import re

text = "你好t1wupeiqi,阿斯頓發wupetwdiqasd 阿士大夫能接受的wut18weiqbff"

data_list = re.findall(r"t\d?w", text)

print(data_list) # ['t1w', 'tw']

3.分組
• 在正則中匹配成功，再去提取區域性資料。
import re

text = "樓主太牛逼了，線上想要 442662578@qq.com和xxxxx@live.com謝謝樓主，手機號也可15131255799，搞起15131255989來呀"

result = re.findall(r"(151(312\d{5}))", text)
print(result)

[('15131255799', '31255799'), ('15131255989', '31255989')]

import re

text = "我的身份證130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

res = re.findall('\d{17}[\dX]', text)
print(res)
import re

text = "我的身份證130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

res = re.findall('\d{6}\d{4}\d{2}\d{2}\d{3}[\dX]', text)
print(res)
import re

text = "我的身份證130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

res = re.findall('(\d{6}(\d{4})(\d{2})(\d{2})\d{3}[\dX])', text)
print(res)
• 表示或
import re

text = "樓主15131root太牛15131alex逼了，線上想要 442662578@qq.com和xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"

data_list = re.findall(r"15131(2\d{5}|r\w+太)", text)

print(data_list)

4.起始和結束
import re

text = "130449197912038879"

res = re.findall(r'^\d{17}[\dX]$', text)
print(res)

案例
• 身份證號碼
import re

text = "dsf130429191912015219k13042919591219521Xkk"
data_list = re.findall(r"\d{17}[\dX]", text)
print(data_list)
• 手機號
import re

text = "我的手機哈是15133377892，你的手機號是1171123啊？"
data_list = re.findall(r"1[3-9]\d{9}", text)
print(data_list)
• 郵箱
import re

text = "樓主太牛逼了，線上想要 442662578@qq.com和xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"
email_list = re.findall(r"\w+@\w+.\w+", text, re.ASCII)
print(email_list)
import re

text = "樓主太牛逼了，線上想要 442662578@qq.com和xxxxx@live.com謝謝樓主，手機號也可15131255789，搞起來呀"

email_list = re.findall(r"[a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+.[a-zA-Z0-9_-]+", text)
print(email_list)

2.5.2 re模組
• re.findall，去整個文字中找所有符合正則條件的文字。
import re

text = "我的身份證130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

res = re.findall('\d{17}[\dX]', text)
print(res) # ['130449197912038879', '13044919991203887X']
• re.search，去整個文字去匹配，返回匹配成功的第一個
import re

text = "我的身份證130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

match_object = re.search('\d{17}[\dX]', text)
print(match_object.group()) # 130449197912038879
• re.match，從開始位置進行匹配，返回匹配成功的第一個
import re

text = "130449197912038879,郭智的身份之是13044919991203887X阿斯頓發士大夫"

match_object = re.match('\d{17}[\dX]', text)
print(match_object)
if match_object:
print(match_object.group()) # 130449197912038879
else:
print("失敗")
• re.split
text = "武沛齊,123"

data_list = text.split(",")
print(data_list) # ["武沛齊","123"]
import re

text = "武沛齊,123"

data = re.split(r",",text)
print(data)
import re

text = "武沛齊,123-999"

data = re.split(r"[,-]", text)
print(data)

案例：片段

import re

price = "￥5499"

ret = re.findall(r"￥(\d+)",price)
print(ret)

text = "已有2人評價"
ret = re.findall(r"已有(\d+)人評價",text)
print(ret)

3.第三方模組
優秀開發者，開源出來一個模組，供其他使用。

• 使用者：左邊
• 開源者：https://www.bilibili.com/video/BV17541187de/

常見命令：
pip install requests
pip uninstall requests
pip list
pip freeze > requirements.txt
pip install -r requirements.txt

配置pip源：
pip3.9 config set global.index-url https://pypi.douban.com/simple/
pip install ???

2.1 requests模組
基於程式碼實現傳送網路請求。
pip install requests

2.1.1 分析請求 + 實現 + json

Request URL: https://www.zhihu.com/api/v4/comment_v5/articles/545093058/root_comment?order_by=score&limit=20&offset=

Request Method: GET

import json
import requests

res = requests.get(
url="https://www.zhihu.com/api/v4/comment_v5/articles/545093058/root_comment?order_by=score&limit=20&offset="
)

data_dict = json.loads(res.text)
for row in data_dict['data']:
content = row['content']
name = row['author']['name']

print(name, content)

2.1.2 分析請求 + 實現 + json
import requests
import json

res = requests.get(
url="https://movie.douban.com/j/search_subjects?type=movie&tag=豆瓣高分&sort=recommend&page_limit=20&page_start=20",
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"
}
)

data_dict = json.loads(res.text)
for row in data_dict['subjects']:
print(row['title'], row['url'])

2.1.3 資料返回值-JSONP格式
• JSON格式，最容易處理

res = reque...

data_dict = json.loads(res.text)
data_dict = res.json()
• JSONP格式
名字({"rate":"9.0","cover_x":1500,"title":"讓子彈飛"})
– 去除元素
名字({"rate":"9.0","cover_x":1500,"title":"讓子彈飛"})

{"rate":"9.0","cover_x":1500,"title":"讓子彈飛"}
– eval，將一段字串當做python程式碼進行編譯執行。
def demo(arg):
print(arg)

eval('demo({"rate":"9.0","cover_x":1500,"title":"讓子彈飛"})')

import json

import requests

res = requests.get(
url="http://num.10010.com/NumApp/NumberCenter/qryNum?callback=jsonp_queryMoreNums&provinceCode=11&cityCode=110&advancePayLower=0&sortType=1&goodsNet=4&searchCategory=3&qryType=02&channel=B2C&numNet=186&groupKey=53271060&judgeType=1"
)

content = res.text

字串處理

result = content.strip("jsonp_queryMoreNums(").strip(")")

print(result)

data_dict = json.loads(result)
print(data_dict)
import json

import requests

def jsonp_queryMoreNums(data_dict):
print(data_dict)

content = res.text

字串處理

eval(content)

2.1.4 HTML格式
你看到網站的所有的內容，本質上都是由HTML標籤給他包裹。

中國聯通

青海

廣西廣東

pip install BeautifulSoup4 import requests from bs4 import BeautifulSoup

res = requests.get(
url="https://www.autohome.com.cn/news/"
)
res.encoding = 'gb2312'

print(res.text)

1.將文字交給BeautifulSoup進行處理

soup = BeautifulSoup(res.text, features="html.parser")

2.使用物件就是整個文字根據特徵尋找標籤

tag = soup.find(name="div", attrs={"id": "auto-channel-lazyload-article"})

3.繼續往下找

li_list = tag.find_all(name="li")
for node in li_list:
h3_tag = node.find(name="h3")
if not h3_tag:
continue

p_tag = node.find(name="p")
img_tag = node.find(name='img')

print(h3_tag.text)
print(p_tag.text)
print(img_tag.attrs['src'])

print('------------------')

import requests
from bs4 import BeautifulSoup

res = requests.get(
url="https://www.autohome.com.cn/news/"
)
res.encoding = 'gb2312'

print(res.text)

1.將文字交給BeautifulSoup進行處理

soup = BeautifulSoup(res.text, features="html.parser")

2.使用物件就是整個文字根據特徵尋找標籤

tag = soup.find(name="ul", attrs={"id": "tagInfo"})

3.每個元素

node_list = tag.find_all(name="li")
for li_node in node_list:
name = li_node.find(name="div", attrs={"class": "editorname"}).text
src = li_node.find(name='img').attrs['src']
src_url = f"https:{src}"
print(name, src_url)

import requests
from bs4 import BeautifulSoup

res = requests.get(
url="https://www.autohome.com.cn/news/"
)
res.encoding = 'gb2312'

print(res.text)

1.將文字交給BeautifulSoup進行處理

soup = BeautifulSoup(res.text, features="html.parser")

2.使用物件就是整個文字根據特徵尋找標籤

tag = soup.find(name="ul", attrs={"id": "tagInfo"})

3.每個元素

# 根據URL去傳送請求，下載圖片的內容
response = requests.get(url=src_url)
with open(f"{name}.jpg",mode='wb') as f:
    f.write(response.content)

案例：聯通商品商城
import requests
from bs4 import BeautifulSoup

1.傳送網路請求，獲取文字資料

res = requests.get(
url="http://s.10010.com/bj/mobile/"
)

print(res.text)

2.解析資料bs4

soup = BeautifulSoup(res.text, features="html.parser")

3.尋找特徵+獲取內部元素

tag = soup.find(name='div', attrs={'id': "goodsList"})

4.尋找每個商品

li_list = tag.find_all(name='li', attrs={"class": 'goodsLi'})
for li_node in li_list:
title = li_node.find(name='p', attrs={"class": "mobileGoodsName"}).text.strip()
price = li_node.find(name='p', attrs={"class": "evaluation"}).text.strip()
comment = li_node.find(name='p', attrs={"class": "evalNum"}).text.strip()
import re

price_num = re.findall(r"￥(\d+)", price)[0]
comment_num = re.findall(r"已有(\d+)人評價", comment)[0]

print(title)
print(price, price_num)
print(comment, comment_num)
print("-" * 30)

案例：雙色球歷史資料
第1頁：HTML格式
import requests
from bs4 import BeautifulSoup

res = requests.get(
url="https://m.78500.cn/kaijiang/ssq/",
headers={
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36"
}
)

soup = BeautifulSoup(res.text, features="html.parser")

parent_area = soup.find(name="article", attrs={'id': "list"})

section_list = parent_area.find_all(name='section', attrs={"class": "item"})
for section in section_list:
title = section.find(name="strong").text
code = section.find(name="p").text

print(title, code)

第2+頁：JSON格式
import requests
from bs4 import BeautifulSoup

cookie_dict = res.cookies.get_dict()

print(cookie_dict)

soup = BeautifulSoup(res.text, features="html.parser")

parent_area = soup.find(name="article", attrs={'id': "list"})

section_list = parent_area.find_all(name='section', attrs={"class": "item"})
for section in section_list:
title = section.find(name="strong").text
code = section.find(name="p").text
print(title, code)

第2頁（請求與網頁不一致）

for i in range(1, 11):
res = requests.get(
url=f"https://m.78500.cn/kaijiang/ssq/?years=list&page={i}",
headers={
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
"Accept": "application/json"
},
cookies=cookie_dict
)
data_dict = res.json()
for row in data_dict['list']:
# print(row['qishu'], "".join(row['result']))
line = f"{row['qishu']}期 {''.join(row['result'])}"
print(line)

Python全棧開發武沛齊day06模組
2024-03-11
Python全棧
Python全棧開發-Python基礎教程-01
2020-12-07
Python全棧
Python全棧開發之—assert斷言
2018-12-30
Python全棧
【python測試開發棧】幫你總結python random模組高頻使
2021-09-09
Pythonrandom
學python入門有用嗎？Python全棧開發
2020-04-09
Python全棧
Python全棧開發之—redis資料庫
2018-12-24
Python全棧Redis資料庫
【python測試開發棧】—幫你總結Python os模組高頻使用的方
2021-09-09
Python
Python全棧MongoDB資料庫（聚合、二進位制、GridFS、pymongo模組）
2018-08-23
Python全棧MongoDB資料庫
Python全棧開發多少錢?學Python價格貴嗎?
2019-12-23
Python全棧
測試開發全棧之 Python 自動化
2020-12-03
全棧Python
python-tab補全模組
2019-02-16
Python
python全棧
2024-12-04
Python全棧
FEer到全棧開發
2019-02-16
全棧
（武漢）PHP開發，全職招聘
2020-08-05
PHP
參加python全棧開發培訓需要多少錢?
2020-06-19
Python全棧
Python全棧開發+人工智慧培訓多少錢?
2020-06-30
Python全棧人工智慧
全棧 JavaScript 開發圖景
2024-08-02
全棧JavaScript
time模組，collections模組，佇列和棧
2019-03-20
佇列
全全全棧測試開發學習路線
2021-09-12
全棧
python人工智慧全棧開發三期【千鋒教育】
2018-08-16
Python人工智慧全棧
【引向】全棧開發工程師之路
2019-02-16
全棧工程師
全棧開發自學路線
2019-02-16
全棧
全棧開發者路線圖
2022-06-08
全棧
Web全棧開發有前途嗎？
2021-07-13
Web全棧
python開發學習之如何更好的引用Python模組?
2019-11-21
Python
Python開發常用的庫及模組!Python學習教程
2021-02-06
Python
python 模組：itsdangerous 模組
2020-02-16
Python
Python模組：time模組
2021-09-09
Python
??Java開發者的Python快速進修指南：自定義模組及常用模組
2023-11-26
JavaPython
Django + DebugToolbar構建全棧WEB開發
2018-11-19
Django全棧Web
【杭州】【兼職】全棧開發工程師
2019-07-03
全棧工程師
Swift 全棧開發之路（一）protoc && SwiftPM
2019-01-29
Swift全棧FTP
Web3 全棧開發完整指南
2022-11-01
Web全棧
使用typescript開發angular模組(編寫模組)
2018-04-23
TypeScriptAngular
Python模組之urllib模組
2020-10-30
Python
python模組之collections模組
2019-01-04
Python
模組化開發(二)
2019-05-19
前端模組化開發
2021-03-15
前端

Python全棧開發武沛齊day07模組

判斷資料夾是否存在？

pip install requests

1.匯入模組時，都會去哪裡找

2.sys.argv，執行指令碼時傳入的引數

os.path.abspath(file)

1.開啟並讀取檔案

2.讀取節點

v1 = obj.sections()

print(v1)

[‘server’,’client’]

3.鍵值對

v2 = obj.items("server")

for k,v in v2:

print(k,v)

v2

4.鍵值

v3 = obj.get("server",'v3')

print(v3)

中國聯通

5.包含

v3 = obj.has_section("server")

print(v3)

True

6.新增

obj.add_section("group")

obj.set("group", 'name', "武沛齊")

obj.set("group", 'age', "19")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

7.刪除

obj.remove_section("group")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

8.刪除

obj.remove_option("server", "v2")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

9.修改

obj.set("server", "v1", "999")

with open("my.ini", mode='w', encoding='utf-8') as f:

obj.write(f)

1.第一個

node = root.find("country")

print(node)

print(node.tag)

print(node.attrib)

2.找到多個

node_list = root.findall("country")

for node in node_list:

print(node.tag, node.attrib)

3.繼續向下找

wupeiqa wupeiqb wupeiqi

[('15131255799', '31255799'), ('15131255989', '31255989')]

字串處理

print(result)

字串處理

print(res.text)

1.將文字交給BeautifulSoup進行處理

2.使用物件就是整個文字根據特徵尋找標籤

3.繼續往下找

print(res.text)

1.將文字交給BeautifulSoup進行處理

2.使用物件就是整個文字根據特徵尋找標籤

3.每個元素

print(res.text)

1.將文字交給BeautifulSoup進行處理

2.使用物件就是整個文字根據特徵尋找標籤

3.每個元素

1.傳送網路請求，獲取文字資料

print(res.text)

2.解析資料bs4

3.尋找特徵+獲取內部元素

4.尋找每個商品

print(cookie_dict)

第2頁（請求與網頁不一致）

相關文章