Python入門基礎-十一、案例8 空氣質量指數計算 #JSON、CSV格式#列表排序#with語句操作檔案#os模組#網路爬蟲#requests模組#BeautifulSoup解析網頁#Pandas

夏普通發表於2018-11-30

原文網址 : https://blog.csdn.net/qq_34243930/article/details/84645046

（課程相關的所有資料程式碼，已上傳至CSDN，請自行下載
https://download.csdn.net/download/qq_34243930/10764180 ）在這裡插入圖片描述

空氣質量指數計算 1.0

在這裡插入圖片描述

v_1.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：1.0
    日期：01/12/2018
"""


def car_linear(iaqi_lo, iaqi_hi, bp_lo, bp_hi, c_p):
    """
        線性變換
    """
    iaqi = (iaqi_hi - iaqi_lo) * (c_p - bp_lo) / (bp_hi - bp_lo) + iaqi_lo
    return iaqi


def cal_pm_iaqi(pm_val):
    """
        計算PM2.5的IAQI
    """
    # 規則
    if 0 <= pm_val < 36:
        iaqi = car_linear(0, 50, 0, 35, pm_val)
    elif 36 <= pm_val < 76:
        iaqi = car_linear(50, 100, 35, 75, pm_val)
    elif 76 <= pm_val < 116:
        iaqi = car_linear(100, 150, 75, 115, pm_val)
    elif 116 <= pm_val < 151:
        iaqi = car_linear(150, 200, 115, 150, pm_val)
    elif 151 <= pm_val < 251:
        iaqi = car_linear(200, 300, 150, 250, pm_val)
    elif 251 <= pm_val < 351:
        iaqi = car_linear(300, 400, 250, 350, pm_val)
    else:
        iaqi = car_linear(400, 500, 350, 500, pm_val)

    return iaqi


def cal_co_iaqi(co_val):
    """
        計算CO的IAQI
    """
    # 規則
    if 0 <= co_val < 3:
        iaqi = car_linear(0, 50, 0, 2, co_val)
    elif 3 <= co_val < 5:
        iaqi = car_linear(50, 100, 2, 4, co_val)
    elif 5 <= co_val < 15:
        iaqi = car_linear(100, 150, 4, 14, co_val)
    elif 15 <= co_val < 25:
        iaqi = car_linear(150, 200, 14, 24, co_val)
    elif 25 <= co_val < 37:
        iaqi = car_linear(200, 300, 24, 36, co_val)
    elif 37 <= co_val < 49:
        iaqi = car_linear(300, 400, 36, 48, co_val)
    else:
        iaqi = car_linear(400, 500, 48, 60, co_val)

    return iaqi


def cal_aqi(param_list):
    """
        AQI計算
    """
    pm_val = param_list[0]
    co_val = param_list[1]

    pm_iaqi = cal_pm_iaqi(pm_val)
    co_iaqi = cal_co_iaqi(co_val)
    aqi = max(pm_iaqi, co_iaqi)
    return aqi


def main():
    """
        主函式
    """
    print('請輸入以下資訊，用空格分割')
    input_str = input('(1)PM2.5 (2)CO:')
    str_list = input_str.split(' ')
    pm_val = float(str_list[0])
    co_val = float(str_list[1])

    param_list = []
    param_list.append(pm_val)
    param_list.append(co_val)

    # 呼叫AQI計算函式
    aqi_val = cal_aqi(param_list)

    print('空氣質量指數為：', aqi_val)


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• JSON資料檔案操作

空氣質量指數計算 2.0–讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案

在這裡插入圖片描述
給出的兩個json檔案為：

（課程相關的所有資料程式碼，已上傳至CSDN，請自行下載
https://download.csdn.net/download/qq_34243930/10764180 ）

JSON格式 JSON庫

在這裡插入圖片描述

檔案處理三步走：

1、開啟檔案
2、處理檔案
3、關閉檔案
注意：
1、有中文編碼的檔案需要注意檔案編碼格式，雙擊檔案，看右下角：
在這裡插入圖片描述
2、寫入檔案，對於中文來說，未不出現亂碼：
加一句ensure_ascii=False

將AQI前5的資料輸出到檔案，用到列表排序：

列表排序

在這裡插入圖片描述
sort可以指定排序方法func
lambda函式：https://blog.csdn.net/qq_34243930/article/details/83748085#t13 中lambda函式部分
按city['aqi']進行排序：

v_2.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：2.0
    日期：01/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
"""
import json


def process_json_file(filepath):
    """
        解碼json檔案
    """
    f = open(filepath, mode='r', encoding='utf-8')
    city_list = json.load(f)
    return city_list


def main():
    """
        主函式
    """
    filepath = input('請輸入json檔名稱：')
    city_list = process_json_file(filepath)
    city_list.sort(key=lambda city: city['aqi'])
    top5_list = city_list[:5]
    # 把AQI top5城市寫入新json檔案
    f = open('top5_aqi.json', mode='w', encoding='utf-8')
    json.dump(top5_list, f, ensure_ascii=False)
    f.close()


if __name__ == '__main__':
    main()

在這裡插入圖片描述

格式不是我們想要的話，去網上轉換
轉換網址：http://tool.oschina.net/codeformat/json

複製到檔案中儲存即可。

• 另一種常用的資料格式CSV

空氣質量指數計算 3.0–讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案

在這裡插入圖片描述

CSV格式

是用英文逗號隔開的
在這裡插入圖片描述
其實就是個沒有格子的表格形式：

回顧字典用法：

在這裡插入圖片描述
1、獲取列名key：

注意轉換成list:

結果如下：

2、獲取資料值：

3、檔案操作：

注意：不加newline=''的話會預設新的一行給你加一個空格

v_3.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：3.0
    日期：01/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
"""
import json
import csv

def process_json_file(filepath):
    """
        解碼json檔案
    """
    f = open(filepath, mode='r', encoding='utf-8')
    city_list = json.load(f)
    return city_list


def main():
    """
        主函式
    """
    filepath = input('請輸入json檔名稱：')
    city_list = process_json_file(filepath)
    city_list.sort(key=lambda city: city['aqi'])

    # 拿到每一行
    lines = []
    # 列名keys
    lines.append(list(city_list[0].keys()))
    # 獲取valus
    for city in city_list:
        lines.append(list(city.values()))
    # 檔案操作
    f = open('aqi.csv', mode='w', encoding='utf-8', newline='')
    writer = csv.writer(f)
    for line in lines:
        writer.writerow(line)
    f.close()


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• 根據副檔名判斷是json檔案還是csv檔案，並進行對應的操作

空氣質量指數計算 4.0–根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作

在這裡插入圖片描述

檔案操作補充–使用with語句操作檔案物件

在這裡插入圖片描述

csv.reader() 將每行記錄作為列表返回

在這裡插入圖片描述
結果為：

更改輸出樣式：
具體講解，見 https://blog.csdn.net/qq_34243930/article/details/83748085#t13 中的列表（list）及列表的操作部分

結果為：

os模組

在這裡插入圖片描述

v_4.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：4.0
    日期：01/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
"""
import json
import csv
import os


def process_json_file(filepath):
    """
        解碼json檔案
    """
    with open(filepath, mode='r', encoding='utf-8') as f:
        city_list = json.load(f)
    print(city_list)


def process_csv_file(filepath):
    """
        解碼csv檔案
    """
    with open(filepath, mode='r', encoding='utf-8', newline='') as f:
        reader = csv.reader(f) # csv讀取的是一個列表
        for row in reader:
            print(','.join(row))


def main():
    """
        主函式
    """
    filepath = input('請輸入json檔名稱：')
    filename, file_text = os.path.splitext(filepath)

    if file_text == '.json':
        process_json_file(filepath)
    elif file_text == '.csv':
        process_csv_file(filepath)
    else:
        print('不支援的檔案格式')


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• 網路爬蟲入門
• 實時獲取城市的AQI

空氣質量指數計算 5.0–利用網路爬蟲實時獲取城市的空氣質量

網址為：http://pm25.in/
在這裡插入圖片描述

網路爬蟲

在這裡插入圖片描述

requests模組

在這裡插入圖片描述
更多方法參考：http://docs.python-requests.org/

步驟1-通過網路連結獲取網頁內容

1、網址構造
主頁為：http://pm25.in/
城市為：例如上海：http://pm25.in/shanghai
所以需要使用者輸入城市的拼音，來拼接url
2、requests網頁請求
在這裡插入圖片描述
requests.get(url, timeout=30) 表示30秒未連線上就不等了
3、HTTP相應內容的字串

步驟2-對獲得的網頁內容進行處理

1、審查元素
在這裡插入圖片描述

2進入原始碼去複製

錯誤複製：

正確複製：

注意：多行字串用三引號

v_5.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：5.0
    日期：02/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
"""
import requests


def get_html_text(url):
    """
        返回url文字
    """
    r = requests.get(url, timeout=30)
    # print(r.status_code)
    return r.text


def main():
    """
        主函式
    """
    city_pinin = input('請輸入城市拼音：')
    url = 'http://pm25.in/' + city_pinin
    url_text = get_html_text(url)

    aqi_div = ''' <div class="span12 data">
        <div class="span1">
          <div class="value">
            '''
    index = url_text.find(aqi_div)
    begin_index = index + len(aqi_div)
    end_index = begin_index + 2
    aqi_value = url_text[begin_index:end_index]
    print('空氣質量指數為：', aqi_value)


if __name__ == '__main__':
    main()

在這裡插入圖片描述
v_5.0程式碼仍然比較複雜，不夠高效。

• 是否有更高效的處理和解析HTML的庫？
• beautifulsoup4

空氣質量指數計算 6.0–高效地解析和處理HTML，beautifulsoup4

在這裡插入圖片描述

網頁解析

在這裡插入圖片描述

BeautifulSoup解析網頁

在這裡插入圖片描述

步驟1-- 建立BeautifulSoup物件

在這裡插入圖片描述

步驟2-- 查詢節點

在這裡插入圖片描述

1處：拿到一個節點
2.處：拿到該節點的內容
3處：拿到的內容含空格，strip()方法去首尾空格

Python 字串 strip()方法

描述 Python strip() 方法用於移除字串頭尾指定的字元（預設為空格或換行符）或字元序列。
注意：該方法只能刪除開頭或是結尾的字元，不能刪除中間部分的字元。
語法 strip()方法語法：str.strip([chars]);
引數 chars – 移除字串頭尾指定的字元序列。
返回值 返回移除字串頭尾指定的字元生成的新字串。

v_6.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：6.0
    日期：02/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
    6.0--高效地解析和處理HTML,beautifulsoup4
"""
import requests
from bs4 import BeautifulSoup


def get_city_value(city_pinyin):
    """
        獲取城市各指數
    """
    url = 'http://pm25.in/' + city_pinyin
    r = requests.get(url, timeout=30)
    soup = BeautifulSoup(r.text, 'lxml')
    div_list = soup.find_all('div', {'class': 'span1'})

    city_value_list = []
    for i in range(8):
        div_content = div_list[i]
        caption = div_content.find('div', {'class': 'caption'}).text.strip()
        value = div_content.find('div', {'class': 'value'}).text.strip()
        # 組成元組放進list
        city_value_list.append((caption, value))
    return city_value_list


def main():
    """
        主函式
    """
    city_pinyin = input('請輸入城市拼音：')
    city_value = get_city_value(city_pinyin)
    print(city_value)


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• 獲取所有城市的AQI

空氣質量指數計算 7.0–利用beautifulsoup4獲取所有城市的空氣質量

在這裡插入圖片描述

步驟1–首先獲取所有的城市列表，及對應的url

1、查詢節點
注意：這裡有兩個bottom,所以之後取第二個
city_div = soup.find_all('div', {'class': 'bottom'})[1]
第一個是[0]
在這裡插入圖片描述
2、在所選節點裡繼續查詢

上圖中的city_list錯誤原因：我忘記初始化了，city_list = []

步驟2–根據url獲取城市的空氣質量（6.0程式）

v_7.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：7.0
    日期：03/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
    6.0--高效地解析和處理HTML,beautifulsoup4
    7.0–利用beautifulsoup4獲取所有城市的空氣質量
"""
import requests
from bs4 import BeautifulSoup


def get_city_value(city_pinyin):
    """
        獲取城市各指數
    """
    url = 'http://pm25.in/' + city_pinyin
    r = requests.get(url, timeout=30)
    soup = BeautifulSoup(r.text, 'lxml')
    div_list = soup.find_all('div', {'class': 'span1'})

    city_value_list = []
    for i in range(8):
        div_content = div_list[i]
        caption = div_content.find('div', {'class': 'caption'}).text.strip()
        value = div_content.find('div', {'class': 'value'}).text.strip()
        # 組成元組放進list
        city_value_list.append((caption, value))
    return city_value_list


def get_all_cities():
    """
        獲取所有城市
    """
    url = 'http://pm25.in/'
    city_list = []
    r = requests.get(url, timeout=30)
    soup = BeautifulSoup(r.text, 'lxml')

    city_div = soup.find_all('div', {'class': 'bottom'})[1]
    city_link_div = city_div.find_all('a')

    for city_link in city_link_div:
        city_name = city_link.text
        city_pinyin = city_link['href'][1:]
        city_list.append((city_name, city_pinyin))
    return city_list


def main():
    """
        主函式
    """
    city_list = get_all_cities()
    for city in city_list:
        city_name = city[0]
        city_pinyin = city[1]
        city_value = get_city_value(city_pinyin)
        print(city_name, city_value)


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• 獲取所有城市的AQI，並儲存為CSV資料檔案

空氣質量指數計算 8.0–將獲取的所有城市空氣質量儲存成CSV資料檔案

在這裡插入圖片描述

步驟1–檔案操作

在這裡插入圖片描述

步驟2–進度顯示

由於國內城市較多，執行時候不知道執行到啥情況。
新增過程進度資料顯示
在這裡插入圖片描述
每一條都輸出有點冗餘，如何十條十條間隔輸出呢？
i%10 == 0

步驟3–根據需求修改對應函式

在這裡插入圖片描述
所以修改如下：

v_8.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：8.0
    日期：03/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
    6.0--高效地解析和處理HTML,beautifulsoup4
    7.0–利用beautifulsoup4獲取所有城市的空氣質量
    8.0--將獲取的所有城市空氣質量儲存成CSV資料檔案
"""
import requests
from bs4 import BeautifulSoup
import csv


def get_city_value(city_pinyin):
    """
        獲取城市各指數
    """
    url = 'http://pm25.in/' + city_pinyin
    r = requests.get(url, timeout=30)
    soup = BeautifulSoup(r.text, 'lxml')
    div_list = soup.find_all('div', {'class': 'span1'})

    city_value_list = []
    for i in range(8):
        div_content = div_list[i]
        caption = div_content.find('div', {'class': 'caption'}).text.strip()
        value = div_content.find('div', {'class': 'value'}).text.strip()
        # 組成元組放進list
        # city_value_list.append((caption, value))
        city_value_list.append(value)
    return city_value_list


def get_all_cities():
    """
        獲取所有城市
    """
    url = 'http://pm25.in/'
    city_list = []
    r = requests.get(url, timeout=30)
    soup = BeautifulSoup(r.text, 'lxml')

    city_div = soup.find_all('div', {'class': 'bottom'})[1]
    city_link_div = city_div.find_all('a')

    for city_link in city_link_div:
        city_name = city_link.text
        city_pinyin = city_link['href'][1:]
        city_list.append((city_name, city_pinyin))
    return city_list


def main():
    """
        主函式
    """
    city_list = get_all_cities()
    # 需要新建列表名稱
    header = ['City', 'AQI', 'PM2.5/1h', 'PM10/1h', 'CO/1h', 'NO2/1h', 'O3/1h', 'O3/8h', 'SO2/1h']

    with open('china_city_aqi.csv', 'w', encoding='utf-8', newline='') as f:
        writer = csv.writer(f)
        writer.writerow(header)
        for i, city in enumerate(city_list):
            if (i+1) % 10 == 0:
                print('已處理{}條記錄。（共{}條記錄）'.format(i+1, len(city_list)))
            city_name = city[0]
            city_pinyin = city[1]
            city_value = get_city_value(city_pinyin)
            row = [city_name] + city_value # cityname加[],列表才能+操作
            writer.writerow(row)


if __name__ == '__main__':
    main()

處理過程中輸出進度：
在這裡插入圖片描述

結果為:

• 利用Pandas處理、分析資料

空氣質量指數計算 9.0–利用Pandas進行資料處理分析

在這裡插入圖片描述

什麼是Pandas

在這裡插入圖片描述
結構化資料：資料是有結構的，像csv，json
資料無格式，例如圖片，視訊等
清洗資料，例如有空值的資料

Pandas的資料結構–Series

在這裡插入圖片描述

Pandas的資料結構–DataFrame

在這裡插入圖片描述

Pandas的資料結構–索引操作

在這裡插入圖片描述

在這裡插入圖片描述
列索引df_obj[‘label’]：

不連續索引df_obj[[‘label1’, ‘label2’]]：

Pandas的資料結構–排序

在這裡插入圖片描述
按值排序，預設升序，降序的話：在by語句後面加上ascending=False

Pandas統計計算和描述

在這裡插入圖片描述

Pandas模組中用於填充缺失資料的函式是： pandas.fillna()

pandas可以直接獲取csv：aqi_data = pd.read_csv('china_city_aqi.csv')
pandas可以直接儲存資料
至csv檔案:top10_citys.to_csv('top10_aqi_city.csv', index=False)
其中index=False表示不儲存索引至檔案
v_9.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：9.0
    日期：04/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
    6.0--高效地解析和處理HTML,beautifulsoup4
    7.0–利用beautifulsoup4獲取所有城市的空氣質量
    8.0--將獲取的所有城市空氣質量儲存成CSV資料檔案
    9.0--利用Pandas進行資料處理分析
"""
import pandas as pd


def main():
    """
        主函式
    """
    aqi_data = pd.read_csv('china_city_aqi.csv')
    print('基本資訊:')
    print(aqi_data.info())

    print('資料預覽：')
    print(aqi_data.head())

    # 基本統計
    print('AQI最大值：',aqi_data['AQI'].max())
    print('AQI最小值：', aqi_data['AQI'].min())
    print('AQI均值：', aqi_data['AQI'].mean())

    # top 10(排序)預設升序排序
    top10_citys = aqi_data.sort_values(by=['AQI']).head(10)
    print('空氣質量最好的十個城市：')
    print(top10_citys)

    # bottom 10(排序)
    # bottom10_citys = aqi_data.sort_values(by=['AQI']).tail(10)
    bottom10_citys = aqi_data.sort_values(by=['AQI'], ascending=False).head(10)
    print('空氣質量最差的十個城市：')
    print(bottom10_citys)

    # 儲存至csv檔案
    top10_citys.to_csv('top10_aqi_city.csv', index=False)
    bottom10_citys.to_csv('bottom10_aqi_city.csv', index=False)


if __name__ == '__main__':
    main()

在這裡插入圖片描述

• 資料清洗
• 利用Pandas進行資料視覺化

空氣質量指數計算 10.0–資料清洗；利用Pandas進行資料視覺化

在這裡插入圖片描述

Pandas資料清洗

在這裡插入圖片描述

Pandas資料視覺化

在這裡插入圖片描述
更多例子請參考：https://pandas.pydata.org/pandas-docs/stable/visualization.html
kind決定畫什麼樣的圖
用plt.savefig() 需新增import matplotlib.pyplot as plt
中文顯示問題解決：

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False

v_10.0程式碼如下：

"""
    作者：xpt
    功能：AQI計算
    版本：10.0
    日期：04/12/2018
    2.0--讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案
    3.0--讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案
    4.0--根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作
    5.0--利用網路爬蟲實時獲取城市的空氣質量
    6.0--高效地解析和處理HTML,beautifulsoup4
    7.0–利用beautifulsoup4獲取所有城市的空氣質量
    8.0--將獲取的所有城市空氣質量儲存成CSV資料檔案
    9.0--利用Pandas進行資料處理分析
    10.0--資料清洗；利用Pandas進行資料視覺化
"""
import pandas as pd
import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False


def main():
    """
        主函式
    """
    aqi_data = pd.read_csv('china_city_aqi.csv')
    print('基本資訊:')
    print(aqi_data.info())

    print('資料預覽：')
    print(aqi_data.head())

    # 資料清洗
    # 只保留AQI>0的資料
    # filter_condition = aqi_data['AQI'] > 0
    # clean_aqi_data = aqi_data[filter_condition]
    clean_aqi_data = aqi_data[aqi_data['AQI'] > 0]

    # 基本統計
    print('AQI最大值：',clean_aqi_data['AQI'].max())
    print('AQI最小值：', clean_aqi_data['AQI'].min())
    print('AQI均值：', clean_aqi_data['AQI'].mean())

    top50_citys = clean_aqi_data.sort_values(by=['AQI']).head(50)
    top50_citys.plot(kind='line', x='City', y='AQI',title='空氣質量最好的50個城市',
                     figsize=(20, 10))
    plt.savefig('top50_city_aqi.png')
    plt.show()


if __name__ == '__main__':
    main()

在這裡插入圖片描述

換折線圖：kind='line’
在這裡插入圖片描述

入門課程結束啦。。。。
——來自凌晨1:18困成狗的夏普通

python爬蟲requests模組
2019-03-01
Python爬蟲
python–模組之os操作檔案模組
2018-11-01
Python
Python OS模組操作檔案
2024-07-06
Python
Python入門(二十六)：檔案模組(os模組與shutil模組)
2020-10-06
Python
爬蟲——Requests模組
2019-01-13
爬蟲
爬蟲-Requests模組
2022-03-03
爬蟲
Python使用os模組、Try語句、pathlib模組判斷檔案是否存在
2019-02-16
Python
透過Requests模組獲取網頁內容並使用BeautifulSoup進行解析
2024-03-26
網頁
Python基礎入門（8）- Python模組和包
2021-12-30
Python
Python案例學習——模組和包、爬蟲的基礎概念
2020-12-27
Python爬蟲
python常用標準庫（os系統模組、shutil檔案操作模組）
2022-06-04
Python
NodeJs 入門到放棄 — 常用模組及網路爬蟲(二)
2021-03-03
NodeJS爬蟲
【python基礎】os模組的使用
2020-12-16
Python
python語言基礎 - day12~13 模組包以及檔案操作
2020-10-16
Python
Python：requests模組
2020-10-18
Python
Python爬蟲之路-jsonpath模組
2021-01-04
Python爬蟲JSON
Python爬蟲之路-lxml模組
2021-01-04
Python爬蟲XML
python基礎之-sys模組、os模組基本介紹（未完成）
2024-04-19
Python
25.python模組（加密，os，re，json）
2024-08-13
Python加密JSON
json解析模組
2021-01-03
JSON
python爬蟲:爬蟲的簡單介紹及requests模組的簡單使用
2022-02-24
Python爬蟲
python爬蟲：使用BeautifulSoup修改網頁內容
2020-04-05
Python爬蟲網頁
python爬蟲需要什麼模組
2021-09-11
Python爬蟲
Python基礎——模組
2020-08-08
Python
爬蟲逆向基礎，理解 JavaScript 模組化程式設計 webpack
2021-10-21
爬蟲JavaScript程式設計Web
序列化模組，隨機數模組，os模組，sys模組，hashlib模組
2019-03-19
隨機
Python基礎12（模組與datetime模組）
2020-12-16
Python
Python 內建模組：os模組
2020-04-05
Python
Python——JSON 模組
2019-01-19
PythonJSON
python的os模組
2018-12-01
Python
python_OS 模組
2019-01-27
Python
Python中os模組
2018-07-03
Python
python爬蟲之 BeautifulSoup庫入門
2019-12-09
Python爬蟲
python之排序操作及heapq模組
2019-02-16
Python排序
[Python3網路爬蟲開發實戰] 2-爬蟲基礎 2-網頁基礎
2018-03-08
Python爬蟲網頁
requests模組
2024-11-01
Python3網路爬蟲快速入門實戰解析
2020-04-23
Python爬蟲
Python爬蟲教程-09-error 模組
2018-09-06
Python爬蟲Error

Python入門基礎-十一、案例8 空氣質量指數計算 #JSON、CSV格式#列表排序#with語句操作檔案#os模組#網路爬蟲#requests模組#BeautifulSoup解析網頁#Pandas

空氣質量指數計算 1.0

空氣質量指數計算 2.0–讀取已經獲取的JSON資料檔案，並將AQI前5的資料輸出到檔案

JSON格式 JSON庫

檔案處理三步走：

列表排序

空氣質量指數計算 3.0–讀取已經獲取的JSON資料檔案，並將其轉換成CSV檔案

CSV格式

回顧字典用法：

空氣質量指數計算 4.0–根據輸入的檔案判斷是JSON格式還是CSV格式，並進行相應的操作

檔案操作補充–使用with語句操作檔案物件

csv.reader() 將每行記錄作為列表返回

os模組

空氣質量指數計算 5.0–利用網路爬蟲實時獲取城市的空氣質量

網路爬蟲

requests模組

步驟1-通過網路連結獲取網頁內容

步驟2-對獲得的網頁內容進行處理

空氣質量指數計算 6.0–高效地解析和處理HTML，beautifulsoup4

網頁解析

BeautifulSoup解析網頁

步驟1-- 建立BeautifulSoup物件

步驟2-- 查詢節點

Python 字串 strip()方法

空氣質量指數計算 7.0–利用beautifulsoup4獲取所有城市的空氣質量

步驟1–首先獲取所有的城市列表，及對應的url

步驟2–根據url獲取城市的空氣質量（6.0程式）

空氣質量指數計算 8.0–將獲取的所有城市空氣質量儲存成CSV資料檔案

步驟1–檔案操作

步驟2–進度顯示

步驟3–根據需求修改對應函式

空氣質量指數計算 9.0–利用Pandas進行資料處理分析

什麼是Pandas

Pandas的資料結構–Series

Pandas的資料結構–DataFrame

Pandas的資料結構–索引操作

Pandas的資料結構–排序

Pandas統計計算和描述

空氣質量指數計算 10.0–資料清洗；利用Pandas進行資料視覺化

Pandas資料清洗

Pandas資料視覺化

相關文章