資料分析之matplotlib

lose bai發表於2020-12-05

原文網址 : https://blog.csdn.net/weixin_45904404/article/details/110673124

matplotlib資料視覺化

開發環境jupyterlab

什麼是資料視覺化

https://matplotlib.org/

安裝matplotlib

python -m pip install -U matplotlib

Matplotlib 一系列的依賴
Python (>= 3.6)
NumPy (>= 1.15)
setuptools
cycler (>= 0.10.0)
dateutil (>= 2.1)
kiwisolver (>= 1.0.0)
Pillow (>= 6.2)
pyparsing (>=2.0.3)

基本使用

# coding=utf-8
from matplotlib import pyplot as plt

x = range(2,26,2)
y = [15,13,14.5,17,20,25,26,26,27,22,18,15]

#設定圖片大小,dpi讓影像變得清晰
plt.figure(figsize=(20,8),dpi=80)

#繪圖
plt.plot(x,y)
#顯示圖形
plt.show()

在這裡插入圖片描述
由x和y的每個值對應而組成的折線圖
是不是這樣看著太簡單了

plt.xticks(range(2,29,2)) #設定x軸刻度
plt.yticks(range(2,29,2)) #y

plt.savefig('./01-test.png') #儲存圖片

結果如下
在這裡插入圖片描述
plt.xticks部分原始碼

 >>> locs, labels = yticks()  # Get the current locations and labels.
    >>> yticks(np.arange(0, 1, step=0.2))  # Set label locations.
    >>> yticks(np.arange(3), ['Tom', 'Dick', 'Sue'])  # Set text labels.
    >>> yticks([0, 1, 2], ['January', 'February', 'March'],
    ...        rotation=45)  # Set text labels and properties.
    >>> yticks([])  # Disable yticks.

我們以新增額外引數時xy軸發生變化

plt.yticks(range(2,29,2),rotation=45)

在這裡插入圖片描述

新增文字

由於matplotlib預設是不支援中文的
需要新增配置

# windws和linux設定字型的放
font = {'family' : 'MicroSoft YaHei',
        'weight': 'bold',
        'size': '14'}
matplotlib.rc("font",**font)
matplotlib.rc("font",family='MicroSoft YaHei',weight="bold")

x = range(0,120)
y = [random.randint(20,35) for i in range(120)]
_xtick_labels = ["10點{}分".format(i) for i in range(60)]
_xtick_labels += ["11點{}分".format(i) for i in range(60)]

#取步長，數字和字串一一對應，資料的長度一樣
plt.xticks(list(x)[::3],_xtick_labels[::3],rotation=45,fontproperties=font) #rotaion旋轉的度數
#新增描述資訊
plt.xlabel("時間",fontproperties=font)
plt.ylabel("溫度 單位(℃)",fontproperties=font)
plt.title("10點到12點每分鐘的氣溫變化情況",fontproperties=font)

在這裡插入圖片描述

各型別圖的使用

型別之間的切換需要更改繪製

plt.plot(x,y) ->  plt.scatter(x,y)

折線圖

折線圖:以折線的上升或下降來表示統計數量的增減變化的統計圖
特點:能夠顯示資料的變化趨勢，反映事物的變化情況。(變化)

plt.plot(x,y)

@_copy_docstring_and_deprecators(Axes.plot)
def plot(*args, scalex=True, scaley=True, data=None, **kwargs):
    return gca().plot(
        *args, scalex=scalex, scaley=scaley,
        **({"data": data} if data is not None else {}), **kwargs)

直方圖

直方圖:由一系列高度不等的縱向條紋或線段表示資料分佈的情況。
一般用橫軸表示資料範圍，縱軸表示分佈情況。
特點:繪製連續性的資料,展示一組或者多組資料的分佈狀況(統計)

def hist(
        x, bins=None, range=None, density=False, weights=None,
        cumulative=False, bottom=None, histtype='bar', align='mid',
        orientation='vertical', rwidth=None, log=False, color=None,
        label=None, stacked=False, *, data=None, **kwargs):
    return gca().hist(
        x, bins=bins, range=range, density=density, weights=weights,
        cumulative=cumulative, bottom=bottom, histtype=histtype,
        align=align, orientation=orientation, rwidth=rwidth, log=log,
        color=color, label=label, stacked=stacked,
        **({"data": data} if data is not None else {}), **kwargs)

直方圖用於統計資料出現的次數或者頻率，有多種引數可以調整，見下例：

np.random.seed(19680801)

n_bins = 10
x = np.random.randn(1000, 3)

fig, axes = plt.subplots(nrows=2, ncols=2)
ax0, ax1, ax2, ax3 = axes.flatten()

colors = ['red', 'tan', 'lime']
ax0.hist(x, n_bins, density=True, histtype='bar', color=colors, label=colors)
ax0.legend(prop={'size': 10})
ax0.set_title('bars with legend')

ax1.hist(x, n_bins, density=True, histtype='barstacked')
ax1.set_title('stacked bar')

ax2.hist(x,  histtype='barstacked', rwidth=0.9)

ax3.hist(x[:, 0], rwidth=0.9)
ax3.set_title('different sample sizes')

fig.tight_layout()
plt.show()

引數中density控制Y軸是概率還是數量，與返回的第一個的變數對應。histtype控制著直方圖的樣式，預設是 ‘bar’，對於多個條形時就相鄰的方式呈現如子圖1， ‘barstacked’ 就是疊在一起，如子圖2、3。 rwidth 控制著寬度，這樣可以空出一些間隙，比較圖2、3. 圖4是隻有一條資料時。

條形圖

條形圖:排列在工作表的列或行中的資料可以繪製到條形圖中。
特點:繪製連離散的資料,能夠一眼看出各個資料的大小,比較資料之間的差別。(統計)

plt.bar(x,y,width=1,height=1)

需要傳入高度和一個可迭代的陣列

# Autogenerated by boilerplate.py.  Do not edit as changes will be lost.
@_copy_docstring_and_deprecators(Axes.bar)
def bar(
        x, height, width=0.8, bottom=None, *, align='center',
        data=None, **kwargs):
    return gca().bar(
        x, height, width=width, bottom=bottom, align=align,
        **({"data": data} if data is not None else {}), **kwargs)

散點圖:用兩組資料構成多個座標點，考察座標點的分佈,判斷兩變數
之間是否存在某種關聯或總結座標點的分佈模式。
特點:判斷變數之間是否存在數量關聯趨勢,展示離群點(分佈規律)

plt.scatter(x,y)

原始碼

# Autogenerated by boilerplate.py.  Do not edit as changes will be lost.
@_copy_docstring_and_deprecators(Axes.scatter)
def scatter(
        x, y, s=None, c=None, marker=None, cmap=None, norm=None,
        vmin=None, vmax=None, alpha=None, linewidths=None,
        verts=cbook.deprecation._deprecated_parameter,
        edgecolors=None, *, plotnonfinite=False, data=None, **kwargs):
    __ret = gca().scatter(
        x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
        vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths,
        verts=verts, edgecolors=edgecolors,
        plotnonfinite=plotnonfinite,
        **({"data": data} if data is not None else {}), **kwargs)
    sci(__ret)
    return __ret

在這裡插入圖片描述

通過Axes繪圖

import matplotlib.pyplot as plt
import numpy as np 

fig = plt.figure()
ax = fig.add_subplot(111)
ax.set(xlim=[0.5, 4.5], ylim=[-2, 8], title='An Example Axes',
       ylabel='Y-Axis', xlabel='X-Axis')
plt.show()

餅圖

labels = 'Frogs', 'Hogs', 'Dogs', 'Logs'
sizes = [15, 30, 45, 10]
explode = (0, 0.1, 0, 0)  # only "explode" the 2nd slice (i.e. 'Hogs')

fig1, (ax1, ax2) = plt.subplots(2)
ax1.pie(sizes, labels=labels, autopct='%1.1f%%', shadow=True)
ax1.axis('equal')
ax2.pie(sizes, autopct='%1.2f%%', shadow=True, startangle=90, explode=explode,
    pctdistance=1.12)
ax2.axis('equal')
ax2.legend(labels=labels, loc='upper right')

plt.show()

餅圖自動根據資料的百分比畫餅.。labels是各個塊的標籤，如子圖一。autopct=%1.1f%%表示格式化百分比精確輸出，explode，突出某些塊，不同的值突出的效果不一樣。pctdistance=1.12百分比距離圓心的距離，預設是0.6.
這裡寫圖片描述

箱形圖

為了專注於如何畫圖，省去資料的處理部分。 data 的 shape 為 (n, )， data2 的 shape 為 (n, 3)。

fig, (ax1, ax2) = plt.subplots(2)
ax1.boxplot(data)
ax2.boxplot(data2, vert=False) #控制方向

這裡寫圖片描述

泡泡圖

散點圖的一種，加入了第三個值 s 可以理解成普通散點，畫的是二維，泡泡圖體現了Z的大小，如下例：

np.random.seed(19680801)


N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = (30 * np.random.rand(N))**2  # 0 to 15 point radii

plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()

這裡寫圖片描述

等高線（輪廓圖）

有時候需要描繪邊界的時候，就會用到輪廓圖，機器學習用的決策邊界也常用輪廓圖來繪畫，見下例：

fig, (ax1, ax2) = plt.subplots(2)
x = np.arange(-5, 5, 0.1)
y = np.arange(-5, 5, 0.1)
xx, yy = np.meshgrid(x, y, sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
ax1.contourf(x, y, z)
ax2.contour(x, y, z)

上面畫了兩個一樣的輪廓圖，contourf會填充輪廓線之間的顏色。資料x, y, z通常是具有相同 shape 的二維矩陣。x, y 可以為一維向量，但是必需有 z.shape = (y.n, x.n) ，這裡 y.n 和 x.n 分別表示x、y的長度。Z通常表示的是距離X-Y平面的距離，傳入X、Y則是控制了繪製等高線的範圍。

這裡寫圖片描述

資料分析---matplotlib模組
2024-06-21
大資料分析/機器學習基礎之matplotlib繪圖篇
2023-11-25
大資料機器學習繪圖
資料分析之tableau
2024-09-05
【matplotlib教程】資料視覺化
2024-08-23
視覺化
Python資料分析之numpy
2018-07-23
Python
Python資料分析之pandas
2018-07-23
Python
資料分析利器之Pandas
2022-12-05
Python資料視覺化matplotlib庫
2019-03-04
Python視覺化
Matplotlib資料視覺化基礎
2022-07-01
視覺化
大資料分析之資料下鑽上卷
2024-03-19
大資料
Python資料分析之Pandas篇
2020-10-05
Python
python繪圖之matplotlib
2019-01-05
Python繪圖
利用 Matplotlib 繪製資料圖形（一）
2019-05-08
利用 Matplotlib 繪製資料圖形（二）
2019-05-14
資料分析之《我不是藥神》
2018-08-12
資料分析之去哪兒酒店
2018-08-09
資料分析師之SQL入門
2021-09-09
SQL
Python資料分析與展示之『Numpy』
2020-12-25
Python
資料分析之全國熱門景點分析
2018-08-16
資料分析之杜邦分析法的公式及示例
2022-11-28
公式
資料質量管理之根因分析！
2023-12-15
RFID之M1卡資料分析
2020-08-19
Vue原始碼分析之資料驅動
2020-08-21
Vue原始碼
Presto原始碼分析之資料型別
2021-09-09
REST原始碼資料型別
資料視覺化之matplotlib實戰：plt.pie()函式繪製餅狀圖
2020-10-03
視覺化函式
資料分析完之後的資料展現方式有那些？
2018-12-17
BI、資料倉儲和資料分析之間的區別
2020-09-25
大資料開發之常見九種資料分析方法
2019-06-13
大資料
資料分析師之如何學好Python（四）
2019-01-18
Python
BeetleX大資料之產品分析服務
2021-01-26
大資料
原始碼分析：Exchanger之資料交換器
2021-02-20
原始碼
python資料視覺化神庫：Matplotlib快速入門
2023-04-27
Python視覺化
Matplotlib 系列之【繪製函式影象】
2018-08-02
函式
Matplotlib 系列之【繪製函式影像】
2019-03-04
函式
python資料視覺化-matplotlib入門(6)-從檔案中載入資料
2022-04-29
Python視覺化
資料分析之Python受歡迎的原因（二）
2019-01-21
Python
lodash原始碼分析之獲取資料型別
2019-02-28
原始碼資料型別
初遇python--之新手學資料分析（1）
2020-11-15
Python

資料分析之matplotlib

matplotlib資料視覺化

什麼是資料視覺化

安裝matplotlib

基本使用

新增文字

各型別圖的使用

折線圖

直方圖

條形圖

通過Axes繪圖

餅圖

箱形圖

泡泡圖

等高線（輪廓圖）

相關文章