#輸出散點圖 def f(): datingDataMat,datingLabels = file2matrix("datingTestSet3.txt") fig = plt.figure() # ax = fig.add_subplot(199,projection='polar') # ax = fig.add_subplot(111,projection='hammer') # ax = fig.add_subplot(111,projection='lambert') # ax = fig.add_subplot(111,projection='mollweide') # ax = fig.add_subplot(111,projection='aitoff') # ax = fig.add_subplot(111,projection='rectilinear') # ax = fig.add_subplot(111,projection='rectilinear') #此處的add_subplot引數的意思是把畫布分為3行4列,畫在從左到右從上到下的第2個格里 ax = fig.add_subplot(3,4,2) #fig.add_subplot(342)也可以,但是這樣無法表示兩位數
ax.scatter(datingDataMat[:,1],datingDataMat[:,2]) # ax1 = fig.add_subplot(221) # ax1.plot(datingDataMat[:,1],datingDataMat[:,2]) plt.show()
其中fig.add_subplot(3,4,2)的效果圖如下(紅框是我加的,原輸出沒有):
所以fig.add_subplot(3,4,12)的效果就是:
所以,第三個引數不能超過前兩個的乘積,如果用fig.add_subplot(a,b,c)來表示的話,ab>=c,否則會報錯。
對於fig.add_subplot(3,4,12)這個函式,官方網站的解釋似乎有點問題,連結https://matplotlib.org/api/_as_gen/matplotlib.figure.Figure.html?highlight=add_subplot#matplotlib.figure.Figure.add_subplot
查詢add_subplot
(*args, **kwargs),得到如下解釋:
*args
Either a 3-digit integer or three separate integers describing the position of the subplot. If the three integers are I, J, and K, the subplot is the Ith plot on a grid with J rows and K columns.
意思是,三個引數分別為I, J, K,表示J行K列,那I是什麼?沒有提及。
倒是下面的See also所指向的matplotlib.pyplot.subplot給出了正確的解釋。
subplot(nrows, ncols, index, **kwargs)
In the current figure, create and return anAxes
, at position index of a (virtual) grid of nrows by ncols axes. Indexes go from 1 tonrows *ncols
, incrementing in row-major order.
If nrows, ncols and index are all less than 10, they can also be given as a single, concatenated, three-digit number.
For example, subplot(2, 3, 3)
and subplot(233)
both create an Axes
at the top right corner of the current figure, occupying half of the figure height and a third of the figure width.
由於沒有使用樣本分類的特徵值,我們很難看出來任何有價值的資訊。Matplotlib庫提供的scatter函式支援個性化標記散點圖上的點。
#輸出進行了分類的散點圖 def g(): datingDataMat,datingLabels = file2matrix("datingTestSet2.txt") fig = plt.figure() ax = fig.add_subplot(111) ax.set_title("scatter") #ax.scatter(datingDataMat[:,1],datingDataMat[:,2]) #ax.scatter(datingDataMat[:,0],datingDataMat[:,1],15.0*array(datingLabels),15.0*array(datingLabels)) print(datingLabels) ax.scatter(datingDataMat[:,1],datingDataMat[:,2],15.0 * array(datingLabels),15.0 * array(datingLabels))
#上式的後兩個引數15.0 * array(datingLabels)和15.0 * array(datingLabels),實際上是s和c兩個引數,用於設定大小和顏色,可以不同,具體如下:
#ax.scatter(datingDataMat[:,0],datingDataMat[:,1],s=15.0*array(datingLabels),c=15.0*array(datingLabels))
#其中的15只是為了擴大倍數,使差別更明顯,只要你願意,你可以用1000,100000等等任何數字去乘。
plt.show()
這裡著重說明一下scatter函式
Axes.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, *, data=None, **kwargs)
x,y表示點的位置 s表示點的大小,官方說明:
scalar or array_like, shape (n, ), optional,數值或類陣列
size in points^2. Default is rcParams['lines.markersize'] ** 2
語焉不詳,沒太看懂,看到了size,以下是逐步測試出來的結果,從效果來看,s可能是scale的縮寫
為了便於測試,我在datingTestSet2.txt中只保留了前5個樣本
40920 8.326976 0.953952 3
14488 7.153469 1.673904 2
26052 1.441871 0.805124 1
75136 13.147394 0.428964 1
38344 1.669788 0.134296 1
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=1)執行效果如下
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=100)
為了變化更明顯,把s值擴大了100倍,執行效果如下:
作為單一數值的效果我們看到了,官方說明中,還有一個array_like的形式,我們來測試一下
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=[1]),這個就不貼圖了,和數值1是一樣的,所有點的大小是一樣的。
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=[1,50]),看看這是什麼效果:
有些變,有些不變,規律是什麼?經過一番測試,中間過程不說了,函式會根據樣本的位置與s中對應位置元素的值進行設定,舉個栗子,
第1個樣本的值是x=8.326976, y=0.953952,s中對應的第1個值是1,所以這個點的大小是1
第2個樣本的值是x=7.153469, y=1.673904,s中對應的第2個值是50,所以這個點的大小是50
第3個樣本的值是x=1.441871, y=0.805124,s中只有兩個值,所以現在回到第1個值,是1,所以這個點的大小是50
以下同理,迴圈。
s=[1,50,500]時,同理。
引數c
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=[1,50], c='r')
引數c表示點的顏色
c : color, sequence, or sequence of color, optional, default: ‘b’
c
can be a single color format string, or a sequence of color specifications of lengthN
, or a sequence ofN
numbers to be mapped to colors using thecmap
andnorm
specified via kwargs (see below). Note thatc
should not be a single numeric RGB or RGBA sequence because that is indistinguishable from an array of values to be colormapped.c
can be a 2-D array in which the rows are RGB or RGBA, however, including the case of a single row to specify the same color for all points.
Matplotlib recognizes the following formats to specify a color:
- an RGB or RGBA tuple of float values in
[0, 1]
(e.g.,(0.1, 0.2, 0.5)
or(0.1, 0.2, 0.5, 0.3)
); - a hex RGB or RGBA string (e.g.,
'#0F0F0F'
or'#0F0F0F0F'
); - a string representation of a float value in
[0, 1]
inclusive for gray level (e.g.,'0.5'
); - one of
{'b', 'g', 'r', 'c', 'm', 'y', 'k', 'w'}
; - a X11/CSS4 color name;
- a name from the xkcd color survey; prefixed with
'xkcd:'
(e.g.,'xkcd:sky blue'
); - one of
{'tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple', 'tab:brown', 'tab:pink', 'tab:gray', 'tab:olive','tab:cyan'}
which are the Tableau Colors from the ‘T10’ categorical palette (which is the default color cycle); - a “CN” color spec, i.e.
'C'
followed by a single digit, which is an index into the default property cycle (matplotlib.rcParams['axes.prop_cycle']
); the indexing occurs at artist creation time and defaults to black if the cycle does not include color.
All string specifications of color, other than “CN”, are case-insensitive.
c='r'表示所有點的顏色都變為紅色
如果要設定不同的顏色,要用陣列或元組,如下:
ax.scatter(datingDataMat[:,1],datingDataMat[:,2],s=[1,50], c=('r','b'))
設定規律同引數s,1、2、3迴圈
引數marker
marker : MarkerStyle
, optional, default: ‘o’
表示圖上的點的樣式,預設是'o',也就是我們最常見的圓點,沒看出來"."和"o"有什麼區別。
All possible markers are defined here:
以下是所有可能的樣式,各位有興趣可以試一下,挺好玩的。 其中從TICKLEFT開始的幾個英文單詞,不知道怎麼用。
marker | description |
---|---|
"." |
point |
"," |
pixel |
"o" |
circle |
"v" |
triangle_down |
"^" |
triangle_up |
"<" |
triangle_left |
">" |
triangle_right |
"1" |
tri_down |
"2" |
tri_up |
"3" |
tri_left |
"4" |
tri_right |
"8" |
octagon |
"s" |
square |
"p" |
pentagon |
"P" |
plus (filled) |
"*" |
star |
"h" |
hexagon1 |
"H" |
hexagon2 |
"+" |
plus |
"x" |
x |
"X" |
x (filled) |
"D" |
diamond |
"d" |
thin_diamond |
"|" |
vline |
"_" |
hline |
TICKLEFT | tickleft |
TICKRIGHT | tickright |
TICKUP | tickup |
TICKDOWN | tickdown |
CARETLEFT | caretleft (centered at tip) |
CARETRIGHT | caretright (centered at tip) |
CARETUP | caretup (centered at tip) |
CARETDOWN | caretdown (centered at tip) |
CARETLEFTBASE | caretleft (centered at base) |
CARETRIGHTBASE | caretright (centered at base) |
CARETUPBASE | caretup (centered at base) |
"None" , " " or "" |
nothing |
'$...$' |
render the string using mathtext. |
verts |
a list of (x, y) pairs used for Path vertices. The center of the marker is located at (0,0) and the size is normalized. |
path | a Path instance. |
(numsides , style , angle ) |
The marker can also be a tuple (
|
For backward compatibility, the form (verts
, 0) is also accepted, but it is equivalent to just verts
for giving a raw set of vertices that define the shape.
其它的引數暫時不去分析,以後用到時再說。