4.pandas基礎使用

weixin_34007291發表於2017-07-10

pandas 是基於 Numpy 構建的含有更高階資料結構和工具的資料分析包
類似於 Numpy 的核心是 ndarray,pandas 也是圍繞著 Series 和 DataFrame 兩個核心資料結構展開的 。Series 和 DataFrame 分別對應於一維的序列和二維的表結構。
安裝

conda install -n tensorflow pandas

demo例項

import sys
from pandas import Series,DataFrame
import pandas as pd
print(sys.version)
'''
3.5.3 |Continuum Analytics, Inc.| (default, May 15 2017, 10:43:23) [MSC v.1900 64 bit (AMD64)]
'''
print('Series部分')
s = Series([1,2,3.0,'abc'])
print(s)
'''
0      1
1      2
2      3
3    abc
dtype: object
'''
s = Series(data=[1, 3, 5, 7], index=['a', 'b', 'x', 'y'])
print(s)
'''
a    1
b    3
x    5
y    7
dtype: int64
'''
print(s.index)
print(s.values)
'''
Index(['a', 'b', 'x', 'y'], dtype='object')
[1 3 5 7]
'''
print('DataFrame部分')
data = {'state':['Ohino','Ohino','Ohino','Nevada','Nevada'],
        'year':[2000,2001,2002,2001,2002],
        'pop':[1.5,1.7,3.6,2.4,2.9]}
df = DataFrame(data)
print(df)
'''
   pop   state  year
0  1.5   Ohino  2000
1  1.7   Ohino  2001
2  3.6   Ohino  2002
3  2.4  Nevada  2001
4  2.9  Nevada  2002
'''
df = DataFrame(data,index=['one','two','three','four','five'],
               columns=['year','state','pop','debt'])
print(df)
print(df.index)
print(df.columns)
print(type(df['debt']))
'''
       year   state  pop debt
one    2000   Ohino  1.5  NaN
two    2001   Ohino  1.7  NaN
three  2002   Ohino  3.6  NaN
four   2001  Nevada  2.4  NaN
five   2002  Nevada  2.9  NaN
Index(['one', 'two', 'three', 'four', 'five'], dtype='object')
Index(['year', 'state', 'pop', 'debt'], dtype='object')
<class 'pandas.core.series.Series'>
'''