pd.Series()函式解析

ckxllf發表於2019-09-04

  1. Series介紹

  Pandas模組的資料結構主要有兩:1、Series ;2、DataFrame

  series是一個一維陣列,是基於NumPy的ndarray結構。Pandas會默然用0到n-1來作為series的index,但也可以自己指定index(可以把index理解為dict裡面的key)。

  2. Series建立

  pd.Series([list],index=[list])

  引數為list;index為可選引數,若不填寫則預設index從0開始;若填寫則index長度應該與value長度相等。

  import pandas as pd

  s=pd.Series([1,2,3,4,5],index=['a','b','c','f','e'])

  print s

  pd.Series({dict})

  以一字典結構為引數。

  import pandas as pd

  s=pd.Series({'a':1,'b':2,'c':3,'f':4,'e':5})

  print s

  3. Series取值

  s[index] or s[[index的list]]

  取值操作類似陣列,當取不連續的多個值時可以以list為引數

  import pandas as pd

  import numpy as np

  v = np.random.random_sample(50)

  s = pd.Series(v)

  s1 = s[[3, 13, 23, 33]]

  s2 = s[3:13]

  s3 = s[43]

  print("s1", s1)

  print("s2", s2)

  print("s3", s3)

  s1 3 0.064095

  13 0.354023

  23 0.225739

  33 0.959288

  dtype: float64

  s2 3 0.064095

  4 0.405651

  5 0.024181

  6 0.367606

  7 0.844005

  8 0.405313

  9 0.102824

  10 0.806400

  11 0.950502

  12 0.735310

  dtype: float64

  s3 0.42803253918

  4. Series取頭和尾的值

  .head(n);.tail(n)

  取出頭n行或尾n行,n為可選引數,若不填預設5

  import pandas as pd

  import numpy as np

  v = np.random.random_sample(50)

  s = pd.Series(v)

  print("s.head()", s.head())

  print("s.head(3)", s.head(3))

  print("s.tail()", s.tail())

  print("s.head(3)", s.head(3))

  s.head() 0 0.714136

  1 0.333600

  2 0.683784

  3 0.044002

  4 0.147745

  dtype: float64

  s.head(3) 0 0.714136

  1 0.333600

  2 0.683784

  dtype: float64

  s.tail() 45 0.779509

  46 0.778341

  47 0.331999

  48 0.444811

  49 0.028520

  dtype: float64

  s.head(3) 0 0.714136

  1 0.333600

  2 0.683784

  dtype: float64

  5. Series常用操作

  import pandas as pd

  import numpy as np

  v = [10, 3, 2, 2, np.nan]

  v = pd.Series(v)

  print("len():", len(v)) # Series長度,包括NaN

  print("shape():", np.shape(v)) # 矩陣形狀,(,)

  print("count():", v.count()) # Series長度,不包括NaN

  print("unique():", v.unique()) # 出現不重複values值

  print("value_counts():\n", v.value_counts()) # 統計value值出現次數

  len(): 5 無錫人流醫院哪家好

  shape(): (5,)

  count(): 4

  unique(): [ 10. 3. 2. nan]

  value_counts():

  2.0 2

  3.0 1

  10.0 1

  dtype: int64

  6. Series加法

  import pandas as pd

  import numpy as np

  v = [10, 3, 2, 2, np.nan]

  v = pd.Series(v)

  sum = v[1:3] + v[1:3]

  sum1 = v[1:4] + v[1:4]

  sum2 = v[1:3] + v[1:4]

  sum3 = v[:3] + v[1:]

  print("sum", sum)

  print("sum1", sum1)

  print("sum2", sum2)

  print("sum3", sum3)

  sum 1 6.0

  2 4.0

  dtype: float64

  sum1 1 6.0

  2 4.0

  3 4.0

  dtype: float64

  sum2 1 6.0

  2 4.0

  3 NaN

  dtype: float64

  sum3 0 NaN

  1 6.0

  2 4.0

  3 NaN

  4 NaN

  dtype: float64

  7. Series查詢

  範圍查詢

  import pandas as pd

  import numpy as np

  s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

  sa = pd.Series(s, name="age")

  print(sa[sa>19])

  jim 22.0

  lj 24.0

  ton 20.0

  Name: age, dtype: float64

  中位數

  import pandas as pd

  import numpy as np

  s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

  sa = pd.Series(s, name="age")

  print("sa.median()", sa.median())

  sa.median() 20.0

  8. Series賦值

  import pandas as pd

  import numpy as np

  s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

  sa = pd.Series(s, name="age")

  print(s)

  print('----------------')

  sa['ton'] = 99

  print(sa)

  {'ton': 20, 'mary': 18, 'jack': 19, 'jim': 22, 'lj': 24, 'car': None}

  ----------------

  car NaN

  jack 19.0

  jim 22.0

  lj 24.0

  mary 18.0

  ton 99.0

  Name: age, dtype: float64


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69945560/viewspace-2655968/,如需轉載,請註明出處,否則將追究法律責任。

相關文章