Pandas知識點彙總(2)——布林索引
資料集地址:
1.計算布林值統計資訊
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#讀取movie,設定行索引是movie_title
pd.options.display.max_columns = 50
movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title')
#判斷電影時長是否超過兩個小時 #Figure1
movie_2_hours = movie['duration'] > 120
#統計時長超過兩小時的電影總數
print(movie_2_hours.sum()) #result:1039
#統計時長超過兩小時的電影的比例
print(movie_2_hours.mean())
#統計False和True的比例
print(movie_2_hours.value_counts(normalize = True))
#比較同一個DataFrame中的兩列
actors = movie[['actor_1_facebook_likes','actor_2_facebook_likes']].dropna()
print((actors['actor_1_facebook_likes'] > actors['actor_2_facebook_likes']).mean()) #Figure2
執行結果:
Figure1
Figure2
2. 構建多個布林條件
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#讀取movie,設定行索引是movie_title
pd.options.display.max_columns = 50
movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title')
#建立多個布林條件
criteria1 = movie.imdb_score > 8
criteria2 = movie.content_rating == "PG-13"
criteria3 = (movie.title_year < 2000) | (movie.title_year >= 2010)
"""
print(criteria1.head())
print(criteria2.head())
print(criteria3.head())
執行結果:Figure1
"""
#將多個布林條件合併成一個
criteria_final = criteria1 & criteria2 & criteria3
print(criteria_final.head())
#執行結果:Figure2
執行結果:
Figure1
Figure2
3.用布林索引過濾
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#讀取movie,設定行索引是movie_title
pd.options.display.max_columns = 50
movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title')
#建立第一個布林條件
crit_a1 = movie.imdb_score > 8
crit_a2 = movie.content_rating == 'PG-13'
crit_a3 = (movie.title_year < 2000) | (movie.title_year > 2009)
final_crit_a = crit_a1 & crit_a2 & crit_a3
#建立第二個布林條件
crit_b1 = movie.imdb_score < 5
crit_b2 = movie.content_rating == 'R'
crit_b3 = (movie.title_year >= 2000) & (movie.title_year <= 2010)
final_crit_b = crit_b1 & crit_b2 & crit_b3
#將兩個條件用或運算合併起來
final_crit_all = final_crit_a | final_crit_b
print(final_crit_all.head()) #Figure 1
#用最終的布林條件過濾資料
print(movie[final_crit_all].head()) #Figure2
執行結果:
Figure1
Figure2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#讀取movie,設定行索引是movie_title
pd.options.display.max_columns = 50
movie = pd.read_csv("./data/movie.csv",index_col = 'movie_title')
#建立第一個布林條件
crit_a1 = movie.imdb_score > 8
crit_a2 = movie.content_rating == 'PG-13'
crit_a3 = (movie.title_year < 2000) | (movie.title_year > 2009)
final_crit_a = crit_a1 & crit_a2 & crit_a3
#建立第二個布林條件
crit_b1 = movie.imdb_score < 5
crit_b2 = movie.content_rating == 'R'
crit_b3 = (movie.title_year >= 2000) & (movie.title_year <= 2010)
final_crit_b = crit_b1 & crit_b2 & crit_b3
#將兩個條件用或運算合併起來
final_crit_all = final_crit_a | final_crit_b
#使用loc,對指定的列做過濾操作,可以清楚地看到過濾是否起作用
cols = ['imdb_score','content_rating','title_year']
movie_filtered = movie.loc[final_crit_all,cols]
print(movie_filtered.head(10))
執行結果:
參考教程:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/1747/viewspace-2824422/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 玩轉javascript---知識點彙總(2)JavaScript
- js知識點彙總JS
- SVM知識點彙總
- JavaScript知識點彙總JavaScript
- java知識點彙總Java
- 前端小知識點彙總前端
- MySQL MVCC知識點彙總MySqlMVC
- MySQL 索引知識點總結MySql索引
- canvas畫布基本知識點總結Canvas
- C++知識點大彙總C++
- java異常知識點彙總Java
- MySQL基礎知識點彙總MySql
- Java常見知識點彙總(⑳)——鎖Java
- Java常見知識點彙總(②)——StaticJava
- Python入門知識點彙總Python
- C/C++重要知識點彙總C++
- MySQL知識彙總MySql
- Docker 知識彙總Docker
- 前端知識彙總前端
- Java常見知識點彙總(⑦)——集合框架Java框架
- Java常見知識點彙總(⑨)——異常Java
- Java常見知識點彙總(⑪)——泛型Java泛型
- PHP入門易忘知識點彙總PHP
- 玩轉javaScript---知識點彙總(3)JavaScript
- 玩轉javascript---知識點彙總(1)JavaScript
- asp.net 常用知識點彙總整理ASP.NET
- Mysql 索引知識點MySql索引
- Python Numpy 切片和索引(高階索引、布林索引、花式索引)Python索引
- Web開發知識點彙總(每天積累一點點)Web
- Java常見知識點彙總(⑮)——Jvm架構JavaJVM架構
- Java常見知識點彙總(⑧)——分派機制Java
- Java常見知識點彙總(⑬)——執行緒Java執行緒
- 前端開發 JavaScript 乾貨知識點彙總前端JavaScript
- JAVA高階面試必過知識點彙總Java面試
- Oracle RAC修改網路配置知識點彙總Oracle
- Android移動應用知識點總彙①Android
- Android應用開發—知識點彙總Android
- MySQL-知識彙總MySql