Python基礎知識分享

FlyOceanFish發表於2018-03-12

原文網址 : https://juejin.im/post/5aa5f3e151882555745955f7

介紹

之前也走馬觀花的看了一下Python的基礎，但是感覺有點不紮實，所以自己又重新細細的把基礎過了一遍，同時把覺著重要的記錄下來。文章最末尾分享了《Python爬蟲開發與專案實戰》pdf書籍，此pdf是高清有目錄的，有需要的朋友拿去。

元組

元組內的資料不能修改和刪除

Python 表示式	結果	描述
('Hi!',) * 4	('Hi!', 'Hi!', 'Hi!', 'Hi!')	複製
3 in (1, 2, 3)	True	元素是否存在

任意無符號的物件，以逗號隔開，預設為元組。例：x, y = 1, 2;

建立一個元素的元組

一定要有一個逗號，要不是錯誤的

tuple = ("apple",)
複製程式碼

通過元組實現數值交換

def test2():
    x = 2
    y = 3
    x, y = y, x
    print x,y
複製程式碼

檢視幫助文件

help

help(list)
複製程式碼

字典

dict["x"]="value"
複製程式碼

如果索引x不在字典dict的key中，則會新增一條資料，反之為修改資料

set()內建函式

set() 函式建立一個無序不重複元素集，可進行關係測試，刪除重複資料，還可以計算交集、差集、並集等。

x = set(["1","2"])
y = set(["1","3","4"])
print x&y # 交集
print x|y # 並集
print x-y # 差集
zip(x) #解包為陣列
複製程式碼

zip()內建函式

zip() 函式用於將可迭代的物件作為引數，將物件中對應的元素打包成一個個元組，然後返回由這些元組組成的列表。

如果各個迭代器的元素個數不一致，則返回列表長度與最短的物件相同，利用 * 號操作符，可以將元組解壓為列表。

a = [1,2,3]
b = [4,5,6]
c = [4,5,6,7,8]
zipped = zip(a,b)     # 打包為元組的列表[(1, 4), (2, 5), (3, 6)]
zip(a,c) # 元素個數與最短的列表一致[(1, 4), (2, 5), (3, 6)]
zip(*zipped)  #與zip相反，可理解為解壓，返回二維矩陣式
[(1, 2, 3), (4, 5, 6)]
複製程式碼

可變引數

在函式的引數使用識別符號"*"來實現可變引數的功能。"*"可以引用元組，把多個參會組合到一個元組中； "**"可以引用字典

def search(*t,**d):
    keys = d.keys()
    for arg in t:
        for key in keys:
            if arg == key:
                print ("find:",d[key])

search("a","two",a="1",b="2") #呼叫
複製程式碼

時間與字串的轉換

時間轉字串使用time模組中的strftime()函式

import time

print time.strftime("%Y-%m-%d",time.localtime())
複製程式碼

字串到時間使用time模組中strftime和datetime模組中的datetime()函式

import time
import datetime

t = time.strptime("2018-3-8", "%Y-%m-%d")
y, m, d = t[0:3]

print datetime.datetime(y,m,d)
複製程式碼

操作檔案和目錄操作

比如對檔案重新命名、刪除、查詢等操作

os庫:檔案的重新命名、獲取路徑下所有的檔案等。os.path模組可以對路徑、檔名等進行操作

files = os.listdir(".")
print type(os.path)
for filename in files:
    print os.path.splitext(filename)# 檔名和字尾分開
複製程式碼

shutil庫：檔案的複製、移動等操作
glob庫：glob.glob("*.txt")查詢當前路徑下字尾名txt所有檔案

讀取配置檔案

通過configparser(3.x，ConfigParser（2.x）)庫進行配置的檔案的讀取、更改、增加等操作

config = ConfigParser.ConfigParser()
config.add_section("系統")
config.set("系統", "系統名稱", "iOS")
f = open("Sys.ini", "a+")
config.write(f)
f.close()
複製程式碼

正則

re正則匹配查詢等操作

類

屬性

私有屬性名字前邊加"__"

class Fruits:
    price = 0               # 類屬性，所有的類變數共享，物件和類均可訪問。但是修改只能通過類訪問進行修改

    def __init__(self):
        self.color = "red"  # 例項變數，只有物件才可以訪問
        zone = "中國"        # 區域性變數
        self.__weight = "12" # 私有變數，不可以直接訪問，可以通過_classname__attribute進行訪問


if __name__ == "__main__":
    apple = Fruits()
    print (apple._Fruits__weight) #訪問私有變數
複製程式碼

方法

靜態方法

    @staticmethod
    def getPrice():
        print (Fruits.price)
複製程式碼

私有方法

    def __getWeight(self):
        print self.__weight
複製程式碼

類方法

    @classmethod
    def getPrice2(cls):
        print (cls.price)
複製程式碼

動態增加方法

Python作為動態指令碼語言，編寫的程式也具有很強的動態性。

class_name.method_name = function_name

類的繼續

並且支援多重繼承

格式：

class class_name(super_class1,super_class2):

抽象方法

    @abstractmethod
    def grow(self):
        pass
複製程式碼

運算子的過載

Python將運算子和類的內建方法關聯起來,每個運算子對應1個函式。例如__add__()表示加好運算子;gt()表示大於運算子

通過過載運算子我們可以實現物件的加減或者比較等操作。

異常

捕獲異常

try: except:finally:

丟擲異常

raise語言丟擲異常

斷言

assert len(t)==1

檔案持久化

`shelve`本地建庫

shelve模組提供了本地資料化儲存的方法

addresses = shelve.open("addresses") # 如果沒有本地會建立
addresses["city"] = "北京"
addresses["pro"] = "廣東"
addresses.close()
複製程式碼

cPickle 序列化

cPickle和pickle兩個模組都是來實現序列號的，前者是C語言編寫的，效率比較高

序列化：

import cPickle as pickle
str = "我需要序列化"
f = open("serial.txt", "wb")
pickle.dump(str, f)
f.close()
複製程式碼

反序列化:

f = open("serial.txt","rb")
str = pickle.load(f)
f.close()
複製程式碼

json檔案儲存

Python內建了json模組用於json資料的操作

序列號到本地

import json
new_str = [{'a': 1}, {'b': 2}]
f = open('json.txt', 'w')
json.dump(new_str, f,ensure_ascii=False)
f.close()
複製程式碼

從本地讀取

import json
f = open('json.txt', 'r')
str = json.load(f)
print str
f.close()
複製程式碼

執行緒

threading模組

class threading.Thread(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)

執行緒和queue

# -*- coding:UTF-8 -*-

import threading
import Queue

class MyJob(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self, name="aa")

    def run(self):
        print threading.currentThread()

        while not q.empty():
            a = q.get()
            print("我的%d"%a)
            print "我的執行緒"
            q.task_done()


def job(a, b):
    print a+b
    print threading.activeCount()
    print "多執行緒"


thread = threading.Thread(target=job, args=(2, 4), name="mythread")
q = Queue.Queue()
if __name__ == "__main__":
    myjob = MyJob()
    for i in range(100):
        q.put(i)
    myjob.start()
    q.join() #每個昨晚的任何必須呼叫task_done()，要不主執行緒會掛起
複製程式碼

程式

multiprocessing中 Process可以建立程式，通過Pool程式池可以對程式進行管理

from multiprocessing import Process
import os

def run_pro(name):
    print 'process %s(%s)' % (os.getpid(),name)

if __name__ == "__main__":
    print 'parent process %s' % os.getpid()
    for i in range(5):
        p = Process(target=run_pro, args=(str(i)))
        p.start()
複製程式碼

爬蟲

爬取資料

urllib2/urllib Python內建的，可以實現爬蟲，比較常用

import urllib2
response = urllib2.urlopen('http://www.baidu.com')
html = response.read()
print html

try:
    request = urllib2.Request('http://www.google.com')
    response = urllib2.urlopen(request,timeout=5)
    html = response.read()
    print html
except urllib2.URLError as e:
    if hasattr(e, 'code'):
        print 'error code:',e.code
    print e
複製程式碼

Requests 第三方比較人性化的框架

import requests
r = requests.get('http://www.baidu.com')
print r.content
print r.url
print r.headers
複製程式碼

解析爬取的資料

通過BeautifulSoup來解析html資料，Python標準庫（html.parser）容錯比較差，一般使用第三方的lxml,效能、容錯等比較好。

hash演算法庫

hashlib介紹

hashlib 是一個提供了一些流行的hash演算法的 Python 標準庫．其中所包括的演算法有 md5, sha1, sha224, sha256, sha384, sha512. 另外，模組中所定義的 new(name, string=”) 方法可通過指定系統所支援的hash演算法來構造相應的hash物件

比較好的資料

《Python爬蟲開發與專案實戰》pdf書籍

連結: Python爬蟲開發與專案實戰密碼: g19d

我的部落格

FlyOceanFish

Python教程分享之Python基礎知識點梳理
2021-05-07
Python
GO基礎知識分享
2021-03-31
Go
python基礎知識
2024-03-14
Python
python 基礎知識
2021-09-09
Python
zookeeper基礎知識分享(一)
2024-11-12
HBase基礎知識分享(二)
2024-11-14
hadoop基礎知識分享(二)
2024-11-06
Hadoop
JavaSE基礎知識分享(十五)
2024-08-25
Java
JavaSE基礎知識分享(六)
2024-08-14
Java
JavaSE基礎知識分享(七)
2024-08-15
Java
JavaSE基礎知識分享(九)
2024-08-18
Java
JavaSE基礎知識分享(十)
2024-08-19
Java
JavaSE基礎知識分享(三)
2024-08-05
Java
MySQL基礎知識分享(二)
2024-08-04
MySql
JavaSE基礎知識分享(十二)
2024-08-21
Java
JavaSE基礎知識分享(十四)
2024-08-23
Java
JavaSE基礎知識分享(八)
2024-08-16
Java
JavaSE基礎知識分享(十一)
2024-08-20
Java
JavaSE基礎知識分享(五)
2024-08-12
Java
JavaSE基礎知識分享(四)
2024-08-09
Java
JavaSE基礎知識分享(一)
2024-07-28
Java
hadoop基礎知識分享(一)
2024-09-08
Hadoop
MySQL基礎知識分享(一)
2024-07-26
MySql
JavaSE基礎知識分享(二)
2024-07-30
Java
GO基礎知識分享2
2021-03-31
Go
Python基礎知識1
2018-11-15
Python
Python基礎知識整理
2019-03-08
Python
Python知識體系-Python2基礎知識
2019-05-08
Python
sql入門基礎知識分享
2019-04-01
SQL
Python基礎知識之字典
2019-02-16
Python
Python基礎知識之集合
2019-02-16
Python
Python基礎知識架構
2019-04-12
Python架構
Python基礎知識之二
2018-09-18
Python
Python——基礎知識細節
2019-08-02
Python
Python基礎知識點梳理
2020-02-08
Python
java/go/python/go/rust/nodejs/c/c++基礎知識分享
2021-04-07
JavaGoPythonRustNodeJSC++
好程式設計師Python培訓分享Python入門基礎知識
2020-07-27
程式設計師Python
【知識分享】伺服器基礎知識【初學者必看】
2023-02-24
伺服器