7、python之檔案操作
python之檔案操作
一、檔案操作基本流程
計算機系統分為:計算機硬體,作業系統,應用程式三部分。
我們用python或其他語言編寫的應用程式若想要把資料永久儲存下來,必須要儲存於硬碟中,這就涉及到應用程式要操作硬體,眾所周知,應用程式是無法直接操作硬體的,這就用到了作業系統。作業系統把複雜的硬體操作封裝成簡單的介面給使用者/應用程式使用,其中檔案就是作業系統提供給應用程式來操作硬碟虛擬概念,使用者或應用程式通過操作檔案,可以將自己的資料永久儲存下來。
有了檔案的概念,我們無需再去考慮操作硬碟的細節,只需要關注操作檔案的流程:
#1. 開啟檔案,得到檔案控制程式碼並賦值給一個變數
f=open(`a.txt`,`r`,encoding=`utf-8`) #預設開啟模式就為r
#2. 通過控制程式碼對檔案進行操作
data=f.read()
#3. 關閉檔案
f.close()
關閉檔案的注意事項:
開啟一個檔案包含兩部分資源:作業系統級開啟的檔案+應用程式的變數。在操作完畢一個檔案時,必須把與該檔案的這兩部分資源一個不落地回收,回收方法為:
1、f.close() #回收作業系統級開啟的檔案
2、del f #回收應用程式級的變數
其中del f一定要發生在f.close()之後,否則就會導致作業系統開啟的檔案還沒有關閉,白白佔用資源,
而python自動的垃圾回收機制決定了我們無需考慮del f,這就要求我們,在操作完畢檔案後,一定要記住 f.close()
雖然我這麼說,但是很多同學還是會很不要臉地忘記f.close(),對於這些不長腦子的同學,我們推薦傻瓜式操作方式:使用with關鍵字來幫我們管理上下文
with open(`a.txt`,`w`) as f:
pass
with open(`a.txt`,`r`) as read_f,open(`b.txt`,`w`) as write_f:
data=read_f.read()
write_f.write(data)
注意
二、檔案編碼
f=open(…)是由作業系統開啟檔案,那麼如果我們沒有為open指定編碼,那麼開啟檔案的預設編碼很明顯是作業系統說了算了,作業系統會用自己的預設編碼去開啟檔案,在windows下是gbk,在linux下是utf-8。
#若要保證不亂碼,檔案以什麼方式存的,就要以什麼方式開啟。
f=open(`a.txt`,`r`,encoding=`utf-8`)
三、檔案的開啟模式
檔案控制程式碼 = open(‘檔案路徑’,‘模式’)
#1. 開啟檔案的模式有(預設為文字模式):
r ,只讀模式【預設模式,檔案必須存在,不存在則丟擲異常】
w,只寫模式【不可讀;不存在則建立;存在則清空內容】
a, 只追加寫模式【不可讀;不存在則建立;存在則只追加內容】
#2. 對於非文字檔案,我們只能使用b模式,”b”表示以位元組的方式操作(而所有檔案也都是以位元組的形式儲存的,使用這種模式無需考慮文字檔案的字元編碼、圖片檔案的jgp格式、視訊檔案的avi格式)
rb
wb
ab
注:以b方式開啟時,讀取到的內容是位元組型別,寫入時也需要提供位元組型別,不能指定編碼
#3,‘+’模式(就是增加了一個功能)
r+, 讀寫【可讀,可寫】
w+,寫讀【可寫,可讀】
a+, 寫讀【可寫,可讀】
#4,以bytes型別操作的讀寫,寫讀,寫讀模式
r+b, 讀寫【可讀,可寫】
w+b,寫讀【可寫,可讀】
a+b, 寫讀【可寫,可讀】
四、檔案操作方法
4.1常用操作方法
read(3):
1. 檔案開啟方式為文字模式時,代表讀取3個字元
2. 檔案開啟方式為b模式時,代表讀取3個位元組
其餘的檔案內游標移動都是以位元組為單位的如:seek,tell,truncate
注意:
1. seek有三種移動方式0,1,2,其中1和2必須在b模式下進行,但無論哪種模式,都是以bytes為單位移動的
2. truncate是截斷檔案,所以檔案的開啟方式必須可寫,但是不能用w或w+等方式開啟,因為那樣直接清空檔案了,所以truncate要在r+或a或a+等模式下測試效果。
4.2所有操作方法(瞭解)
class file(object)
def close(self): # real signature unknown; restored from __doc__
關閉檔案
“””
close() -> None or (perhaps) an integer. Close the file.
Sets data attribute .closed to True. A closed file cannot be used for
further I/O operations. close() may be called more than once without
error. Some kinds of file objects (for example, opened by popen())
may return an exit status upon closing.
“””
def fileno(self): # real signature unknown; restored from __doc__
檔案描述符
“””
fileno() -> integer “file descriptor”.
This is needed for lower-level file interfaces, such os.read().
“””
return 0
def flush(self): # real signature unknown; restored from __doc__
重新整理檔案內部緩衝區
“”” flush() -> None. Flush the internal I/O buffer. “””
pass
def isatty(self): # real signature unknown; restored from __doc__
判斷檔案是否是同意tty裝置
“”” isatty() -> true or false. True if the file is connected to a tty device. “””
return False
def next(self): # real signature unknown; restored from __doc__
獲取下一行資料,不存在,則報錯
“”” x.next() -> the next value, or raise StopIteration “””
pass
def read(self, size=None): # real signature unknown; restored from __doc__
讀取指定位元組資料
“””
read([size]) -> read at most size bytes, returned as a string.
If the size argument is negative or omitted, read until EOF is reached.
Notice that when in non-blocking mode, less data than what was requested
may be returned, even if no size parameter was given.
“””
pass
def readinto(self): # real signature unknown; restored from __doc__
讀取到緩衝區,不要用,將被遺棄
“”” readinto() -> Undocumented. Don`t use this; it may go away. “””
pass
def readline(self, size=None): # real signature unknown; restored from __doc__
僅讀取一行資料
“””
readline([size]) -> next line from the file, as a string.
Retain newline. A non-negative size argument limits the maximum
number of bytes to return (an incomplete line may be returned then).
Return an empty string at EOF.
“””
pass
def readlines(self, size=None): # real signature unknown; restored from __doc__
讀取所有資料,並根據換行儲存值列表
“””
readlines([size]) -> list of strings, each a line from the file.
Call readline() repeatedly and return a list of the lines so read.
The optional size argument, if given, is an approximate bound on the
total number of bytes in the lines returned.
“””
return []
def seek(self, offset, whence=None): # real signature unknown; restored from __doc__
指定檔案中指標位置
“””
seek(offset[, whence]) -> None. Move to new file position.
Argument offset is a byte count. Optional argument whence defaults to
(offset from start of file, offset should be >= 0); other values are 1
(move relative to current position, positive or negative), and 2 (move
relative to end of file, usually negative, although many platforms allow
seeking beyond the end of a file). If the file is opened in text mode,
only offsets returned by tell() are legal. Use of other offsets causes
undefined behavior.
Note that not all file objects are seekable.
“””
pass
def tell(self): # real signature unknown; restored from __doc__
獲取當前指標位置
“”” tell() -> current file position, an integer (may be a long integer). “””
pass
def truncate(self, size=None): # real signature unknown; restored from __doc__
截斷資料,僅保留指定之前資料
“””
truncate([size]) -> None. Truncate the file to at most size bytes.
Size defaults to the current file position, as returned by tell().
“””
pass
def write(self, p_str): # real signature unknown; restored from __doc__
寫內容
“””
write(str) -> None. Write string str to file.
Note that due to buffering, flush() or close() may be needed before
the file on disk reflects the data written.
“””
pass
def writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__
將一個字串列表寫入檔案
“””
writelines(sequence_of_strings) -> None. Write the strings to the file.
Note that newlines are not added. The sequence can be any iterable object
producing strings. This is equivalent to calling write() for each string.
“””
pass
def xreadlines(self): # real signature unknown; restored from __doc__
可用於逐行讀取檔案,非全部
“””
xreadlines() -> returns self.
For backward compatibility. File objects now include the performance
optimizations previously implemented in the xreadlines module.
“””
pass
2.x
class TextIOWrapper(_TextIOBase):
“””
Character and line based layer over a BufferedIOBase object, buffer.
encoding gives the name of the encoding that the stream will be
decoded or encoded with. It defaults to locale.getpreferredencoding(False).
errors determines the strictness of encoding and decoding (see
help(codecs.Codec) or the documentation for codecs.register) and
defaults to “strict”.
newline controls how line endings are handled. It can be None, “,
`
`, `
`, and `
`. It works as follows:
* On input, if newline is None, universal newlines mode is
enabled. Lines in the input can end in `
`, `
`, or `
`, and
these are translated into `
` before being returned to the
caller. If it is “, universal newline mode is enabled, but line
endings are returned to the caller untranslated. If it has any of
the other legal values, input lines are only terminated by the given
string, and the line ending is returned to the caller untranslated.
* On output, if newline is None, any `
` characters written are
translated to the system default line separator, os.linesep. If
newline is “ or `
`, no translation takes place. If newline is any
of the other legal values, any `
` characters written are translated
to the given string.
If line_buffering is True, a call to flush is implied when a call to
write contains a newline character.
“””
def close(self, *args, **kwargs): # real signature unknown
關閉檔案
pass
def fileno(self, *args, **kwargs): # real signature unknown
檔案描述符
pass
def flush(self, *args, **kwargs): # real signature unknown
重新整理檔案內部緩衝區
pass
def isatty(self, *args, **kwargs): # real signature unknown
判斷檔案是否是同意tty裝置
pass
def read(self, *args, **kwargs): # real signature unknown
讀取指定位元組資料
pass
def readable(self, *args, **kwargs): # real signature unknown
是否可讀
pass
def readline(self, *args, **kwargs): # real signature unknown
僅讀取一行資料
pass
def seek(self, *args, **kwargs): # real signature unknown
指定檔案中指標位置
pass
def seekable(self, *args, **kwargs): # real signature unknown
指標是否可操作
pass
def tell(self, *args, **kwargs): # real signature unknown
獲取指標位置
pass
def truncate(self, *args, **kwargs): # real signature unknown
截斷資料,僅保留指定之前資料
pass
def writable(self, *args, **kwargs): # real signature unknown
是否可寫
pass
def write(self, *args, **kwargs): # real signature unknown
寫內容
pass
def __getstate__(self, *args, **kwargs): # real signature unknown
pass
def __init__(self, *args, **kwargs): # real signature unknown
pass
@staticmethod # known case of __new__
def __new__(*args, **kwargs): # real signature unknown
“”” Create and return a new object. See help(type) for accurate signature. “””
pass
def __next__(self, *args, **kwargs): # real signature unknown
“”” Implement next(self). “””
pass
def __repr__(self, *args, **kwargs): # real signature unknown
“”” Return repr(self). “””
pass
buffer = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
closed = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
encoding = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
errors = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
line_buffering = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
name = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
newlines = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
_CHUNK_SIZE = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
_finalizing = property(lambda self: object(), lambda self, v: None, lambda self: None) # default
3.x
五、檔案的修改
檔案的資料是存放於硬碟上的,因而只存在覆蓋、不存在修改這麼一說,我們平時看到的修改檔案,都是模擬出來的效果,具體的說有兩種實現方式:
方式一:將硬碟存放的該檔案的內容全部載入到記憶體,在記憶體中是可以修改的,修改完畢後,再由記憶體覆蓋到硬碟(word,vim,nodpad++等編輯器)
import os # 呼叫系統模組
with open(`a.txt`) as read_f,open(`a.txt.swap`,`w`) as write_f:
data=read_f.read() #全部讀入記憶體,如果檔案很大,會很卡
data=data.replace(`alex`,`SB`) #在記憶體中完成修改
write_f.write(data) #一次性寫入新檔案
os.remove(`a.txt`) #刪除原檔案
os.rename(`.a.txt.swap`,`a.txt`) #將新建的檔案重新命名為原檔案
方式二:將硬碟存放的該檔案的內容一行一行地讀入記憶體,修改完畢就寫入新檔案,最後用新檔案覆蓋原始檔
import os
with open(`a.txt`) as read_f,open(`.a.txt.swap`,`w`) as write_f:
for line in read_f:
line=line.replace(`alex`,`SB`)
write_f.write(line)
os.remove(`a.txt`)
os.rename(`.a.txt.swap`,`a.txt`)
六、練習
1、 檔案a.txt內容:每一行內容分別為商品名字,價錢,個數。
apple 10 3
tesla 100000 1
mac 3000 2
lenovo 30000 3
chicken 10 3
通過程式碼,將其構建成這種資料型別:[{`name`:`apple`,`price`:10,`amount`:3},{`name`:`tesla`,`price`:1000000,`amount`:1}……] 並計算出總價錢。
2、有如下檔案:
——-
alex是頭上長了個包。
alex其實是人妖。
誰說alex是sb?
你們真逗,alex再牛逼,也掩飾不住資深屌絲的氣質。
———-
將檔案中所有的alex都替換成大寫的SB。
相關文章
- (十七)Python學習之檔案操作Python
- Python3之檔案操作filePython
- Python學習筆記|Python之檔案操作Python筆記
- Python操作檔案Python
- python–模組之os操作檔案模組Python
- python_檔案操作Python
- 1.4.0 Python檔案操作Python
- Python 檔案操作(一)Python
- python 操作整理檔案Python
- Python的檔案操作Python
- Python檔案的操作Python
- python對檔案的操作Python
- python3.7 檔案操作Python
- python--檔案操作指南Python
- Python 檔案、目錄操作Python
- Python基礎——檔案操作Python
- python檔案讀寫操作Python
- 18 Python如何操作檔案?Python
- python 檔案操作入門Python
- Python基礎知識之檔案的讀取操作Python
- 從零開始的Python學習Episode 7——檔案基本操作Python
- python交教程4:檔案操作Python
- Python:檔案操作詳細教程Python
- Python批處理:檔案操作Python
- python file 檔案操作筆記Python筆記
- Python OS模組操作檔案Python
- python pyyaml操作yaml配置檔案PythonYAML
- Java 檔案 IO 操作之 DirectIOJava
- Linux學習之檔案操作Linux
- Java操作PDF檔案之ITextJava
- Python基礎入門(9)- Python檔案操作Python
- python對檔案的操作方法Python
- 6.1Python檔案的操作(一)Python
- Python檔案操作:finally子句的使用Python
- python操作檔案寫入內容Python
- CentOS 7 操作使用者和組 && 常用的檔案操作CentOS
- python 檔案操作(二) 替換性修改檔案內容Python
- IO流之 檔案操作字元流字元