[PY3]——內建資料結構(3)——字串及其常用操作

Jelly_lyj發表於2017-03-18

字串及其常用操作xmind圖

                   

字串的定義

1. 單引號/雙引號

In [1]: s1='hello world'
In [2]: s1="hello world"

2. 三對單引號/三對雙引號

In [8]: s1='''hello
   ...: world''';  print(s1)    #三個單引號支援字串分行
hello
world

In [9]: s1="""hello world"""

 

轉義

In [24]: s='I\'m jelly';print(s)
I'm jelly

In [25]: s='file path:c:\\windows\\';print(s)
file path:c:\windows\

# 在三對單引號中,可以自由地使用單/雙引號而不需要考慮轉義引號
In [20]: sql='''select * table where name='jelly' ''';print(sql)
select * table where name='jelly'

In [21]: sql='''select * table where name="jelly" ''';print(sql)
select * table where name="jelly"


# r' ' 原樣輸出,不轉義
In [27]: path=r'c:\windows\system32\???';print(path)
c:\windows\system32\???

 

訪問字串,字串不可變

In [1]: print(s)
I'm jelly

In [2]: print(s[0])
I

In [3]: print(s[3])

In [4]: type(s[3])
Out[4]: str

In [5]: s[3]='B'
TypeError: 'str' object does not support item assignment

 

字串常用操作

###連線操作###

1. join()

# join是字串的方法,引數是“內容為字串”“可迭代物件”接收者是分隔符

In [7]: s=str()
In [8]: help(s.join)
join(...) method of builtins.str instance
    S.join(iterable) -> str
    Return a string which is the concatenation of the strings in the
    iterable.  The separator between elements is S.

In [1]: lst=['I','\'m','jelly']

In [2]: print(lst)
['I', "'m", 'jelly']

In [3]: ' '.join(lst)
Out[3]: "I 'm jelly"

In [4]: '/'.join(lst)
Out[4]: "I/'m/jelly"

In [5]: lst   //這個lst的內容是int而不是string所以丟擲TypeError
Out[5]: [1, 2, 3, 4, 5, 6, 7, 8]

In [6]: '.'.join(lst)   
TypeError: sequence item 0: expected str instance, int found

###分割操作###

1. split()

In [13]: help(s.split)
split(...) method of builtins.str instance
    S.split(sep=None, maxsplit=-1) -> list of strings
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.

#split預設使用空格分割,並且多個空格會當成一個空格處理
In [17]: s1="I  love  python";s1.split()
Out[17]: ['I', 'love', 'python']

#應當注意若是split指定了分隔符為空格,則每個空格都處理
In [18]: s1="I  love  python";s1.split(' ')
Out[18]: ['I', '', 'love', '', 'python']

#seq參數列示指定分隔符,分隔符可以是任意字串
In [21]: s2="I love python";s2.split('o')
Out[21]: ['I l', 've pyth', 'n']

#split從左往右分割,maxsplit參數列示分割次數,預設值為-1表示分割所有分隔符
In [25]: s3="i i i i i i"; s3.split(maxsplit=-1)
Out[25]: ['i', 'i', 'i', 'i', 'i', 'i']

In [26]: s3="i i i i i i"; s3.split(maxsplit=1)
Out[26]: ['i', 'i i i i i']

In [27]: s3="i i i i i i"; s3.split(maxsplit=2)
Out[27]: ['i', 'i', 'i i i i']

# 當字串不含分隔符,以及當字串只含分隔符的情況
In [11]: ''.split('=')
Out[11]: ['']

In [12]: '='.split('=')
Out[12]: ['', '']

2. rsplit()

In [14]: help(s.rsplit)
rsplit(...) method of builtins.str instance
    S.rsplit(sep=None, maxsplit=-1) -> list of strings
    Return a list of the words in S, using sep as the
    delimiter string, starting at the end of the string and
    working to the front.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified, any whitespace string
    is a separator.
#rsplit與split的方向相反,是從右往左分割,但我們可以看到當不涉及maxsplit引數時,rsplit的結果split完全一致
#但split的效率高於rsplit
In [29]: s1="I  love python";s1.rsplit()
Out[29]: ['I', 'love', 'python']

In [30]: s1="I  love python";s1.rsplit(' ')
Out[30]: ['I', '', 'love', 'python']

In [31]: s2="I love python";s1.rsplit('o')
Out[31]: ['I  l', 've pyth', 'n']

#涉及到maxsplit引數,則rsplit和split的輸出結果相反
In [32]: s3="i i i i i i";s3.rsplit(maxsplit=-1)
Out[32]: ['i', 'i', 'i', 'i', 'i', 'i']

In [33]: s3="i i i i i i";s3.rsplit(maxsplit=1)
Out[33]: ['i i i i i', 'i']

In [34]: s3="i i i i i i";s3.rsplit(maxsplit=2)
Out[34]: ['i i i i', 'i', 'i']

3. splitlines()

In [15]: help(s.splitlines)
splitlines(...) method of builtins.str instance
    S.splitlines([keepends]) -> list of strings
    Return a list of the lines in S, breaking at line boundaries.
    Line breaks are not included in the resulting list unless keepends
    is given and true.

In [1]: s="""first line
   ...: second line"""
# splitlines按行分割,預設返回結果不帶換行符 In [2]: s.splitlines() Out[2]: ['first line', 'second line'] # 設定True引數則返回結果帶換行符 In [3]: s.splitlines(True) Out[3]: ['first line\n', 'second line']

4. partition()/rpartition()

In [4]: help(s.partition)
partition(...) method of builtins.str instance
    S.partition(sep) -> (head, sep, tail)
    Search for the separator sep in S, and return the part before it,
    the separator itself, and the part after it.  If the separator is not
    found, return S and two empty strings.

# 將字串按照傳入的分隔符seq分割一次,返回結果總是一個三元組——(head,seq,tail)

In [5]: s="first/second/third";s.partition('/')
Out[5]: ('first', '/', 'second/third')

In [6]: s="first/second/third";s.rpartition('/')
Out[6]: ('first/second', '/', 'third')

# partition常用於分割配置檔案
In [8]: cfg='env=PATH=/usr/bin:$PATH'; cfg.partition('=')
Out[8]: ('env', '=', 'PATH=/usr/bin:$PATH')

# 當字串不含分隔符,以及當字串只含分隔符的情況
In [9]: ''.partition('=')
Out[9]: ('', '', '')

In [10]: '='.partition('=')
Out[10]: ('', '=', '')         //總之輸出的結果總是一個三元組

###字元的轉化/排版類操作###

1. 大小寫轉化

# upper()轉化為大寫
In [1]: s='test';s.upper()
Out[1]: 'TEST'

# lower()轉化為小寫
In [3]: s='Test';s.lower()
Out[3]: 'test'

# title()將各單詞首字母轉化為大寫
In [6]: s='i love python';s.title()
Out[6]: 'I Love Python'

# capitalize()僅將首單詞首字母轉化為大寫
In [7]: s='i love python';s.capitalize()
Out[7]: 'I love python'

# casefold()通常用於忽略大小寫
In [24]: s='Test TTest';s.casefold()
Out[24]: 'test ttest'

# swapcase()大小寫互換
In [27]: s='TEst';s.swapcase()
Out[27]: 'teST'

2. 排版相關(瞭解即可,用的不多)

# center()
In [9]: s='test';s.center(20)
Out[9]: '        test        '

# zfill()
In [22]: s="700";s.zfill(20)
Out[22]: '00000000000000000700'

# expandtabs(n)將table轉化為n個空格
In [29]: '\t'.expandtabs(6)
Out[29]: '      '

###修改操作###

1. replace()

In [2]: help(s.replace)
replace(...) method of builtins.str instance
    S.replace(old, new[, count]) -> str
    Return a copy of S with all occurrences of substring
    old replaced by new.  If the optional argument count is
    given, only the first count occurrences are replaced.

# replace('old','new'),將字串中的old全部替換成new
In [3]: s='red red green';s.replace('red','yellow')
Out[3]: 'yellow yellow green'

# replace('old','new'count),count可以用來指定替換次數
In [4]: s='red red green';s.replace('red','yellow',1)
Out[4]: 'yellow red green'

2. strip()/lstrip()/rstrip()

In [5]: help(s.strip)
strip(...) method of builtins.str instance
    S.strip([chars]) -> str
    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.

In [6]: help(s.lstrip)
lstrip(...) method of builtins.str instance
    S.lstrip([chars]) -> str
    Return a copy of the string S with leading whitespace removed.
    If chars is given and not None, remove characters in chars instead.

In [7]: help(s.rstrip)
rstrip(...) method of builtins.str instance
    S.rstrip([chars]) -> str
    Return a copy of the string S with trailing whitespace removed.
    If chars is given and not None, remove characters in chars instead.

# strip()用於移除字串前後的空白
In [1]: s=' hhh hhh ';s.strip()
Out[1]: 'hhh hhh'

In [3]: s='\n \r \t hhh hhh \t \n \r';s.strip()
Out[3]: 'hhh hhh'

# strip()還可以移除指定的多個字元
In [4]: s='###hhh###kkk###';s.strip('#')
Out[4]: 'hhh###kkk'

In [5]: s='{{ hhh }}';s.strip('{}')
Out[5]: ' hhh '

In [6]: s='{{ hhh }}';s.strip('{} ')
Out[6]: 'hhh'

# lstrip()和rstrip()則分別為只移除左/右端
In [8]: s='{{ hhh }}';s.lstrip('{} ')
Out[8]: 'hhh }}'

In [9]: s='{{ hhh }}';s.rstrip('{} ')
Out[9]: '{{ hhh'

2. center()/ljust()/rjust()

In [16]: help(s.center)
center(...) method of builtins.str instance
    S.center(width[, fillchar]) -> str
    Return S centered in a string of length width. Padding is
    done using the specified fill character (default is a space)

In [18]: help(s.ljust)
ljust(...) method of builtins.str instance
    S.ljust(width[, fillchar]) -> str
    Return S left-justified in a Unicode string of length width. Padding is
    done using the specified fill character (default is a space).

In [19]: help(s.rjust)
rjust(...) method of builtins.str instance
    S.rjust(width[, fillchar]) -> str
    Return S right-justified in a string of length width. Padding is
    done using the specified fill character (default is a space).

In [9]: s='test';len(s)
Out[9]: 4

# rjust()填充字串,原串在右側(預設用空格填充,填充的字元可指定)
In [10]: s1=s.rjust(10,'#');print(s1);len(s1)
######test
Out[10]: 10

# ljust()填充字串,原串在左側
In [11]: s1=s.ljust(10,'#');print(s1);len(s1)
test######
Out[11]: 10

# center()填充字串,原串在中間
In [13]: s1=s.center(11,'#');print(s1);len(s1)
####test###
Out[13]: 11

# 如果指定的填充字元寬度小於原字串的寬度,則不做任何操作
In [14]: s1=s.center(3,'#');print(s1);len(s1)
test
Out[14]: 4

###查詢操作###

1. find()

In [17]: help(s.find)
find(...) method of builtins.str instance
    S.find(sub[, start[, end]]) -> int
    Return the lowest index in S where substring sub is found,
    such that sub is contained within S[start:end].  Optional
    arguments start and end are interpreted as in slice notation.
    Return -1 on failure.

# find()從左往右查詢,找到第一個子串,返回子串首字母的索引
In [22]: s='holler hello love';s.find('hello')
Out[22]: 7

In [23]: s='holler hello love';s.find('h')
Out[23]: 0

# rfind()是從右往左查詢
In [30]: s='holler hello love';s.rfind('h')
Out[30]: 7

# find同樣可以指定索引範圍
In [25]: s='holler hello love';s.find('h',3)
Out[25]: 7

# 查詢不到則返回-1
In [24]: s='holler hello love';s.find('hhh')
Out[24]: -1

2. index()/rindex()

In [18]: help(s.index)
index(...) method of builtins.str instance
    S.index(sub[, start[, end]]) -> int
    Like S.find() but raise ValueError when the substring is not found.

# index()和find()用法一樣
In [31]: s='holler hello love';s.index('hello')
Out[31]: 7

# 唯一的區別:是找不到子串時丟擲ValueError異常
In [32]: s='holler hello love';s.index('hhh')
ValueError: substring not found

3. count()

In [20]: help(s.count)
count(...) method of builtins.str instance
    S.count(sub[, start[, end]]) -> int
    Return the number of non-overlapping occurrences of substring sub in
    string S[start:end].  Optional arguments start and end are
    interpreted as in slice notation.

# count()用於統計字串出現次數
In [33]: s='holler hello love';s.count('hello')
Out[33]: 1

In [35]: s='holler hello love';s.count('o')
Out[35]: 3

In [37]: s='holler hello love';s.count('o',0,6)
Out[37]: 1

# 找不到時丟擲ValueError異常
In [36]: s='holler hello love';s.index('hhh')
ValueError: substring not found

###判斷操作###

1. startswith()
In [39]: help(s.startswith)
startswith(...) method of builtins.str instance
    S.startswith(prefix[, start[, end]]) -> bool
    Return True if S starts with the specified prefix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    prefix can also be a tuple of strings to try.

# startswith()用於判斷是否以某個字串開始,返回結果是bool
In [41]: s='holler hello love';s.startswith('h')
Out[41]: True

In [42]: s='holler hello love';s.startswith('holler')
Out[42]: True

In [43]: s='holler hello love';s.startswith('hhh')
Out[43]: False

# 同樣可以指定start和end表示索引從何開始、從何結束
In [44]: s='holler hello love';s.startswith('h',2,8)
Out[44]: False

2. endswith( )

In [45]: help(s.endswith)
endswith(...) method of builtins.str instance
    S.endswith(suffix[, start[, end]]) -> bool
    Return True if S ends with the specified suffix, False otherwise.
    With optional start, test S beginning at that position.
    With optional end, stop comparing S at that position.
    suffix can also be a tuple of strings to try.

# endswith()用於判斷是否以某個字串結束,返回結果是bool
In [1]: s='holler hello love';s.endswith('e')
Out[1]: True

In [2]: s='holler hello love';s.endswith('love')
Out[2]: True

In [3]: s='holler hello love';s.endswith('o',2,12)
Out[3]: True

3. is*( ) string中is開頭的方法用來做判斷

# isalnum()判斷字串是否只含有字母或數字
# 可以用來判斷是否有空格或其它字元
In [8]: s='holler';s.isalnum()
Out[8]: True

In [9]: s='holler\t';s.isalnum()
Out[9]: False

In [10]: s='holler hello love';s.isalnum()
Out[10]: False

In [11]: s='holler123';s.isalnum()
Out[11]: True

# isalpha()判斷字串是否只含有字母
In [13]: s='123';s.isalpha()
Out[13]: False

In [14]: s='abc';s.isalpha()
Out[14]: True

In [15]: s='abc123';s.isalpha()
Out[15]: False

# isdecimal()判斷字串是否只含有數字
In [19]: s='123';s.isdecimal()
Out[19]: True

In [20]: s='abc';s.isdecimal()
Out[20]: False

In [21]: s='abc123';s.isdecimal()
Out[21]: False

# isidentifier()判斷字元是否是一個合法識別符號
# 合法識別符號:字母/下劃線開頭,僅包含字母、數字、下劃線

In [22]: s='_abc';s.isidentifier()
Out[22]: True

In [23]: s='1abc';s.isidentifier()
Out[23]: False

In [24]: s='abc_123';s.isidentifier()
Out[24]: True



相關文章