NumPy之:資料型別物件dtype

flydean發表於2021-04-30

原文網址 : https://www.cnblogs.com/flydean/p/14720858.html

簡介

之前講到了NumPy中有多種資料型別，每種資料型別都是一個dtype(numpy.dtype )物件。今天我們來詳細講解一下dtype物件。

dtype的定義

先看下dtype方法的定義：

class numpy.dtype(obj, align=False, copy=False)

其作用就是將物件obj轉成dtype型別的物件。

它帶了兩個可選的引數:

align - 是否按照C編譯器的結構體輸出格式對齊物件。
Copy - 是拷貝物件，還是對物件的引用。

dtype可以用來描述資料的型別（int，float，Python物件等），描述資料的大小，資料的位元組順序（小端或大端）等。

可轉換為dtype的物件

可轉換的obj物件可以有很多種型別，我們一一來進行講解

dtype物件

如果obj物件本身就是一個dtype物件，那麼可以進行無縫轉換。

None

不傳的話，預設就是float_，這也是為什麼我們建立陣列預設都是float型別的原因。

陣列標量型別

內建的陣列標量可以被轉換成為相關的data-type物件。

前面一篇文章我們講到了什麼是陣列標量型別。陣列標量型別是可以通過np.type來訪問的資料型別。比如： np.int32, np.complex128等。

我們看下陣列標量的轉換：

In [85]: np.dtype(np.int32)
Out[85]: dtype('int32')

In [86]: np.dtype(np.complex128)
Out[86]: dtype('complex128')

這些以np開頭的內建陣列標量型別可以參考我之前寫的文章 “NumPy之:資料型別” 。

注意，陣列標量並不是dtype物件，雖然很多情況下，可以在需要使用dtype物件的時候都可以使用陣列標量。

通用型別

一些通用型別物件，可以被轉換成為相應的dtype型別：

通用型別物件	dtype型別
`number`, `inexact`, `floating`	float
`complexfloating`	`cfloat`
`integer`, `signedinteger`	`int_`
`unsignedinteger`	`uint`
`character`	`string`
`generic`, `flexible`	`void`

內建Python型別

一些Python內建的型別和陣列標量型別是等價的，也可以被轉換成為dtype：

Python型別	dtype型別
int	`int_`
bool	`bool_`
float	`float_`
complex	`cfloat`
bytes	`bytes_`
str	`str_`
buffer	`void`
(all others)	`object_`

看下內建Python型別轉換的例子：

In [82]: np.dtype(float)
Out[82]: dtype('float64')

In [83]: np.dtype(int)
Out[83]: dtype('int64')

In [84]:  np.dtype(object)
Out[84]: dtype('O')

帶有.dtype屬性的物件

任何type物件只要包含dtype屬性，並且這個屬性屬於可以轉換的範圍的話，都可以被轉換成為dtype。

一個字元的string物件

對於每個內建的資料型別來說都有一個和它對應的字元編碼，我們也可以使用這些字元編碼來進行轉換：

In [134]: np.dtype('b')  # byte, native byte order
Out[134]: dtype('int8')

In [135]: np.dtype('>H')  # big-endian unsigned short
Out[135]: dtype('>u2')

In [136]: np.dtype('<f') # little-endian single-precision float
Out[136]: dtype('float32')

In [137]: np.dtype('d') # double-precision floating-point number
Out[137]: dtype('float64')

陣列型別的String

Numpy中陣列型別的物件有一個屬性叫做typestr。

typestr描述了這個陣列中存放的資料型別和長度。

typestr由三部分組成，第一部分是描述資料位元組順序： < 小端 > 大端。

第二部分是陣列裡面元素的基本型別：

型別	描述
`t`	Bit field (following integer gives the number of bits in the bit field).
`b`	Boolean (integer type where all values are only True or False)
`i`	Integer
`u`	Unsigned integer
`f`	Floating point
`c`	Complex floating point
`m`	Timedelta
`M`	Datetime
`O`	Object (i.e. the memory contains a pointer to PyObject)
`S`	String (fixed-length sequence of char)
`U`	Unicode (fixed-length sequence of Py_UNICODE)
`V`	Other (void * – each item is a fixed-size chunk of memory)

最後一部分就是資料的長度。

dtype支援下面幾種型別的轉換：

型別	描述
`'?'`	boolean
`'b'`	(signed) byte
`'B'`	unsigned byte
`'i'`	(signed) integer
`'u'`	unsigned integer
`'f'`	floating-point
`'c'`	complex-floating point
`'m'`	timedelta
`'M'`	datetime
`'O'`	(Python) objects
`'S'`, `'a'`	zero-terminated bytes (not recommended)
`'U'`	Unicode string
`'V'`	raw data (`void`)

我們看幾個例子：

In [137]: np.dtype('d')
Out[137]: dtype('float64')

In [138]: np.dtype('i4')
Out[138]: dtype('int32')

In [139]: np.dtype('f8')
Out[139]: dtype('float64')

In [140]:  np.dtype('c16')
Out[140]: dtype('complex128')

In [141]: np.dtype('a25')
Out[141]: dtype('S25')

In [142]: np.dtype('U25')
Out[142]: dtype('<U25')

逗號分割的字串

逗號分割的字串可以用來表示結構化的資料型別。

對於這種結構化的資料型別也可以轉換成為dtpye格式，轉換後的dtype，將會以f1，f2, … fn-1作為名字來儲存對應的格式資料。我們舉個例子：

In [143]: np.dtype("i4, (2,3)f8, f4")
Out[143]: dtype([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')])

上面的例子中，f0儲存的是32位的整數，f1儲存的是 2 x 3 陣列的64-bit 浮點數。f2是一個32-bit 的浮點數。

再看另外一個例子：

In [144]: np.dtype("a3, 3u8, (3,4)a10")
Out[144]: dtype([('f0', 'S3'), ('f1', '<u8', (3,)), ('f2', 'S10', (3, 4))])

型別字串

所有在numpy.sctypeDict.keys()中的字元，都可以被轉換為dtype：

In [146]: np.sctypeDict.keys()
Out[146]: dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', 'Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'uint64', 'Int64', 'UInt64', 'i8', 'u8', 'int32', 'uint32', 'Int32', 'UInt32', 'i4', 'u4', 'int16', 'uint16', 'Int16', 'UInt16', 'i2', 'u2', 'int8', 'uint8', 'Int8', 'UInt8', 'i1', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a'])

使用的例子：

In [147]: np.dtype('uint32')
Out[147]: dtype('uint32')

In [148]: np.dtype('float64')
Out[148]: dtype('float64')

元組

通過使用dtype構成的元組，我們可以生成新的dtype。

元組也有很多種方式。

(flexible_dtype, itemsize)

對於不固定長度的dtype，可以指定size：

In [149]: np.dtype((np.void, 10))
Out[149]: dtype('V10')

In [150]: np.dtype(('U', 10))
Out[150]: dtype('<U10')

(fixed_dtype, shape)

對於固定長度的dtype，可以指定shape：

In [151]:  np.dtype((np.int32, (2,2)))
Out[151]: dtype(('<i4', (2, 2)))

In [152]: np.dtype(('i4, (2,3)f8, f4', (2,3)))
Out[152]: dtype(([('f0', '<i4'), ('f1', '<f8', (2, 3)), ('f2', '<f4')], (2, 3)))

[(field_name, field_dtype, field_shape), ...]

list中的元素是一個個的field，每個field都是由2-3個部分組成的，分別是field名字，field型別，field的shape。

field_name如果是 ’ ‘的話，就會使用預設的f1，f2 ….作為名字。field_name 也可以是一個2元組，由title 和 name 組成。

field_dtype 就是field的dtype型別。

shape是一個可選欄位，如果field_dtype是一個陣列的話，就需要指定shape。

In [153]: np.dtype([('big', '>i4'), ('little', '<i4')])
Out[153]: dtype([('big', '>i4'), ('little', '<i4')])

上面是兩個欄位，一個是大端的32位的int，一個是小端的32位的int。

In [154]: np.dtype([('R','u1'), ('G','u1'), ('B','u1'), ('A','u1')])
Out[154]: dtype([('R', 'u1'), ('G', 'u1'), ('B', 'u1'), ('A', 'u1')])

四個欄位，每個都是無符號整形。

{'names': ..., 'formats': ..., 'offsets': ..., 'titles': ..., 'itemsize': ...}

這種寫法可以指定name列表和formats列表：

In [157]: np.dtype({'names': ['r','g','b','a'], 'formats': [np.uint8, np.uint8, np.uint8, np.uint8]})
Out[157]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])

offsets 指的是每個欄位的byte offsets。titles 是欄位的title，itemsize 是整個dtype的size。

In [158]: np.dtype({'names': ['r','b'], 'formats': ['u1', 'u1'],
     ...:                'offsets': [0, 2],
     ...:                'titles': ['Red pixel', 'Blue pixel']})
     ...:
Out[158]: dtype({'names':['r','b'], 'formats':['u1','u1'], 'offsets':[0,2], 'titles':['Red pixel','Blue pixel'], 'itemsize':3})

(base_dtype, new_dtype)

可以將基本的dtype型別轉換為結構化的dtype型別：

In [159]: np.dtype((np.int32,{'real':(np.int16, 0),'imag':(np.int16, 2)}))
Out[159]: dtype([('real', '<i2'), ('imag', '<i2')])

32位的int轉換成兩個16位的int。

In [161]: np.dtype(('i4', [('r','u1'),('g','u1'),('b','u1'),('a','u1')]))
Out[161]: dtype([('r', 'u1'), ('g', 'u1'), ('b', 'u1'), ('a', 'u1')])

32位的int，轉換成4個unsigned integers。

本文已收錄於 http://www.flydean.com/04-python-numpy-datatype-obj/

最通俗的解讀，最深刻的乾貨，最簡潔的教程，眾多你不知道的小技巧等你來發現！

歡迎關注我的公眾號:「程式那些事」,懂技術，更懂你！

NumPy之:資料型別
2021-04-23
資料型別
NumPy 資料型別
2023-12-20
資料型別
numpy資料型別
2020-10-20
資料型別
什麼是NumPy?Python中NumPy資料型別有哪些?
2021-04-23
Python資料型別
NumPy 超詳細教程（2）：資料型別
2019-03-18
資料型別
js資料型別之基本資料型別和引用資料型別
2018-06-19
JS資料型別
NumPy 陣列切片及資料型別介紹
2024-05-07
陣列資料型別
JavaScript獲取物件資料型別
2018-08-19
JavaScript物件資料型別
JS中資料型別、內建物件、包裝型別物件、typeof關係
2019-02-25
JS資料型別物件
Mysql 資料型別之整數型別
2020-10-27
MySQL 資料型別
Python之資料型別
2020-04-04
Python資料型別
MySQL基礎之----資料型別篇(常用資料型別)
2020-10-03
MySql資料型別
從原始資料型別到值物件
2021-09-13
資料型別物件
JavaScript筆記5：計時器、物件、基本資料型別、引用資料型別
2020-10-18
JavaScript筆記物件資料型別
Numpy學習-Task01-資料型別&陣列建立
2020-10-20
資料型別陣列
PHP 資料型別之檢視和判斷資料型別
2019-06-15
PHP資料型別
Python資料分析之numpy
2018-07-23
Python
資料型別之字串篇
2018-12-21
資料型別字串
基本資料型別之字串
2022-03-15
資料型別字串
JavaScript 基礎 (二) - 引用資料型別 (物件)
2019-07-31
JavaScript資料型別物件
js基本語法之值型別(資料型別)(變數型別)
2018-08-11
JS資料型別變數
Python基本資料型別之浮點型
2019-02-16
Python資料型別
【Python資料科學】之Numpy
2019-04-29
Python資料科學
python-資料型別之字典
2019-02-16
Python資料型別
Python基本資料型別之整型
2019-02-16
Python資料型別
玩轉 JavaScript 之資料型別
2019-01-04
JavaScript資料型別
01.javascript之資料型別
2018-07-11
JavaScript資料型別
php資料型別之陣列
2020-03-19
PHP資料型別陣列
Go 筆記之資料型別
2019-10-26
Go筆記資料型別
Python學習之資料型別
2019-05-12
Python資料型別
python基礎之資料型別
2022-02-05
Python資料型別
[java基礎]之資料型別
2021-02-07
Java資料型別
JS專題之資料型別和型別檢測
2018-12-15
JS資料型別
資料型別: 資料型別有哪些？
2021-02-01
資料型別
第10章物件和類——物件和類（六）抽象資料型別
2024-03-25
物件抽象資料型別
NumPy之:使用genfromtxt匯入資料
2021-05-08
Python資料分析與展示之『Numpy』
2020-12-25
Python
python-資料型別之set集合
2019-02-16
Python資料型別