影像讀取庫合集——cv2, PIL, skimage與numpy, pytorch(ToPILimage)

1 影像讀取與屬性

1.1 PIL與numpy間的相互訪問

import numpy as np
from PIL import Image

#read a image with 3 channels, 500x889 pixels
img_pil =  Image.open('./test.png')
#show a image
img_pil.show()

#get image imfo
print(img_pil)

#get the pixel value in PIL format
print(img_pil.getpixel((0,0))) 

#covert PIL to numpy
img_np = np.array(img_pil)
print(img_np.shape)

#get the pixel value in numpy format
print(img_np[0,0])

#convert numpy to PIL
img_pil = Image.fromarray(img_np)
print(img_pil)

"""
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x889 at 0x193331AD240>
(219, 210, 193)
(889, 500, 3)
[219 210 193]
<PIL.Image.Image image mode=RGB size=500x889 at 0x1933330ADA0>
"""

$n o t e$ ：

PIL庫讀取影像的三通道順序為RGB，讀取影像的寬度( $w i d t h$ )和高度( $h e i g h t$ )與原始影像一致;
PIL庫與 $n u m p y$ 的轉化存在細微的差別： $n u m p y . a r r a y ()$ 會改變PIL物件的寬度和高度資訊， $I m a g e . f r o m a r r a y ()$ 會重新調整回原始狀態；
PIL訪問某一位置的畫素值時呼叫 $img\_pil.getpixel((x,y))$ , $n u m p y$ 為矩陣形式，直接訪問 $i n d e x$ , $img\_np[x,y]$ ；

1.2 cv2與numpy間的相互訪問

import numpy as np
import cv2

#read a image with 3 channels, 500x889 pixels
img_cv = cv2.imread('./test.png')
#show a image
cv2.imshow('img', img_cv)

#get image imfo
print(img_cv.shape)

#get the pixel value in cv2 format
print(img_cv[0,0]) 

#covert cv2 to numpy
img_np = np.array(img_cv)
print(img_np.shape)

#get the pixel value in numpy format
print(img_np[0,0])

#convert numpy to cv2(not necessary)
cv2.imshow('img_np', img_np)

cv2.waitKey(0)

"""
(889, 500, 3)
[193 210 219]
(889, 500, 3)
[193 210 219]
"""

$n o t e$ :

cv2讀取影像的三通道順序為GBR, 影像的寬度資訊和高度資訊發生調整；
cv2訪問元素和 $n u m p y$ 的方式相同，通過 $i n d e x$ 直接訪問；
cv2可以直接開啟 $n u m p y$ 陣列( $u i n t 8$ );
為避免cv2閃退，通常加上 $c v 2 . w a i t K e y ()$ 等待鍵入才退出;

1.3 skimg與numpy間的相互訪問

import numpy as np
from skimage import io, transform
import matplotlib.pyplot as plt


#read a image with 3 channels, 500x889 pixels
img_sk = io.imread('./test.png')

#get image info
print(img_sk.shape)
io.imshow(img_sk)

#get the pixel value in skimage format
print(img_sk[0,0])

#covert skimage to numpy
img_np = np.array(img_sk)
print(img_np.shape)

#get the pixel value in numpy format
print(img_np[0,0])

#convert numpy to skimg
io.imshow(img_np)

plt.show()

"""
(889, 500, 3)
[219 210 193]
(889, 500, 3)
[219 210 193]
"""

$n o t e$ :

$s k i m a g e$ 庫和 $c v 2$ 比較相似，可以看到結果輸出也基本相同，和 $n u m p y$ 的轉化也比較方便；
$s k i m a g e$ 庫無法直接開啟影像，需要藉助 $m a t p l o t l i b . p y p l o t$ ，因此 $s k i m a g e$ 通常和 $p y p l o t$ 合併使用用於過程視覺化，可以方便畫圖、畫表格；

綜上而言，PIL庫儘可能保持了原始輸入的資訊，使用方便快捷，此外，PIL庫通常還可以與imageio庫相互結合做影像預處理；c $v 2$ 將影像轉化為陣列便於對影像的進一步處理； $s k i m a g e$ 和 $m a t p l l t l i b$ 相互結合，做影像對比更加方便；

2 Pytorch讀取影像

torch.utils.data.DataLoader(dataset, batch_size=batch_size,
                                  shuffle=False, num_workers=8, drop_last=False)

呼叫 $P y t o r c h$ 的 $D a t a L o a d e r$ 時需要載入 $d a t a s e t$ ，此處的 $d a t a s e t$ 為自定義的資料，用於輸出影像和對應的標籤資訊，同時對影像做資料增強，此時的資料型別為PIL物件，此處以Standford_car為例（程式碼來源：sourcecode）：

class STANFORD_CAR():
    def __init__(self, input_size, root, is_train=True, data_len=None):
        self.input_size = input_size
        self.root = root
        self.is_train = is_train
        train_img_path = os.path.join(self.root, 'cars_train')
        test_img_path = os.path.join(self.root, 'cars_test')
        train_label_file = open(os.path.join(self.root, 'train.txt'))
        test_label_file = open(os.path.join(self.root, 'test.txt'))
        train_img_label = []
        test_img_label = []
        for line in train_label_file:
            train_img_label.append([os.path.join(train_img_path,
                                                 line[:-1].split(' ')[0]), 
                                    		int(line[:-1].split(' ')[1])-1])
        for line in test_label_file:
            test_img_label.append([os.path.join(test_img_path,
                                                line[:-1].split(' ')[0]), 														int(line[:-1].split(' ')[1])-1])
        self.train_img_label = train_img_label[:data_len]
        self.test_img_label = test_img_label[:data_len]


    def __getitem__(self, index):
        if self.is_train:
            img, target = imageio.imread(self.train_img_label[index][0]),
            									self.train_img_label[index][1]
            if len(img.shape) == 2:
                img = np.stack([img] * 3, 2)
            img = Image.fromarray(img, mode='RGB')

            img = transforms.Resize((self.input_size, 
                                     self.input_size), Image.BILINEAR)(img)
            # img = transforms.RandomResizedCrop(size=self.input_size,
            						#scale=(0.4, 0.75),ratio=(0.5,1.5))(img)#
            # img = transforms.RandomCrop(self.input_size)(img)
            img = transforms.RandomHorizontalFlip()(img)
            img = transforms.ColorJitter(brightness=0.2, contrast=0.2)(img)

            img = transforms.ToTensor()(img)
            img = transforms.Normalize([0.485, 0.456, 0.406], 
                                       [0.229, 0.224, 0.225])(img)

        else:
            img, target = imageio.imread(self.test_img_label[index][0]), 
            									self.test_img_label[index][1]
            if len(img.shape) == 2:
                img = np.stack([img] * 3, 2)
            img = Image.fromarray(img, mode='RGB')
            img = transforms.Resize((self.input_size, 
                                     self.input_size), Image.BILINEAR)(img)
            # img = transforms.CenterCrop(self.input_size)(img)
            img = transforms.ToTensor()(img)
            img = transforms.Normalize([0.485, 0.456, 0.406],
                                       	[0.229, 0.224, 0.225])(img)
        return img, target

    def __len__(self):
        if self.is_train:
            return len(self.train_img_label)
        else:
            return len(self.test_img_label)

此段程式碼同時使用了PIL庫， $n u m p y$ 庫，以及相應的 $i m a g e i o$ 庫進行相應的影像增強。

影像讀取庫合集——cv2, PIL, skimage與numpy, pytorch(ToPILimage)

影像讀取庫合集——cv2, PIL, skimage與numpy, pytorch(ToPILimage)

1 影像讀取與屬性

1.1 PIL與numpy間的相互訪問

1.2 cv2與numpy間的相互訪問

1.3 skimg與numpy間的相互訪問

2 Pytorch讀取影像

相關文章