春節停車難？用Python找空車位

640?wx_fmt=jpeg

作者 | Adam Geitgey

譯者 | 風車雲馬

整理 | Jane

出品 | AI科技大本營（ID:rgznai100）

【導語】今天這篇文章的選題非常貼近生活。營長生活在北京，深知開車出門最怕的就是堵車和找不到停車位。記得冬至那個週末，幾個小夥伴滑雪回來找了一家餃子館吃餃子，結果七拐八拐，好不容易才找到一個停車位。看到這篇技術文章，馬上就想要學習一下，分享給大家，希望有助於解決大家這個痛點問題，春節出行沒準就可以用得上了。

640?wx_fmt=gif

作者通過相機結合深度學習演算法，基於 Python 語言建立一個高精度的停車位的通知系統，每當有新停車位時就會發簡訊提醒我。聽起來好像很複雜，真的方便實用嗎？但實際上所使用的工具都是現成的，只要將這些工具進行有機的組合，就可以快速、簡便的實現。

640?wx_fmt=gif

下面我們就開始學習整個工程流程：

分解問題

解決一個複雜的問題，首先第一步是要把問題分解成幾個簡單子任務。然後，針對每個子任務，運用機器學習中不同的方法來分別解決每個問題。最後把這些子任務貫穿起來，形成整套解決方案。

下面是檢測開放停車位的流程圖:

640?wx_fmt=jpeg

輸入：普通攝像頭採集的視訊流

640?wx_fmt=gif

有了輸入資料後，接下來我們需要知道影像的哪一部分是停車位，而且停車位是沒有人使用的。

第一步：檢測視訊幀中所有可能的停車位。

第二步：檢測每一幀視訊中的所有車輛。可以跟蹤每輛車從一幀到另一幀的運動。

第三步：確定目前有哪些停車位被佔用，哪些沒有被佔用。這需要結合第一步和第二步的結果。

最後一步：當停車位變成可用時，系統發出通知。

其實可以使用多種不同的方法來完成這些步驟。不同的方法將具有不同的優勢和缺點。接下來具體來看：

一、探測停車位

相機檢視如下圖所示：

640?wx_fmt=png

需要掃描影像，並得到有效停車區域，如下面黃色標識出來的位置：

640?wx_fmt=png

一個懶辦法是程式寫死每個停車場的位置，而不是自動檢測停車場。但是如果移動攝像機，探測不同街道上的停車位，就必須再次手動定位停車位置。這樣看來這個方法一點都不好，還是要採用自動檢測停車位的方法。

其中一個想法是識別停車計時器並假設每個計時器旁邊都有一個停車位:

640?wx_fmt=png

但是這種方法也有一些問題。首先，並不是每個停車位有一個停車計時器，而且我們最想先找到免費停車位。第二，知道停車計時器的位置還不能告訴我們確切的停車位置點。

另一個想法是搭建一個目標檢測模型，找到在馬路上畫的停車位記號，就像下圖示識出來的：

640?wx_fmt=png

這種方法也有兩個難點。首先，從遠處看，停車位線的標誌很小，很難看到，增加了識別難度；其次，道路上還有各種交通標記線，比如車道線、人行道斑馬線，這也給識別增加了難度。

或許還可以換個思維方式，迴避一些技術挑戰。重新思考停車位到底是什麼呢?無非是一個車需要停放一定時間的位置。所以我們可能根本不需要檢測停車位，只要檢測出長時間不動的車，並假設它們的位置就是停車位。

640?wx_fmt=png

所以，如果我們能檢測出汽車，並找出哪些車在視訊幀之間沒有移動，就可以推斷停車位的位置。

二、在影像中檢測汽車

在視訊中檢測車輛是一個經典的目標檢測問題。有很多機器學習方法可以實現。下面列出了一些最常見的目標檢測演算法：

1、通過 HOG（梯度方向直方圖）目標檢測器檢測出所有的車。這種非深度學習方法執行起來相對較快，但它無法處理汽車在不同方向上的旋轉問題。

2、通過 CNN（卷積神經網路）目標檢測器檢測所有的車。這種方法是準確的，但是效率比較低，因為同一張影像必須掃描多次，以檢測到所有的汽車。雖然它可以很容易地對不同旋轉方向的汽車定向，但是比 HOG 方法需要更多的訓練資料。

3、使用新的深度學習方法，如 Mask R-CNN，Faster R-CNN 或者 YOLO 演算法，相容準確性和執行效率，大大加快了檢測過程。一旦有大量的訓練資料，在 GPU 上執行也很快。

通常來說，我們希望選擇最簡單可行的演算法和最少的訓練資料，而不是一定要用那些流行的新的演算法。基於目前這種特殊場景下，Mask R- CNN 是一個不錯的選擇。

Mask R-CNN 架構就是在整個影像中檢測物件，不使用滑動視窗的方式，所以執行速度很快。有了 GPU 處理器，我們能夠每秒處理多幀的高解析度視訊，從中檢測到車輛。

Mask R-CNN 為我們提供了很多檢測到的資訊。大多數目標檢測演算法只返回每個物件的邊框。但是 Mask R-CNN 不僅會給我們每個物件的位置，也會給出一個物件的輪廓，像這樣:

640?wx_fmt=png

為了訓練 Mask R-CNN 模型，我們需要很多這類檢測物體的圖片。可以花幾天的時間出去拍攝照片，不過已經存在一些汽車影像的公共資料集。有一個很流行的資料集叫做COCO（Common Objects In Context的縮寫），它裡面已經有超過 12000 張汽車的圖片。下面就是一個 COCO 資料集中的影像:

640?wx_fmt=jpeg

這些資料可以很好的訓練 Mask R-CNN 模型，而且已經有很多人使用過 COCO資料集，並分享了訓練的結果。所以我們可以直接使用一些訓練好的模型，在本專案中使用 Matterport 的開源模型。

640?wx_fmt=png

不僅能識別車輛，還能識別到交通燈和人。有趣的是，它把其中一棵樹識別成“potted plant”。對於影像中檢測到的每個物件，我們從 MaskR-CNN 模型得出以下 4 點：

（1）不同物件的類別，COCO 模型可以識別出 80 種不同的物體，比如小轎車和卡車。

（2）目標識別的置信度，數字越大，說明模型識別物件的精準度越高。

（3）影像中物體的邊界框，給定了 X/Y 畫素的位置。

（4）點陣圖“mask”說明了邊框內哪些畫素是物件的一部分，哪些不是。使用“mask”資料，我們也可以算出物體的輪廓。

下面是 Python 程式碼，使用 Matterport 的 Mask R-CNN 的訓練模型和 OpenCV 來檢測汽車邊框：

1import os
2import numpy as np
3import cv2
4import mrcnn.config
5import mrcnn.utils
6from mrcnn.model import MaskRCNN
7from pathlib import Path
8
9
10# Configuration that will be used by the Mask-RCNN library
11class MaskRCNNConfig(mrcnn.config.Config):
12    NAME = "coco_pretrained_model_config"
13    IMAGES_PER_GPU = 1
14    GPU_COUNT = 1
15    NUM_CLASSES = 1 + 80  # COCO dataset has 80 classes + one background class
16    DETECTION_MIN_CONFIDENCE = 0.6
17
18
19# Filter a list of Mask R-CNN detection results to get only the detected cars / trucks
20def get_car_boxes(boxes, class_ids):
21    car_boxes = []
22
23    for i, box in enumerate(boxes):
24        # If the detected object isn't a car / truck, skip it
25        if class_ids[i] in [3, 8, 6]:
26            car_boxes.append(box)
27
28    return np.array(car_boxes)
29
30
31# Root directory of the project
32ROOT_DIR = Path(".")
33
34# Directory to save logs and trained model
35MODEL_DIR = os.path.join(ROOT_DIR, "logs")
36
37# Local path to trained weights file
38COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
39
40# Download COCO trained weights from Releases if needed
41if not os.path.exists(COCO_MODEL_PATH):
42    mrcnn.utils.download_trained_weights(COCO_MODEL_PATH)
43
44# Directory of images to run detection on
45IMAGE_DIR = os.path.join(ROOT_DIR, "images")
46
47# Video file or camera to process - set this to 0 to use your webcam instead of a video file
48VIDEO_SOURCE = "test_images/parking.mp4"
49
50# Create a Mask-RCNN model in inference mode
51model = MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=MaskRCNNConfig())
52
53# Load pre-trained model
54model.load_weights(COCO_MODEL_PATH, by_name=True)
55
56# Location of parking spaces
57parked_car_boxes = None
58
59# Load the video file we want to run detection on
60video_capture = cv2.VideoCapture(VIDEO_SOURCE)
61
62# Loop over each frame of video
63while video_capture.isOpened():
64    success, frame = video_capture.read()
65    if not success:
66        break
67
68    # Convert the image from BGR color (which OpenCV uses) to RGB color
69    rgb_image = frame[:, :, ::-1]
70
71    # Run the image through the Mask R-CNN model to get results.
72    results = model.detect([rgb_image], verbose=0)
73
74    # Mask R-CNN assumes we are running detection on multiple images.
75    # We only passed in one image to detect, so only grab the first result.
76    r = results[0]
77
78    # The r variable will now have the results of detection:
79    # - r['rois'] are the bounding box of each detected object
80    # - r['class_ids'] are the class id (type) of each detected object
81    # - r['scores'] are the confidence scores for each detection
82    # - r['masks'] are the object masks for each detected object (which gives you the object outline)
83
84    # Filter the results to only grab the car / truck bounding boxes
85    car_boxes = get_car_boxes(r['rois'], r['class_ids'])
86
87    print("Cars found in frame of video:")
88
89    # Draw each box on the frame
90    for box in car_boxes:
91        print("Car: ", box)
92
93        y1, x1, y2, x2 = box
94
95        # Draw the box
96        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 1)
97
98    # Show the frame of video on the screen
99    cv2.imshow('Video', frame)
100
101    # Hit 'q' to quit
102    if cv2.waitKey(1) & 0xFF == ord('q'):
103        break
104
105# Clean up everything when finished
106video_capture.release()
107cv2.destroyAllWindows()

執行該指令碼後，將會看到在影像中識別到的汽車和邊框：

640?wx_fmt=jpeg

同時會得到檢測的每輛車的畫素座標：

640?wx_fmt=png

這樣我們已經成功地在影像中檢測到了汽車。接下來到了下一個步驟。

三、探測空車位

知道影像中每輛車的畫素位置後，通過觀察連續多幀視訊，可以很容易地算出哪幀裡汽車沒有移動。但我們如何檢測到汽車何時離開停車位？經觀察，影像中汽車的邊框部分有所重疊：

640?wx_fmt=jpeg

如果假設每個邊界框代表一個停車場空間，這個區域即使有車開走了，但是仍可能被另外汽車部分佔據。因此我們需要一種方法來測量重疊，檢查出“大部分為空”的框。我們使用的度量方法稱為 Intersection Over Union（IoU）。通過計算兩個物體重疊的畫素量，然後除以兩個物體所覆蓋的畫素：

640?wx_fmt=png

有了這個值，接下來就可以很容易確定一輛車是否在停車位。如果 IoU 測量值低，比如 0.15，表示汽車並沒有佔據大部分的停車位空間。但是如果測量值很高，比如 0.6，就表示汽車佔據了大部分的停車位，因此可以確定停車位已被佔用。

IoU 是計算機視覺中常用的一種測量方法，提供了現成的程式碼。Matterport 的Mask R-CNN 庫可以直接呼叫這個函式 mrcnn.utils.compute_overlaps()。假設我們有一個表示停車位邊界框的列表，要檢識別到的車輛是否在這些邊界內框很簡單，只需新增一兩行程式碼：

1    # Filter the results to only grab the car / truck bounding boxes
2    car_boxes = get_car_boxes(r['rois'], r['class_ids'])
3
4    # See how much cars overlap with the known parking spaces
5    overlaps = mrcnn.utils.compute_overlaps(car_boxes, parking_areas)
6
7    print(overlaps)

結果顯示為：

640?wx_fmt=png

在二維陣列中，每一行表示一個停車位邊界框。同樣的，每一列表示停車場被汽車所覆蓋的程度。1.0 分意味著汽車完全佔據了，而 0.02 這樣的低分數，意味著有重疊區域，但不會佔據大部分空間。

要找到無人使用的停車位，只需要計算出這個陣列。如果所有的數都是 0 或者很小，也就表示空間沒有被佔用，因此一定是空停車位。

儘管 Mask R-CNN 非常精確，但目標檢測並不能做到完美。有時也會在一段視訊中漏掉一兩輛車。所以在定位到一個空車位時，還應該檢測在一段時間內都是空的，比如 5或10幀連續視訊。這也可以避免視訊本身出現故障而造成誤檢。一旦看到幾個連續視訊中都有空車位，馬上傳送提醒通知!

四、傳送訊息

最後一步是傳送 SMS 提醒訊息。利用 Twilio 通過 Python 傳送 SMS 訊息非常簡單，基本上幾行程式碼就可以實現。當然，Twilio 只是這個專案中用到的方法，你也可以用其他方式實現。

要使用 Twilio，先要註冊一個試用帳戶，建立一個 Twilio 電話號碼並獲取您的帳戶憑證。然後，您需要安裝 Twilio Python 客戶端庫:

640?wx_fmt=png

下面是傳送 SMS 訊息的 Python 程式碼（需用自己的帳戶資訊替換這些值）：

1from twilio.rest import Client
2
3# Twilio account details
4twilio_account_sid = 'Your Twilio SID here'
5twilio_auth_token = 'Your Twilio Auth Token here'
6twilio_source_phone_number = 'Your Twilio phone number here'
7
8# Create a Twilio client object instance
9client = Client(twilio_account_sid, twilio_auth_token)
10
11# Send an SMS
12message = client.messages.create(
13    body="This is my SMS message!",
14    from_=twilio_source_phone_number,
15    to="Destination phone number here"
16)

在新增 SMS 傳送功能時要注意，不要連續傳送已經識別過的空車位資訊。可以用一個 flag 來跟蹤已經發過的簡訊，除非是設定一段時間後再次提醒或是檢測到新的空車位。

五、把所有流程串在一起

現在將每個步驟整合一個Python指令碼。下面是完整程式碼，要執行這段程式碼，需要安裝Python 3.6+，Matterport 的 Mask R-CNN 和 OpenCV：

1import os
2import numpy as np
3import cv2
4import mrcnn.config
5import mrcnn.utils
6from mrcnn.model import MaskRCNN
7from pathlib import Path
8from twilio.rest import Client
9
10
11# Configuration that will be used by the Mask-RCNN library
12class MaskRCNNConfig(mrcnn.config.Config):
13    NAME = "coco_pretrained_model_config"
14    IMAGES_PER_GPU = 1
15    GPU_COUNT = 1
16    NUM_CLASSES = 1 + 80  # COCO dataset has 80 classes + one background class
17    DETECTION_MIN_CONFIDENCE = 0.6
18
19
20# Filter a list of Mask R-CNN detection results to get only the detected cars / trucks
21def get_car_boxes(boxes, class_ids):
22    car_boxes = []
23
24    for i, box in enumerate(boxes):
25        # If the detected object isn't a car / truck, skip it
26        if class_ids[i] in [3, 8, 6]:
27            car_boxes.append(box)
28
29    return np.array(car_boxes)
30
31
32# Twilio config
33twilio_account_sid = 'YOUR_TWILIO_SID'
34twilio_auth_token = 'YOUR_TWILIO_AUTH_TOKEN'
35twilio_phone_number = 'YOUR_TWILIO_SOURCE_PHONE_NUMBER'
36destination_phone_number = 'THE_PHONE_NUMBER_TO_TEXT'
37client = Client(twilio_account_sid, twilio_auth_token)
38
39
40# Root directory of the project
41ROOT_DIR = Path(".")
42
43# Directory to save logs and trained model
44MODEL_DIR = os.path.join(ROOT_DIR, "logs")
45
46# Local path to trained weights file
47COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
48
49# Download COCO trained weights from Releases if needed
50if not os.path.exists(COCO_MODEL_PATH):
51    mrcnn.utils.download_trained_weights(COCO_MODEL_PATH)
52
53# Directory of images to run detection on
54IMAGE_DIR = os.path.join(ROOT_DIR, "images")
55
56# Video file or camera to process - set this to 0 to use your webcam instead of a video file
57VIDEO_SOURCE = "test_images/parking.mp4"
58
59# Create a Mask-RCNN model in inference mode
60model = MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=MaskRCNNConfig())
61
62# Load pre-trained model
63model.load_weights(COCO_MODEL_PATH, by_name=True)
64
65# Location of parking spaces
66parked_car_boxes = None
67
68# Load the video file we want to run detection on
69video_capture = cv2.VideoCapture(VIDEO_SOURCE)
70
71# How many frames of video we've seen in a row with a parking space open
72free_space_frames = 0
73
74# Have we sent an SMS alert yet?
75sms_sent = False
76
77# Loop over each frame of video
78while video_capture.isOpened():
79    success, frame = video_capture.read()
80    if not success:
81        break
82
83    # Convert the image from BGR color (which OpenCV uses) to RGB color
84    rgb_image = frame[:, :, ::-1]
85
86    # Run the image through the Mask R-CNN model to get results.
87    results = model.detect([rgb_image], verbose=0)
88
89    # Mask R-CNN assumes we are running detection on multiple images.
90    # We only passed in one image to detect, so only grab the first result.
91    r = results[0]
92
93    # The r variable will now have the results of detection:
94    # - r['rois'] are the bounding box of each detected object
95    # - r['class_ids'] are the class id (type) of each detected object
96    # - r['scores'] are the confidence scores for each detection
97    # - r['masks'] are the object masks for each detected object (which gives you the object outline)
98
99    if parked_car_boxes is None:
100        # This is the first frame of video - assume all the cars detected are in parking spaces.
101        # Save the location of each car as a parking space box and go to the next frame of video.
102        parked_car_boxes = get_car_boxes(r['rois'], r['class_ids'])
103    else:
104        # We already know where the parking spaces are. Check if any are currently unoccupied.
105
106        # Get where cars are currently located in the frame
107        car_boxes = get_car_boxes(r['rois'], r['class_ids'])
108
109        # See how much those cars overlap with the known parking spaces
110        overlaps = mrcnn.utils.compute_overlaps(parked_car_boxes, car_boxes)
111
112        # Assume no spaces are free until we find one that is free
113        free_space = False
114
115        # Loop through each known parking space box
116        for parking_area, overlap_areas in zip(parked_car_boxes, overlaps):
117
118            # For this parking space, find the max amount it was covered by any
119            # car that was detected in our image (doesn't really matter which car)
120            max_IoU_overlap = np.max(overlap_areas)
121
122            # Get the top-left and bottom-right coordinates of the parking area
123            y1, x1, y2, x2 = parking_area
124
125            # Check if the parking space is occupied by seeing if any car overlaps
126            # it by more than 0.15 using IoU
127            if max_IoU_overlap < 0.15:
128                # Parking space not occupied! Draw a green box around it
129                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
130                # Flag that we have seen at least one open space
131                free_space = True
132            else:
133                # Parking space is still occupied - draw a red box around it
134                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 1)
135
136            # Write the IoU measurement inside the box
137            font = cv2.FONT_HERSHEY_DUPLEX
138            cv2.putText(frame, f"{max_IoU_overlap:0.2}", (x1 + 6, y2 - 6), font, 0.3, (255, 255, 255))
139
140        # If at least one space was free, start counting frames
141        # This is so we don't alert based on one frame of a spot being open.
142        # This helps prevent the script triggered on one bad detection.
143        if free_space:
144            free_space_frames += 1
145        else:
146            # If no spots are free, reset the count
147            free_space_frames = 0
148
149        # If a space has been free for several frames, we are pretty sure it is really free!
150        if free_space_frames > 10:
151            # Write SPACE AVAILABLE!! at the top of the screen
152            font = cv2.FONT_HERSHEY_DUPLEX
153            cv2.putText(frame, f"SPACE AVAILABLE!", (10, 150), font, 3.0, (0, 255, 0), 2, cv2.FILLED)
154
155            # If we haven't sent an SMS yet, sent it!
156            if not sms_sent:
157                print("SENDING SMS!!!")
158                message = client.messages.create(
159                    body="Parking space open - go go go!",
160                    from_=twilio_phone_number,
161                    to=destination_phone_number
162                )
163                sms_sent = True
164
165        # Show the frame of video on the screen
166        cv2.imshow('Video', frame)
167
168    # Hit 'q' to quit
169    if cv2.waitKey(1) & 0xFF == ord('q'):
170        break
171
172# Clean up everything when finished
173video_capture.release()
174cv2.destroyAllWindows()

這段程式碼寫的比較簡潔，實現了基本的功能。大家也可以試著修改程式碼以適應不同的場景。僅僅更改一下模型引數，出現的效果就可能完全不同，在不同應用中盡情發揮想象力！

原文連結：
https://medium.com/@ageitgey/snagging-parking-spaces-with-mask-r-cnn-and-python-955f2231c400

（本文為 AI科技大本營編譯文章，轉載請微信聯絡 1092722531。）

——————————————— 徵稿 ————————————————

640?wx_fmt=png

推薦閱讀：

640?wx_fmt=png

春節停車難？用Python找空車位

2019最新實戰！給程式設計師的7節深度學習必修課，最好還會Python

知否？知否？一文看懂深度文字分類之DPCNN原理與程式碼

PDF翻譯神器，再也不擔心讀不懂英文Paper了

相關文章