Stickyworld 的網頁應用已經支援視訊撥放一段時間,但都是通過YouTube的嵌入模式實現。我們開始提供新的版本支援視訊操作,可以讓我們的使用者不用受制於YouTube的服務。
我過去曾經參與過一個專案,客戶需要視訊轉碼功能,這實在不是個容易達成的需求。需要大量的讀取每一個視訊、音訊與視訊容器的格式再輸出符合網頁使用與喜好的視訊格式。
考慮到這一點,我們決定將轉碼的工作交給 Encoding.com 。這個網站可以免費讓你編碼1GB大小的視訊,超過1GB容量的檔案將採取分級計價收費。
開發的程式碼如下,我上傳了一個178KB容量的兩秒視訊來測試程式碼是否成功運作。當測試過程沒有發生任何的例外錯誤後,我繼續測試其它更大的外部檔案。
階段一:使用者上傳視訊檔案
現在這的新的程式碼段提供了一個基於 HTML5且可以快速上手的 的上傳機制。用CoffeeScript撰寫的程式碼,可以從客戶端上傳檔案到伺服器端。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
$scope.upload_slide = (upload_slide_form) -> file = document.getElementById("slide_file").files[0] reader = new FileReader() reader.readAsDataURL file reader.onload = (event) -> result = event.target.result fileName = document.getElementById("slide_file").files[0].name $.post "/world/upload_slide", data: result name: fileName room_id: $scope.room.id (response_data) -> if response_data.success? is not yes console.error "There was an error uploading the file", response_data else console.log "Upload successful", response_data reader.onloadstart = -> console.log "onloadstart" reader.onprogress = (event) -> console.log "onprogress", event.total, event.loaded, (event.loaded / event.total) * 100 reader.onabort = -> console.log "onabort" reader.onerror = -> console.log "onerror" reader.onloadend = (event) -> console.log "onloadend", event |
最好可以通過 (“slide_file”).files 且經由獨立的POST上傳每個檔案,而不是由一個POST需求上傳所有檔案。稍後我們會解釋這點。
階段二:驗證並上傳至 Amazon S3
後端我們執行了Django與RabbitMQ。主要的模組如下:
1 2 |
$ pip install 'Django>=1.5.2' 'django-celery>=3.0.21' \ 'django-storages>=1.1.8' 'lxml>=3.2.3' 'python-magic>=0.4.3' |
我建立了兩個模組:SlideUploadQueue 用來儲存每一次上傳的資料,SlideVideoMedia 則是用來儲存每個要上傳影片的資料。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
class SlideUploadQueue(models.Model): created_by = models.ForeignKey(User) created_time = models.DateTimeField(db_index=True) original_file = models.FileField( upload_to=filename_sanitiser, blank=True, default='') media_type = models.ForeignKey(MediaType) encoding_com_tracking_code = models.CharField( default='', max_length=24, blank=True) STATUS_AWAITING_DATA = 0 STATUS_AWAITING_PROCESSING = 1 STATUS_PROCESSING = 2 STATUS_AWAITING_3RD_PARTY_PROCESSING = 5 STATUS_FINISHED = 3 STATUS_FAILED = 4 STATUS_LIST = ( (STATUS_AWAITING_DATA, 'Awaiting Data'), (STATUS_AWAITING_PROCESSING, 'Awaiting processing'), (STATUS_PROCESSING, 'Processing'), (STATUS_AWAITING_3RD_PARTY_PROCESSING, 'Awaiting 3rd-party processing'), (STATUS_FINISHED, 'Finished'), (STATUS_FAILED, 'Failed'), ) status = models.PositiveSmallIntegerField( default=STATUS_AWAITING_DATA, choices=STATUS_LIST) class Meta: verbose_name = 'Slide' verbose_name_plural = 'Slide upload queue' def save(self, *args, **kwargs): if not self.created_time: self.created_time = \ datetime.utcnow().replace(tzinfo=pytz.utc) return super(SlideUploadQueue, self).save(*args, **kwargs) def __unicode__(self): if self.id is None: return 'new <SlideUploadQueue>' return '<SlideUploadQueue> %d' % self.id class SlideVideoMedia(models.Model): converted_file = models.FileField( upload_to=filename_sanitiser, blank=True, default='') FORMAT_MP4 = 0 FORMAT_WEBM = 1 FORMAT_OGG = 2 FORMAT_FL9 = 3 FORMAT_THUMB = 4 supported_formats = ( (FORMAT_MP4, 'MPEG 4'), (FORMAT_WEBM, 'WebM'), (FORMAT_OGG, 'OGG'), (FORMAT_FL9, 'Flash 9 Video'), (FORMAT_THUMB, 'Thumbnail'), ) mime_types = ( (FORMAT_MP4, 'video/mp4'), (FORMAT_WEBM, 'video/webm'), (FORMAT_OGG, 'video/ogg'), (FORMAT_FL9, 'video/mp4'), (FORMAT_THUMB, 'image/jpeg'), ) format = models.PositiveSmallIntegerField( default=FORMAT_MP4, choices=supported_formats) class Meta: verbose_name = 'Slide video' verbose_name_plural = 'Slide videos' def __unicode__(self): if self.id is None: return 'new <SlideVideoMedia>' return '<SlideVideoMedia> %d' % self.id |
我們的模組皆使用 filename_sanitiser。FileField 自動的將檔名調整成 <model>/<uuid4>.<extention> 格式。整理每個檔名並確保其獨一性。我們採用了有時效性簽署的網址列讓我們可以掌控哪些使用者在使用我們的服務,使用了多久。
1 2 3 4 5 6 7 8 9 10 11 |
def filename_sanitiser(instance, filename): folder = instance.__class__.__name__.lower() ext = 'jpg' if '.' in filename: t_ext = filename.split('.')[-1].strip().lower() if t_ext != '': ext = t_ext return '%s/%s.%s' % (folder, str(uuid.uuid4()), ext) |
拿來測試的檔案 testing.mov 將會轉換成以下網址:https://our-bucket.s3.amazonaws.com/slideuploadqueue/3fe27193-e87f-4244-9aa2-66409f70ebd3.mov 並經由Django Storages 模組上傳。
我們通過 Magic 驗證從使用者端瀏覽器上傳的檔案。Magic可以從檔案內容偵測是何種型別的檔案。
1 2 3 4 5 6 |
@verify_auth_token @return_json def upload_slide(request): file_data = request.POST.get('data', '') file_data = base64.b64decode(file_data.split(';base64,')[1]) description = magic.from_buffer(file_data) |
如果檔案型別符合MPEG v4 系統或是Apple QuickTime 電影,我們就知道該檔案轉碼不會有太大問題。如果格式不是上述所提的幾種,我們會標誌給使用者知悉。
接著,我們將通過SlideUploadQueue 模組將視訊儲存到佇列併傳送一個需求給 RabbitMQ。因為我們使用了Django Storages 模組,檔案將自動被上傳到 Amazon S3。
1 2 3 4 5 6 7 8 9 10 |
slide_upload = SlideUploadQueue() ... slide_upload.status = SlideUploadQueue.STATUS_AWAITING_PROCESSING slide_upload.save() slide_upload.original_file.\ save('anything.%s' % file_ext, ContentFile(file_data)) slide_upload.save() task = ConvertRawSlideToSlide() task.delay(slide_upload) |
階段3:傳送視訊到第三方.
RabbitMQ 將控管 task.delay(slide_upload) 的呼叫。
我們現在只需要傳送視訊檔網址與輸出格式給Encoding.com。該網站會回覆我們一個工作碼讓我們檢查視訊轉碼的進度。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
class ConvertRawSlideToSlide(Task): queue = 'backend_convert_raw_slides' ... def _handle_video(self, slide_upload): mp4 = { 'output': 'mp4', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'mpeg4', 'profile': 'main', 'vcodecparameters': 'no', 'audio_codec': 'libfaac', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'mp4', 'hint': 'no', } webm = { 'output': 'webm', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_sample_rate': '44100', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libvpx', 'profile': 'baseline', 'vcodecparameters': 'no', 'audio_codec': 'libvorbis', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'preset': '6', 'file_extension': 'webm', 'acbr': 'no', } ogg = { 'output': 'ogg', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_sample_rate': '44100', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libtheora', 'profile': 'baseline', 'vcodecparameters': 'no', 'audio_codec': 'libvorbis', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'ogg', 'acbr': 'no', } flv = { 'output': 'fl9', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libx264', 'profile': 'high', 'vcodecparameters': 'no', 'audio_codec': 'libfaac', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'mp4', } thumbnail = { 'output': 'thumbnail', 'time': '5', 'video_codec': 'mjpeg', 'keep_aspect_ratio': 'yes', 'file_extension': 'jpg', } encoder = Encoding(settings.ENCODING_API_USER_ID, settings.ENCODING_API_USER_KEY) resp = encoder.add_media(source=[slide_upload.original_file.url], formats=[mp4, webm, ogg, flv, thumbnail]) media_id = None if resp is not None and resp.get('response') is not None: media_id = resp.get('response').get('MediaID') if media_id is None: slide_upload.status = SlideUploadQueue.STATUS_FAILED slide_upload.save() log.error('Unable to communicate with encoding.com') return False slide_upload.encoding_com_tracking_code = media_id slide_upload.status = \ SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING slide_upload.save() return True |
Encoding.com 推薦一些堪用的Python程式,可用來與它們的服務溝通。我修改了模組一些地方,但還需要修改一些功能才能達到我滿意的狀態。以下是修改過後目前正在使用的程式程式碼:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
import httplib from lxml import etree import urllib from xml.parsers.expat import ExpatError import xmltodict ENCODING_API_URL = 'manage.encoding.com:80' class Encoding(object): def __init__(self, userid, userkey, url=ENCODING_API_URL): self.url = url self.userid = userid self.userkey = userkey def get_media_info(self, action='GetMediaInfo', ids=[], headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'mediaid': ','.join(ids), } query = self._build_tree(etree.Element('query'), nodes) results = self._execute_request(query, headers) return self._parse_results(results) def get_status(self, action='GetStatus', ids=[], extended='no', headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'extended': extended, 'mediaid': ','.join(ids), } query = self._build_tree(etree.Element('query'), nodes) results = self._execute_request(query, headers) return self._parse_results(results) def add_media(self, action='AddMedia', source=[], notify='', formats=[], instant='no', headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'source': source, 'notify': notify, 'instant': instant, } query = self._build_tree(etree.Element('query'), nodes) for format in formats: format_node = self._build_tree(etree.Element('format'), format) query.append(format_node) results = self._execute_request(query, headers) return self._parse_results(results) def _build_tree(self, node, data): for k, v in data.items(): if isinstance(v, list): for item in v: element = etree.Element(k) element.text = item node.append(element) else: element = etree.Element(k) element.text = v node.append(element) return node def _execute_request(self, xml, headers, path='', method='POST'): params = urllib.urlencode({'xml': etree.tostring(xml)}) conn = httplib.HTTPConnection(self.url) conn.request(method, path, params, headers) response = conn.getresponse() data = response.read() conn.close() return data def _parse_results(self, results): try: return xmltodict.parse(results) except ExpatError, e: print 'Error parsing encoding.com response' print e return None |
其他待完成事項包括通過HTTPS-only (加密聯機) 使用Encoding.com 嚴謹的SSL驗證,還有一些單元測試。
階段4:下載所有新的視訊檔格式
我們有個定期執行的程式,通過RabbitMQ每15秒檢查視訊轉碼的進度:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
class CheckUpOnThirdParties(PeriodicTask): run_every = timedelta(seconds=settings.THIRD_PARTY_CHECK_UP_INTERVAL) ... def _handle_encoding_com(self, slides): format_lookup = { 'mp4': SlideVideoMedia.FORMAT_MP4, 'webm': SlideVideoMedia.FORMAT_WEBM, 'ogg': SlideVideoMedia.FORMAT_OGG, 'fl9': SlideVideoMedia.FORMAT_FL9, 'thumbnail': SlideVideoMedia.FORMAT_THUMB, } encoder = Encoding(settings.ENCODING_API_USER_ID, settings.ENCODING_API_USER_KEY) job_ids = [item.encoding_com_tracking_code for item in slides] resp = encoder.get_status(ids=job_ids) if resp is None: log.error('Unable to check up on encoding.com') return False |
檢查Encoding.com的響應來驗證每個部分是否正確以利我們繼續下去。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
if resp.get('response') is None: log.error('Unable to get response node from encoding.com') return False resp_id = resp.get('response').get('id') if resp_id is None: log.error('Unable to get media id from encoding.com') return False slide = SlideUploadQueue.objects.filter( status=SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING, encoding_com_tracking_code=resp_id) if len(slide) != 1: log.error('Unable to find a single record for %s' % resp_id) return False resp_status = resp.get('response').get('status') if resp_status is None: log.error('Unable to get status from encoding.com') return False if resp_status != u'Finished': log.debug("%s isn't finished, will check back later" % resp_id) return True formats = resp.get('response').get('format') if formats is None: log.error("No output formats were found. Something's wrong.") return False for format in formats: try: assert format.get('status') == u'Finished', \ "%s is not finished. Something's wrong." % format.get('id') output = format.get('output') assert output in ('mp4', 'webm', 'ogg', 'fl9', 'thumbnail'), 'Unknown output format %s' % output s3_dest = format.get('s3_destination') assert 'http://encoding.com.result.s3.amazonaws.com/'\ in s3_dest, 'Suspicious S3 url: %s' % s3_dest https_link = \ 'https://s3.amazonaws.com/encoding.com.result/%s' %\ s3_dest.split('/')[-1] file_ext = https_link.split('.')[-1].strip() assert len(file_ext) > 0,\ 'Unable to get file extension from %s' % https_link count = SlideVideoMedia.objects.filter(slide_upload=slide, format=format_lookup[output]).count() if count != 0: print 'There is already a %s file for this slide' % output continue content = self.download_content(https_link) assert content is not None,\ 'There is no content for %s' % format.get('id') except AssertionError, e: log.error('A format did not pass all assertions: %s' % e) continue |
到這裡我們已確認所有事項皆正常,所以我們可以儲存所有的視訊檔了:
1 2 3 4 |
media = SlideVideoMedia() media.format = format_lookup[output] media.converted_file.save('blah.%s' % file_ext, ContentFile(content)) media.save() |
階段5:經由HTML5播放視訊檔
在我們的前端網頁已經新增了一個有HTML5的影像單元的網頁。並採用對每個瀏覽器都有最佳支援的video.js來顯示視訊。
1 2 3 4 5 6 7 |
? bower install video.js bower caching git://github.com/videojs/video.js-component.git bower cloning git://github.com/videojs/video.js-component.git bower fetching video.js bower checking out video.js#v4.0.3 bower copying /home/mark/.bower/cache/video.js/5ab058cd60c5615aa38e8e706cd0f307 bower installing video.js#4.0.3 |
在我們的首頁有包含其他相依的檔案:
1 2 3 4 5 6 7 |
!!! 5 html(lang="en", class="no-js") head meta(http-equiv='Content-Type', content='text/html; charset=UTF-8') ... link(rel='stylesheet', type='text/css', href='/components/video-js-4.1.0/video-js.css') script(type='text/javascript', src='/components/video-js-4.1.0/video.js') |
在Angular.js/JADE-based 框架下的模組,我們引入<video>卷標 與其<source>子卷標。每個視訊檔案都會有縮圖通過<video>卷標的 poster 元件顯示,縮圖的影像是由我們從視訊的前幾秒擷取下來。
1 2 3 |
#main.span12 video#example_video_1.video-js.vjs-default-skin(controls, preload="auto", width="640", height="264", poster="{{video_thumbnail}}", data-setup='{"example_option":true}', ng-show="videos") source(ng-repeat="video in videos", src="{{video.src}}", type="{{video.type}}") |
還會顯示出我們轉換的每個視訊檔案格式,並使用在<source>標籤。Video.js 會根據使用者使用的瀏覽器決定播放哪種格式的視訊。
我們仍然有許多工作需要完成,建立單元測試與加強和Encoding.com服務溝通的程式。如果你對這些工作感興趣請與我連絡。