iOS推送群發的問題，PushSharp作者的這篇講得最詳細了

阿債發表於2015-12-04

原文 http://redth.codes/the-problem-with-apples-push-notification-ser/

The problem with Apples Push Notification Service… Solutions and Workarounds…

Push Notifications have intrigued me since Apple first introduced them in iOS years ago. RIM had been doing this for a while, but as a platform it never excited me. As soon as the documentation for their APNs protocol was released I started busily implementing a solution to send push notifications in C#. The first version of the protocol was already terrible, but I’m not going to harp on that as we’ve got a newer version that’s only slightly better to tear apart.

I’m the author of PushSharp (https://github.com/Redth/PushSharp) which is a .NET library to assist developers in delivering push notifications on as many platforms as possible (iOS, Android, Windows, Windows Phone, and HTML5). This library was the culmination of my previous efforts in individual libraries (APNS-Sharp and C2DM-Sharp mostly), and represents a more abstracted, standardized, easier way to support push notifications on all the platforms you may target as a developer.

This post is a chance for me to vent, to explore my frustrations with Apple’s APNS protocol, and hope that they somehow listen and change it. You’ll notice how there’s no complaining about the way the other platforms implement their protocols. This is because they aren’t terrible (though they aren’t perfect either).

Apple’s Enhanced Format for Push Notifications

I had great hopes that Apple finally fixed its protocol when they introduced the Enhanced format (basically v2 of their protocol). Both the original and the enhanced format are binary protocols. You can quickly see the differences between the two in the diagrams below:

Original Protocol:
Original-Binary-Protocol

Enhanced Protocol:
Enhanced-Binary-Protocol

You’ll notice that in the enhanced protocol, there’s two additional bits of information (besides the first byte being 1, indicating that it’s the new protocol as opposed to 0 in the original). These are two very good additions to the protocol:

Identifier – Your own 4 byte data to uniquely identify the notification – this is important as it’s returned to you in the error response packet if there’s a problem.
Expiry – A point in time after which the message is no longer valid if it has not already been delivered and can be discarded from Apple’s servers
In the original protocol, whenever you send Apple a notification that has a problem with it (maybe it’s too big, or it has an invalid device token or malformed payload, etc.), it simply closes the TCP connection without any warning. You are left to assume something is corrupt in your notification.

With the enhanced protocol also handles bad notifications it receives a bit differently. It still closes the connection to you when there’s an error, but before it does so, it sends back an error response packet. You can see the format in the diagram below, along with a list of status codes and their meanings:

Error Response Packet:
Error-Response-Packet
Status Codes & Meanings
0 – No errors encountered
1 – Processing error
2 – Missing device token
3 – Missing topic
4 – Missing payload
5 – Invalid token size
6 – Invalid topic size
7 – Invalid payload size
8 – Invalid token
255 – None (unknown)
This packet is quite simple, with the first byte presumably indicating the protocol version, the second byte being the status or error code (Apple provides us with a list of status codes and what they mean – interestingly enough one of the status codes is ‘0 – No errors encountered’ – just keep this in mind for later). The final piece of info is the Identifier. This identifier will correspond to the identifier of the notification you sent which caused the error condition.

What’s the problem here?

So far so good, right? Well, not so much. In theory this all sounds very good. Finally, we get an error response from the service, and some additional functionality. But there are still two major issues with the protocol that you would discover very quickly if you decided to try and implement a client for it yourself:

Apple never sends a response for messages that succeeded.
An error response may not immediately be returned before you have a chance to send more notifications to the stream, but before Apple closes the connection. These notifications that might still ‘make it through’ are left in limbo. They are never acknowledged or delivered, and never have error responses returned for them.
How could Apple fix this?

Remember how I told you to keep in mind the error response status code of ‘0 – No errors encountered’? This is the silver bullet. If Apple simply always returned an error response, for every notification, even if the notification succeeded, we could simply build a library that wrote a notification to the stream, waited for a response, and then moved onto the next notification, over and over. There’d be no business of waiting around for an error response that might never come, and greatly simplify the pains of implementing this protocol. Apple might argue that this would consume more bandwidth, and while they may be right, in this day and age it would only amount to another ~6 MB per 1,000,000 notifications delivered. Considering that Google and Windows Phone both use HTTP protocols and generate significantly more bandwidth based on the underlying protocol alone, and are able to keep their infrastructure running, an extra 6 bytes per notification should be pocket change to Apple in the cost of maybe a few additional servers and bandwidth allocation.

It’s such a simple answer. It’s even in Apple’s own documentation. Yet, it’s not our reality.

So what is the workaround?

I’ve looked at many libraries, written in many different languages, to see how they worked around this problem. In about 99% of the cases I’ve observed, they all use the same, sadly inefficient approach: Waiting.

So again, we have a connection to Apple’s APNS server, and we want to send notifications over that connection repeatedly, and as fast as possible. Apple never sends us a response if a notification was sent successfully, but if one failed, they will send us an error response and close our connection.

The problem is, if we keep sending notifications over and over again, we might send a second, or third notification before Apple ever sends us an error response for the first one that failed. If this happens, the second and third notifications are never delivered, and are lost forever.

The easiest way to solve this is to asynchronously read from the connection stream, waiting for an error response. In doing so, however, this means you must also wait a little while after you write each notification to the stream to see if your asynchronous read ever receives anything. You can’t just do a synchronous blocking read on the stream since you’d be waiting indefinitely if the notification succeeded (since Apple sends no response in this case). To make matters worse, Apple doesn’t guarantee how quickly an error response will be sent to us. I’ve seen libraries wait for an error response from anywhere between 100 to 500 milliseconds.

It should be painfully obvious to you by now why this approach is flawed. If you have to wait even 100 milliseconds after sending every notification, that would take you almost 28 hours to send 1,000,000 notifications over a single connection!

Most libraries employ the use of multiple connections to circumvent this new issue they’ve created for themselves by waiting between each notification. If you use 10 connections to apple’s servers, that cuts your time down to 2.8 hours. This is better, but why should it take 10 connections 2.8 hours to deliver a theoretical maximum of only 300MB of data (1,000,000 notifications * 301 bytes maximum size per notification)? This is asinine!

A better workaround

I just couldn’t stand the thought of wasting 100-500 milliseconds per notification sent. I figured there had to be a better way, and I think I’ve found it! PushSharp employs a technique that is fairly easy in theory, and was a bit more difficult to implement in code.

Each time a notification is written to the connection stream, it is then added to a ‘Sent’ queue. If an error response is received, the corresponding notification is located in the ‘Sent’ queue (by its identifier). Anything before the error-causing notification in the queue is removed and assumed to be successfully sent. Anything after the error-causing notification is assumed to be lost, and re-queued to the ‘To Send’ queue to be tried again. There is also a cleanup thread running that constantly checks the oldest notification in the ‘Sent’ queue to see if it’s older than a few seconds and if so, it is assumed to have been successfully sent and is removed from the ‘Sent’ queue. This effectively moves the waiting period for an error response outside of the scope of the connection to the APNS servers. The diagram below illustrates how the Sent queue works:

PushSharp Sent Queue
PushSharp-Sent-Queue
Conclusion

So that’s it, that’s my big speech on why Apple’s Push Notifications are so troublesome, how they could make my life a lot easier with a couple small changes, and how I’ve learned to cope with the situation for now. I hope you enjoyed my ramble, and do please check out PushSharp!

最全面最詳細的字符集講解來了!
2024-07-19
iOS史上最詳細的動畫講解-載入等待動畫（一）
2016-07-08
iOS動畫
最詳細的JVM&GC講解
2017-04-12
JVMGC
別找了，這是 Pandas 最詳細教程了
2020-04-06
JAVA的字串這篇講清楚了
2024-09-24
Java字串
這是我見過的最詳細的Linux系統結構講解!
2022-03-25
Linux
iOS 極光推送遇到的問題
2015-06-08
iOS
iOS面試旗開得勝之問題篇
2017-12-21
iOS面試
Java設計模式之單例模式，這是最全最詳細的了
2019-01-19
Java設計模式單例
這可能是掘金講「原型鏈」，講的最好最通俗易懂的了，附練習題！
2021-12-16
原型
史上最詳細的webpack 講解2 （DefinePlugin中的淫技巧）
2017-04-11
WebPlugin
最詳細的測試用例設計方法講解
2024-09-11
這 5 個簡單的面試題，把群炸了
2019-04-24
面試題
可能是把 Java 介面講得最通俗的一篇文章
2020-05-15
Java
線上服務的FGC問題排查，看這篇就夠了！
2020-06-14
GC
面試中關於Redis的問題看這篇就夠了
2019-01-19
面試Redis
計算理論101：這可能是講FSM的最生動的一篇了
2019-04-20
這或許是最詳細的JUC多執行緒併發總結
2020-05-16
執行緒
iOS10註冊推送的細節
2017-01-10
iOS
webpack4.x最詳細入門講解
2018-10-29
Web
全網最!詳!細!Tarjan演算法講解。
2017-07-17
演算法
這可能是最詳細的Python檔案操作
2019-01-04
Python
教科書級講解，秒懂最詳細Java的註解
2020-06-06
Java
最詳細的代理講解--JDK動態代理和cglib代理
2013-05-05
JDKCGLib
史上最詳細的一線大廠Mysql面試題詳解
2019-03-30
MySql面試題
超詳細的Java面試題總結（三）之Java集合篇常見問題
2018-03-03
Java面試題
Spark開發-WordCount詳細講解
2017-09-15
Spark
這大概是全網最詳細的教你如何在 Spring Cloud 中使用 Hystrix 的文章了
2021-04-07
SpringCloud
openstack完整的部署（最詳細）
2024-05-04
ios GCD 死鎖幾個案例詳細講解
2017-06-15
iOSGC
年底晉升，全網最詳細的通關指南來了！
2020-12-21
史上最詳細的iOS之事件的傳遞和響應機制
2016-03-02
iOS事件
iOS開發常見問題之綜合篇
2017-12-22
iOS
史上最詳細的 webpack 講解 1 （vue-cli 中 build.js）
2019-03-04
WebVueUIJS
最詳細的 SAP ABAP Web Service 建立和消費步驟講解
2021-05-23
Web
Docker版Grafana整合InfluxDB看這一篇就夠了（2020全網最詳細教程）
2020-10-31
DockerGrafanaUX
iOS 動畫詳解（學習動畫看這一篇就夠了）
2016-12-13
iOS動畫
看完這篇 HashSet，跟面試官扯皮沒問題了
2020-07-01
面試

iOS推送群發的問題，PushSharp作者的這篇講得最詳細了

相關文章