Gunfish
Push Provider Server with HTTP/2
Push Provider Server with HTTP/2
Yokohama\.(pm6?|go) #14
Takuya Yoshimura (takyoshi)
Takuya Yoshimura (takyoshi)
About me
- Takuya Yoshimura
- @takyoshi (moulin) (github, work)
- @tkyshm (github, private)
- KAYAC Inc.
- Lobi, server side engineer
Lobi - Chat & Game Community
Push Notification
↓ Login same user accounts
APNS
Got my push !
1. A actions to B
4. APNs deliveries puth notofications theses devices
2. Retrieves all device
tokens binded B from DB
3. Send notifications to
B device tokens
A
B
iOS Push Notification on Lobi
iOS Push Notification on Lobi
- iOS Push Notification using APNs
- Legacy APNs Provider API
- Delivery all devices binded user account in our application
Push Notification
↓ Login same user accounts
APNS
Got my push !
1. A actions to B
4. APNs deliveries puth notofications theses devices
2. Retrieves all device
tokens binded B from DB
3. Send notifications to
B device tokens
A
B
About APNs
- Apple Push Notification service
- The service to delivery Push Notification to iOS devices.
- Developers can use the service with APNs Provider API
About APNs
-
Sending Notification to APNs on Background
- Network latency.
- e.g. Lobi to APNs average response time is 0.15 sec
-
Pooling TCP connections (with APNs)
- APNs will treats with the Denial service of Attacks.
- Not creates too many tcp connections
-
Screening Invalid Device Tokens
- Cannot delivery a part of notifications after the invalid token
- Didn't delivery some notifications after APNs reponse EOF (on Lobi)
-
2 pattern screening cases:
- Error Response = delete imediately
- Feedback service = delete periodically
Implementation Notification Provider Server
- Implements Push notification with AnyEvent::APNS (Perl)
- Persistent TCP connection
- Receives HTTP POST by JSON array format
- Delete Invalid Token as Error Response Imediatelly
- Using Feedback service from cron
- Retry to Send Push Notification due to Invalid Tokens
- Stores sending histories
Legacy APNs
- Due to Invalid tokens, cannot delivery some notification...
- cannot send properly when to overflow queue size for retring
- Cannot know all invalid tokens perfectly
- So, cannot screen all invalid tokens.
- Accumulate invalid tokens into DB gradually.
- These tokens cause EOF.
New APNs
New APNs
- HTTP/2
- Feedback service includes Error Response
- Easy to authenticate APNs
- Extends payload size to 4096 bytes
WWDC-2015 (8/26) 「What's New in Notifications」
HTTP/2
- Delivery notifications after the Invalid token
- HTTP/2 feature of multiple request and response
- Growing performance
- Using efficently one http client because that http/2 allows multiple request and response
Without Feedback Service cron
- New APNs replies many error types
- New APNs replies the error response type as a previous Feedback service
Golang ...
introduced http/2 at Go 1.6.
Go 1.6 will be released at February !
Gunfish
Gunfish
- Goal is to delivery all Push Notification
- Notification Provider server created by Go1.6
- Currently only supports new APNs Provider API
- Designed to delivery by post JSON array on HTTP/1.1
- for backword compatibility
- Command hook and custom error handler when to catch error response
- Feature to delete invalid tokens.
- graceful restart
- Enable to tune a performance
Gunfish Architechture
server
supervisor
worker
sender
APNS
Lobi
Gunfish
: chan
: goroutine
: write
: read
①
②
③
④
⑤
⑥
⑦
Worker & Sender
Worker
- 1worker, 1 http/2 client
- Sender receives APNs response asynchronously
Sender
- 1 worker, N sender
- requests APNs
- give a APNs response own Worker
- Sender number is max multiplexity of http/2 client
Starts Gunfish
$ gunfish -c /etc/gunfish/config.toml
config.toml example:
[provider]
port = 38003
worker_num = 4 # http/2 client number
queue_size = 2000 # queue length for posted requests
max_request_size = 1000 # max JSON array size of posted request
max_connections = 1000 # max connection size between developer application and gunfish
[apns]
cert_file = "/path/to/cert_file.pem"
key_file = "/path/to/key_file.pem"
request_per_sec = 2000 # flow rate of notification per sec
sender_num = 30 # http/2 client multiplexity
error_hook = "your_hook_cmd.sh" # error response hook command
Gunfish Push Notification
$ curl -X POST -H "Content-type: application/json" \
-d '[{"token":"83fa0eb8a743c118fca80b0136bfee0", \
"payload": {"aps": {"alert": "hoge", "sound": "default", "badge": 1}}}]' \
http://localhost:38103/apns/push
{"result": "ok"}
// POST json array format
[
{
"token": "apns published device token",
"payload": "APNs Payload"
}
]
// Payload structure
{
"aps": {
"alert": {
"title": "foo",
"body": "bar"
},
"sound": "default",
"badge": 1
},
"option1": "some properties your application"
}
Custom Error Response Handler
// ResponseHandler interface
type ResponseHandler interface {
OnResponse(*Request, *Response, error)
HookCmd() string
}
- Handling error response on golang layer
- Implements ResponseHandler and set this Handler
InitErrorResponseHandler(YourCustomErrorHandler{})
Graceful Restart
- Not terms application finished to send that queue contains data
- Timeout to force shutdown is 2 min.
- Server::Starter
$ start_server --port 38003 --interval 5 -- gunfish -c /etc/gunfish/config.toml
- About Server::Starter
Go言語でGraceful Restartをする (How to graceful restart with golang)
http://shogo82148.github.io/blog/2015/05/03/golang-graceful-restart/
Go言語でGraceful Restartをするときに取りこぼしを少なくする
(Minimize missing when to graceful restart with golang)
http://shogo82148.github.io/blog/2015/11/23/golang-graceful-restart-2nd/
Performance tunning
Problem (example):
- Goal is to delivery all notifications at flow rate as 5000 notification/sec
Setting parameters:
- request_per_sec = 5000
- worker num
- sender num
Performance tunning
- h2o APNs Mock server ( mruby )
- wrk command bench
- 1 POST, 200 Notification
- Starts gunfish for bench
$ h2o -c conf/h2o/h2o.conf
$ wrk2 -t2 -c20 -d10 -s bench/scripts/err_and_success.lua -L -R25 http://localhost:38103
$ gunfish -c /etc/gunfish/config.toml -E test 2> gunfish.log
Performance tunning
# result of wrk2
#[Mean = 23.171, StdDeviation = 22.064]
#[Max = 160.384, Total count = 240]
#[Buckets = 27, SubBuckets = 2048]
----------------------------------------------------------
242 requests in 10.01s, 31.43KB read
Requests/sec: 24.18
Transfer/sec: 3.14KB
# log
38542 msg:"Succeeded to send a notification", type:worker # success to send
9067 msg:"Response queue is full.", type:sender # the message due to too many notifications
968 msg:"Catch error response.", type:"http/2-client" # receive error response
506 msg:MissingTopic, type:worker
190 msg:BadDeviceToken, type:worker
95 msg:Unregistered, type:worker
8 msg:"Worker Queue size: 2083", type:
8 msg:"Succeeded to establish new connection.", type:worker
8 msg:"Response queue size: 2083", type:
1 msg:, type:
242 request * 10 * 200 = 48400 (send)
38542 + 9067 + 506 + 190 + 95 = 48400
Release
Release
- First, deployed only one production
- Observed gunfish in a few days...
- by zabbix
- To observe 2 API status.
{
"pid": 19184,
"debug_port": 17889,
"uptime": 14,
"start_at": 1458276491,
"su_at": 0,
"period": 1,
"retry_after": 10,
"workers": 8,
"senders": 400,
"queue_size": 0,
"retry_queue_size": 0,
"workers_queue_size": 0,
"cmdq_queue_size": 0,
"retry_count": 0,
"req_count": 0,
"sent_count": 0,
"err_count": 0
}
{
"time": 1458276536967232300,
"go_version": "go1.6",
"go_os": "linux",
"go_arch": "amd64",
"cpu_num": 4,
"goroutine_num": 424,
"gomaxprocs": 4,
"cgo_call_num": 5,
"memory_alloc": 7276200,
"memory_total_alloc": 9109144,
"memory_sys": 13641976,
"memory_lookups": 104,
"memory_mallocs": 41548,
"memory_frees": 24228,
"memory_stack": 2228224,
"heap_alloc": 7276200,
"heap_sys": 8257536,
"heap_idle": 352256,
"heap_inuse": 7905280,
"heap_released": 0,
"heap_objects": 17320,
"gc_next": 8999555,
"gc_last": 1458276491942982000,
"gc_num": 2,
"gc_per_second": 0,
"gc_pause_per_second": 0,
"gc_pause": []
}
Memory Leak Problem
- HTTP/2 client cause memory leak (2016/2/2-2/3)
- Restart Gunfish per 30 min by cron as interim measures (2016/2/3)
Memory Leak Problem
- Analized by pprof
$ go tool pprof gunfish http://localhost:2412/debug/pprof/profile
(pprof) > top
(pprof) > list
- Invesitigates a problem and cause with pprof
- checks that already issued official repository
Memory Leak Problem
net/http: http2 Transport retains Request.Body after request is complete, not GCed #14084
Memory Leak Problem
Solve memory leak problems, so deployed all productions
Summary
Summary
- APNs Provider Server with Golang http/2 client
- Gunfish design concept is 'delivering all notifications'
- Catch up problems quickly when to use latest
- I wanted to create bench as tunning is more easy
- I want to support GCM in the future
Gunfish (English)
By Takuya Yoshimura (tkyshm)
Gunfish (English)
Push Provider Server with HTTP/2.
- 1,841