This project serves as an experimental area for developing, testing real time communication.
Following is the primary use case:
- user
connects
device to push cloud - user's device/ app is
pushed
number of unread feeds (and change id/ timestamp) - user's app uses
rest api
for getting the unread feeds using the change id
There are two publish-subscribe servers: Push
& Publisher
servers. There are also two memory data stores: DeviceDB
& PublishDB
. DeviceDB
holds current information on connected devices. PublishDB
holds transient publishing information for Push
server Clients
.
Mapping the use case to our architecture:
- Users authenticate their
Client
to a givenPush
server Push
server spawns a subscription onPublishDB
(using the user credentials)Push
server also updatesClient
info toDeviceDB
Publisher
server looks upClient
info inDeviceDB
to publish any unread feed items
Push
&Publisher
servers are Node.js scriptsClient
is a multi threaded Java programDeviceDB
&PublishDB
are Redis database servers
The test aims to discover a maximum number of clients that can be subscribed to a PUSH server (at a given hardware configuration).
A way to measure this is to track the latency between when a message is created by the PUBLISH server to when it is actually received by the client. As more active subscriber clients to the mix, we expect this latency to increase, till it becomes intolerable. The number of active clients connected at that time is our maximum. Of course, not every client needs to be active - meaning they do not need to be receiving messages. In that case, we have a PERCENT_COLLAB
factor which can be used (in the PUBLISHER
& PUSH
servers) to find a maximum client number based on a realistic mix of active clients vs clients that are just listeners.
More specific details of the test follow below:
- Push server sends out message payload
payload = {chg_id: <id of last known read item>, items: <unread items>, publish_ts: <timestamp> }
- Client monitors message latency
ReceivedTs = Calendar.getInstance().getTimeInMillis()
ServerTs = payload.publish_ts
Latency = ReceivedTs - ServerTs
TOLERANCE = <client program input>
- For each given
concurrency
in[1, 10, 100, 1000, 10000, 100000, 1000000 ]
- Find
Median(Latency[concurrency])
if Median(Latency[concurrency]) > TOLERANCE then stop the test and publish the concurrency#
- Find
# ulimit -n 65536
# ifconfig eth0 txqueuelen 8192 # replace eth0 with the ethernet interface you are using
# /sbin/sysctl -w net.core.somaxconn=4096
# /sbin/sysctl -w net.core.netdev_max_backlog=16384
# /sbin/sysctl -w net.core.rmem_max=16777216
# /sbin/sysctl -w net.core.wmem_max=16777216
# /sbin/sysctl -w net.ipv4.tcp_max_syn_backlog=8192
# /sbin/sysctl -w net.ipv4.tcp_syncookies=1
- Push server is
Done
- PublishDB is
Done
- DeviceDB is
Done
- Client is
Done
- Publisher is
Done
- LOAD BALANCER: performs ssl termination and proxies http 1.1/ TCP traffic
- PUSH: Node.js/ Socket.IO servers machine
- CLIENT: java client machine
- PUBLISHER: Node.js machine
- DATA STORE: redis machine
- Unit testing is
Done
- Performance testing is
WIP
- Able to sustain 1200 active device connections in MAC Book Pro (along with GUI, Client, Redis, Node.js)
- Able to sustain 3200 active device connections in a single core server machine (along with Redis, Node.js running on same machine)