MigratoryData Demonstrates Record-Breaking 8X Higher WebSocket Scalability than Competition


This benchmark shows that MigratoryData achieves 8X higher scalability than the record obtained by the competition in the same benchmark category; reaffirming it is the most scalable WebSocket server. This benchmark result also demonstrates that, using MigratoryData WebSocket Server, it is feasible and affordable to build real-time web applications delivering high volumes of real-time information to a high number of concurrent users.

Benchmark Results

In this benchmark scenario, MigratoryData scales up to 192,000 concurrent users (delivering 8.8 Gbps throughput) from a single Dell R610 1U server and achieves an 8X higher scalability than the record obtained by the competition (who used a more recent Dell 1U server with similar specifications). Moreover, MigratoryData achieves lower bandwidth utilization and lower latency as shown in the diagram and table below.

benchmark_high_volume

Latency is defined here as the time needed for a message to propagate from the publisher to the client, via the MigratoryData server. Thus, the latency of a message is the difference between the time at which the message is sent by the publisher to the MigratoryData server and the time at which the message is received by the client from the MigratoryData server as detailed in the following diagram:

latency

Detailed Results of MigratoryData WebSocket Server Running on a Single Instance of a Dell R610 1U Server

In the table below, it is important to note that we’ve obtained the results using the default configuration of MigratoryData WebSocket Server, a fresh installation of Linux Centos 6.4 (without any kernel recompilation or other special tuning), and the standard network configuration (employing the default MTU 1500, default kernel buffer sizes, etc).

Number of client connections 24,000 48,000 72,000 96,000 120,000 144,000 168,000 192,000
Number of messages per second to each client 10 10 10 10 10 10 10 10
Total Messages Throughput 240,000 480,000 720,000 960,000 1,200,000 1,440,000 1,680,000 1,920,000
Average Latency (milliseconds) 2.35 3.09 5.76 39.95 83.23 139.46 225.87 597.27
Standard Deviation for Latency (milliseconds) 3.74 3.79 6.73 20.80 39.36 65.12 106.00 269.74
Maximum Latency (milliseconds) 49 54 88 168 291 391 760 1732
Network Utilization 1.21 Gbps 2.39 Gbps 3.59 Gbps 4.65 Gbps 5.75 Gbps 6.79 Gbps 7.87 Gbps 8.88 Gbps
CPU Utilization 25% 49% 72% 82% 88% 90% 92% 96%
RAM Memory Allocated to the Java JVM 2.5 GB 26 GB 26 GB 26 GB 26 GB 26 GB 30 GB 48 GB

Note: As RAM is inexpensive, we did not tune the memory configuration and used a reasonable value for each benchmark test.

Hardware & Setup

MigratoryData Websocket Server version 4.0.3 ran on a single Dell PowerEdge R610 server as follows:

Model Name Dell PowerEdge R610
Manufacturing Date Q4 2011
Dimension 1U
Number of CPUs 2
Number of Cores per CPU 6
Total Number of Cores 12
CPU type Intel Xeon Processor X5650
(12 MB Cache, 2.66 GHz, 6.40 GT/s QPI)
Memory 64 GB RAM
(DDR3 1333 MHz)
Network Intel X520-DA2 SPF+ with two 10 Gbps ports
Two dual ports 1 Gbps embedded on the motherboard
Operating System Centos 6.4, Linux kernel 2.6.32-358.2.1.el6.x86_64
Java Version Oracle (Sun) JRE 1.6.0_37

The Benchmark Publisher and the Benchmark Client instances ran on 14 identical Dell PowerEdge SC1435 servers. The Dell R610 server (running MigratoryData WebSocket Server) and the 14 Dell SC1435 servers (running the Benchmark Clients and the Benchmark Publisher) were connected via two gigabit switches: a Dell PowerConnect 5424 and a Dell PowerConnect 6224 (enhanced with a 2-port 10 Gbps module), as detailed in the diagram below:

high-volume-benchmark-setup

The total number of concurrent client connections for each benchmark test is achieved using 13 of the 14 Dell SC1435 servers. One instance of the Benchmark Client runs on each of these 13 servers. Thus, one simulates 1/13 of the total concurrent client connections from each of these 13 servers.

For example, in our benchmark test which simulated 192,000 concurrent users, we deployed 13 instances of Benchmark Client, each opening 14,770 (i.e. 192,000 / 13) concurrent client connections to the MigratoryData server.

The 14th Dell SC1435 server is used to run both an instance of the Benchmark Client (opening 30 concurrent client connections) and an instance of the Benchmark Publisher.

The Benchmark Scenario

  • There are a total of 100 different subjects (see MigratoryData Architecture Guide to learn more about MigratoryData subjects).
  • The publisher sends 1000 messages per second.
  • The subject of each message is randomly selected from the 100 subjects; thus, each subject is updated 10 times a second.
  • Each client subscribes to a single subject randomly selected from the 100 subjects; thus, each client receives 10 messages per second.
  • The payload of each message is a 512-byte string (consisting of 512 random alphanumeric characters)

Methodology

We performed 8 benchmark tests corresponding to the 8 results summarized above, in order to simulate 24,000 / 48,000 / 72,000 / 96,000 / 120,000 / 144,000 / 168,000 / 192,000 concurrent users from a single instance of MigratoryData WebSocket Server and using 13 instances of the Client Benchmark.

For the duration of each test, we ran a 14th instance of the Benchmark Client on the same machine that ran the instance of the Benchmark Publisher. The 14th instance of the Benchmark Client was used to measure latency results. It simulated an additional 30 users on top of the total number of simulated users, ran for 600 seconds, and computed the average, standard deviation, and maximum statistics of the latency of the received messages.

Note: Because the 14th instance of the Benchmark Client ran on the same machine as the instance of the Benchmark Publisher, there was no need for clock synchronization. Thus, the latency results are perfectly accurate as far as time synchronization is concerned.

Moreover, the sample size for each test is 180,000 messages (600 second x 10 messages per second x 30 concurrent client connections). Thus, it is large enough such that the latency results are statistically accurate.

Linear Horizontal Scalability

Not only does MigratoryData WebSocket Server offer horizontal scalability via its built-in clustering feature, it also offers linear horizontal scalability because each instance of MigratoryData WebSocket Server in the cluster runs independently from the other cluster members. It exchanges only negligible coordination information or, depending on the clustering type you configure, does not exchange any information at all with the other cluster members.

Therefore, if one wants to deliver real-time information to 1 million concurrent users in this benchmark scenario, then one can deploy 6 instances of MigratoryData WebSocket Server on 6 Dell R610 servers to deliver data to 1,152,000 concurrent users (i.e. 6 servers x 192,000 maximum concurrent connections, as demonstrated by this benchmark).

Note: Even if MigratoryData WebSocket Server comes with linear horizontal scalability, in a production deployment, one also needs to consider the situation when a cluster member might go down. If this were to occur, the users of the server which goes down will automatically be reconnected by the MigratoryData API to the other cluster members. Thus, the other cluster members would support the load introduced by the member which fails.

The implication of this is that, for the example above, in a production deployment, it is recommended to have at least 7-8 servers to achieve 1 million concurrent users such that, if a failure were to occur, each server will have enough reserve to accept part of the users of the cluster member which fails.

Conclusion

This benchmark result reaffirms MigratoryData’s leadership in websocket server scalability.

Using MigratoryData’s high vertical scalability and linear horizontal scalability, one can build cost-effective real-time applications scalable to meet any growth in number of users and data volumes.

Push Notifications to Millions of Android Devices using MigratoryData WebSocket Server

At the request of a new Fortune Global 500 (South Korea-based) customer, with whom we have recently signed an enterprise-wide license with premium support for MigratoryData WebSocket Server, we have developed a new MigratoryData API for building Android apps able to push notifications to end users even when the mobile device is in “sleep” mode.

MigratoryData WebSocket Server is certainly most appreciated and in demand for its ability to reliably distribute real-time data to a large number of web clients.

However, our customers want to leverage the high scalability and the enterprise features of MigratoryData WebSocket Server and use it to distribute real-time data to mobile clients too – not only to mobile web browsers but also to native mobile applications.

As a result, we’ve extended our client API for JavaScript, used to build pure real-time web applications accessible from any standard browser, to mobile technologies. In this way, we’ve developed client APIs for Android (Java), Windows Mobile (.NET Compact Framework), iOS (Objective-C), and BlackBerry (Java ME).

The new MigratoryData Android Push Notification API enhances the existing MigratoryData API for Android with the ability to push notifications even when the mobile device is in “sleep” mode.

A push notification is similar to an SMS notification. However, the push notifications can be sent at almost no cost, the delivery is instant and guaranteed.

The new API implements all the features of the MigratoryData Client APIs, including:

  • Encrypted SSL/TLS Connectivity
  • Load Balancing and Fault Tolerance
  • Guaranteed Message Delivery
  • Entitlement

Also important to note is that this API connects end users directly to the MigratoryData server using a persistent TCP connection (even in sleep mode) and without using the Google GCM (“Google Cloud Messaging”) service. Thus, data is delivered directly and securely to mobile users, without the need to pass through a third-party service.

The API will be available shortly via our download portal but if you want to try it sooner, just let us know.

P.S. I promised some new benchmarks for MigratoryData WebSocket Server version 4.0 but of course premium customer support is MigratoryData’s top priority. I hope to be able to perform the new benchmark tests shortly.

Websocket Server Handling Millions of Concurrent Users: True or False?

Some providers of enterprise WebSocket servers claim their real-time web technology scales to millions of users. However, AFAIK, only MigratoryData currently offers evidence of how this scaling is achieved. Some claims:

Kaazing

A true veteran performance guru (Kirk Pepperdine) joins the Kaazing team to verify test Kaazing’s 1 million connection benchmark on a 1U blade

Lightstreamer

… platform for pushing live data over WebSockets and other web protocols, scalable up to millions of users

Pusher

Scaling infrastructure to maintain millions of long running connections is not trivial, but we’ve worked out the tricks to handle it with ease

MigratoryData

A single MigratoryData instance is able to handle up to 1 million concurrent users (see benchmarks)”

Most likely, all the products mentioned above come with some form of clustering. Thus, this feature can be used to scale horizontally and push data to millions of concurrent users by deploying multiple WebSocket server instances. In this case, scaling to millions of users is possible but expensive.

MigratoryData implements built-in high availability clustering, thus it scales horizontally. But more important, MigratoryData demonstrates (as published in their Performance Benchmarking Guide) that pushing real-time data while having 1 million concurrent users connected is possible from a single instance running on a 1U blade. Thus, MigratoryData is able to push data to millions of users and, in fact, it is currently used in production to push real-time data to millions of end-users every day.

Over the next couple of weeks, I’m going to work on the performance benchmarking of the just released MigratoryData WebSocket Server 4.0 (see the Architecture Guide for an overview of the new version 4.0).

Last time when I performed benchmark tests for MigratoryData server 3.5, I compared the results with those published by Caplin Systems for their WebSocket server Caplin Liberator. Caplin Liberator’s performance results were very good in terms of latency and throughput and I think (if I recall correctly) we’ve achieved comparable results, slightly better for some use-cases and slightly worse for other ones. Caplin Systems’ target at that time was not high vertical scalability, as suggested by Martin Tyler.

I’m interested if any WebSocket server provider that claims high vertical scalability can offer any published results so that I can compare them with the new results of MigratoryData 4.0. MigratoryData benchmark results for version 4.0 are going to be released soon.

 

The Must-Have Features of a WebSocket / Comet Server

I plan to blog about some important features of the WebSocket / Comet servers:

  1. High Vertical Scalability,
  2. Clustering (Horizontal Scalability / Load Balancing),
  3. Low Latency,
  4. Guaranteed Message Delivery (in presence of failures),
  5. Fault Tolerance (both on client-side and server-side),
  6. Security (including Authorization / Entitlement),
  7. Low Bandwidth,
  8. Monitoring,
  9. APIs,
  10. Benchmarking.

If you think I missed an important feature, please add a comment here with your suggestions.