LiteSpeed Beats nginx in HTTP/3 Benchmark Tests
LiteSpeed’s HTTP/3 implementation outperforms that of the hybrid nginx/quiche in a battery of benchmark tests. OpenLiteSpeed transfers resources more quickly, scales better, and uses less CPU and memory while doing it. In each of these metrics, LiteSpeed betters nginx by a factor of 2 or more. nginx could not complete some tests, while in others, it achieved only a fraction of TCP speed. Below, we describe the setup and the various benchmarks.
Why Compare LiteSpeed and nginx HTTP/3 Now?
HTTP/3 is a new protocol for the web, a successor to both Google QUIC and HTTP/2. As the QUIC Working Group at IETF is getting closer to finalizing the drafts, the nascent HTTP/3 implementations are maturing and some are now starting to see production use.
LiteSpeed was the first to ship HTTP/3 support in LSWS in July of this year. We have been supporting QUIC since 2017, making improvements along the way. The HTTP/3 support rests on this foundation.
Several weeks ago, Cloudflare released a special HTTP/3 patch for nginx, encouraging everyone to experiment. Because nginx is our competitor, we were excited to get a chance to kick nginx’s shiny new HTTP/3 tires. Cloudflare uses this patch in production, so it has to be good!
Cloudflare’s Quiche Patch
Quiche is an HTTP/3 and QUIC library by Cloudflare. It is written in Rust, a new high-level language. The library provides a C API, which is how it is used by nginx.
Benchmark Setup
Platform
Both servers and the load tool are run on the same VM, which is a Ubuntu-14 machine with 32 GB of RAM and 20-core Intel Xeon E7-4870. The bandwidth and RTT are modified using netem.
Web Servers
For LiteSpeed, we use OpenLiteSpeed, which is the open-source version of our flagship LiteSpeed Web Server. We use version 1.6.4, which can be downloaded here.
For nginx, we use 1.16.1 with the Cloudflare quiche patch. Compilation steps are described here.
Both OpenLiteSpeed and nginx were configured to use one worker process. To be able to issue 1,000,000 requests using 100 connections, nginx’s maximum requests setting was increased to 10,000.
# OpenLiteSpeed httpdWorkers 1 # nginx worker_processes 1; http { server { access_log off; http3_max_requests 10000; } }
Website
The website is a simple selection of static files: 163-byte index file and files of 1 MB, 10 MB, 100 MB, and 1 GB in size. You can use this script to generate these.
Load Tool
We use h2load with HTTP/3 support to generate load. It is built very easily using the supplied Dockerfile.
Benchmark Tests
To get each number — requests per second or time to fetch a resource — three tests were run and the median value was taken.
Fetching small page
The index page is 163 bytes. We will fetch it in several ways using different network conditions.
h2load options of interest:
- -n: Total number of requests to send
- -c: Number of connections
- -m: Number of concurrent requests per connection
- -t: Number of h2load threads
-n 10000 -c 100
OLS | nginx | |
100 mbps, 100 ms RTT | 935 reqs/sec | 890 reqs/sec |
100 mbps, 20 ms RTT | 3915 reqs/sec | 2910 reqs/sec |
100 mbps, 10 ms RTT | 6420 reqs/sec | 4100 reqs/sec |
-n 100000 -c 100 -t 10
This is a longer run, each connection will now send 1000 requests
OLS | nginx | |
100 mbps, 100 ms RTT | 985 reqs/sec | 980 reqs/sec |
100 mpbs, 20 ms RTT | 4650 reqs/sec | 4525 reqs/sec |
100 mbps, 10 ms RTT | 8450 reqs/sec | 7155 reqs/sec |
OpenLiteSpeed is a little faster at 100 ms and significantly faster at 20 ms and 10 ms RTT.
-n 100000 -c 100 -m 10 -t 10
OLS | nginx | |
100 mbps, 100 ms RTT | 9010 reqs/sec | 7365 reqs/sec |
100 mpbs, 20 ms RTT | 24,700 reqs/sec | 5850 reqs/sec * |
100 mbps, 10 ms RTT | 25,230 reqs/sec | 6855 reqs/sec * |
* High variance
Tellingly, in the very first test, nginx was using 100% CPU, while OpenLiteSpeed was using about 45% CPU. This is the reason nginx numbers do not improve as the RTT goes down. On the other hand, OpenLiteSpeed, still does not use 100% CPU even with 20 and 10 ms RTTs.
-n 1000000 -c 100 -m 10 -t 10
To issue more than 1000 requests per connection, we need to set nginx’s http3_max_requests parameter to 10000 from its default value of 1000.
OLS | nginx | |
200 mbps, 10 ms RTT | 29,900 reqs/sec | 7180 reqs/sec * |
* High variance
Now we’ve managed to get OpenLiteSpeed to use 100% CPU. During this test, nginx allocated more than 1 GB of memory (resident size, as shown by top(1)). OLS never exceeded 28 MB.
That’s more than 4 times the performance at about 1/37th the cost.
Fetching single file
In this scenario, we will fetch a single file under different network conditions and measure how long it takes to download the file.
10 MB
OLS | nginx | |
10 mbps, 100 ms RTT | 9.8 sec | 11.2 sec |
10 mpbs, 20 ms RTT | 9.7 sec | 10.8 sec |
10 mbps, 10 ms RTT | 9.4 sec | 10.8 sec |
We see that nginx is somewhat slower in this test. At the same time, it uses a lot more CPU than OpenLiteSpeed in each of the tests above: between 3 and 4 times more.
100 MB
OLS | nginx | |
100 mbps, 100 ms RTT | 12.2 sec | 40 sec * |
100 mpbs, 20 ms RTT | 9.4 sec | 40 sec * |
100 mbps, 10 ms RTT | 9.3 sec | 30 sec |
* High variance
In all three benchmarks, nginx used 100% CPU, which is the most likely reason for its poor performance.
1 GB
I tried testing downloading 1 GB file using nginx at 1 Gbps but I got tired of waiting for it to finish. My guess is that the performance difference between OLS and nginx is even more drastic at this speed.
Shallow Queue
We have seen that nginx struggles when bandwidth is high. Let’s see how it does when bandwidth is low. One twist is that we will use a shallow queue using netem’s limit parameter. Here, we will set it to 7.
Fetching single 10 MB file
limit | OLS | nginx | |
5 mbps, 20 ms RTT | 1000 * | 19.6 sec | 22.5 sec |
5 mbps, 20 ms RTT | 7 | 29.6 sec | 48.1 sec |
* netem default
Introducing a shallow queue on path reduces OLS performance by about 50%, whereas nginx performance is degraded by more than 100%. In both cases, LiteSpeed is significantly faster than nginx.
OpenLiteSpeed HTTP/3 is Better Than nginx
We compared OpenLiteSpeed and nginx using several types of benchmarks. In all tests, LiteSpeed performs better than nginx: it transfers files faster and uses less CPU and memory. nginx never reaches TCP-level throughput at low bandwidth. At high bandwidth, nginx throughput is a fraction of that of LiteSpeed.
nginx’s HTTP/3 is not ready for production use. It delivers poor performance and, at the same time, uses too much CPU and memory.
This result is not surprising. QUIC and HTTP/3 are complex protocols. New implementations will have a hard time matching the performance of LiteSpeed. Ours is a mature implementation, as we first shipped production-grade QUIC support back in the summer of 2017.
nginx will likely improve in the future. We look forward to more benchmark testing when that occurs. Until then, LiteSpeed HTTP/3 cannot be beat.
Comments