LiteSpeed Beats nginx in HTTP/3 Benchmark Tests

HTTP3 LiteSpeed vs. nginx

LiteSpeed’s HTTP/3 implementation outperforms that of the hybrid nginx/quiche in a battery of benchmark tests. OpenLiteSpeed transfers resources more quickly, scales better, and uses less CPU and memory while doing it. In each of these metrics, LiteSpeed betters nginx by a factor of 2 or more. nginx could not complete some tests, while in others, it achieved only a fraction of TCP speed. Below, we describe the setup and the various benchmarks.

Why Compare LiteSpeed and nginx HTTP/3 Now?

HTTP/3 is a new protocol for the web, a successor to both Google QUIC and HTTP/2. As the QUIC Working Group at IETF is getting closer to finalizing the drafts, the nascent HTTP/3 implementations are maturing and some are now starting to see production use.

LiteSpeed was the first to ship HTTP/3 support in LSWS in July of this year. We have been supporting QUIC since 2017, making improvements along the way. The HTTP/3 support rests on this foundation.

Several weeks ago, Cloudflare released a special HTTP/3 patch for nginx, encouraging everyone to experiment. Because nginx is our competitor, we were excited to get a chance to kick nginx’s shiny new HTTP/3 tires. Cloudflare uses this patch in production, so it has to be good!

Cloudflare’s Quiche Patch

Quiche is an HTTP/3 and QUIC library by Cloudflare. It is written in Rust, a new high-level language. The library provides a C API, which is how it is used by nginx.

Benchmark Setup

Platform

Both servers and the load tool are run on the same VM, which is a Ubuntu-14 machine with 32 GB of RAM and 20-core Intel Xeon E7-4870. The bandwidth and RTT are modified using netem.

Web Servers

For LiteSpeed, we use OpenLiteSpeed, which is the open-source version of our flagship LiteSpeed Web Server. We use version 1.6.4, which can be downloaded here.

For nginx, we use 1.16.1 with the Cloudflare quiche patch. Compilation steps are described here.

Both OpenLiteSpeed and nginx were configured to use one worker process. To be able to issue 1,000,000 requests using 100 connections, nginx’s maximum requests setting was increased to 10,000.

# OpenLiteSpeed
httpdWorkers      1

# nginx
worker_processes  1;
http {
  server {
    access_log off;
    http3_max_requests 10000;
  }
}

Website

The website is a simple selection of static files: 163-byte index file and files of 1 MB, 10 MB, 100 MB, and 1 GB in size. You can use this script to generate these.

Load Tool

We use h2load with HTTP/3 support to generate load. It is built very easily using the supplied Dockerfile.

Benchmark Tests

To get each number — requests per second or time to fetch a resource — three tests were run and the median value was taken.

Fetching small page

The index page is 163 bytes. We will fetch it in several ways using different network conditions.

h2load options of interest:

  • -n: Total number of requests to send
  • -c: Number of connections
  • -m: Number of concurrent requests per connection
  • -t: Number of h2load threads

-n 10000 -c 100

OLSnginx
100 mbps, 100 ms RTT935 reqs/sec890 reqs/sec
100 mbps, 20 ms RTT3915 reqs/sec2910 reqs/sec
100 mbps, 10 ms RTT6420 reqs/sec4100 reqs/sec

-n 100000 -c 100 -t 10

This is a longer run, each connection will now send 1000 requests

OLSnginx
100 mbps, 100 ms RTT985 reqs/sec980 reqs/sec
100 mpbs, 20 ms RTT4650 reqs/sec4525 reqs/sec
100 mbps, 10 ms RTT8450 reqs/sec7155 reqs/sec

OpenLiteSpeed is a little faster at 100 ms and significantly faster at 20 ms and 10 ms RTT.

-n 100000 -c 100 -m 10 -t 10

OLSnginx
100 mbps, 100 ms RTT9010 reqs/sec7365 reqs/sec
100 mpbs, 20 ms RTT24,700 reqs/sec5850 reqs/sec *
100 mbps, 10 ms RTT25,230 reqs/sec6855 reqs/sec *

* High variance

Tellingly, in the very first test, nginx was using 100% CPU, while OpenLiteSpeed was using about 45% CPU. This is the reason nginx numbers do not improve as the RTT goes down. On the other hand, OpenLiteSpeed, still does not use 100% CPU even with 20 and 10 ms RTTs.

-n 1000000 -c 100 -m 10 -t 10

To issue more than 1000 requests per connection, we need to set nginx’s http3_max_requests parameter to 10000 from its default value of 1000.

OLSnginx
200 mbps, 10 ms RTT29,900 reqs/sec7180 reqs/sec *

* High variance

Now we’ve managed to get OpenLiteSpeed to use 100% CPU. During this test, nginx allocated more than 1 GB of memory (resident size, as shown by top(1)). OLS never exceeded 28 MB.

That’s more than 4 times the performance at about 1/37th the cost.

Fetching single file

In this scenario, we will fetch a single file under different network conditions and measure how long it takes to download the file.

10 MB

OLSnginx
10 mbps, 100 ms RTT9.8 sec11.2 sec
10 mpbs, 20 ms RTT9.7 sec10.8 sec
10 mbps, 10 ms RTT9.4 sec10.8 sec

We see that nginx is somewhat slower in this test. At the same time, it uses a lot more CPU than OpenLiteSpeed in each of the tests above: between 3 and 4 times more.

100 MB

OLSnginx
100 mbps, 100 ms RTT12.2 sec40 sec *
100 mpbs, 20 ms RTT9.4 sec40 sec *
100 mbps, 10 ms RTT9.3 sec30 sec

* High variance

In all three benchmarks, nginx used 100% CPU, which is the most likely reason for its poor performance.

1 GB

I tried testing downloading 1 GB file using nginx at 1 Gbps but I got tired of waiting for it to finish. My guess is that the performance difference between OLS and nginx is even more drastic at this speed.

Shallow Queue

We have seen that nginx struggles when bandwidth is high. Let’s see how it does when bandwidth is low. One twist is that we will use a shallow queue using netem’s limit parameter. Here, we will set it to 7.

Fetching single 10 MB file

limitOLSnginx
5 mbps, 20 ms RTT1000 *19.6 sec22.5 sec
5 mbps, 20 ms RTT729.6 sec48.1 sec

* netem default

Introducing a shallow queue on path reduces OLS performance by about 50%, whereas nginx performance is degraded by more than 100%. In both cases, LiteSpeed is significantly faster than nginx.

OpenLiteSpeed HTTP/3 is Better Than nginx

We compared OpenLiteSpeed and nginx using several types of benchmarks. In all tests, LiteSpeed performs better than nginx: it transfers files faster and uses less CPU and memory. nginx never reaches TCP-level throughput at low bandwidth. At high bandwidth, nginx throughput is a fraction of that of LiteSpeed.

nginx’s HTTP/3 is not ready for production use. It delivers poor performance and, at the same time, uses too much CPU and memory.

This result is not surprising. QUIC and HTTP/3 are complex protocols. New implementations will have a hard time matching the performance of LiteSpeed. Ours is a mature implementation, as we first shipped production-grade QUIC support back in the summer of 2017.

nginx will likely improve in the future. We look forward to more benchmark testing when that occurs. Until then, LiteSpeed HTTP/3 cannot be beat.


Tags: , ,

Related Posts


Comments