QUIC Performance Benchmarks with HTTP/2
As we develop our own IETF QUIC implementation, we are focused on protocol correctness and robust performance. To get a better idea of the extent of our implementation comparative performance, we decided to benchmark it against other publicly available implementations.
Since there are few stable IETF QUIC implementations, we have restricted the list of servers tested to only mature GQUIC implementations. We have made the assumption that the performance for an IETF QUIC implementation and its GQUIC counterpart should match closely.
We have designed a flexible QUIC benchmarking program and included a testing script for easy test automation. This open-source program is available in the lsquic repository on Github. It has the ability to request an array of files over many concurrent connections using many concurrent streams. This flexibility makes it an essential tool for understanding the performance of a QUIC implementation.
We used this program and testing script to run a series of benchmark tests to understand the strengths and weaknesses of our own QUIC implementation, as well as other publicly available implementations. The benchmarking tests are meant to push servers to their limit and simulate realistic QUIC traffic. They range from requesting many small files to requesting a few large files.
The benchmark tests follow the client-server model and take place on a private network with a 10G link between the two testing machines. The client testing program used is the http_client sample client distributed with the lsquic-client repository.
The independent variables in this benchmark are server type, the number of requests, the size of requests, the number of concurrent clients, the number of concurrent connections per client, and the number of concurrent streams per client.
The server types tested in this experiment are lsws, proto-quic, caddy, and quic-go. Not all server types will be included in each test.
The dependent variables in this benchmark are client time, client CPU usage, server CPU usage, and in some tests, packets received on client machine.
The testing script runs five trials and calculates the average client time as well as the standard deviation across trials. Using the Linux
top(1) command, the tester manually monitors client and server CPU usage and reports an average value, rounded to the nearest factor of 5.
Round 1: Many Small Requests
With the first test in this series of benchmarks, looked at how each server would handle a high number of small requests. A single test client would open one hundred concurrent connections, with a maximum of ten concurrent request streams. The client would make one million requests; each request was for a 630-byte file.
|server||Client_time_avg (sec)||client_time_stdev||Client_cpu (%)||Server_cpu (%)|
The server with the lowest average client time was the LiteSpeed Web Server. It reached full server CPU utilization for a single thread of execution. Next, Google’s proto-quic server had the second lowest average client time and reached full server CPU utilization for a single thread of execution. Finally, the Golang-based implementations, quic-go and Caddy, reported the highest average client time. Additionally, both quic-go and Caddy used more than a single thread, consuming more resources that the previously reported servers.
A single test client opens ten concurrent connections, with a maximum of ten concurrent request streams. The client makes four hundred requests; each request is for a 10-megabyte file.
./tools/bench/lsqb.sh -a <CLIENT_PATH> -H <HOSTNAME> -s <IP>:<PORT> -p <FILE_PATH> -T 5 -C 1 -r 400 -c 10 -m 50 -w 10 -K
The LiteSpeed Web Server (lsws) trial with the lowest average client time is the HTTP/2 protocol. Next, the HTTP/1.1 protocol had the second lowest average client time. Finally, in comparison with the other LiteSpeed Web Server trials, the QUIC protocol had the highest average client time.
When compared to other QUIC trials, LiteSpeed Web Server (lsws) using QUIC showed the lowest average client time. It is followed by Google’s proto-quic server and the Golang-based implementations, quic-go and caddy.