Developer’s Corner: HoL Blocking and Priority Inversion
Producing a native header compression mechanism for QUIC is one of the goals of the IETF QUIC Working Group. The approach used in the Google QUIC (gQUIC) suffers from head-of-line blocking (HoLB) and is thus suboptimal. I had been interested in header compression for some time and, in November of 2017, I added my own proposal (QMIN) to the other two proposals already under consideration. The Working Group had empaneled the Compression Design Team and tasked it with coming up with a single proposal. Our team has six members:
- Alan Frindell, the author of proxygen;
- Charles “Buck” Krasic, the author of the QCRAM compression proposal;
- Dmitri Tikhonov, the author of the QMIN compression proposal.
- Martin Thomson, the co-editor of the QUIC Internet Draft;
- Mike Bishop, the author of the QPACK compression proposal; and
- Roberto Peon, the co-author of HPACK (RFC 7541).
There are many design decisions that go into producing a compression mechanism. The resulting proposal (Internet Draft) will hopefully become a standard (RFC) one day. The rationale behind many aspects of a standard usually does not get included into RFCs: this is not what the RFCs are for. In this article — which may become first in a series of several — I will describe an interesting problem that could impact the choices the Compression Design Team makes.
QUIC streams have priorities which the application layer can set. HTTP/2 and, consequently, HTTP/QUIC use priorities to improve UX. Usually, priorities are assigned by content type: HTML has higher priority than CSS, which in turn has higher priority than images. In general, an application using the protocol may assign any priorities to suit its needs.
In gQUIC, the header blocks are delivered on a separate, high-priority (higher than all other priorities on the connection save the HANDSHAKE stream), unbounded HEADERS stream. The HTTP payload-bearing streams’ priorities are used to select which HTTP message to process. Lost or late packets on the HEADERS stream may cause other streams to wait. (This is the head-of-line blocking problem that the Design Team is trying to solve.) The HEADERS stream itself does not depend on other streams and it is always processed first.
In the current IETF HTTP/QUIC draft, on the other hand, the header blocks are part of the request/response stream itself. In the case where the header compression protocol mandates that the decoder be able to wait on unresolved entries (this is the decoder HoL blocking), the compression protocol and the QUIC protocol can actively hinder each other in the following manner:
- HoL blocking can cause priority inversion; and
- Priority inversion can cause HoL blocking.
HoL Blocking Causes Priority Inversion
This situation can occur when a server sends two responses (let’s call them A and B) where the header block of a higher-priority response (for instance, A) references entries first inserted in the header block of the lower-priority response.
Here, the server does the correct thing — since it got the B request first and the A request has not yet arrived, it encodes and sends the response, even though a higher-priority request is coming.
The client, too, does everything correctly: after receiving the responses, it processes the higher-priority A response first.
Stream Priorities Cause HoL Blocking
Priority-induced HoL blocking can happen whenever there is packet loss. If a high-priority response A depends — via its header block compression — on a low-priority response B which gets lost, the server will choose to send the higher-priority data on stream A before retransmitting the lost B response. The client will have to buffer the whole A response or fill the A stream buffer before it receives the required B response and can proceed.
The example is a variation of the above when the B response gets lost:
The server’s choice to send higher-priority stream data is in accordance with the QUIC recommendation. It does everything correctly.
Attention: This is worse than the original gQUIC HoL blocking problem, for the gQUIC server would choose to send the missing header block first.
More Potential Troubles
Above, only static priorities are considered. QUIC allows stream priorities to be adjusted dynamically. An implementation that changes stream priorities based on some performance metrics — for example, as an indirect result of priority inversion — may only make the problem worse.
Adding stream dependencies in the form of header block dependencies in the decoder introduces a conflict with the QUIC stream priorities whenever there is packet loss or reordering. This conflict can cause HoL blocking, resulting in a markedly degraded user experience. Since both packet loss and packet reordering happen constantly (the former is how congestion control works), the problem will be ubiquitous.
These issues can be solved by designing the protocol so that either
- Response or request headers can be processed immediately; or
- Operations that modify the table state are sent on a different, special-purpose stream (or streams).
Text by: Dmitri Tikhonov
Graphics by: Mark Zou