Notes from the Road: QUIC Interim Zurich
Push this button to melt this protocol!
– Brian Trammell, 2/5/2020
The QUIC Working Group Interim meeting in Zurich took place on February 5 and 6 (Wednesday and Thursday). This is the second article in the two-article series – the first described the Interop that ran earlier in that week. My recap is not exhaustive; see the official minutes for the full list of the agenda items.
At the beginning of Wednesday’s session, Lars Eggert (one of the co-chairs; NetApp) updated the Working Group on the results of the Interop. In short: we’re still playing catch-up with the changes in the draft; getting the transfer performance batch (the “T” letter) is tricky; and very few implementations support the ECN feature. (lsquic does support ECN, which is an optional part of the protocol.)
Varinting the Transport Parameters
The discussion began with one of the more contentious issues (#3294): converting the way transport parameters are encoded. In the current draft (25), they look odd because they do not use variable length integers; almost everything else in the transport draft uses varints. The purist point of view, which David Schinazi (Google) advocated, is that we should make the draft more uniform and use variable integers throughout. The opposing view (and my initial position) is that this is a cosmetic change and it is too late in the process to make cosmetic changes.
The argument then moved on whether the limitation of 64K possible transport parameters imposed by the current encoding is important. More experienced members of the WG stated that this is indeed the case. A smaller set of codepoints leads to registrars being very protective of the unallocated values. A practically infinitely large (262 – 1) set of codepoints would make these concerns moot.
Having considered these, I switched my preference from “no” to “neutral,” even though this, in effect, becomes wasted work. In addition, we would have to support both encoding types when doing both ID-25 and ID-26 drafts.
In the end, the consensus was to make this change, and so we are again changing something significant in the draft. To lessen the impact of this change, the editors promised to release ID-26 promptly, so that we have time to implement it before the Vancouver Interop in late March.
Two related issues were briefly touched upon. First, should there be a requirement to detect duplicate transport parameters? Second, should the transport parameters be required to be sorted? Certainly, duplicate detection is trivial when parameters are sorted. Nevertheless, both these proposals were shot down. Martin Thomson (Mozilla) pointed out that the lack of ordering requirements for TLS extensions turned out to be integral for making possible backwards compatibility with TLS 1.2.
CID Retirement Timeout
Gorry Fairhurst expressed his concern with the lack of clarity regarding the wait time before a connection ID must be retired in issue 3215. The discussion revolved around just how long an endpoint could wait after receiving a NEW_CONNECTION_ID frame before issuing a RETIRE_CONNECTION_ID in response. If this time is too short, the tarrying endpoint risks receiving a stateless reset when it tries to use the connection ID that was supposed to be retired. Igor Lubashev (Akamai) and a few others pointed out that forgetting a stateless reset token and sending the RETIRE_CONNECTION_ID frame for the corresponding CID do not have to happen at the same time. That is to say, the risk of an aborted connection due to a stateless reset is minimal.
As this discussion wore on without resolution, Lars preempted it. He asked for a self-organized lunch group to come to a consensus during lunch (naturally). At lunch, it was resolved to close this issue with no action (that is, no design change) and potentially a new editorial issue to clean up existing text.
Padding Outside QUIC Packet
Issue 3333 asks whether UDP datagram padding should be counted toward the server’s amplification limit. To prevent amplification attack, a QUIC server is not to send more than three times the number of bytes it received on the connection until the client address is verified. The current version (25) of the Transport Draft recommends padding Initial QUIC packets using PADDING frames. On the other hand, it does not proscribe padding the UDP datagrams that carry QUIC packets with garbage. At least one client implementation pads Initial packets in this manner.
One view is that what matters are bytes received from the client. Since garbage bytes that follow an Initial packet in a UDP datagram are bytes, those bytes should count. The opposing view, advocated by Christian Huitema and others, is that QUIC is a cryptographic protocol, which means that the only way to attribute bytes to a connection is by using QUIC packets.
In the end, the first view had more support. The next version of the draft will count garbage trailing bytes toward the server amplification limit.
Recommend ECN Strongly
Mirja Külhewind, one of the co-authors of the QUIC Operability Drafts, advocated for recommending ECN more strongly in issue 3373, for she deemed the current text to be too cautious. Agreeing with her were Brian Trammell (Google), David Schinazi (Google), and Lars Eggert. Eric Kinnear (Apple) reported that Apple turned on ECN for TCP three years ago and it hasn’t caused outages. Matt Joras (Facebook) and Roberto Peon (Facebook) were more cautious: “recommending ECN without deployment experience seems strange.”
After a rather prolonged back-and-forth, the consensus was that the current text in the draft is encouraging enough and that Appendix B describing a sample ECN support detection algorithm will stay until we have something better to recommend. Mirja will work on minor editorial text improvements.
There Be Extension Dragons
What happens when two QUIC extensions are not compatible? For example, they might modify an existing frame, such as the ACK frame, in conflicting ways. This is the subject of issue 3332.
Some proposed disallowing modifying existing frames. Igor Lubashev offered an algorithm for extending frames in an unambiguous manner. Others pointed out that extensions may be incompatible in a variety of ways, of which frame formatting is only one.
In the end, it was decided to let the problem resolve itself naturally – in chronological order as extensions get proposed — add a comment to the draft that there be dragons here. Dragons indeed.
DCID for Handshake Retransmission
Should the server be allowed to issue new connection IDs before the handshake is complete? This is the subject of issue 3348, raised by Marten Seemann. Why would a client change its Destination CID at that time? Brian pointed out that there is no reason for the client to do it. Yet, the discussion went on. Martin Thomson said that coupling handshake state machine with CID state machine complicates things, and so it is uncomfortable to forbid changing CIDs until 1-RTT packets. Eventually Lars stopped the back-and-forth, directing people most interested in this problem to come up with something during lunch.
The lunch consensus – I take it, people tend to be more agreeable when their bellies are full – was to allow the client to change DCID at any time. In other words, if a server does not want the client to do this, it should not issue new CIDs too early. An editorial clarification or two may follow.
ACK Generation Recommendation
Jana Iyengar (Fastly) opened issue 3304 before he and Ian Swett (Google) co-authored the Delayed Acks extension. (I think their collaboration was the result of the conversation in this GitHub thread). Jana pointed out that the current recommendation – produce one ACK for every two ACK-eliciting packets – results in too many ACK frames in the normal case. “Too many” here means that the large number of ACK packets sent and received affects throughput negatively. (This was likely one of the reasons for the poor nginx/quiche performance uncovered in our November benchmark tests.)
Several implementations already do not follow this recommendation. Is providing a recommendation in the draft and expecting people to ignore it the best we can do? The tricky part, of course, is making a specific recommendation that works well most of the time. The several schemes discussed in the GitHub issue all suffer from some form of degradation under some circumstances or when interacting with a particular type of congestion controller.
Not having come up with anything conclusive, we resolved to close this issue with no action. The current recommendation is good enough, at least for now. In addition, we have the Delayed Acks extension that we could use to experiment.
Issuing New CIDs in 0-RTT Packets
In issue 3423, Marten posed another corner-case riddle. Because the client does not have to remember the maximum number of active connection IDs the server supports, it risks overrunning this limit by issuing new connection IDs in its 0-RTT packet.
Corner cases are the bane of our Working Group. Discussing them takes up valuable time, time that would be better spent discussing more practical matters. This corner case took up about 1.5 hours – almost as much as all the Thursday’s discussions of the QUIC extension drafts combined!
The scope of the discussion is documented well in the minutes. None of us could put forth a plausible scenario for using NEW_CONNECTION_ID frames in 0-RTT packets. The voices of reason who advocated banning these frames in 0-RTT packets outright (mine and Christian Huitema’s) almost prevailed at one point, only to see our ten-votes-to-three advantage go down to one-vote-to-ten shortage. (Christian was remote and so his vote naturally was not counted.) Martin Duke (F5), who was running the vote counting, offered me a chance to “die on this hill,” which is an IETF expression for a last-ditch effort to object, akin to Demi Moore’s character’s strenuous objection in the movie The Few Good Men. Dying is a bit too dramatic for me, so I quipped, “I refuse to die!” to the clapping of three dozen pairs of hands. Thus was it decided that the client would just have to remember the server’s active_connection_id_limit setting.
Changing ACK Frequency
They explored the problem area, highlighting the tension between having too many ACKs reducing throughput because of the need to generate and process them and having so few ACKs as to cause retransmissions that also reduce throughput.
The Working Group was mildly enthusiastic about the possibilities this extension could offer. There is plenty of room for optimization, but there are also some hidden stones. Martin Duke asked whether the parameters communicated via the new ACK_FREQUENCY frame were advisory or mandatory. The answer is clearly advisory, as the sender of the ACK has its own limits or preferences that would trump those of its peer.
Most of us agreed that this work is important, and that experimentation should continue.
Quentin De Coninck (Université catholique de Louvain) talked about multi-path QUIC. He introduced the concept of a QUIC uniflow, which is one half of the today’s QUIC path. Using uniflows, one can model MPQUIC effectively.
Quentin shared his implementation experience. A basic multi-path support in quickly weighed in at about 350 lines of code; a full-fledged implementation based on picoquic was 2500 lines of code – still a very reasonable number.
The Working Group peppered Quentin with questions. How does path prioritization affect the scheduler? Is it necessary to change QUIC and HTTP/3 APIs to make this useful? Does this work need its own Working Group?
The biggest chill was put on the proposal by Jana’s observation that there is no application that could use this capability. That is, lest it be completely transparent to the application. (This seems like a chicken-and-egg argument to me. An application utilizing multi-path capabilities cannot be created until those capabilities exist; and multi-path capabilities cannot be developed until there is an application that can use them.) We need a Research Group, said Jana, not a Working Group.
Packet Loss Signaling
Igor presented our packet loss signaling proposal. (I say “our” because I recently became a co-editor of the draft along with Igor and Alexandre Ferrieux (Orange Labs)). I touched upon the proposal in the previous post. The WG expressed several concerns. Matt pointed out that this proposal is unlike others because it modifies the packet headers and it could become a “de facto” extension that everyone is expected to implement. This change would burn both the reserved bits: what if QUICv2 needed more bits in the first byte? Does this change preclude other extensions from using these bits?
The real answer is that this functionality should have been part of the core protocol to begin with. These bits provide real value to network operators; without them, troubleshooting of QUIC network flows is much more difficult. Another extension (that doesn’t yet exist) that would want to use the reserved bits in a similar fashion can use the L bit for its square wave, thus differentiating itself. Further than that, obviously the space is limited, and no amount of ingenuity will be able to squeeze many extensions into two bits.
Other questions resembled those posed at the trial of Joan of Arc. Like the Virgin (although likely possessing less divine guidance), Igor parried admirably:
Marten: Loss detection and recovery in our spec is only advisory.
Igor: When you decide the packet is lost, you increment the counter.
Marten: This is an implementation decision.
Igor: Yes, when you decide it’s lost you increment the counter.
Marten: So, the network needs to know your algorithm.
Igor: No, the assumption is that if the sender thinks it’s lost, then it’s lost.
The extension was deemed important enough. To quote Igor, “we need this to make the Internet better.” Mark Nottingham (co-chair of the working group; Fastly) directed us to participate in the experimental QUIC DISPATCH meeting at IETF 107 in Vancouver in March of this year.
The Last Lunch
The Interim wrapped up early: around half past one on Thursday. About half of us headed for free lunch at Google cafeteria, which was just down the corridor.
The food is fancy at Google Zurich. The “small plates” concept is in fashion there as well, so I got myself two plates. And a sandwich.
I sat across from Junho Choi (Cloudflare), with whom we talked shop: QUIC performance and how it is affected by ACK rate, quirks of Rust and C++ programming, and QUIC deployment issues. Later we were joined by Lucas Pardue (Cloudflare), the newly minted QUIC WG co-chair. We discussed the events of the past few days and assessed the results of the Interim.
As usual, it was great to share implementation experience with like-minded people. I look forward to comparing notes with them again.
The Chagall Windows
I stepped out from the Google building into a fresh and sunny winter afternoon. After several gloomy, rainy days everyone was visibly in a better mood. I hurried to the Fraumünster Church while the sun was still reasonably high.
Marc Chagall created these three (and three more, for a total of six) beautiful stained-glass windows for the Fraumünster Church in Zurich. These were installed in 1970, when he was 83. He started doing this type of art late in life. Fame had come to Chagall earlier for his paintings, about ten of which are on display in the Zurich art museum (Kunsthaus).
The church itself is a reincarnation of previous churches, which have existed on this spot for the last 1200 years. The crypt reveals parts of the original church’s foundation. What were the names of the workers who cut and laid those stones? – I wondered – Were they happy or sad? Did they know that what they had wrought would be still around after so much time?
Like the builders of the newer church, we base the workings of a new protocol on an old foundation. Like Chagall’s, our conception of what could be must be fitted into a canvas of a predetermined size. If we be granted only a fraction of those craftsmen’s and artist’s inspiration and perseverance, then our project – QUIC — will surely be successful.