On bufferbloat and shoulders of the giants

Is your internet connection slow?

Yeah yeah, you have connection with bandwidth of 7.2 Mbps but it sulks (no pun intended) and you wonder why.

The answer, mostly is, #bufferbloat [1]. In short, the data flows from Google+ to your computer sitting somewhere in the dark corner on the opposite side of the world, ‘packets’ of this data are stored and transmitted at each intermediate stop through which they are passed. The protocol used to communicate between your computer and a google server, and everything in between (the intermediate stops), is called TCP and that beast is pretty good at adapting to the best speed with which those two end points (your computer and the google server) can talk to each other.
The way TCP achieves that is – when google server sends a few thousand packets at a go with the maximum speed it has, your computer might be choking to cope up with the volume and it will start dropping the packets (UPDATE: and acknowledgements, hereafter referred to as ACK, are sent notifying the dropping ACK is sent on successful delivery of packets. Dropping is assumed if no ACK is received in a reasonable amount of time. For more technical details, see excellent comment from James Cape), and google server slows down sending packets accordingly.

Wondering where the problem is, then? Well, all those intermediate points – including servers, routers, your broadband ISPs, their servers, routers, your WiFi router and even the Network Card (NIC) on your computer – have ‘buffer’ to keep a large number of packets, before it could be processed/transmitted further. And that excessive buffering, is what the problem is, which defeats TCP adapting itself. +Jim Gettys was one mastermind who got irritated with the same, decided to investigate, zeroed in on the problem, created widespread awareness and coined the term #bufferbloat for it, towards the end of 2010.

Now the question is, how to fix it. Looks pretty simple – reduce the amount of buffer, right? Unfortunately, not quite. What is the standard size of the buffer to be used? If the size of the buffer is too small, slow connections suffer. If the buffer size is too large, fast connections suffer. What was needed is another algorithm, usually called Adaptive (UPDATE: corrected by Stuart Cheshire in comments) Active Queue Management. There were various attempts to find One True ™ algorithm independent of network/bandwidth/timing/buffer size/queue size, devoid of knobs for fine tuning.

Now, Kathleen Nichols and Van Jacobson (yes, /the/ Van Jacobson) have come up with an algorithm closest to that, called #CoDel [2]. The very first implementations of CoDel by +Dave Taht and +Eric Dumazet and are going into #Linux kernel now, along with similar implementations to #CeroWrt (based on #Linux ) for routers [3]. This is one of the pieces of solution to bufferbloat, not the only one, but definitely one in the right direction.

These extra ordinary people, today, are silently fixing tomorrow’s internet. And they deserve big props for that. All you technologists who didn’t get a chance to appreciate Nikola Tesla or Dennis Ritchie, here is your chance to do that for some of the real heroes of our time.

[1] http://gettys.wordpress.com/category/bufferbloat/
[2] http://queue.acm.org/detail.cfm?id=2209336
[3] http://lwn.net/Articles/496509/

Disclaimer: The descriptions of various technological aspects in this article are overly simplistic and may not be one hundred percent correct, please add a note or comment if you find an inaccuracy.


4 responses to “On bufferbloat and shoulders of the giants”

  1. Nitpick: ACKs are sent on successful delivery, not failure. Though other protocols (most interestingly, PGM) have mechanisms for negative acknowledgement (typically called a NAK), TCP doesn’t.

    When the sender doesn’t receive an ACK within an agreed upon time frame (window), it will retransmit the original packet again. What follows is my understanding:

    The TCP window is (AFAIK) calculated on the fly to be the bandwidth of the smallest link * round trip delay / MTU, depending on how the algorithm actually works. These algorithms watch for drops/missed packets/etc. to see how big the window should actually be, and when to resend, etc.

    What bufferbloat is (so far as I understand) doing is throwing a huge cache in the middle of the link, which changes the results of the B*D product because it’s able to store extra data.

    So, for example, if you’re on a 10Mb internet hookup in London, and the server is 100ms away (e.g. Denver) on a 1Gb link, your B*D = 1,000 Mb = 1Gb/s = 125MB. Which means you can have up to 125MB of traffic (data + headers) en-route before you should expect to see your first ACK, and so that’s how TCP will (effectively) figure this stuff out.

    Now, if I throw a pair of 1GB buffers on either end of the 10Mb internet hookup, what happens? Well, I can now send 8x more data before I expect to see a drop… so as far as TCP is concerned I’ve actually got an 80Mbps connection.

    Except I actually only have a 10Mbps connection, and that “buffer-faked” 80Mbps is short-lived.

    If I don’t actually use the full 1GB of buffers, then my latency is all over the place, because the buffered frames at the end have to wait for the buffered frames at the start to dequeue, and so I’m fiddling with the “D” in that equation in realtime.

    If I *do* actually use the full 1GB of buffers, then not only is my latency all over the place, but my bandwidth is too—it looks like I’ve got 80Mbps of bandwidth until I fill the buffers, at which point I can only add new data to the buffers at the speed they can be emptied—10Mbps.

    Which means lots of drops, and when TCP hits a drop, it falls back to “wait, let me figure this out again”/slow-start mode.

    But otherwise I’m looking forward to seeing legitimate AQM in place as well 🙂

  2. James, thanks a lot for such a technically detailed explanation and corrections!

  3. I’m happy to see you appreciating the good work these people are doing. BTW, AQM stands for “Active Queue Management”, not “Adaptive Queue Management”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: