Beating Our bloat
At my brother's wedding last week (way to go Steve!), one of his friends noted I had stopped blogging... instead, I've been making serious
progress on this New Years resolution:
Collaborate more. Make a stronger effort to find people worth collaborating with. Use email more. Use usenet again. Push into the mainstream more patches - but logout at the end of the day - create some music.
Back in mid-November I'd first caught wind of bufferbloat
from Jim Gettys' blog
At first I was merely intrigued...
I had seen the kinds of TCP traces jg was getting while I was in Nicaragua (working on the wisp6 greenfield wireless mesh network), and in several cybercafes and hotels. I'd assumed then it was merely the tin cans and string connecting Nica to the rest of the world. I'd seen traces like this, in particular, a lot:
I didn't understand the effects on TCP of bufferbloat until I saw his traces and then his more detailed analysis with different tools
This is a normal TCP trace:
This is a bufferbloated TCP trace:
It looks like an EKG on crack! (That's what Steve Lord called it
, anyway). I envision this picture on a milk carton, with the caption: "Have you seen this trace?
Login to bufferbloat.net to learn how to fix it..."
I got interested
So, I re-ran his experiments
, against the wisp6 router testbed
. The results, under bad conditions (heavy rain), were horrifying.... 10s of seconds of delay in the routers (!@#@!#!)... and explained why NTP, DNS, ND, DHCP, and most other traffic had stopped working under those conditions.
Still, even at this point (late December), I thought it was a local, device specific, problem. I did a little patch to those routers and fixed it, (but good!) and went on my merry way, trying to cope with my other wisp6 problems of autoconfiguration
, ipv4 in 6 encapsulation, ipsec, mtu size...
Then I saw the netanylzr data and watched and listened to jim's presentation
about 3 times...
The diagonal lines are showing latencies - across paths that should be taking under 100ms to do anything - all over the world - measured in SECONDS.
I realized, finally, it wasn't just me and my devices and my little network in Nicaragua.
Bufferbloat was a global internet-wide problem, one probably growing worse, rapidly.
I got alarmed
. If NTP, DNS, DHCP, ND, etc., start breaking we're in a world of hurt, but if TCP/IP starts breaking worse really bad things will happen...
I emailed Jim Gettys on January 10th
about the mis-understandings thus far in the press that I'd been trying to correct, and volunteered to donate a pair of servers that I had lying around, and maybe write an article about traffic shaping... he told me I was exactly correct in my own analysis...
I'd met him a couple of times, we'd worked on the same stuff, like handhelds.org, X11, and OLPC...
...and so I found myself instead hacking ruby and redmine, getting multiple servers running, using my rock and roll promotion skills to get people all over the world in disparate disciplines involved, hacking kernels, fiddling with AQMs and new algorithms, reading 70+ theoretical papers, writing multiple pieces and wiki pages, making deals, swapping services, picking up dropped balls, making a ton of phone calls and exhausting my personal email address book to get bufferbloat.net
to be a real, functioning entity, with developers, theorists and users from all over the world, and not
a talk shop.
And the rest, is history in the making.
I still haven't got around to writing the piece about traffic shaping.
Basically, Bufferbloat (see FAQ
) is a new name for an old problem (RFC 970) that has gradually been re-introduced over the last 10 years. It's especially bad in cable modems, 802.11n gear, FIOS, but also can be seen in just about anything that has a wide dynamic range (GigE switches hat do 100Mbit). It's bad, it's ugly, it's screwing up the Net, big time, and it's just a mistake that we've (as engineers and network designers) have all been making for a long time...
Head. Desk. Head. Desk. Head. Desk.
The Bufferbloat problem is almost as bad as Y2k... And more solvable. It's just that the Internet is so much bigger now than in 1999 that is intimidating. More cell phones are being added
to the Internet every quarter than we had total users in 1999. There's also a persistent fear that it will get much worse
, before it gets better.
So we've been lining up people to fix it ever since.
While doing all that, along the way, I came up with a good idea for a cosmic background bufferbloat detector
that was extensively discussed on usenet
, and the bufferbloat mailing list
. Nobody found any holes in the concept which means (darn it) I'm going to have to code it up - or convince someone else to do so.
Good stuff keeps happening... there are nearly 200 members of the bloat mailing list now, John Linville just released a debloat-testing kernel
containing not only a new algorithm (eBDP) for wireless, but two new AQMs and some driver patches. Doc Searls
graciously loaned me his column for an editorial in Linux Journal's upcoming June issue... Vint Cerf loaned Jim Gettys his column for IEEE computer (due out in a few days), multiple other writers
have chipped in... Theorists, coders, cats and dogs, all talking to one another on the mailing lists
About the only flaw in all this activity of mine is that I've been so buried by it all as to stop blogging!! The effort required to write something for a more general audience is so much greater than carrying out conversations with the people I'm collaborating with presently on email and irc that I've stopped journaling entirely. I'm trying to fix that today, a little.
I've learnt that while journaling/blogging is important, even necessary, to the writer
and his/her creative process, writing the history down behind the writing matters to no-one else. (I'm journaling today so that I
can remember the timelines here)
Also, cutting the history from the finished work helps a lot. I just learned this trick from esr
, who has also taken time out on irc to teach me more about writing in the last 2 months than I've learned in 10 years of blogging. (It also took 5 other polished writers - Evan Hunt, Bill Weinberg, Richard Pitt, & Jim Gettys, to tell me in no uncertain terms that I was doing some things wrong - for it to register. I've undergone a writerly "intervention". It was painful, but I'll survive)
I wish now, that I'd opened up my writing to a writers cabal 25 years ago, or earlier. I might have got a few books done by now.
Tomorrow (wednesday) I'm in open-to-all VOIP conference call about bufferbloat
, with the freeswitch folk
. Please join the call to hear more. Or check out bufferbloat.net.
After I gave up on SIP based VOIP (after working on it for 6 years), and gave my last presentation on it, in 2006, at Astricon
, I'd had no idea then that a goodly portion of the problems I'd had with SIP were tied to bufferbloat. No idea what-so-ever.
Solutions seem feasible, across the Internet, for a whole new level of interactive applications after we get bufferbloat fixed. SIP phones now do IPv6, which solves a lot of problems, too. I'm seriously encouraged.
Sometimes it takes giving up on something, utterly, in order to make progress. It's been a zen 2011 that way. And also resolving to actually resolve your new years resolutions - works too.
All this said, I'm going to take a break from all this soon and write a bit about listening to, and making great music, and about an old, cherished concept of mine (and jeff stram's
) called the jam-o-phone.