Postcards from the Bleeding Edge
Monday, October 05, 2009

  Some news from the life and death and life of Pocobelle, the 300mw mail router



The Pocobelle project, day 10



Pocobelle has been acting as a backup email router for a few days now.

I get several hundred emails per day. It used to be thousands, but I switched to using netnews & gmane for my more high-traffic mailing lists, the lists that I mostly read and don't write to, such as lkml. Pocobelle successfully coped with the email I get in the dribs and drabs I get it in - I didn't have any complaints. It was transparent to me, and everybody. It did STARTTLS crypto without cracking 10% of cpu. My tests included sending a few dozen mails from a server outside my network, but that was it. Most of the time mail goes right to my laptop...

Today was the perfect day to try pocobelle out in a real world scenario.

I was without power or internet for 8 hours. My generator didn't work. (Most likely, I'm out of gas). My ice cream melted. Oh, well. I needed to defrost the chicken anyway.

The UPS that pocobelle was running on showed 95% of it's battery available after that period. I was really happy about that. Assuming I have a week when the internet stays up and power doesn't I should be able to have my email delivered without a problem, and periodically fire up the laptop to read it.

At 7PM, power and internet came back on simultaneously. I had previously turned off email to my laptop (and turned the laptop off, to save power). I booted up the laptop just to watch pocobelle do it's stuff...

Pocobelle got on the Net... Got an ipv6 address... And started getting the backlog of mail...

The Bind9 DNS server rapidly got to 23MB in size... The cpu went to 100%... 93% of it, gone, waiting for disk access.. the Loadavg lept past 5... available memory dropped to zero... My ssh session locked up...

Midway through the 15th email it bounced 3 messages, then it died.

Pocobelle ran completely out of memory and came to a screeching halt.

Sigh. The perils of engineering.

I hadn't thought deeply about the interaction between DNS services and email. Freshly booted, there is no DNS cache on the system.

I hadn't thought about a complete and utter cold start of absolutely everything pocobelle was connected to. There were no DNS caches anywhere it talked to that weren't "cold".

I'd got into a pathological situation, where the bandwidth being chewed up by all the mail being sent, and the time it took to "walk" DNS to verify it as "good" mail, competed and combined to bounce mails it couldn't do a reverse lookup on.

At the same time, the load on the system was such as to put it on the moon in short order.

As crashes go, it was not pretty. It brought back a flashback from 1995 where a CEO I knew, ecstatic with his new 26MB powerpoint presentation, emailed it to everyone in the company, and everyone he knew in the world, besides. That was an age, also, when a *good* mail server only had 64MB of ram....

1) Pocobelle only has 64MB of memory. (Pocobelle 2 will have 256MB or more) An easy "cure" for the memory problem was to enable swapping. When running without swap a Linux system will free up memory by discarding unused (read-only) program text pages, which are read-only, and swapping them in from the filesystem when they get used.

While there are a lot of binary pages you can do this to, it doesn't work on pages that have been modified by the linker, and it (especially) doesn't work on interpreted languages like python and perl. These languages often do have plenty of little-used pages, but they are *data* and can't get discarded because some day they MIGHT be modified further.

This arm build does not appear to have Jakub Jelinek's prelink utility installed, which will free up more memory by prelinking the various binaries. Prelink solved a few problems, but in the arm world, most people (I'm not) use a libc that wasn't compatable with prelink. I'm still researching this...

So, anyway, there are plenty of things that can't get swapped out that could, if swap was enabled. So I added 128MB of swap on the flash. Linux doesn't require that you have swap on a raw partition (although it is a good idea), so I just did a:

dd if=/dev/zero bs=1024k of=/etc/swap count=128
mkswap /etc/swap
swapon /etc/swap # and add to /etc/fstab

An even better cure for this would be to use a box with more memory but that's a problem reserved for pocobelle 2.

With swapping enabled, pocobelle grew decidedly less "chunky" in the general case. There is always a lot more free memory available for general use - for example, bind9 dropped from 23MB of ram down to 14MB. In normal use, 16MB is living on swap by default.

Whenever I get around to reformatting this USB key, I will put swap on a raw partition. I might put it on the built-in flash on the board, actually. We'll see.

2) Pocobelle was configured to use one DNS server - it's own - and forward to several local servers attached to my wireless network, provided by my provider. While this is a decent config... One that a normal client would use... given that all of the servers it was connecting to were ALSO freshly booted and ALSO had to walk DNS there, they ALL failed within the default DNS timeouts.

What I decided to do was establish a robust set of DNS servers (5), having pocobelle talk to itself twice - once in the beginning of the loop and another time at the end. In the middle it talks to my main mail server, which having already done the anti-spam protection in the first place should have a cached record of the remote server's origin already.

It should effectively put a 10 second timeout on the DNS lookup instead of a 2 second one, AND get to at least one server that has a good, primed, cache; a server in the US that's impervious to power failures.

(Getting this to work was a little tricky in that I'm using bind views internally to give me a consistent picture of my network and routing configuration(s), but I'm not going to go into that here)

I hope this is sufficiently robust. I'm not going to purposely instigate another 8 hour delay on my email, at least, not in the near future.

Another answer to this is to cache more of the internet's DNS service at the start, before accepting mail. (My mail is mostly not random, but comes from a limited number of mailing list servers). I have a buddy that used to cache the entire DNS root zones back in 1995. Maybe that's still possible.

It would be good to have some sort of cache log that I could replay on name service startup (or at certain times of the day, for example, shortly before I wake up in the morning) to prime the cache(s). I can sort of do this by replaying the mail logs through the DNS system, but it would be cleaner if I could figure out a way to get my top 100 sites out of bind periodically.

3) Given that write speeds to the flash are so slow it would be best to always keep at least 512K reserved for disk buffers. Smaller writes than 128k at a time are *bad* with flash.

I used to know how to do that, but the interface to the Linux swapper has changed so much that I have to google to figure it out. (Most of this blog was written after the power failed again)

4) Given that one of pocobelle's purposes is to be a mail router, and it lives on ipv6 which has little to no spam on it, it's somewhat pointless having even the minimal anti-spam services I have on it (like those reverse lookups that caused the bounces in the first place)

I'm not going to do that, I actually want to make this into a system capable of the best anti-spam measures I can come up with because spam is just never going to go away.

...

So, after fixing 1 and 2, I fired up rss2email on a new user on pocobelle. Rss2email is written in python. It took 12 seconds to start, and 16MB of ram, and was really going slow, so I decided that I didn't need to do that on pocobelle itself, but on my smart host elsewhere. Pocobelle just needs the mail itself, not the process that generates it.

Result: I got 25 messages as fast as they could be delivered.

It ran happily with 16MB of ram out on swap.

I'm happy with pocobelle today. I'm going to turn off my laptop tonight and see what happens.

AM Update: I turned the laptop on again, and got about 60 emails sent in rapid succession. The night before I'd double the default number of connections to 12 in a burst of optimism.... Pocobelle handled the load, but I think I'm going to limit the number of inbound and outbound connections to 4. At 12, it ran at 93% of cpu and got down to very little memory during it's burst of email. Pocobelle needs to remain responsive to DNS, in particular, as it's the main DNS server for the household, and has quite a few other things to do besides email.

Now, I'm running full starttls (encrypted) email inside of my household, which probably accounts for some of the cpu usage, but I think the overhead was of startup and running all those processes, not the crypto.

Maybe I'll try rate-limiting the number of inbound connections via iptables, tarpitting them maybe, to keep the mail server on the other side happy once pocobelle it gets past 3, keeping it from rescheduling the mail repeatedly. That will ensure a burst of email actually gets sent, albeit slowly. (this is also a good anti-spam measure)

On to figuring out 3 and 4...

Labels: , , , , , , ,

 
Comments:
In the end of days, I completely disagree that 'it's easy to run your own mail server'.

since I've been running mail servers lo these many years, I feel qualified to speak to this issue.

Folks who run their own mail servers, are as often as not, the problem, not the solution. This is another one of those places were folks are better off leaving the service to those who do it for a living.

I concur with the sentiment. But in this very real world, please look to the lessons from the tragedy of the commons. He with the most broken moral compass wins.
 
CPM: As I have been very dependent on you for a good anti-spam setup, I should probably back off on the statement that "it is easy to setup an email server".

It really is easy to setup an email server, when compared to configuring bind and DNS.

Setting up a good email server is much harder, and currently is the province of experts. I would like that to change, as outsourcing my email to the Ministry of Information really bugs me.

Are you saying my moral compass is broken?
 
oh, yea, and the bounce messages bounced, too, because they came from an invalid user that wasn't permitted to send mail.(I kind of like that, it makes me feel freer to mess up. It's just my email, not the worlds)
 
Post a Comment

Links to this post:

Create a Link



<< Home
David Täht writes about politics, space, copyright, the internet, audio software, operating systems and surfing.


Resume,Songs,
My new blog, NeX-6, My facebook page
Orgs I like
The EFF - keeping free speech in the world
Musical stuff I like
Jeff, Rick, Ardour, Jack
Prior Rants - ipv6 and smart(er) mail relaying in postfix Squid 3.1 (with ipv6 support) lands in debian Building a virtual mesh potato, part 1 Accidentally cross building Linux 2.6.31 Beach calm #1 Designing an architecture for 2010 and beyond, for... Going retro, re-adopting Emacs! This is for the sysadmins Living in the eternal now Hell in Honduras
Best of the blog:
Uncle Bill's Helicopter - A speech I gave to ITT Tech - Chicken soup for engineers
Beating the Brand - A pathological exploration of how branding makes it hard to think straight
Inside the Internet Mind - trying to map the weather within the global supercomputer that consists of humans and google
Sex In Politics - If politicians spent more time pounding the flesh rather than pressing it, it would be a better world
Getting resources from space - An alternative to blowing money on mars using NEAs.
On the Columbia - Why I care about space
Authors I like:
Doc Searls
Where's Cherie?
UrbanAgora
Jerry Pournelle
The Cubic Dog
Evan Hunt
The Bay Area is talking
Brizzled
Zimnoiac Emanations
Eric Raymond
Unlocking The Air
Bob Mage
BroadBand & Me
SpaceCraft
Selenian Boondocks
My Pencil
Transterrestial Musings
Bear Waller Hollar
Callahans
Pajamas Media BlogRoll Member

If you really want to, you can poke through the below links as well.

ARCHIVES
06/09/2002 - 06/16/2002 / 07/28/2002 - 08/04/2002 / 08/11/2002 - 08/18/2002 / 08/18/2002 - 08/25/2002 / 08/25/2002 - 09/01/2002 / 09/22/2002 - 09/29/2002 / 11/10/2002 - 11/17/2002 / 12/15/2002 - 12/22/2002 / 12/22/2002 - 12/29/2002 / 12/29/2002 - 01/05/2003 / 01/05/2003 - 01/12/2003 / 01/19/2003 - 01/26/2003 / 01/26/2003 - 02/02/2003 / 02/09/2003 - 02/16/2003 / 02/16/2003 - 02/23/2003 / 03/02/2003 - 03/09/2003 / 03/16/2003 - 03/23/2003 / 04/06/2003 - 04/13/2003 / 04/13/2003 - 04/20/2003 / 04/20/2003 - 04/27/2003 / 05/04/2003 - 05/11/2003 / 05/18/2003 - 05/25/2003 / 05/25/2003 - 06/01/2003 / 06/01/2003 - 06/08/2003 / 06/08/2003 - 06/15/2003 / 06/15/2003 - 06/22/2003 / 06/22/2003 - 06/29/2003 / 06/29/2003 - 07/06/2003 / 07/20/2003 - 07/27/2003 / 07/27/2003 - 08/03/2003 / 08/03/2003 - 08/10/2003 / 08/10/2003 - 08/17/2003 / 08/17/2003 - 08/24/2003 / 08/24/2003 - 08/31/2003 / 08/31/2003 - 09/07/2003 / 09/07/2003 - 09/14/2003 / 09/14/2003 - 09/21/2003 / 09/21/2003 - 09/28/2003 / 09/28/2003 - 10/05/2003 / 10/05/2003 - 10/12/2003 / 10/12/2003 - 10/19/2003 / 10/19/2003 - 10/26/2003 / 10/26/2003 - 11/02/2003 / 11/02/2003 - 11/09/2003 / 11/09/2003 - 11/16/2003 / 11/30/2003 - 12/07/2003 / 12/07/2003 - 12/14/2003 / 12/14/2003 - 12/21/2003 / 12/28/2003 - 01/04/2004 / 01/11/2004 - 01/18/2004 / 01/18/2004 - 01/25/2004 / 01/25/2004 - 02/01/2004 / 02/01/2004 - 02/08/2004 / 02/08/2004 - 02/15/2004 / 02/15/2004 - 02/22/2004 / 02/22/2004 - 02/29/2004 / 02/29/2004 - 03/07/2004 / 03/14/2004 - 03/21/2004 / 03/21/2004 - 03/28/2004 / 03/28/2004 - 04/04/2004 / 04/04/2004 - 04/11/2004 / 04/11/2004 - 04/18/2004 / 04/18/2004 - 04/25/2004 / 04/25/2004 - 05/02/2004 / 05/02/2004 - 05/09/2004 / 05/09/2004 - 05/16/2004 / 05/16/2004 - 05/23/2004 / 05/30/2004 - 06/06/2004 / 06/06/2004 - 06/13/2004 / 06/13/2004 - 06/20/2004 / 06/20/2004 - 06/27/2004 / 06/27/2004 - 07/04/2004 / 07/04/2004 - 07/11/2004 / 07/11/2004 - 07/18/2004 / 07/18/2004 - 07/25/2004 / 08/08/2004 - 08/15/2004 / 08/22/2004 - 08/29/2004 / 08/29/2004 - 09/05/2004 / 09/05/2004 - 09/12/2004 / 09/19/2004 - 09/26/2004 / 09/26/2004 - 10/03/2004 / 10/03/2004 - 10/10/2004 / 10/10/2004 - 10/17/2004 / 10/17/2004 - 10/24/2004 / 10/24/2004 - 10/31/2004 / 10/31/2004 - 11/07/2004 / 11/07/2004 - 11/14/2004 / 11/14/2004 - 11/21/2004 / 11/21/2004 - 11/28/2004 / 11/28/2004 - 12/05/2004 / 12/05/2004 - 12/12/2004 / 12/12/2004 - 12/19/2004 / 12/19/2004 - 12/26/2004 / 12/26/2004 - 01/02/2005 / 01/02/2005 - 01/09/2005 / 01/16/2005 - 01/23/2005 / 01/23/2005 - 01/30/2005 / 01/30/2005 - 02/06/2005 / 02/06/2005 - 02/13/2005 / 02/13/2005 - 02/20/2005 / 02/20/2005 - 02/27/2005 / 02/27/2005 - 03/06/2005 / 03/06/2005 - 03/13/2005 / 03/27/2005 - 04/03/2005 / 04/03/2005 - 04/10/2005 / 04/10/2005 - 04/17/2005 / 05/29/2005 - 06/05/2005 / 06/05/2005 - 06/12/2005 / 06/12/2005 - 06/19/2005 / 06/19/2005 - 06/26/2005 / 06/26/2005 - 07/03/2005 / 07/03/2005 - 07/10/2005 / 07/10/2005 - 07/17/2005 / 07/24/2005 - 07/31/2005 / 07/31/2005 - 08/07/2005 / 08/07/2005 - 08/14/2005 / 08/14/2005 - 08/21/2005 / 08/21/2005 - 08/28/2005 / 08/28/2005 - 09/04/2005 / 09/04/2005 - 09/11/2005 / 09/11/2005 - 09/18/2005 / 09/18/2005 - 09/25/2005 / 09/25/2005 - 10/02/2005 / 10/02/2005 - 10/09/2005 / 10/09/2005 - 10/16/2005 / 10/16/2005 - 10/23/2005 / 10/23/2005 - 10/30/2005 / 10/30/2005 - 11/06/2005 / 11/06/2005 - 11/13/2005 / 11/13/2005 - 11/20/2005 / 11/20/2005 - 11/27/2005 / 11/27/2005 - 12/04/2005 / 12/04/2005 - 12/11/2005 / 12/11/2005 - 12/18/2005 / 12/18/2005 - 12/25/2005 / 01/01/2006 - 01/08/2006 / 01/08/2006 - 01/15/2006 / 01/15/2006 - 01/22/2006 / 01/22/2006 - 01/29/2006 / 01/29/2006 - 02/05/2006 / 02/19/2006 - 02/26/2006 / 03/05/2006 - 03/12/2006 / 03/19/2006 - 03/26/2006 / 03/26/2006 - 04/02/2006 / 04/02/2006 - 04/09/2006 / 04/09/2006 - 04/16/2006 / 04/23/2006 - 04/30/2006 / 05/07/2006 - 05/14/2006 / 05/14/2006 - 05/21/2006 / 05/21/2006 - 05/28/2006 / 06/04/2006 - 06/11/2006 / 06/11/2006 - 06/18/2006 / 06/18/2006 - 06/25/2006 / 06/25/2006 - 07/02/2006 / 07/02/2006 - 07/09/2006 / 07/09/2006 - 07/16/2006 / 07/23/2006 - 07/30/2006 / 08/06/2006 - 08/13/2006 / 08/13/2006 - 08/20/2006 / 09/03/2006 - 09/10/2006 / 09/17/2006 - 09/24/2006 / 09/24/2006 - 10/01/2006 / 10/01/2006 - 10/08/2006 / 10/22/2006 - 10/29/2006 / 11/19/2006 - 11/26/2006 / 11/26/2006 - 12/03/2006 / 12/03/2006 - 12/10/2006 / 12/10/2006 - 12/17/2006 / 12/17/2006 - 12/24/2006 / 12/24/2006 - 12/31/2006 / 01/07/2007 - 01/14/2007 / 01/14/2007 - 01/21/2007 / 01/28/2007 - 02/04/2007 / 02/11/2007 - 02/18/2007 / 02/18/2007 - 02/25/2007 / 02/25/2007 - 03/04/2007 / 03/04/2007 - 03/11/2007 / 03/18/2007 - 03/25/2007 / 04/01/2007 - 04/08/2007 / 04/08/2007 - 04/15/2007 / 04/15/2007 - 04/22/2007 / 04/22/2007 - 04/29/2007 / 04/29/2007 - 05/06/2007 / 05/06/2007 - 05/13/2007 / 05/20/2007 - 05/27/2007 / 05/27/2007 - 06/03/2007 / 06/03/2007 - 06/10/2007 / 06/10/2007 - 06/17/2007 / 06/17/2007 - 06/24/2007 / 07/01/2007 - 07/08/2007 / 07/08/2007 - 07/15/2007 / 07/22/2007 - 07/29/2007 / 07/29/2007 - 08/05/2007 / 08/05/2007 - 08/12/2007 / 08/26/2007 - 09/02/2007 / 09/09/2007 - 09/16/2007 / 09/23/2007 - 09/30/2007 / 09/30/2007 - 10/07/2007 / 10/07/2007 - 10/14/2007 / 10/14/2007 - 10/21/2007 / 10/21/2007 - 10/28/2007 / 10/28/2007 - 11/04/2007 / 11/04/2007 - 11/11/2007 / 11/11/2007 - 11/18/2007 / 11/18/2007 - 11/25/2007 / 11/25/2007 - 12/02/2007 / 12/02/2007 - 12/09/2007 / 12/09/2007 - 12/16/2007 / 12/16/2007 - 12/23/2007 / 12/23/2007 - 12/30/2007 / 01/06/2008 - 01/13/2008 / 02/03/2008 - 02/10/2008 / 02/17/2008 - 02/24/2008 / 02/24/2008 - 03/02/2008 / 03/02/2008 - 03/09/2008 / 03/09/2008 - 03/16/2008 / 03/16/2008 - 03/23/2008 / 03/23/2008 - 03/30/2008 / 03/30/2008 - 04/06/2008 / 04/20/2008 - 04/27/2008 / 04/27/2008 - 05/04/2008 / 05/04/2008 - 05/11/2008 / 05/11/2008 - 05/18/2008 / 05/18/2008 - 05/25/2008 / 05/25/2008 - 06/01/2008 / 06/01/2008 - 06/08/2008 / 06/08/2008 - 06/15/2008 / 06/15/2008 - 06/22/2008 / 06/22/2008 - 06/29/2008 / 07/06/2008 - 07/13/2008 / 07/13/2008 - 07/20/2008 / 07/20/2008 - 07/27/2008 / 07/27/2008 - 08/03/2008 / 08/03/2008 - 08/10/2008 / 08/10/2008 - 08/17/2008 / 08/17/2008 - 08/24/2008 / 08/31/2008 - 09/07/2008 / 09/07/2008 - 09/14/2008 / 09/14/2008 - 09/21/2008 / 09/21/2008 - 09/28/2008 / 09/28/2008 - 10/05/2008 / 10/05/2008 - 10/12/2008 / 10/12/2008 - 10/19/2008 / 10/19/2008 - 10/26/2008 / 10/26/2008 - 11/02/2008 / 11/02/2008 - 11/09/2008 / 11/09/2008 - 11/16/2008 / 11/16/2008 - 11/23/2008 / 12/07/2008 - 12/14/2008 / 12/21/2008 - 12/28/2008 / 12/28/2008 - 01/04/2009 / 01/18/2009 - 01/25/2009 / 01/25/2009 - 02/01/2009 / 03/22/2009 - 03/29/2009 / 05/10/2009 - 05/17/2009 / 05/17/2009 - 05/24/2009 / 05/31/2009 - 06/07/2009 / 06/14/2009 - 06/21/2009 / 06/21/2009 - 06/28/2009 / 06/28/2009 - 07/05/2009 / 07/05/2009 - 07/12/2009 / 07/12/2009 - 07/19/2009 / 07/26/2009 - 08/02/2009 / 08/09/2009 - 08/16/2009 / 08/23/2009 - 08/30/2009 / 09/06/2009 - 09/13/2009 / 09/20/2009 - 09/27/2009 / 09/27/2009 - 10/04/2009 / 10/04/2009 - 10/11/2009 / 10/11/2009 - 10/18/2009 / 10/18/2009 - 10/25/2009 / 10/25/2009 - 11/01/2009 / 11/29/2009 - 12/06/2009 / 12/27/2009 - 01/03/2010 / 01/31/2010 - 02/07/2010 / 02/07/2010 - 02/14/2010 / 02/28/2010 - 03/07/2010 / 03/07/2010 - 03/14/2010 / 03/28/2010 - 04/04/2010 / 04/18/2010 - 04/25/2010 / 05/16/2010 - 05/23/2010 / 05/30/2010 - 06/06/2010 / 06/13/2010 - 06/20/2010 / 06/20/2010 - 06/27/2010 / 07/04/2010 - 07/11/2010 / 07/11/2010 - 07/18/2010 / 07/18/2010 - 07/25/2010 / 08/08/2010 - 08/15/2010 / 08/29/2010 - 09/05/2010 / 09/05/2010 - 09/12/2010 / 09/19/2010 - 09/26/2010 / 09/26/2010 - 10/03/2010 / 10/10/2010 - 10/17/2010 / 10/17/2010 - 10/24/2010 / 10/31/2010 - 11/07/2010 / 11/28/2010 - 12/05/2010 / 12/05/2010 - 12/12/2010 / 12/12/2010 - 12/19/2010 / 12/26/2010 - 01/02/2011 / 03/06/2011 - 03/13/2011 / 03/13/2011 - 03/20/2011 / 05/22/2011 - 05/29/2011 / 08/07/2011 - 08/14/2011 / 08/14/2011 - 08/21/2011 / 09/18/2011 - 09/25/2011 / 10/02/2011 - 10/09/2011 / 10/09/2011 - 10/16/2011 / 11/06/2011 - 11/13/2011 / 01/15/2012 - 01/22/2012 / 04/22/2012 - 04/29/2012 / 06/24/2012 - 07/01/2012 / 08/05/2012 - 08/12/2012 / 08/11/2013 - 08/18/2013 / 03/01/2015 - 03/08/2015 / 10/04/2015 - 10/11/2015 / 11/08/2015 - 11/15/2015 /


Powered by Blogger