Some news from the life and death and life of Pocobelle, the 300mw mail router
Briefly: what Pocobelle is about - is trying to create the internet I thought (in 1990) we'd have in 1996. Back then, I thought the internet would be a network of peers - not clients and servers - but peers. What we call P2P networking is a bitter joke compared to what could have been, what could have existed on the edge of the network, in every household, in every business, in the USA and in Timbuktu.
I expected email, netnews, DNS, radio services, web services, etc - to all exist - inside your home and not out in the "cloud". I expected to be playing concerts via the jamophone with my neighbor down the street and games with a buddy across town - with sub 1 ms latency...
Yes, the world changed. Mostly everyone became cyberserfs.
I didn't. My dream didn't die, it was just resting. Netnews isn't dead, it's still there, lots of people use it. It's easy to run your own email server, less easy to run your own DNS... Radio... Well, one day...
It IS impossible to be a true peer on the internet with IPv4 without a lot of expense. I'm doing it (mostly successfully) on the cheap with ipv6. And... I'm trying to do it, with a 140 dollar arm box with 64MB of memory that's 100% solid state and eats 300mw. That's milliwatts. Less than 1/3 of a watt.
The machine's called Pocobelle.sjds.teklibre.org.
I'm not going to bother posting much about it on my blog here, if you want to see what's going on with Pocobelle, get ipv6 working, and see for yourself.
This is an excerpt, from a lot of writing in progress:
Pocobelle has been acting as a backup email router for a few days now.
I get several hundred emails per day. It used to be thousands, but I switched to using netnews & gmane for my more high-traffic mailing lists, the lists that I mostly read and don't write to, such as lkml. Pocobelle successfully coped with the email I get in the dribs and drabs I get it in - I didn't have any complaints. It was transparent to me, and everybody. It did STARTTLS crypto without cracking 10% of cpu. My tests included sending a few dozen mails from a server outside my network, but that was it. Most of the time mail goes right to my laptop...
Today was the perfect day to try pocobelle out in a real world scenario.
I was without power or internet for 8 hours. My generator didn't work. (Most likely, I'm out of gas). My ice cream melted. Oh, well. I needed to defrost the chicken anyway.
The UPS that pocobelle was running on showed 95% of it's battery available after that period. I was really happy about that. Assuming I have a week when the internet stays up and power doesn't I should be able to have my email delivered without a problem, and periodically fire up the laptop to read it.
At 7PM, power and internet came back on simultaneously. I had previously turned off email to my laptop (and turned the laptop off, to save power). I booted up the laptop just to watch pocobelle do it's stuff...
Pocobelle got on the Net... Got an ipv6 address... And started getting the backlog of mail...
The Bind9 DNS server rapidly got to 23MB in size... The cpu went to 100%... 93% of it, gone, waiting for disk access.. the Loadavg lept past 5... available memory dropped to zero... My ssh session locked up...
Midway through the 15th email it bounced 3 messages, then it died.
Pocobelle ran completely out of memory and came to a screeching halt.
Sigh. The perils of engineering.
I hadn't thought deeply about the interaction between DNS services and email. Freshly booted, there is no DNS cache on the system.
I hadn't thought about a complete and utter cold start of absolutely everything pocobelle was connected to. There were no DNS caches anywhere it talked to that weren't "cold".
I'd got into a pathological situation, where the bandwidth being chewed up by all the mail being sent, and the time it took to "walk" DNS to verify it as "good" mail, competed and combined to bounce mails it couldn't do a reverse lookup on.
At the same time, the load on the system was such as to put it on the moon in short order.
As crashes go, it was not pretty. It brought back a flashback from 1995 where a CEO I knew, ecstatic with his new 26MB powerpoint presentation, emailed it to everyone in the company, and everyone he knew in the world, besides. That was an age, also, when a *good* mail server only had 64MB of ram....
1) Pocobelle only has 64MB of memory. (Pocobelle 2 will have 256MB or more) An easy "cure" for the memory problem was to enable swapping. When running without swap a Linux system will free up memory by discarding unused (read-only) program text pages, which are read-only, and swapping them in from the filesystem when they get used.
While there are a lot of binary pages you can do this to, it doesn't work on pages that have been modified by the linker, and it (especially) doesn't work on interpreted languages like python and perl. These languages often do have plenty of little-used pages, but they are *data* and can't get discarded because some day they MIGHT be modified further.
This arm build does not appear to have
Jakub Jelinek's prelink utility installed, which will free up more memory by prelinking the various binaries. Prelink solved a few problems, but in the arm world, most people (I'm not) use a libc that wasn't compatable with prelink. I'm still researching this...
So, anyway, there are plenty of things that can't get swapped out that could, if swap was enabled. So I added 128MB of swap on the flash. Linux doesn't require that you have swap on a raw partition (although it is a good idea), so I just did a:
dd if=/dev/zero bs=1024k of=/etc/swap count=128
mkswap /etc/swap
swapon /etc/swap # and add to /etc/fstab
An even better cure for this would be to use a box with more memory but that's a problem reserved for pocobelle 2.
With swapping enabled, pocobelle grew decidedly less "chunky" in the general case. There is always a lot more free memory available for general use - for example, bind9 dropped from 23MB of ram down to 14MB. In normal use, 16MB is living on swap by default.
Whenever I get around to reformatting this USB key, I will put swap on a raw partition. I might put it on the built-in flash on the board, actually. We'll see.
2) Pocobelle was configured to use one DNS server - it's own - and forward to several local servers attached to my wireless network, provided by my provider. While this is a decent config... One that a normal
client would use... given that all of the servers it was connecting to were ALSO freshly booted and ALSO had to walk DNS there, they ALL failed within the default DNS timeouts.
What I decided to do was establish a robust set of DNS servers (5), having pocobelle talk to itself twice - once in the beginning of the loop and another time at the end. In the middle it talks to my main mail server, which having already done the anti-spam protection in the first place should have a cached record of the remote server's origin already.
It should effectively put a 10 second timeout on the DNS lookup instead of a 2 second one, AND get to at least one server that has a good, primed, cache; a server in the US that's impervious to power failures.
(Getting this to work was a little tricky in that I'm using bind views internally to give me a consistent picture of my network and routing configuration(s), but I'm not going to go into that here)
I hope this is sufficiently robust. I'm not going to purposely instigate another 8 hour delay on my email, at least, not in the near future.
Another answer to this is to cache more of the internet's DNS service at the start, before accepting mail. (My mail is mostly not random, but comes from a limited number of mailing list servers). I have a buddy that used to cache the entire DNS root zones back in 1995. Maybe that's still possible.
It would be good to have some sort of cache log that I could replay on name service startup (or at certain times of the day, for example, shortly before I wake up in the morning) to prime the cache(s). I can sort of do this by replaying the mail logs through the DNS system, but it would be cleaner if I could figure out a way to get my top 100 sites out of bind periodically.
3) Given that write speeds to the flash are so slow it would be best to always keep at least 512K reserved for disk buffers. Smaller writes than 128k at a time are *bad* with flash.
I used to know how to do that, but the interface to the Linux swapper has changed so much that I have to google to figure it out. (Most of this blog was written after the power failed again)
4) Given that one of pocobelle's purposes is to be a mail router, and it lives on ipv6 which has little to no spam on it, it's somewhat pointless having even the minimal anti-spam services I have on it (like those reverse lookups that caused the bounces in the first place)
I'm not going to do that, I actually want to make this into a system capable of the best anti-spam measures I can come up with because spam is just never going to go away.
...
So, after fixing 1 and 2, I fired up rss2email on a new user on pocobelle. Rss2email is written in python. It took 12 seconds to start, and 16MB of ram, and was really going slow, so I decided that I didn't need to do that on pocobelle itself, but on my smart host elsewhere. Pocobelle just needs the mail itself, not the process that generates it.
Result: I got 25 messages as fast as they could be delivered.
It ran happily with 16MB of ram out on swap.
I'm happy with pocobelle today. I'm going to turn off my laptop tonight and see what happens.
AM Update: I turned the laptop on again, and got about 60 emails sent in rapid succession. The night before I'd double the default number of connections to 12 in a burst of optimism.... Pocobelle handled the load, but I think I'm going to limit the number of inbound and outbound connections to 4. At 12, it ran at 93% of cpu and got down to very little memory during it's burst of email. Pocobelle needs to remain responsive to DNS, in particular, as it's the main DNS server for the household, and has quite a few other things to do besides email.
Now, I'm running full starttls (encrypted) email inside of my household, which probably accounts for some of the cpu usage, but I think the overhead was of startup and running all those processes, not the crypto.
Maybe I'll try rate-limiting the number of inbound connections via iptables, tarpitting them maybe, to keep the mail server on the other side happy once pocobelle it gets past 3, keeping it from rescheduling the mail repeatedly. That will ensure a burst of email actually gets sent, albeit slowly. (this is also a good anti-spam measure)
On to figuring out 3 and 4...
Labels: arm debian, dns, email, embedded, internet, ipv6, multicast, pocobelle