Postcards from the Bleeding Edge
Tuesday, March 02, 2010

  Zoneminder/Facebook gateway and other random project notes

I woke up with a really good idea this morning that I need to write down but first an aside.

I am very happy that facebook supports jabber now (XMPP). That makes it the LAST of the chat protocols that I have to wedge into one multi-chat client (pidgin or emacs' erc).

( That is, if I ever get bitlbee 1.2.4 + otr + bitlbee skype plugin + znc working again. My last attempt at getting otr patched into bitlbee failed big time. )

I really hate how "chat" - merely talking to another person over the internet - has fragmented into 14+ major protocols including irc - which has itself, fragmented into 260+ irc networks. Perhaps it's an example of the human condition that we form tribes in this way, that the tower of babel lives on... (has anyone rigged a translation bot into chat with google translate yet?)

Anyway, that's the rant part of this morning, on to the interesting stuff.

One major subproject of mine has been trying to come up with a working security and sensor system for my home. The hardware side of the software I've been calling the "pocobelle" project, and part of the software side is built around zoneminder, which is a motion detection/video capture/alert system.

A month or so ago, I got zoneminder to automagically push alerts ("Motion detected at the front door, see http://whatever/whatever for details") into jabber. It was just a couple libraries and a few dozen lines of perl to do this. (cool)

To make it faster and more reliable for local alerting, I installed my own jabber server on the pocobelle box... and also tried to register a zoneminder specific account on the main jabber.org server.

Unfortunately jabber.org hasn't been taking new signups, which frustrated me because I'm trying NOT to use google (which also does jabber) for new stuff and wanted to demo the idea to a few people - and because the zoneminder box is behind NAT I can't use the nifty federation (x2x) feature that jabber has which I could use to "link" my jabber server to another jabber server like jabber.org - AND because although jabber works great over IPv6 nobody seems to do x2x over it....

I woke up this morning with a new idea. Since facebook supports jabber now, I could setup a jabber account on IT, and have zoneminder send it's alerts out via facebook. Not only that, but by using the facebook API/uploader I could have it automatically upload the captured video, and using the facebook friends system only allow certain people to be able to view it...

That's it for the new ideas on the zoneminder front. In other news... My mom came down last month and brought 4 of Ubiquity's nanostation M5 5Ghz Wireless-n capable radios. I just got them configured - 2 with the default firmware, 1 with openwrt built by openwrt and (this weekend) my own openwrt build...

Pluses - The hardware is great, it's low power, and rated to 70C, and it has a nifty POE passthrough feature for the second ethernet port. The factory firmware is dirt simple. The web based site survey tools are great. The power monitoring LEDs are useful. I can get two of the radios to talk at 108 megabits/sec, with a real transfer rate in the 80Mbit range. I'm told I can do better than that at longer ranges than within my house (shorter ranges overdrive the local receiver), but it's still basically double the theoretical performance of wireless-g and easily triple the real world performance I've observed with wireless-g...

Minuses - Lacking IPv6 and (especially) routing protocol support, the factory firmware is USELESS. Maybe I'm spoilt by dd-wrt and openwrt having support for ospf and olsr out of the box, but believe me, as soon as you add a couple routers to your network, having a routing protocol to automagically figure out how to get stuff from point A to points B,C,D is flat out necessary... and it's not there. (aside: the regular unavailability and lack of standardization of routing protocols is what made bridging so inevitable - there is ONE standard for bridging - stp - and everybody implements it - and it works well for most things)

Secondly the factory firmware is based on a ancient version of linux - 2.6.15 - and that in itself, scares me. That's over 4 years old.

So I installed openwrt, not with a little trepidation (you need to install the nano-m firmware via tftp from a recent build of openwrt trunk, then install a bunch of modules like ath9k, and yes, I'll write up more documentation)

oooh, luxury. 8MB of flash and 32MB of ram goes a LOT further than you'd think. Linux 2.6.32.9, even, less than 4 months behind the mainline linux kernel. I installed ipv6, babel, ahcpd, quagga, snmpd, ntp, and a few other things, and still had room to spare. The web interface is not all that great, but I don't really care, I want to manage these things via command line tools anyway. After some tweaking, I got openwrt talking to the ubuquity firmware...

but so far, only at 54Mbit. I'm missing some configuration parameter (I hope). I have a 40Mhz channel configured, but... I'd like it very much if I could basically drive the radios at 100Mbit ethernet speeds, it would lower the need to shape the outgoing traffic as much, and anytime I can get something that is 5x better than an older technology (wireless 11b) I'm happier...

I also haven't figured out how to override the power settings to the full range available, limited to 17dbm of gain instead of 27dbm...

Update: These two problems are related. I'm definitely overdriving the receivers. I can actually connect as high as 162 Mbits/sec between the two. I found this out when I accidentally pointed one of the radios at the ceiling... Ooh. 162Mbits.... NICE...

Now, since I envision a day where I have to build out a bunch more of these radios, and partially because I'm a glutton for punishment, I downloaded openwrt and built my own firmware. That worked... on the first try! (that's amazing in itself, but usually a bad sign indicating real trouble ahead, as I will always then proceed to dig my own hole). On my second try - after configuring a few more modules - it fails on building the kernel for some reason:

LD [M] fs/autofs4/autofs4.o
mips-openwrt-linux-uclibc-ld: unrecognized option '-Wl,-rpath,/usr/local/lib'
mips-openwrt-linux-uclibc-ld: use the --help option for usage information

I'm stuck on that right now. I have a bunch of tests to do (reliability, traffic shaping, different routing protocols, etc) but I'd like the two different firmwares to at least be talking at the same rate.

But it is kind of neat to see babel "just working" to route between the routers. Hmm... maybe I can get vlans to work... I wonder if this thing has any pins I could hang a temp sensor off off...

Labels: , , ,

 
Friday, February 12, 2010

  Rebooting for 2010

Computationally, I've limping along, flying on the shreds of one wing. My quad core box died in August, my laptop died in December, and I killed Pocobelle back in November.

The quad-core? Dead beyond repair. I tried replacing the fan, the cpu, and the memory. Pocobelle requires some finicky jtag work to repair and was at the end of it's useful life anyway, so I replaced it with an open-rd box.

And, rather than fix the laptop, I ended up building a new machine (specific to a project I can't talk about), in x86 mode. "Buddy" is a nice machine, a dual core atom, and the zotac nvidia graphics interface is amazingly fast, even with driving two screens. I don't miss my old power sucking quad core box, except when compiling kernels and ardour. I should have made buddy be x86_64, however, so I could just clone the laptop...

Back in august, I switched over to using emacs for chat, email, news. I took major steps to try and integrate chat, in particular, into emacs - using bitlebee and znc to integrate skype, irc, and jabber into one interface. It worked great!

I do the majority of my writing and coding in emacs, already, so I ended up with two fullscreen windows that used every pixel I had available, sparingly.

I wasn't done, in particular I started working on some new blogging software and got stuck on it, and the machine I was developing it on, pocobelle, died also, and the memory stick it ran on is around here... Somewhere...

It took a LOT of work to setup all that I haven't replicated on the new box. It would have been easy had I stuck with x86_64 mode instead of reverting to x86. Instead I switched to using the new thunderbird for mail, dropped netnews entirely, and went back to pidgin for chat, because it was easier to set those up quickly. And went from the keyboard driven window manager I like, back to gnome. A major mistake, I'm thinking all that was, but I did get a chance to try some new tools.

Thunderbird 3 is awesome, in particular the tabbed searching facility is to die for. But I haven't got around to sorting my mailboxes with the same level of filtering I had with emacs. Due to switching to using imap for the email backend, I'm not sure how hard it's going to be to use emacs for mail again, and I'd rather like to keep thunderbird around as an option.

I HATE the gnome window manager. It's so dumbed down as to make me feel like Harrison Bergeron.

Pidgin is good but...

The thing is, I can feel the productivity seeping out of me every time I switch from one glossy white (web,chat,news) window to another. I feel far less productive minute by minute, hour by hour, by using these tools not integrated into my main editing interface. My personal, searchable database of everything I do was all living in emacs, and it was letting me manage far more "stuff" than I can without these tools integrated.

Sigh. So I kind of need to reboot my life - repairing the dead laptop, resurrecting pocobelle, getting a new desk made, laying out all the stuff that needs repair or upgrades on a workbench...
and some time all by myself to methodically go through it all.

Facebook has finally adopted a standards based chat system, built around XMPP. That's one timesump I can just move to being chat. There's also a fix for yahoo out there in bitlbee. My first attempt at integrating the yahoo and otr patches into bitlebee did not go well, I hope someone else has fixed it.

I built out my email systems using digital certificates. They are about to expire and the cert I had for the website to un-expire them is on the dead laptop....

Hopefully I'll find the time this month to rebuild my environment to where I feel productive again. What I really want to be able to do is wake up, and sit down, and work, without having to reboot my world every time I get up. Solving THAT problem is going to require I get a solar/battery/inverter system so that I can keep the core systems powered up, and/or that I simplify my environment a lot more... need money for the inverter system...

Maybe I can outsource some of the backlog. Need money for that. Guess I need to work some, first.

While all that stuff was going backwards I did make major forward progress with "jaco", the open-rd replacement for pocobelle. It's a darn nice box.

At one point I had high hopes I could actually use jaco for a desktop replacement. At 11 watts, it eats ~1/4th the power of the atom, and runs emacs just fine. Given that my display eats 20watts, I could halve my power consumption and double run time on battery. (Lest you think this is an obscure requirement, I frequently undergo day long power failures, and running the generator for long periods is both annoying and expensive, and my existing UPS lasts about 3 hours on the atom)

The video driver, however, is so horribly slow that I just can't stand using it as a desktop.

Jaco is great for everything else though. I have it running zoneminder, in particular, as part of my home security system, in addition to it being primary DNS, squid cache, web server, bittorrent client, ipv6 gateway, mesh networking, email/imap, znc, jabber server, music and I forget what else.

I have had so much fun developing stuff to work on the open-rd that I've been afraid to turn it to the production use I'd planned for it, namely replacing my router/firewall. I guess I need to get another one, or wait for the new "guruplug"s to arrive.

First up this morning - make an outline of everything I need to do this month. Importantly, this has to include revenue generating stuff, as I burned myself out back in December and need to get on the stick, but also importantly it has to get me to where I can "wake up and work"....

Second up this morning - fix the !@#@! Laptop. To do that, I need to find the pesky little screws for a new DVD drive, and burn a new cd, and that new DVD drive is going into the box I'm typing this on so I guess I'll have to power off in a sec before I can develop more of an outl

Labels: , ,

 
Tuesday, December 01, 2009

  Rad Decision

Tonight's online reading was Rad Decision, a novel about the events leading up to a nuclear accident in the US that didn't happen, but could have.

I found the the plot gripping, the situations utterly believable, the characters decent, the backstory quite plausible, and the denouement worth thinking about. Recomended.

I still think nuclear technology belongs in the mix of energy sources for the future.

I'm impressed with how far Nanosolar has come in the past few years, although I find it weird that they are concentrating on Germany rather than in places, like Central America and Mexico, that have boatloads of solar power.

I do appreciate the rush towards greener consumer technology, LED lighting in particular, seems to be getting better at a rapid rate. Earlier today I'd given up on getting any sheeva plugs this year, and went with an open-rd box, which promises about 1/3 the power consumption of a nearly equivalent PC. I also decided not to build out another quad core machine to replace my broken one.

Whether or not the open-rds could be used as a desktop and be as useful as an atom (the nearest competitor for the desktop) remains to be seen, but I intend to retire one of my computers in favor of it for use as the first iteration of Pocobelle 2, probably with added duties as a music server, and perhaps it - or the next Arm A9 based generation - will give intel a run for their money.

I read David Rowe's blog religeously. Here he talks about the power audit he did on his house and what he did to improve his negawatts. He also talks about what he did for his pool system to improve it, too.

I'm also big on geothermal power - I've been watching a Nicaraguan company, formerly known as Polaris Geothermal, now known as Ram Power complete a merger and obtain 150+ million in funding to bring up (at least) 70MW more geothermal power here. A terawatt is feasible, long term.

Nicaragua has enough geothermal energy in the ground to power 3 countries, if only they'd - or someone - would try, harder.

Through such slow, incremental change is the future made.

I'd had a really bad week last week. Nothing for me, personally, was really working, and I'd had some bad news about the Montavista merger that basically tore up some retirement plans entirely.

I took a cheap cab out halfway to Rivas, as I was too broke to get all the way to the frontier to renew my passport and return - and started walking south. I made it as far as the windmill project down that way, which has been under construction ever since I'd got here, with numerous hassles and delays.

The wind off the lake was blowing hard, and every last one of those enormous shiny white windmills was turning, and they'd finished running the power lines out - They were actually delivering power!

It was so beautiful that I sat down in front of the land, watched those windmills turn and turn, and cried. Maybe only another engineer can understand what it is like to struggle and fail and flail and fail again and then see something else that was actually working.

Shortly thereafter someone picked me up and took me to the border.

I made it home with 70 cordoba in my pocket, but my good friends at El Pozo let me run a tab, and all the friends I hadn't seen in a month showed up that night to visit.

So my life goes on, with a few moments like these to make it worthwhile to struggle on, and to keep trying. I keep saying to myself that the only way to win is to not play the game, but it's deeper than that, you need to invent new rules for the game. Better ones.

Labels: ,

 
Sunday, October 25, 2009

  Still not finishing the spec for Pocobelle 2, but satisfied all the same

The big reason why I haven't been participating in the otherwise fascinating thread on health care on my blog is that I swore to myself I'd finish the spec for Pocobelle version 2 by this weekend, and get it posted, and commented on.

Hopefully, with some feedback, I could order the "right thing" by the end of next week.

I write stuff and post it so that - although the average intelligence of the internet may be low - the cumulative intelligence is also inconceivably high. Out there there is always someone that has the answer to your problem...

Examples of cumulative intelligence are the health care thread, and what I read today on slashdot about embedded hardware while taking a mental health break.

I had pretty much settled on the TS7500 board as being pretty much ideal for what I was looking to build. It was extremely low power, and had enough DIOs, host USB, flash, etc to do most of what I needed to do.

The only problem with it was that it didn't have enough ram to do all I needed to do, and was kind of slow. I was working out the implications for motion detection in zoneminder, trying to offload that function to other devices, and not liking what the shifts in costs were like - my camera costs went way, way up, and functionality down. Zoneminder's motion detect algorithm was very, very good, but also cpu intensive. I couldn't see my way clear to embedding anything even as close to as good in a FPGA, and doubted your typical camera could, either... and I was trying to keep the moving parts and power budget to a bare minimum.

Slashdot steered me at two really cool looking products - most notably the Sheeva Plug, and secondarily the Open RD...

Which - aside from slightly higher power consumption - still far less than a laptop or atom board - are pretty much perfect for most of what I need (except the DIO relays, which I can probably do via USB).

I kind of wish they'd posted on this topic before I burned the hours writing my spec, and for all I know there is some showstopper in one of these boards, but it hurts a lot less to read than to write, so I'm going to go read up on them now, and maybe place an order or two.

The sheeva plugs have 512MB of RAM, 512MB of flash, USB 2.0, eat 3w of power at idle, and cost 99 BUCKS! for that price someone can afford to buy a couple, just to play with!

I am sore tempted to order a a touchbook, too. Life is getting a bit more interesting in the arm world.

Labels: , , ,

 
Sunday, October 11, 2009

  One mistake, and PocoBelle becomes a brick

It was a dark and stormy saturday night. Power and internet were offline. I'd got to where I was satisfied with the Linux kernel I was running. It was doing everything I needed to do, I'd loaded up all my apps and (with a 256MB of swap) hadn't had a crash in a week...

I wanted to focus on userspace, and make Pocobelle boot standalone so it didn't need to fetch, via tftp, its kernel anymore, and could stand completely alone...

(About the only bit of tuning I wanted to do was make it boot faster, I had had to put a rootdelay=10 seconds into the Linux boot so I didn't have to do any special magic (e.g. switchroot) to complete the boot off of the USB stick. But it booted in less than 2 minutes, and that was fine.)

I was *happy*. (Getting to where you have a stable kernel makes EVERYTHING else a thousand times easier)

Soooo... I decided to write the the "stable" kernel to the on-board flash. I went into RedBoot and did a:



Fis list

Name FLASH addr Mem addr Length Entry point
(reserved) 0x60000000 0x60000000 0x07D04000 0x00000000
RedBoot 0x67D04000 0x67D04000 0x00040000 0x00000000
vmlinux 0x67D44000 0x00218000 0x00160000 0x00218000
RedBoot config 0x67FF8000 0x67FF8000 0x00001000 0x00000000
FIS directory 0x67FFC000 0x67FFC000 0x00004000 0x00000000


Not remembering how Pocobelle got this way, (I'd had it in a box for 4 years or so) I blithely assumed that this was all I needed. I kept scratching my head over how this layout failed to match what was in the kernel for the layout:


static struct mtd_partition partition_info128[] = {
{
.name = "TS-BOOTROM",
.offset = 0x00000000,
.size = 0x00020000,
}, {
.name = "Linux",
.offset = 0x00020000,
.size = 0x07d00000,
}, {
.name = "RedBoot",
.offset = 0x07d20000,
.size = 0x002fc000,
},
};


My assumption was basically that I had an advanced kernel with a preproduction redboot and that the partition table info here was incorrect...

WRONG.

I did an fis init -y to clear out the uselessly small Linux partition, hacked the TS7250 driver to take my new partition table, built a new kernel,

Allocated 4MB for the new Linux kernel at 0x60000000. Erased that partition, tftp loaded and wrote the Linux kernel there...

I completely forgot that this device has a complex bootstrap process. The tiny on-board NOR flash has a tiny bootloader that then bootstraps another bootloader out on the NAND flash, which is then smart enough to load up Redboot, which in turn is smart enough to load up Linux.

Result: 1 brick. Pocobelle never gets to Redboot. It jumps right to Linux and blows up shortly thereafter.

I figure I overwrote the TS-BOOTROM, which, although it wasn't marked in the bogus fis table, was hiding at 0x06000000. The system has enough brains to jump to that spot, that's all....

Now, there may yet be something to do that will get me into rewriting the ts-bootloader into the NAND flash from the serial port. But I doubt it. (I'm writing this blog entry unattached from the internet, like I do a lot these days, so I can't google)

Post googling Update: Joy! There's a tool to boot from the serial port! Boo! I have to build redboot from scratch for it to work. I've been meaning to do that anyway, but...

There's no one to blame for this but myself. I should have dumped the contents of the flash there and looked at the strings in it to see what it meant before I overwrote it. Shoulda, but I didn't. I normally would have done that, back when I was doing this professionally. I would have puzzled and agonized for days over the mismatch between the flash info, the documentation, and the source code...

(Back then, the boards I was working on were usually very early prototypes, worth thousands or tens of thousands of dollars. My fear factor with a 145 dollar board is considerably less.)

It would be easy to fix if I had a jtag interface available, or was still living in the Silicon Valley, where I could visit any of a bunch of friends at companies that keep stuff like that lying around.

Sadly, this is the furthest I've been away from a jtag debugger in a decade. I keep meaning to get one, but they tend to be rather project specific and I've been unable to settle on the right boards for what I intend to be doing.

The nearest jtag debugger is probably 1200 miles away, in Florida, maybe further than that as Florida isn't exactly known as a hotbed of embedded development.

OK, so I rationalize to myself:

This was just an experiment, after all, to see what I could do that was new and different in the embedded world.

I've pretty much determined that the range of software I want to run is going to require at least 256MB of ram, and I might as well go for 512MB. I was kind of hoping to keep prototyping on the board (it was already quite useful as it was).

At the time that I killed Pocobelle... I was successfully running (all over ipv6):



All on this 300mw power sipping machine. I'd transfered every function I'd had a dedicated server (an old laptop) doing onto this box and was ready to turn it off...

The only things that were annoyingly slow was heavy duty disk writes, (databases in particular) and the startup of interpreted programs was very slow, particularly in the web service side.

I had just switched (yesterday) from apache to using lighttpd - which was much faster than apache and, as a bonus, gave me high speed .flv streaming) and was working on getting fastcgi to work with the web interfaces for dimp, cricket, zoneminder, and my new blogging software.

And I killed it...

The darn thing was getting useful. I was actually getting dependent on it... it was routing my /48 ipv6 network, and running DNS and mail for the whole house and was serving up a bunch of mp3s and videos...

I'll miss you, pocobelle. I'll get you fixed as soon as I can, I promise.

The beauty of this particular project is that I can retreat, for a while, into booting into a qemu emulator of the arm processor in it. I never needed actual hardware to run it in the first place. It WAS essential I prove the board and kernel reliable and get a feel for it's performance. That's it.

All the binary code for it actually lives on a USB stick. I can just pop that stick over there and resume working.

(This is so much better than life in the old days, pre-usb sticks, pre USB, even... I have spent months in my life in a jtag debugger, just trying to get freshly designed and probably buggy board to run the first 50 instructions...)

So I slammed the USB sticks into another machine, made a copy, converted the result into qcow2 format, and booted up pocobelle virtually via qemu. The out of the box Linux kernel I have for that emulator is the Versatile variant of arm, which needed a bunch of modules, so I booted up another copy of the emulator that accessed the original versatile system image, and copied those over. I told inittab to use a slightly different serial port (ttyAMA0) and /etc/securetty (to let me login on that port)

And vPocobelle came to life once again! (I still need to figure out the tun interface to get the emulator on the net, however)

It's actually 3x faster to run out of the emulator in this case actually!! And the kernel I was using was fully baked! I was done. I didn't need to work on it anymore! And I'd intended to focus on userspace issues anyway and see what more memory did for me...

So (temporarily) losing the hardware is a setback, but only a minor one.

I really, really liked that it let me stay on the internet for a week without power. I'm writing this now, without power, or internet... (Note to self, get more gas for the generator monday)

I have no friggin idea how or when I'll get Pocobelle to boot again.

Maybe I'll find some california surfer dude visiting that can pick me up one in the Valley on the way down here....

It's probably cheaper to just get a new board...

I wonder what that will be?

Labels: , , , ,

 
Wednesday, October 07, 2009

  A quick automounter rant

A nice feature added to most Linux desktops in the last couple years, is the robust support of plugging almost any USB device into it and having it "just work". Printers, USB keys, Mp3 players, pretty much all "do the right thing" now. USB support in linux is comparable to Mac and Windows now...

... Including the flaws.

I don't understand all the magic that takes place behind the scenes to get a mp3 player to mount on /media. I understand it picks up the partition table name somehow, and gets an event...

But...

Why do we mount the devices read/write? And leave them mounted all the time? And require human
intervention to eject them? And have bad things happen if you don't do that? And (at least on the box I'm in front of) require you be logged in to the desktop to have it automount it in the first place?

Linux has had an automounter since the days of Apollo (Not the NASA version, the HP version), and for local devices it works pretty well. Automounters have never worked perfectly of course, particularly over failing networks...

It turned out pocobelle didn't do any of the magic that my desktop did, not mounting the drive at all, so I sat down and implemented automounts mostly the way I wanted them to work in the first place.

I added the following lines to /etc/auto.misc

sansa -fstype=vfat,rw,uid=1000,gid=1000,umask=022 :/dev/disk/by-id/usb-Rockbox_Internal_Storage_90000000000000000A4B4511712031AFD-0\:0-part1

sansa2 -fstype=vfat,rw,uid=1000,gid=1000,umask=022 :/dev/disk/by-id/usb-Rockbox_Internal_Storage_90000000000000000A4B4511712031AFD-0\:0-part2

(These are each one line. yes, my sansa runs Rockbox)

I created symlinks to the device in my public_html directory, so when it was plugged in I could stream media from it from any of the other devices in the house. When not plugged in, attempts to access it result in file not found, which is what I want. Also, when it is plugged in, I have a cron job setup to suck down some music at night from some internet radio stations I like (and not when it's not connected)

According to the setting in auto.master, it auto-dismounts 60 seconds after the last program using it stops, which makes it easy to just unplug it when I want to take it on the road... 99.99% of the time it will be dismounted - and I'd like it to mount read-only until something wants to write to it, which would take care of a few other nines... and implement a forced killall processes and umount triggered by a switch on one of the DIOs on the board to take care of the last .000001% of the problem.

Labels: ,

 
Monday, October 05, 2009

  Some news from the life and death and life of Pocobelle, the 300mw mail router



The Pocobelle project, day 10



Pocobelle has been acting as a backup email router for a few days now.

I get several hundred emails per day. It used to be thousands, but I switched to using netnews & gmane for my more high-traffic mailing lists, the lists that I mostly read and don't write to, such as lkml. Pocobelle successfully coped with the email I get in the dribs and drabs I get it in - I didn't have any complaints. It was transparent to me, and everybody. It did STARTTLS crypto without cracking 10% of cpu. My tests included sending a few dozen mails from a server outside my network, but that was it. Most of the time mail goes right to my laptop...

Today was the perfect day to try pocobelle out in a real world scenario.

I was without power or internet for 8 hours. My generator didn't work. (Most likely, I'm out of gas). My ice cream melted. Oh, well. I needed to defrost the chicken anyway.

The UPS that pocobelle was running on showed 95% of it's battery available after that period. I was really happy about that. Assuming I have a week when the internet stays up and power doesn't I should be able to have my email delivered without a problem, and periodically fire up the laptop to read it.

At 7PM, power and internet came back on simultaneously. I had previously turned off email to my laptop (and turned the laptop off, to save power). I booted up the laptop just to watch pocobelle do it's stuff...

Pocobelle got on the Net... Got an ipv6 address... And started getting the backlog of mail...

The Bind9 DNS server rapidly got to 23MB in size... The cpu went to 100%... 93% of it, gone, waiting for disk access.. the Loadavg lept past 5... available memory dropped to zero... My ssh session locked up...

Midway through the 15th email it bounced 3 messages, then it died.

Pocobelle ran completely out of memory and came to a screeching halt.

Sigh. The perils of engineering.

I hadn't thought deeply about the interaction between DNS services and email. Freshly booted, there is no DNS cache on the system.

I hadn't thought about a complete and utter cold start of absolutely everything pocobelle was connected to. There were no DNS caches anywhere it talked to that weren't "cold".

I'd got into a pathological situation, where the bandwidth being chewed up by all the mail being sent, and the time it took to "walk" DNS to verify it as "good" mail, competed and combined to bounce mails it couldn't do a reverse lookup on.

At the same time, the load on the system was such as to put it on the moon in short order.

As crashes go, it was not pretty. It brought back a flashback from 1995 where a CEO I knew, ecstatic with his new 26MB powerpoint presentation, emailed it to everyone in the company, and everyone he knew in the world, besides. That was an age, also, when a *good* mail server only had 64MB of ram....

1) Pocobelle only has 64MB of memory. (Pocobelle 2 will have 256MB or more) An easy "cure" for the memory problem was to enable swapping. When running without swap a Linux system will free up memory by discarding unused (read-only) program text pages, which are read-only, and swapping them in from the filesystem when they get used.

While there are a lot of binary pages you can do this to, it doesn't work on pages that have been modified by the linker, and it (especially) doesn't work on interpreted languages like python and perl. These languages often do have plenty of little-used pages, but they are *data* and can't get discarded because some day they MIGHT be modified further.

This arm build does not appear to have Jakub Jelinek's prelink utility installed, which will free up more memory by prelinking the various binaries. Prelink solved a few problems, but in the arm world, most people (I'm not) use a libc that wasn't compatable with prelink. I'm still researching this...

So, anyway, there are plenty of things that can't get swapped out that could, if swap was enabled. So I added 128MB of swap on the flash. Linux doesn't require that you have swap on a raw partition (although it is a good idea), so I just did a:

dd if=/dev/zero bs=1024k of=/etc/swap count=128
mkswap /etc/swap
swapon /etc/swap # and add to /etc/fstab

An even better cure for this would be to use a box with more memory but that's a problem reserved for pocobelle 2.

With swapping enabled, pocobelle grew decidedly less "chunky" in the general case. There is always a lot more free memory available for general use - for example, bind9 dropped from 23MB of ram down to 14MB. In normal use, 16MB is living on swap by default.

Whenever I get around to reformatting this USB key, I will put swap on a raw partition. I might put it on the built-in flash on the board, actually. We'll see.

2) Pocobelle was configured to use one DNS server - it's own - and forward to several local servers attached to my wireless network, provided by my provider. While this is a decent config... One that a normal client would use... given that all of the servers it was connecting to were ALSO freshly booted and ALSO had to walk DNS there, they ALL failed within the default DNS timeouts.

What I decided to do was establish a robust set of DNS servers (5), having pocobelle talk to itself twice - once in the beginning of the loop and another time at the end. In the middle it talks to my main mail server, which having already done the anti-spam protection in the first place should have a cached record of the remote server's origin already.

It should effectively put a 10 second timeout on the DNS lookup instead of a 2 second one, AND get to at least one server that has a good, primed, cache; a server in the US that's impervious to power failures.

(Getting this to work was a little tricky in that I'm using bind views internally to give me a consistent picture of my network and routing configuration(s), but I'm not going to go into that here)

I hope this is sufficiently robust. I'm not going to purposely instigate another 8 hour delay on my email, at least, not in the near future.

Another answer to this is to cache more of the internet's DNS service at the start, before accepting mail. (My mail is mostly not random, but comes from a limited number of mailing list servers). I have a buddy that used to cache the entire DNS root zones back in 1995. Maybe that's still possible.

It would be good to have some sort of cache log that I could replay on name service startup (or at certain times of the day, for example, shortly before I wake up in the morning) to prime the cache(s). I can sort of do this by replaying the mail logs through the DNS system, but it would be cleaner if I could figure out a way to get my top 100 sites out of bind periodically.

3) Given that write speeds to the flash are so slow it would be best to always keep at least 512K reserved for disk buffers. Smaller writes than 128k at a time are *bad* with flash.

I used to know how to do that, but the interface to the Linux swapper has changed so much that I have to google to figure it out. (Most of this blog was written after the power failed again)

4) Given that one of pocobelle's purposes is to be a mail router, and it lives on ipv6 which has little to no spam on it, it's somewhat pointless having even the minimal anti-spam services I have on it (like those reverse lookups that caused the bounces in the first place)

I'm not going to do that, I actually want to make this into a system capable of the best anti-spam measures I can come up with because spam is just never going to go away.

...

So, after fixing 1 and 2, I fired up rss2email on a new user on pocobelle. Rss2email is written in python. It took 12 seconds to start, and 16MB of ram, and was really going slow, so I decided that I didn't need to do that on pocobelle itself, but on my smart host elsewhere. Pocobelle just needs the mail itself, not the process that generates it.

Result: I got 25 messages as fast as they could be delivered.

It ran happily with 16MB of ram out on swap.

I'm happy with pocobelle today. I'm going to turn off my laptop tonight and see what happens.

AM Update: I turned the laptop on again, and got about 60 emails sent in rapid succession. The night before I'd double the default number of connections to 12 in a burst of optimism.... Pocobelle handled the load, but I think I'm going to limit the number of inbound and outbound connections to 4. At 12, it ran at 93% of cpu and got down to very little memory during it's burst of email. Pocobelle needs to remain responsive to DNS, in particular, as it's the main DNS server for the household, and has quite a few other things to do besides email.

Now, I'm running full starttls (encrypted) email inside of my household, which probably accounts for some of the cpu usage, but I think the overhead was of startup and running all those processes, not the crypto.

Maybe I'll try rate-limiting the number of inbound connections via iptables, tarpitting them maybe, to keep the mail server on the other side happy once pocobelle it gets past 3, keeping it from rescheduling the mail repeatedly. That will ensure a burst of email actually gets sent, albeit slowly. (this is also a good anti-spam measure)

On to figuring out 3 and 4...

Labels: , , , , , , ,

 
David Täht writes about politics, space, copyright, the internet, audio software, operating systems and surfing.


Resume,Songs,
My new blog, NeX-6, My facebook page
Orgs I like
The EFF - keeping free speech in the world
Musical stuff I like
Jeff, Rick, Ardour, Jack
Prior Rants - Sharing your home network better in a time of covi... Designing for the disconnect Email lists going down the memory hole Instituting saner, professional source code manage... Wireless and Wifi in 2015 - not what I dreamed of Saving wifi! Fixing Bufferbloat! Fighting the vend... Virgin Media - Fixing the epidemic of bufferbloat ... 49... and trying to find my navel Wheels down on mars! Tracking the landing of Curiosity, from Seattle
Best of the blog:
Uncle Bill's Helicopter - A speech I gave to ITT Tech - Chicken soup for engineers
Beating the Brand - A pathological exploration of how branding makes it hard to think straight
Inside the Internet Mind - trying to map the weather within the global supercomputer that consists of humans and google
Sex In Politics - If politicians spent more time pounding the flesh rather than pressing it, it would be a better world
Getting resources from space - An alternative to blowing money on mars using NEAs.
On the Columbia - Why I care about space
Authors I like:
Doc Searls
Where's Cherie?
UrbanAgora
Jerry Pournelle
The Cubic Dog
Evan Hunt
The Bay Area is talking
Brizzled
Zimnoiac Emanations
Eric Raymond
Unlocking The Air
Bob Mage
BroadBand & Me
SpaceCraft
Selenian Boondocks
My Pencil
Transterrestial Musings
Bear Waller Hollar
Callahans
Pajamas Media BlogRoll Member

If you really want to, you can poke through the below links as well.

ARCHIVES
06/09/2002 - 06/16/2002 / 07/28/2002 - 08/04/2002 / 08/11/2002 - 08/18/2002 / 08/18/2002 - 08/25/2002 / 08/25/2002 - 09/01/2002 / 09/22/2002 - 09/29/2002 / 11/10/2002 - 11/17/2002 / 12/15/2002 - 12/22/2002 / 12/22/2002 - 12/29/2002 / 12/29/2002 - 01/05/2003 / 01/05/2003 - 01/12/2003 / 01/19/2003 - 01/26/2003 / 01/26/2003 - 02/02/2003 / 02/09/2003 - 02/16/2003 / 02/16/2003 - 02/23/2003 / 03/02/2003 - 03/09/2003 / 03/16/2003 - 03/23/2003 / 04/06/2003 - 04/13/2003 / 04/13/2003 - 04/20/2003 / 04/20/2003 - 04/27/2003 / 05/04/2003 - 05/11/2003 / 05/18/2003 - 05/25/2003 / 05/25/2003 - 06/01/2003 / 06/01/2003 - 06/08/2003 / 06/08/2003 - 06/15/2003 / 06/15/2003 - 06/22/2003 / 06/22/2003 - 06/29/2003 / 06/29/2003 - 07/06/2003 / 07/20/2003 - 07/27/2003 / 07/27/2003 - 08/03/2003 / 08/03/2003 - 08/10/2003 / 08/10/2003 - 08/17/2003 / 08/17/2003 - 08/24/2003 / 08/24/2003 - 08/31/2003 / 08/31/2003 - 09/07/2003 / 09/07/2003 - 09/14/2003 / 09/14/2003 - 09/21/2003 / 09/21/2003 - 09/28/2003 / 09/28/2003 - 10/05/2003 / 10/05/2003 - 10/12/2003 / 10/12/2003 - 10/19/2003 / 10/19/2003 - 10/26/2003 / 10/26/2003 - 11/02/2003 / 11/02/2003 - 11/09/2003 / 11/09/2003 - 11/16/2003 / 11/30/2003 - 12/07/2003 / 12/07/2003 - 12/14/2003 / 12/14/2003 - 12/21/2003 / 12/28/2003 - 01/04/2004 / 01/11/2004 - 01/18/2004 / 01/18/2004 - 01/25/2004 / 01/25/2004 - 02/01/2004 / 02/01/2004 - 02/08/2004 / 02/08/2004 - 02/15/2004 / 02/15/2004 - 02/22/2004 / 02/22/2004 - 02/29/2004 / 02/29/2004 - 03/07/2004 / 03/14/2004 - 03/21/2004 / 03/21/2004 - 03/28/2004 / 03/28/2004 - 04/04/2004 / 04/04/2004 - 04/11/2004 / 04/11/2004 - 04/18/2004 / 04/18/2004 - 04/25/2004 / 04/25/2004 - 05/02/2004 / 05/02/2004 - 05/09/2004 / 05/09/2004 - 05/16/2004 / 05/16/2004 - 05/23/2004 / 05/30/2004 - 06/06/2004 / 06/06/2004 - 06/13/2004 / 06/13/2004 - 06/20/2004 / 06/20/2004 - 06/27/2004 / 06/27/2004 - 07/04/2004 / 07/04/2004 - 07/11/2004 / 07/11/2004 - 07/18/2004 / 07/18/2004 - 07/25/2004 / 08/08/2004 - 08/15/2004 / 08/22/2004 - 08/29/2004 / 08/29/2004 - 09/05/2004 / 09/05/2004 - 09/12/2004 / 09/19/2004 - 09/26/2004 / 09/26/2004 - 10/03/2004 / 10/03/2004 - 10/10/2004 / 10/10/2004 - 10/17/2004 / 10/17/2004 - 10/24/2004 / 10/24/2004 - 10/31/2004 / 10/31/2004 - 11/07/2004 / 11/07/2004 - 11/14/2004 / 11/14/2004 - 11/21/2004 / 11/21/2004 - 11/28/2004 / 11/28/2004 - 12/05/2004 / 12/05/2004 - 12/12/2004 / 12/12/2004 - 12/19/2004 / 12/19/2004 - 12/26/2004 / 12/26/2004 - 01/02/2005 / 01/02/2005 - 01/09/2005 / 01/16/2005 - 01/23/2005 / 01/23/2005 - 01/30/2005 / 01/30/2005 - 02/06/2005 / 02/06/2005 - 02/13/2005 / 02/13/2005 - 02/20/2005 / 02/20/2005 - 02/27/2005 / 02/27/2005 - 03/06/2005 / 03/06/2005 - 03/13/2005 / 03/27/2005 - 04/03/2005 / 04/03/2005 - 04/10/2005 / 04/10/2005 - 04/17/2005 / 05/29/2005 - 06/05/2005 / 06/05/2005 - 06/12/2005 / 06/12/2005 - 06/19/2005 / 06/19/2005 - 06/26/2005 / 06/26/2005 - 07/03/2005 / 07/03/2005 - 07/10/2005 / 07/10/2005 - 07/17/2005 / 07/24/2005 - 07/31/2005 / 07/31/2005 - 08/07/2005 / 08/07/2005 - 08/14/2005 / 08/14/2005 - 08/21/2005 / 08/21/2005 - 08/28/2005 / 08/28/2005 - 09/04/2005 / 09/04/2005 - 09/11/2005 / 09/11/2005 - 09/18/2005 / 09/18/2005 - 09/25/2005 / 09/25/2005 - 10/02/2005 / 10/02/2005 - 10/09/2005 / 10/09/2005 - 10/16/2005 / 10/16/2005 - 10/23/2005 / 10/23/2005 - 10/30/2005 / 10/30/2005 - 11/06/2005 / 11/06/2005 - 11/13/2005 / 11/13/2005 - 11/20/2005 / 11/20/2005 - 11/27/2005 / 11/27/2005 - 12/04/2005 / 12/04/2005 - 12/11/2005 / 12/11/2005 - 12/18/2005 / 12/18/2005 - 12/25/2005 / 01/01/2006 - 01/08/2006 / 01/08/2006 - 01/15/2006 / 01/15/2006 - 01/22/2006 / 01/22/2006 - 01/29/2006 / 01/29/2006 - 02/05/2006 / 02/19/2006 - 02/26/2006 / 03/05/2006 - 03/12/2006 / 03/19/2006 - 03/26/2006 / 03/26/2006 - 04/02/2006 / 04/02/2006 - 04/09/2006 / 04/09/2006 - 04/16/2006 / 04/23/2006 - 04/30/2006 / 05/07/2006 - 05/14/2006 / 05/14/2006 - 05/21/2006 / 05/21/2006 - 05/28/2006 / 06/04/2006 - 06/11/2006 / 06/11/2006 - 06/18/2006 / 06/18/2006 - 06/25/2006 / 06/25/2006 - 07/02/2006 / 07/02/2006 - 07/09/2006 / 07/09/2006 - 07/16/2006 / 07/23/2006 - 07/30/2006 / 08/06/2006 - 08/13/2006 / 08/13/2006 - 08/20/2006 / 09/03/2006 - 09/10/2006 / 09/17/2006 - 09/24/2006 / 09/24/2006 - 10/01/2006 / 10/01/2006 - 10/08/2006 / 10/22/2006 - 10/29/2006 / 11/19/2006 - 11/26/2006 / 11/26/2006 - 12/03/2006 / 12/03/2006 - 12/10/2006 / 12/10/2006 - 12/17/2006 / 12/17/2006 - 12/24/2006 / 12/24/2006 - 12/31/2006 / 01/07/2007 - 01/14/2007 / 01/14/2007 - 01/21/2007 / 01/28/2007 - 02/04/2007 / 02/11/2007 - 02/18/2007 / 02/18/2007 - 02/25/2007 / 02/25/2007 - 03/04/2007 / 03/04/2007 - 03/11/2007 / 03/18/2007 - 03/25/2007 / 04/01/2007 - 04/08/2007 / 04/08/2007 - 04/15/2007 / 04/15/2007 - 04/22/2007 / 04/22/2007 - 04/29/2007 / 04/29/2007 - 05/06/2007 / 05/06/2007 - 05/13/2007 / 05/20/2007 - 05/27/2007 / 05/27/2007 - 06/03/2007 / 06/03/2007 - 06/10/2007 / 06/10/2007 - 06/17/2007 / 06/17/2007 - 06/24/2007 / 07/01/2007 - 07/08/2007 / 07/08/2007 - 07/15/2007 / 07/22/2007 - 07/29/2007 / 07/29/2007 - 08/05/2007 / 08/05/2007 - 08/12/2007 / 08/26/2007 - 09/02/2007 / 09/09/2007 - 09/16/2007 / 09/23/2007 - 09/30/2007 / 09/30/2007 - 10/07/2007 / 10/07/2007 - 10/14/2007 / 10/14/2007 - 10/21/2007 / 10/21/2007 - 10/28/2007 / 10/28/2007 - 11/04/2007 / 11/04/2007 - 11/11/2007 / 11/11/2007 - 11/18/2007 / 11/18/2007 - 11/25/2007 / 11/25/2007 - 12/02/2007 / 12/02/2007 - 12/09/2007 / 12/09/2007 - 12/16/2007 / 12/16/2007 - 12/23/2007 / 12/23/2007 - 12/30/2007 / 01/06/2008 - 01/13/2008 / 02/03/2008 - 02/10/2008 / 02/17/2008 - 02/24/2008 / 02/24/2008 - 03/02/2008 / 03/02/2008 - 03/09/2008 / 03/09/2008 - 03/16/2008 / 03/16/2008 - 03/23/2008 / 03/23/2008 - 03/30/2008 / 03/30/2008 - 04/06/2008 / 04/20/2008 - 04/27/2008 / 04/27/2008 - 05/04/2008 / 05/04/2008 - 05/11/2008 / 05/11/2008 - 05/18/2008 / 05/18/2008 - 05/25/2008 / 05/25/2008 - 06/01/2008 / 06/01/2008 - 06/08/2008 / 06/08/2008 - 06/15/2008 / 06/15/2008 - 06/22/2008 / 06/22/2008 - 06/29/2008 / 07/06/2008 - 07/13/2008 / 07/13/2008 - 07/20/2008 / 07/20/2008 - 07/27/2008 / 07/27/2008 - 08/03/2008 / 08/03/2008 - 08/10/2008 / 08/10/2008 - 08/17/2008 / 08/17/2008 - 08/24/2008 / 08/31/2008 - 09/07/2008 / 09/07/2008 - 09/14/2008 / 09/14/2008 - 09/21/2008 / 09/21/2008 - 09/28/2008 / 09/28/2008 - 10/05/2008 / 10/05/2008 - 10/12/2008 / 10/12/2008 - 10/19/2008 / 10/19/2008 - 10/26/2008 / 10/26/2008 - 11/02/2008 / 11/02/2008 - 11/09/2008 / 11/09/2008 - 11/16/2008 / 11/16/2008 - 11/23/2008 / 12/07/2008 - 12/14/2008 / 12/21/2008 - 12/28/2008 / 12/28/2008 - 01/04/2009 / 01/18/2009 - 01/25/2009 / 01/25/2009 - 02/01/2009 / 03/22/2009 - 03/29/2009 / 05/10/2009 - 05/17/2009 / 05/17/2009 - 05/24/2009 / 05/31/2009 - 06/07/2009 / 06/14/2009 - 06/21/2009 / 06/21/2009 - 06/28/2009 / 06/28/2009 - 07/05/2009 / 07/05/2009 - 07/12/2009 / 07/12/2009 - 07/19/2009 / 07/26/2009 - 08/02/2009 / 08/09/2009 - 08/16/2009 / 08/23/2009 - 08/30/2009 / 09/06/2009 - 09/13/2009 / 09/20/2009 - 09/27/2009 / 09/27/2009 - 10/04/2009 / 10/04/2009 - 10/11/2009 / 10/11/2009 - 10/18/2009 / 10/18/2009 - 10/25/2009 / 10/25/2009 - 11/01/2009 / 11/29/2009 - 12/06/2009 / 12/27/2009 - 01/03/2010 / 01/31/2010 - 02/07/2010 / 02/07/2010 - 02/14/2010 / 02/28/2010 - 03/07/2010 / 03/07/2010 - 03/14/2010 / 03/28/2010 - 04/04/2010 / 04/18/2010 - 04/25/2010 / 05/16/2010 - 05/23/2010 / 05/30/2010 - 06/06/2010 / 06/13/2010 - 06/20/2010 / 06/20/2010 - 06/27/2010 / 07/04/2010 - 07/11/2010 / 07/11/2010 - 07/18/2010 / 07/18/2010 - 07/25/2010 / 08/08/2010 - 08/15/2010 / 08/29/2010 - 09/05/2010 / 09/05/2010 - 09/12/2010 / 09/19/2010 - 09/26/2010 / 09/26/2010 - 10/03/2010 / 10/10/2010 - 10/17/2010 / 10/17/2010 - 10/24/2010 / 10/31/2010 - 11/07/2010 / 11/28/2010 - 12/05/2010 / 12/05/2010 - 12/12/2010 / 12/12/2010 - 12/19/2010 / 12/26/2010 - 01/02/2011 / 03/06/2011 - 03/13/2011 / 03/13/2011 - 03/20/2011 / 05/22/2011 - 05/29/2011 / 08/07/2011 - 08/14/2011 / 08/14/2011 - 08/21/2011 / 09/18/2011 - 09/25/2011 / 10/02/2011 - 10/09/2011 / 10/09/2011 - 10/16/2011 / 11/06/2011 - 11/13/2011 / 01/15/2012 - 01/22/2012 / 04/22/2012 - 04/29/2012 / 06/24/2012 - 07/01/2012 / 08/05/2012 - 08/12/2012 / 08/11/2013 - 08/18/2013 / 03/01/2015 - 03/08/2015 / 10/04/2015 - 10/11/2015 / 11/08/2015 - 11/15/2015 / 12/22/2019 - 12/29/2019 / 04/05/2020 - 04/12/2020 /


Powered by Blogger