Postcards from the Bleeding Edge
Saturday, May 24, 2008

  Colorizing the big bright green state machine

I've been busily optimizing OSLEC for the blackfin, X86, and X86_64 architectures.

It took a long period of thinking about how the algorithm really worked and how it interacted with the x86 hardware to figure out how to speed x86 up (My subconcious has labored on it since I hung out with David Rowe in February, watching an IP08 kind of fall over under the workload). I'm still not entirely sure how OSLEC works, actually, the inner mystery slips in and out of my head... David has been living inside it for far longer than I have and I keep referring back to his emails)

But: I had multiple eureka moments on the most mips eating loop on sunday night... I spent 2 days coding and testing... and (yea!)

I think I have achieved a major speedup for oslec on x86 and even more major on x86_64. I'm not going to say how much better it is until I finish checking to make sure it's bit exact with the original. (It's a lot faster. A virtual beer to the person that comes up with the closest guess. )

Flush with that success, creatively inspired, and wanting to defer writing a mind sucking tedious series of bitwalking tests for bit exactness for as long as possible, I took on the Blackfin.

I've spent about a month over the past quarter devouring manuals on the blackfin architecture and assembly language. It took reading some of them two or three times to really "get it". It is a cool processor!

So, for oslec, I came up with a parallel algorithm that I figured would be easily twice as fast as the original. (and with a little work, maybe more)

I had my eureka moment on wednesday I spent a day writing the code. 2 days cutting it down from 9 instructions to 7 in the inner loop. 3 days trying to get it to run. (you'll note how the math works out here, is that I didn't sleep very much)

It ran for the first time today. It's twice as fast as the original.

I think I have a major win there, but it is definitely NOT bit exact. !@#!@!!.

It's a rather big state machine to look at...

Worse.... I've got another major state machine that I'm debugging (alphatrack driver). I gave up on THAT last week, because it was just too tedious..... when programming stops being fun is when I start wanting a paycheck....

anyway, my whole life I have tended to look at enormous pictures of state that kind of look like this:


N0: 11000011111111000000000000000000
N1: 10000000000000000000000000000000
N2: 10000000000000000000100000000000
N3: 10000000000000000000100000000001
N4: 10000000000000000000100000000011
...


And I would let my eye pick out when it had gone wrong. Sometimes, when two parallel operations are going on, or I'm comparing a working piece of code with an optimized replacement, it makes sense to have BEFORE and AFTER on the same line:


BEFORE: 11000011111111000000000000000000 AFTER : 10000000000000000000000000000000
BEFORE: 11000011111111000000000000000000 AFTER : 10000000000000000000000000000000
BEFORE: 11000011111111000000000000000000 AFTER : 10000000000000000000000000000000
BEFORE: 11000011111111000000000000000000 AFTER : 10000000000000000000000000001000


Eyes are really good at seeing pictures in patterns like this, but when a state machine gets too big to fit on the screen, the patterns get harder to find. I would use smaller and smaller fonts until I could see as much as possible (sometimes hundreds of state changes on a screen - you don't have to actually be able to read the ones and zeros, font sizes of 2 are ok!)

You can take this to a real extreme if you have graphics available, and I've always kind of wanted to be able to zoom out of my session far enough to reduce each bit to a pixel...

It never occurred to me before now to use color for the *differences* between two machines.

It may be because I used to do this stuff on a huge monochrome Xterminal and it was simply easier to print out huge amounts of data and let my eyes due the work. I still miss that old Xterminal... and for that matter, green screens. So do my eyes.

My terminal windows are all green. The whole web has gone bright white and I find it, well, unsettling. Green is calming. it lets me think....

Anyway, I was looking for changes in new code vs old code in a 40 + 32 + 32 bit state machine and I decided that this was a better way to look at the differences:

10000000000000000000000000000000000000001000000000000000000000000000000011000011111111000000000000000000
10000000000000000000000000000000000000001100000000000000000000000000000001100100100100000000000000000000
00001011111000000000000000000000000000001000000000000000000000000000000010010110101100000000000000000000
10000000000000000000000000000000000000001000000000000000000000000000000011000011111111000000000000000000

And wrote an itty bitty little C program (that compiled the first time!) to generate both HTML and ansi character sequences. I like my cool green, and prefer cool colors, so on the green terminal this actually generates blue colors....

C++ has a nicer way of dealing with values larger than 64 bits (bitsets - LOVE EM), but this was good enough for my purposes....

I could turn this into a library but I gotta state machine to debug. Now that I've blogged I've run out of reasons to put it off... I sure hope a swell comes in soon....


/* Diffbits - released into the public domain 2008 by Michael Taht */

#include

// Some ansi standard strings

#ifdef HTML
#define REDON ""
#define REDOFF ""
#else
#define REDON "\033[01;34m"
#define REDOFF "\033[00m"
#endif

static int diffbits_64(long a,long b,int size) {
int color = 0;
int abit = 0;
int bbit = 0;
int status = 0;
int i;
for(i = 0; i < size; i++) {
abit = (a&1);
bbit = (b&1);
if(abit != bbit) {
status = 1;
if(color !=1) {
printf(REDON);
color = 1;
}
} else {
if(color ==1) {
printf(REDOFF);
color = 0;
}
}
printf("%d",abit);
a = a >> 1;
b = b >> 1;
}
if(color ==1) {
printf(REDOFF);
color = 0;
}
return (status);
}

int diffbits64(int a, int b) {
return diffbits_64(a,b,sizeof(long long)*8);
}

int diffbits32(int a, int b) {
return diffbits_64(a,b,sizeof(int)*8);
}

int diffbits16(int a, int b) {
return diffbits_64(a,b,sizeof(short)*8);
}

int diffbits8(char a, char b) {
return diffbits_64(a,b,sizeof(char)*8);
}

diffbits(long a, long b) {
diffbits_64(a,b,sizeof(int)*8);
}

#ifdef TEST

int main () {
diffbits(1,0); printf("\n");
diffbits(-1,1); printf("\n");
diffbits(555,225); printf("\n");
diffbits(16323,-12225); printf("\n");

diffbits8(1,-1); printf("\n");
diffbits16(1,-1); printf("\n");
diffbits16(16323,-12225); printf("\n");
diffbits32(1,-1); printf("\n");
diffbits32(1,2); printf("\n");
diffbits32(16323,-12225); printf("\n");
diffbits64(1,-1); printf("\n");
diffbits64(1,2); printf("\n");
diffbits64(16323,-12225); printf("\n");
}

Labels: , ,

 
Comments:
Working with a EE here, we did a box for the power utility. They wanted to measure voltage and current to a DC magnet and, with the coefficient of heating for copper, calculate the coil's temperature. Output was a DC voltage proportional to the calculated temperature.
What made it fun was two things. First, the power levels we were measuring. You may recall from physics that P = E*I. The DC magnet consumes a little of two megawatts (P in the formula). Voltage (E) and current (I) were several hundred to a couple of thousand volts and amps, respectively.
The magnet is the armature of a gas-fired, steam-powered AC generator feeding the local grid. Output of the generator is a couple of hundred megawatts.
And they wanted a 10baseT connector so they could hook up a garden variety notebook, crank up a browser and tweak the parameters via a web interface.
We bought a cheap 386 board with two A to D converters, used 1000:1 transducers at the generator with loooong wires and low pass filters at the 386 board, and the output goes through another looong wire to the power station control room.
It boots (Linux, of course) from compact flash and runs, and runs, and runs.
Sorry, no state machines. Just inexpensive, easy to assemble, a smidgin of floating point calculation once a second, and lots of CPU cycles left over.
We did it as a "proof of concept" and haven't made any money on it, at least not yet.
But it sure was fun dealing with real hardware and some impressive numbers.
 
Post a Comment



<< Home
David Täht writes about politics, space, copyright, the internet, audio software, operating systems and surfing.


Resume,Songs,
My new blog, NeX-6, My facebook page
Orgs I like
The EFF - keeping free speech in the world
Musical stuff I like
Jeff, Rick, Ardour, Jack
Prior Rants - Transport strike over, working too hard ioquake3 gets ipv6 No Luz, no taxi Blood in the water getting ready to hang it up Lock up the *ssholes Big batch of OLPCs distributed in Nepal No right to not remain silent Election insanity, and other unpopular causes Back in Nicaragua, haunted by jellyfish
Best of the blog:
Uncle Bill's Helicopter - A speech I gave to ITT Tech - Chicken soup for engineers
Beating the Brand - A pathological exploration of how branding makes it hard to think straight
Inside the Internet Mind - trying to map the weather within the global supercomputer that consists of humans and google
Sex In Politics - If politicians spent more time pounding the flesh rather than pressing it, it would be a better world
Getting resources from space - An alternative to blowing money on mars using NEAs.
On the Columbia - Why I care about space
Authors I like:
Doc Searls
Where's Cherie?
UrbanAgora
Jerry Pournelle
The Cubic Dog
Evan Hunt
The Bay Area is talking
Brizzled
Zimnoiac Emanations
Eric Raymond
Unlocking The Air
Bob Mage
BroadBand & Me
SpaceCraft
Selenian Boondocks
My Pencil
Transterrestial Musings
Bear Waller Hollar
Callahans
Pajamas Media BlogRoll Member

If you really want to, you can poke through the below links as well.

ARCHIVES
06/09/2002 - 06/16/2002 / 07/28/2002 - 08/04/2002 / 08/11/2002 - 08/18/2002 / 08/18/2002 - 08/25/2002 / 08/25/2002 - 09/01/2002 / 09/22/2002 - 09/29/2002 / 11/10/2002 - 11/17/2002 / 12/15/2002 - 12/22/2002 / 12/22/2002 - 12/29/2002 / 12/29/2002 - 01/05/2003 / 01/05/2003 - 01/12/2003 / 01/19/2003 - 01/26/2003 / 01/26/2003 - 02/02/2003 / 02/09/2003 - 02/16/2003 / 02/16/2003 - 02/23/2003 / 03/02/2003 - 03/09/2003 / 03/16/2003 - 03/23/2003 / 04/06/2003 - 04/13/2003 / 04/13/2003 - 04/20/2003 / 04/20/2003 - 04/27/2003 / 05/04/2003 - 05/11/2003 / 05/18/2003 - 05/25/2003 / 05/25/2003 - 06/01/2003 / 06/01/2003 - 06/08/2003 / 06/08/2003 - 06/15/2003 / 06/15/2003 - 06/22/2003 / 06/22/2003 - 06/29/2003 / 06/29/2003 - 07/06/2003 / 07/20/2003 - 07/27/2003 / 07/27/2003 - 08/03/2003 / 08/03/2003 - 08/10/2003 / 08/10/2003 - 08/17/2003 / 08/17/2003 - 08/24/2003 / 08/24/2003 - 08/31/2003 / 08/31/2003 - 09/07/2003 / 09/07/2003 - 09/14/2003 / 09/14/2003 - 09/21/2003 / 09/21/2003 - 09/28/2003 / 09/28/2003 - 10/05/2003 / 10/05/2003 - 10/12/2003 / 10/12/2003 - 10/19/2003 / 10/19/2003 - 10/26/2003 / 10/26/2003 - 11/02/2003 / 11/02/2003 - 11/09/2003 / 11/09/2003 - 11/16/2003 / 11/30/2003 - 12/07/2003 / 12/07/2003 - 12/14/2003 / 12/14/2003 - 12/21/2003 / 12/28/2003 - 01/04/2004 / 01/11/2004 - 01/18/2004 / 01/18/2004 - 01/25/2004 / 01/25/2004 - 02/01/2004 / 02/01/2004 - 02/08/2004 / 02/08/2004 - 02/15/2004 / 02/15/2004 - 02/22/2004 / 02/22/2004 - 02/29/2004 / 02/29/2004 - 03/07/2004 / 03/14/2004 - 03/21/2004 / 03/21/2004 - 03/28/2004 / 03/28/2004 - 04/04/2004 / 04/04/2004 - 04/11/2004 / 04/11/2004 - 04/18/2004 / 04/18/2004 - 04/25/2004 / 04/25/2004 - 05/02/2004 / 05/02/2004 - 05/09/2004 / 05/09/2004 - 05/16/2004 / 05/16/2004 - 05/23/2004 / 05/30/2004 - 06/06/2004 / 06/06/2004 - 06/13/2004 / 06/13/2004 - 06/20/2004 / 06/20/2004 - 06/27/2004 / 06/27/2004 - 07/04/2004 / 07/04/2004 - 07/11/2004 / 07/11/2004 - 07/18/2004 / 07/18/2004 - 07/25/2004 / 08/08/2004 - 08/15/2004 / 08/22/2004 - 08/29/2004 / 08/29/2004 - 09/05/2004 / 09/05/2004 - 09/12/2004 / 09/19/2004 - 09/26/2004 / 09/26/2004 - 10/03/2004 / 10/03/2004 - 10/10/2004 / 10/10/2004 - 10/17/2004 / 10/17/2004 - 10/24/2004 / 10/24/2004 - 10/31/2004 / 10/31/2004 - 11/07/2004 / 11/07/2004 - 11/14/2004 / 11/14/2004 - 11/21/2004 / 11/21/2004 - 11/28/2004 / 11/28/2004 - 12/05/2004 / 12/05/2004 - 12/12/2004 / 12/12/2004 - 12/19/2004 / 12/19/2004 - 12/26/2004 / 12/26/2004 - 01/02/2005 / 01/02/2005 - 01/09/2005 / 01/16/2005 - 01/23/2005 / 01/23/2005 - 01/30/2005 / 01/30/2005 - 02/06/2005 / 02/06/2005 - 02/13/2005 / 02/13/2005 - 02/20/2005 / 02/20/2005 - 02/27/2005 / 02/27/2005 - 03/06/2005 / 03/06/2005 - 03/13/2005 / 03/27/2005 - 04/03/2005 / 04/03/2005 - 04/10/2005 / 04/10/2005 - 04/17/2005 / 05/29/2005 - 06/05/2005 / 06/05/2005 - 06/12/2005 / 06/12/2005 - 06/19/2005 / 06/19/2005 - 06/26/2005 / 06/26/2005 - 07/03/2005 / 07/03/2005 - 07/10/2005 / 07/10/2005 - 07/17/2005 / 07/24/2005 - 07/31/2005 / 07/31/2005 - 08/07/2005 / 08/07/2005 - 08/14/2005 / 08/14/2005 - 08/21/2005 / 08/21/2005 - 08/28/2005 / 08/28/2005 - 09/04/2005 / 09/04/2005 - 09/11/2005 / 09/11/2005 - 09/18/2005 / 09/18/2005 - 09/25/2005 / 09/25/2005 - 10/02/2005 / 10/02/2005 - 10/09/2005 / 10/09/2005 - 10/16/2005 / 10/16/2005 - 10/23/2005 / 10/23/2005 - 10/30/2005 / 10/30/2005 - 11/06/2005 / 11/06/2005 - 11/13/2005 / 11/13/2005 - 11/20/2005 / 11/20/2005 - 11/27/2005 / 11/27/2005 - 12/04/2005 / 12/04/2005 - 12/11/2005 / 12/11/2005 - 12/18/2005 / 12/18/2005 - 12/25/2005 / 01/01/2006 - 01/08/2006 / 01/08/2006 - 01/15/2006 / 01/15/2006 - 01/22/2006 / 01/22/2006 - 01/29/2006 / 01/29/2006 - 02/05/2006 / 02/19/2006 - 02/26/2006 / 03/05/2006 - 03/12/2006 / 03/19/2006 - 03/26/2006 / 03/26/2006 - 04/02/2006 / 04/02/2006 - 04/09/2006 / 04/09/2006 - 04/16/2006 / 04/23/2006 - 04/30/2006 / 05/07/2006 - 05/14/2006 / 05/14/2006 - 05/21/2006 / 05/21/2006 - 05/28/2006 / 06/04/2006 - 06/11/2006 / 06/11/2006 - 06/18/2006 / 06/18/2006 - 06/25/2006 / 06/25/2006 - 07/02/2006 / 07/02/2006 - 07/09/2006 / 07/09/2006 - 07/16/2006 / 07/23/2006 - 07/30/2006 / 08/06/2006 - 08/13/2006 / 08/13/2006 - 08/20/2006 / 09/03/2006 - 09/10/2006 / 09/17/2006 - 09/24/2006 / 09/24/2006 - 10/01/2006 / 10/01/2006 - 10/08/2006 / 10/22/2006 - 10/29/2006 / 11/19/2006 - 11/26/2006 / 11/26/2006 - 12/03/2006 / 12/03/2006 - 12/10/2006 / 12/10/2006 - 12/17/2006 / 12/17/2006 - 12/24/2006 / 12/24/2006 - 12/31/2006 / 01/07/2007 - 01/14/2007 / 01/14/2007 - 01/21/2007 / 01/28/2007 - 02/04/2007 / 02/11/2007 - 02/18/2007 / 02/18/2007 - 02/25/2007 / 02/25/2007 - 03/04/2007 / 03/04/2007 - 03/11/2007 / 03/18/2007 - 03/25/2007 / 04/01/2007 - 04/08/2007 / 04/08/2007 - 04/15/2007 / 04/15/2007 - 04/22/2007 / 04/22/2007 - 04/29/2007 / 04/29/2007 - 05/06/2007 / 05/06/2007 - 05/13/2007 / 05/20/2007 - 05/27/2007 / 05/27/2007 - 06/03/2007 / 06/03/2007 - 06/10/2007 / 06/10/2007 - 06/17/2007 / 06/17/2007 - 06/24/2007 / 07/01/2007 - 07/08/2007 / 07/08/2007 - 07/15/2007 / 07/22/2007 - 07/29/2007 / 07/29/2007 - 08/05/2007 / 08/05/2007 - 08/12/2007 / 08/26/2007 - 09/02/2007 / 09/09/2007 - 09/16/2007 / 09/23/2007 - 09/30/2007 / 09/30/2007 - 10/07/2007 / 10/07/2007 - 10/14/2007 / 10/14/2007 - 10/21/2007 / 10/21/2007 - 10/28/2007 / 10/28/2007 - 11/04/2007 / 11/04/2007 - 11/11/2007 / 11/11/2007 - 11/18/2007 / 11/18/2007 - 11/25/2007 / 11/25/2007 - 12/02/2007 / 12/02/2007 - 12/09/2007 / 12/09/2007 - 12/16/2007 / 12/16/2007 - 12/23/2007 / 12/23/2007 - 12/30/2007 / 01/06/2008 - 01/13/2008 / 02/03/2008 - 02/10/2008 / 02/17/2008 - 02/24/2008 / 02/24/2008 - 03/02/2008 / 03/02/2008 - 03/09/2008 / 03/09/2008 - 03/16/2008 / 03/16/2008 - 03/23/2008 / 03/23/2008 - 03/30/2008 / 03/30/2008 - 04/06/2008 / 04/20/2008 - 04/27/2008 / 04/27/2008 - 05/04/2008 / 05/04/2008 - 05/11/2008 / 05/11/2008 - 05/18/2008 / 05/18/2008 - 05/25/2008 / 05/25/2008 - 06/01/2008 / 06/01/2008 - 06/08/2008 / 06/08/2008 - 06/15/2008 / 06/15/2008 - 06/22/2008 / 06/22/2008 - 06/29/2008 / 07/06/2008 - 07/13/2008 / 07/13/2008 - 07/20/2008 / 07/20/2008 - 07/27/2008 / 07/27/2008 - 08/03/2008 / 08/03/2008 - 08/10/2008 / 08/10/2008 - 08/17/2008 / 08/17/2008 - 08/24/2008 / 08/31/2008 - 09/07/2008 / 09/07/2008 - 09/14/2008 / 09/14/2008 - 09/21/2008 / 09/21/2008 - 09/28/2008 / 09/28/2008 - 10/05/2008 / 10/05/2008 - 10/12/2008 / 10/12/2008 - 10/19/2008 / 10/19/2008 - 10/26/2008 / 10/26/2008 - 11/02/2008 / 11/02/2008 - 11/09/2008 / 11/09/2008 - 11/16/2008 / 11/16/2008 - 11/23/2008 / 12/07/2008 - 12/14/2008 / 12/21/2008 - 12/28/2008 / 12/28/2008 - 01/04/2009 / 01/18/2009 - 01/25/2009 / 01/25/2009 - 02/01/2009 / 03/22/2009 - 03/29/2009 / 05/10/2009 - 05/17/2009 / 05/17/2009 - 05/24/2009 / 05/31/2009 - 06/07/2009 / 06/14/2009 - 06/21/2009 / 06/21/2009 - 06/28/2009 / 06/28/2009 - 07/05/2009 / 07/05/2009 - 07/12/2009 / 07/12/2009 - 07/19/2009 / 07/26/2009 - 08/02/2009 / 08/09/2009 - 08/16/2009 / 08/23/2009 - 08/30/2009 / 09/06/2009 - 09/13/2009 / 09/20/2009 - 09/27/2009 / 09/27/2009 - 10/04/2009 / 10/04/2009 - 10/11/2009 / 10/11/2009 - 10/18/2009 / 10/18/2009 - 10/25/2009 / 10/25/2009 - 11/01/2009 / 11/29/2009 - 12/06/2009 / 12/27/2009 - 01/03/2010 / 01/31/2010 - 02/07/2010 / 02/07/2010 - 02/14/2010 / 02/28/2010 - 03/07/2010 / 03/07/2010 - 03/14/2010 / 03/28/2010 - 04/04/2010 / 04/18/2010 - 04/25/2010 / 05/16/2010 - 05/23/2010 / 05/30/2010 - 06/06/2010 / 06/13/2010 - 06/20/2010 / 06/20/2010 - 06/27/2010 / 07/04/2010 - 07/11/2010 / 07/11/2010 - 07/18/2010 / 07/18/2010 - 07/25/2010 / 08/08/2010 - 08/15/2010 / 08/29/2010 - 09/05/2010 / 09/05/2010 - 09/12/2010 / 09/19/2010 - 09/26/2010 / 09/26/2010 - 10/03/2010 / 10/10/2010 - 10/17/2010 / 10/17/2010 - 10/24/2010 / 10/31/2010 - 11/07/2010 / 11/28/2010 - 12/05/2010 / 12/05/2010 - 12/12/2010 / 12/12/2010 - 12/19/2010 / 12/26/2010 - 01/02/2011 / 03/06/2011 - 03/13/2011 / 03/13/2011 - 03/20/2011 / 05/22/2011 - 05/29/2011 / 08/07/2011 - 08/14/2011 / 08/14/2011 - 08/21/2011 / 09/18/2011 - 09/25/2011 / 10/02/2011 - 10/09/2011 / 10/09/2011 - 10/16/2011 / 11/06/2011 - 11/13/2011 / 01/15/2012 - 01/22/2012 / 04/22/2012 - 04/29/2012 / 06/24/2012 - 07/01/2012 / 08/05/2012 - 08/12/2012 / 08/11/2013 - 08/18/2013 / 03/01/2015 - 03/08/2015 / 10/04/2015 - 10/11/2015 / 11/08/2015 - 11/15/2015 / 12/22/2019 - 12/29/2019 / 04/05/2020 - 04/12/2020 / 07/21/2024 - 07/28/2024 /


Powered by Blogger