Quit bugging me: surprise NaN!

You've got a complex system. There are dozens of tasks, a handful of interrupt routines. Your system runs fine, for hours or sometimes days on end. The loading on your system is somewhat "bursty" in that average loading is only about 20% of peak load, and peaks only happen once every hour for less than 5 seconds. Every once in a while, for whatever reason, you suddenly get a divide by zero, not-a-number, or other unanticipated erroneous results on one of your floating point calculations.

In your system you only have a few tasks that employ floating point. You're pretty sure none of your ISRs do, all they do is copy or move data around, they shouldn't have to compute anything. Yet for some reason, occasionally any one of the tasks running floating point will go belly-up with unexplainable results. It just looks like some kind of random corruption.

Continue Reading ››