Open Standards Posts

December 01, 2006

Mitigating Risk

The phone rings.  A customer has an application destined for a high-risk environment, and somewhere in test they've found a new unanticipated condition.  A "Tiger Team" is formed - a group of experts who've "been here before", who understand the priority and the risks, and know how serious it is.  Many times a Tiger Team may give the go / no-go, the final answer that either saves a mission, or drops all that work into the dust-bin.  Such a team itself represents a *lot* of work, frustration, and time - investigations may run on days, weeks, or months, as long as they reach their conclusion on-time.  And there's the one facet that can't be changed by any engineering practice:  Time.

Developing space, mission- or life-critical applications is serious business.  The deployment environment may be harsh - and much about it may be unknown.  To make sure missions are successful, the known risks are eliminated as well as possible.  Some of these risks may include high radiation, occlusion by the Sun (Solar Conjunction), extreme differences in temperatures, harsh chemical environment, vibrations, or electrical noise.  Some of this you can plan for with hardware, some of this you can't.  You address the things you know about, that's all you can do.

With software engineering, you're also faced with unknowns.  One of the biggest impacts is the unknown date when you'll finally have the actual hardware the project is going to run on.  Using simulators can help out, a lot, enabling some degree of parallel development when hardware time is rare.  I encourage most of my customers to leverage the simulator wherever possible.  It's a good way to kick-start development, enable more hands on the project, and bring new engineers up to speed.

Once you have your hardware, there  may be some hardware-level debug issues.  Having a hardware debugging interface of some sort - an In Circuit Emulator (ICE), for example - can be invaluable.  As long as one has an LED to blink, one can write polled-output routines to display register settings (etc) to assist in debugging a hardware interaction, but nothing is as good as being able to dump desired address ranges (registers, etc) at will.  An ICE can give you that ability to see inside the hardware.

What do you do if the hardware is just not going to be available?  You can use similar boards, processors, etc, but.. it's not the *same* thing.  The minute you change the foundation, everything above it is going to have a new set of interactions with the foundation, and some of those may induce problems.  When updating hardware, timing, electrical, or even errata related to updated chips or even PAL equations may cause engineering delays - even if it's just a newer version of the same board.  Switching entire boards - or from engineering / test boards to "flight qualified" boards - can complicate the issue.

So.. what can you do when you don't have the real hardware on-hand, to help reduce risks associated with developing for that hardware?  In the past few years, hardware emulation tools have come a long way.  Hardware emulation is like a simulator - vxSim for example - except there's a layer of software that's designed to act *just like* the real hardware would, right down to devices and registers within those devices, etc. 

Leveraging known-good (mature) software designs is another way to limit risk.  Re-using application code that is well understood is one example.   Selecting a base of software that's been certified to adhere to accepted standards is another way.  It's always nice to know that software has been scrutinized by eyes other than the manufacturer's, and been found fit.  And there's no substitute for a rigorous test strategy, or following the oft-learned adage: test as you will fly, fly as you did test.

Safe design practices can't eliminate tiger teams completely - there will still be unknowns, still be conditions that aren't discovered until the last moment.  But by combining proper practices, mature systems, and the proper tools, many of the problems that lead to the need for tiger teams may be addressed, and may be discovered far enough in advance to prevent a project from running out of Time.

November 01, 2006

MS taking some pain out of WinCE?

Microsoft has released WinCE 6.0 - this time as a "shared source" release, with various enhancements.  Some of the big claims-to-fame:  It's the "first commercial hard real-time OS released as a shared-source product".

I'm not sure exactly what this means.  Perhaps they're using their source licensing model - "shared sourcing" - as a qualifier?  Since it's the only RTOS made by Microsoft with sources shared through their license, that makes the "shared" part true.  From that perspective, there are licensing options for source code from most proprietary OS companies, you just can't post it all to the web for anyone to download.

As far as "open" goes, MS is capitalizing on the fear of the GPL - no-one has challenged it far enough yet to know if using a GPL OS would make it so you have to release your *applications* under a GPL umbrella. But ignoring this bit - there are a number of realtime or at least nearly-deterministic kernels available via GPL and Open Source communities (a quick online search yields over a thousand pages with the exact phrase "Open Source RTOS").

Determinism: putting the "Hard" in Real-time.

There are hundreds of open-source OS's - some are even deterministic enough to be "hard realtime".  RealTime is basically "fast enough to keep up" - but hard realtime means "measurably deterministic" - not just fast enough to keep up, but measurably and predictably able to keep up with a given load. WinCE 6 makes the claim that it's hard realtime  But... is it really a hard-realtime system?  Just claiming you're "RealTime" isn't enough - as This Stanford Comparison  shows, some systems claiming to be "Real Time" don't quite show determinism, especially when running as a "loaded" system.  Part of determinism is being able to give "deterministic" execution - that is, you can be guaranteed that the OS itself won't take longer than a given window of time to react to the world.  Execution times for critical system activities that vary over an order of magnitude is not a shining example of "determinism" (see page 4!).

I don't know if you've done any research into this part of the topic - but if you do, you'll notice that many comparisons for determinism messier things like context-switch time and Interrupt Latency (how long it takes the OS to handle an interrupt or change contexts, usually measured under idle and highly-loaded conditions).  I've never seen such a document for WinCE - and most of what I've seen was either from MS themselves or from companies hired by MS to run their tests.  Though their method is scientific, it requires specific hardware and tools, and their graphs don't give you the complete story for context switching or interrupt latency.  Since the method isn't shown against any other RTOS, there's no comparison possible.

WinCE 6 has extensive use of virtual memory spaces, including supporting up to 32,000 processes each with a 2GB addressable virtual address space.  This requires at least changing memory maps, and possibly "swapping" chunks of text and data  between RAM and some fixed storage.  The act of swapping - pushing data to/from physical media - induces indeterminism.

Other than this - there are some interesting developments.

It's being bundled with Visual Studio with an integrated Platform Builder plug-in.  Microsoft says ""Under one roof, you have the entire development chain from device to application".  Eventually it might have some sort of data visualization tools, perhaps some kind of system event display, maybe even.. hardware-assisted debug... could it be that Microsoft may be joining the DSO trend?

Mike Deliman

  • As an Engineering Specialist, it is Mike Deliman's responsibility to enable customers to achieve success in their endeavors, assist sales groups in evangelizing Wind River's technologies, and bring feedback of customer needs and experiences back into Marketing and Engineering. Mike has over 15 years of experience with VxWorks.
    "Mike's forgotten more about VxWorks than most people will ever know." -J Carlstrom
åç