The US National Highway Traffic Safety Administration (NHTSA) recently released a deep report into last year's issue with "unintended acceleration" on certain Toyota cars. They actually employed a team from NASA who analyzed the throttle control software using a wide range of cutting-edge tools. Reading their report gives a good idea for how embedded control software is developed, and the challenges inherent in validating it.
The conclusion is that the software is not a likely cause of the problems. The real meat of the report (and its appendixes) is just how this conclusions is reached, and what it says about the tools that could be applied to develop embedded software with greater confidence about its correctness.
The NASA team used several types of static analysis, both code-semantic analysis looking for coding errors and static timing and stack-depth analysis. They also applied model checking, and built models of the environment that the control code interacted with. What they did not do was to simulate the actual code being used, since a simulation environment was not really available that could run the complete software setup.
The source code is complex and has many dependencies that make full-scale simulation outside its native hardware environment difficult. Even within the ... compilation and debug environment, the practice ... has been to only use a software simulation environment for “one-shot” unit testing, i.e., one input vector yielding one output vector. Any further testing beyond the unit level was done ... on hardware platforms with integrated software loads.
What this appears to be saying is that the developers of the code created unit-testing scaffolds for the control system and ran these on some small simulation setup. The simulated testing of the actual compiled code did not extend to the complete system, and most likely did not include running the actual target OS, dealing with interrupts, and other asynchronous events. This problem is elegantly solved with a virtual platform like Simics, where you replicate the hardware platform inside a host machine. You can then run the real code just like on the physical machine. No other solution really gives you that ability, running the whole software stack with all its bits and pieces integrated.
Using virtual platforms for such testing of functionality has several advantages, in particular for reproducing complex issues. Considering the effort spent on code validation, unit testing, and static analysis, I would expect the code to be pretty solid in units. However, when the code is integrated into the environment with an OS, with hardware and interrupt handlers, there is a whole new set of issues that appear.
Virtual platforms are great for integration testing. They are much more available than hardware. When rare issues appear, they are trivial to capture, reproduce, and communicate. With multiple communicating processor cores, you really want the debug power of a virtual platform. You also want to test what happens when the hardware starts to act up, with faults such as bad sensor readings or interrupts arriving quicker than expected.
In the end, all software has to be tested on the real hardware. There is no getting around that, you "fly what you test and you test what you fly". Using a virtual platform alongside environment simulations and static analysis tools will, however, make that testing much less painful. More bugs will have been removed prior to committing to hardware, and the overall code quality will be better. More than likely, the software development will also have taken less time and been less stressful for the developers.