Essentially, all models are wrong, but some are useful

Essentially, all models are wrong, but some are useful

D.Aarno


I had to learn this the hard way. About a decade ago I was sitting among the five computers occupying most of the space in my small student dorm room. I was looking for a project for my master’s thesis, and was currently pondering a proposal from a company called Virtutech to create a model of a Power Mac G3. I remember thinking about how I would simulate the bits and interconnects of the processor at the logic level, something akin to what synthesizable SystemC would look like. It was not until later I would realize how completely wrong this approach is for building a model that can be used to do system level software development. More alarming is how many people also take my initial, and wrong, approach.

Eventually I did not pursue the project, but went on to do my thesis and following doctoral studies at the Centre for Autonomous Systems (CAS), focusing on human machine collaboration. At CAS our motto was “those who can’t simulate, those who can do” – taking a stab at all the robotics research that never leaves MATLAB. The lesson I learned here was that debugging the software and hardware at the same time is extremely difficult. We are so used to having stable hardware in the form of PCs so we always expect the hardware to work. On a robot in a lab this is not always the case. Often hours or days spent debugging software simply ended up being fixed by replacing a faulty sensor or fixing a bad soldering.

My experiences at CAS convinced me that you really want to do as much of the development and testing as possible in a controlled environment, such as a simulation, and then only do final testing and tuning on the real system. The last point is an important one, the simulator is not a replacement for the real thing; it’s a powerful tool that helps you build the real thing more efficiently.

As part of my research at CAS I was on a research exchange program in Copenhagen, and one dreary morning in April, wet snow outside the window, I got approached by Virtutech for a developer position that I later accepted. Once I joined Virtutech it was revealed to me how naïve and completely useless my previous ponderings on how to simulate a computer system for the sake of running actual software was. I was amazed to learn how the team attacked the problem and just how much you can “cheat” without the software running on the system being aware of it. With the upcoming Simics book I hope to share my experiences in building models for the sake of system software development in an easily accessible format.

The first lesson I learned was that you should not try to model the hardware, rather it is the hardware/software interface that you need to model. This means that abstractions such as registers and interrupts are much more important than busses and bits. This is covered in the maxim that you should model what the hardware does, not how it does it. After all, from the software’s perspective, it is the architecture that matters, not a particular implementation of that architecture. In the upcoming book, Software and System Development using Virtual Platforms, which I have written with Jakob Engblom, we will discuss how to determine what you need to model and how to model it at the right abstraction level.

To better understand the importance of choosing the right abstraction level, consider the analogy with physics. In modern physics there are four major “abstraction levels,” or models, used: classical (Newtonian) mechanics, relativistic mechanics, quantum mechanics, and quantum field theory.

Classical mechanics is very good at dealing with everyday conditions; that is objects typically found on Earth moving at speeds far below the speed of light. However, once an object starts to move at speeds close to the speed of light or they start to approach the size of atoms other models are needed. Trying to calculate the motion of planets using quantum field theory becomes intractable, since the chosen model is too detailed for the purpose. Similarly trying to calculate the motion of subatomic particles using relativistic or even classical mechanics is not possible because the model is too abstract.

The analogy carries over to virtual platform models. If the abstraction level is too low there is a risk that the model will become a performance bottleneck and bring the entire simulation down to an unacceptable speed, not to mention the increased time it takes to create the model. On the other hand if the abstraction level is too high it may not be possible to perform all tasks required.

A model created at the right abstraction level can be so much more than the real thing. Such models help software developers increase their productivity since they provide unparalleled inspection, configuration, and injection capabilities, as well as debug tools superior to traditional hardware-based solutions. For example, a simple thing such as providing good log messages in a device model can help a device driver developer tremendously, as the developer will now have access to the device’s view of what is going on.

One interesting case that convinced me of this was a case where we booted a customer’s operating system on a Simics model. Simics emitted a warning message that a 64 MB page was being force-aligned to the next lower 64 MB boundary. In effect, an MMU mapping that tried to put a 64 MB page at 0xefe00000 was in practice putting it at 0xec000000. The specification for the architecture makes it clear that if an MMU page is not aligned on its own size, the mapping will be implicitly force-aligned, and the processor will keep executing. Simics was doing the same thing as the real processor, and the code kept running, but in addition Simics also emitted the warning (since this is likely an error).

Indeed it was an error on the part of the programmers. The intention had been to map just 1 MB at 0xefe00000 but the effect was to also map 62 MB below the address, reserving it for the operating system. This bug had been latent for a few years, until a user finally managed to create some tasks that made the OS attempt to use the erroneously mapped area for user data. The access failed, resulting in a crashed task and a bug report to the OS vendor. This shows the value of providing good log messages in the model. Even if the code appears to work on hardware, it can have latent issues that the model spots even if the hardware does not complain.

Software and System Development using Virtual Platforms will discuss, in depth, these and many other issues commonly encountered in the world of virtual platform development. The book will also address some common mistakes and misunderstandings that are surprisingly persistent. To provide a good understanding for how device models are created, one of the chapters in the book will provide a concrete modeling example where a model of a DMA controller is built step-by-step, from a specification, taking into account several of the tradeoffs and design decisions that have to be made. The finished modeled is then integrated into a virtual platform and tested with a custom Linux device driver to bring everything together.

I have now seen so many times the gains that can be achieved by doing software and system development on virtual platforms, that I’m convinced that this is the best way to do development both pre- and post-silicon. Remember, a virtual platform is not the real thing; it’s the thing that helps you build the thing that will eventually make you successful. Therefore it is always necessary to do the final testing and tuning on the real system. With this in mind I’m now ready to update my old motto to: those who can’t fail, those who can do will first simulate.

For more information about Simics please click here.

Daniel Aarno is an Engineering Manager at Intel where he leads a team working on the Simics full system simulation product. Daniel holds a master’s degree in electrical engineering and a licentiate’s degree in computer science.
The views and opinions expressed here are the author’s and not necessarily those of Intel Corporation, Wind River Systems Inc, or affiliates.

* George E. P. Box, Norman R. Draper. “Empirical Model-Building and Response Surfaces”, 1987. ISBN: 0471810339