By Jakob Engblom
Let's start with a story from the 1904 olympics in St Louis. In the marathon race, a runner crossed the finish line far ahead of the competition at an incredibly good time. It quickly became clear that he had cheated – he had been riding in a car for about half the race. This was obviously wrong, and he was quickly stripped of his "victory." Running a marathon is defined as covering the distance on foot. Using a car is not an allowed option.
However, if the goal of the marathon race had instead been defined as "get from A to B in the shortest time" with no particular care for the mode of transportation, our cheater would have been hailed as the smart guy. The other runners who insisted on covering the distance on foot would have been considered dumb and behind the times. Thus, we can see that what is considered "cheating" really depends on the rules of the game and what is considered the essential goal of the exercise.
In a the world of virtual platforms, such considerations about the rules of the game are common. By looking at what we are trying to achieve from a new angle (redefining or bending the rules of the game or even making a complete paradigm shift), we can often get to the point the user wants to be faster, with less friction. In our engineering tradition, we call this "cheating." In marketing, we call it "work smarter."
A recent example serves as a good illustration of how working from the real problem backwards can make a huge difference. Some users had perceived a problem with their virtual platforms: writing files to a flash-based file system was taking a painfully long time to complete (in terms of real-world time). How can this be optimized? It all depends on what we think is the core problem. With the right view of the problem, we might be able to take some shortcuts and cut down the time of the operation.
If the goal is to test the correctness of flash file system operations, there is not much to do. The code has to run every step, and any kind of shortcut is plain bad, affecting the final results and not testing all the code that should be tested. However, if the flash file system is considered as a stable base, and the problem is defined as getting the contents of the flash updated as quickly as possible, there are things that can be done.
In some cases, the contents of the flash can be prepared offline, completely bypassing the programming on the target. This requires a host-based tool for creating flash file system images, and having the virtual platform load the flash memory contents using a back door (essentially dumping a stream of bytes from the flash image file into the simulated flash memory).
When flash programming has to happen on the target, it turns out that we can still optimize the process (i.e., cheat). The flash driver being used in this case has a very raw interface to the flash memory. Each word written is followed by a delay in order not to violate the write access times of the flash. Running this busy-wait loop turns out to be what is consuming almost all the simulation time. The delay is on the order of 10 microseconds, which is the kind of time frame where a busy-wait loop is the only reasonable implementation. You cannot set a timer to 100kHz and fill the flash using an interrupt driver. The ideal flash would have a built-in DMA engine to program itself – but that is asking the hardware to do things it does not do.
On physical hardware, this delay is absolutely necessary for correct operation. On a Simics virtual platform, it is not, as the flash model in a functional simulator does not require wait time for a write to go through. We can just write each word immediately following the end of the previous word, which saves some 90 to 99% of the execution time of the flash write operation.
To implement this shortcut, we have several implementation options. The simplest is to locate the place in the flash driver that calls the delay function, and just replace it with a NOP. Fairly easy to do with some scriptable symbolic debugging. Another alternative is to make the delay function return immediately, which might hit other code. You could also imagine compiling a BSP that is slightly modified to run better on Simics – that is sometimes an acceptable strategy (it depends on the build configurations, how important unmodified code is, the structure of the company, and many other factors).
Such shortcuts are going to change the virtual time the program takes to run – but in this case, that does not matter. Remember that we assume that the file system and flash driver are correct, and we just want to get the contents changed as quickly as possible. If it turns out to be a problem in some cases, we can simply stop doing the optimization. It is an optional extra, as all good optimizations.
This is just one example of how working with a virtual platform can be subtly different from working with physical hardware, and how taking advantage of the properties of a virtual platform can make your development work proceed with less friction.