Working Faster with Less (Simulated) Sweat

One very important property of a virtual platform like Simics is its speed of execution. Without sufficient execution speed, a virtual platform is not very useful – users want to have results in reasonable time. Raw simulation speed (getting as many target instructions as possible simulated each second) is important, and Simics is certainly pretty good at that. It is not necessarily the case that the best way to get a job done faster is to process instructions faster. Sometimes, working smarter rather than harder is possible.

Typically, working smarter means doing less to achieve the same goal. If the simulator needs to do perform less work, the task will complete the task in less time.

One way of avoiding work is to do something only once, and reuse the result many times. In Simics, this is supported by checkpointing. For example, saving the state of a booted target to avoid having to redo the boot each time a target is put to use. A previous blog post discusses checkpointing in more detail.

It is worth pointing out that a checkpoint can be prepared by one person and used by many. This can be used to implement a nightly boot workflow, where a platform team configures and boots a standard setup for all developers to use. In this way, a complex target is booted only once for an entire project, rather than once per individual developer.

We can also use simulator backdoors to speed up certain tasks. A typical example is loading an OS kernel directly to simulated memory, rather than have a bootrom running on the target downloading the kernel over a network from the development host. Another example is to replace flash programming on the target with direct updates to the flash content (which works when we have an image for the entire flash file system). In both cases, procedures that might take many minutes (and many billion target instructions) collapse into an instant.

Even when you are running software on the target to update disk and flash contents (for example, adding software on the fly to a booted target), such operations can be optimized by having the simulator reduce the latencies required. The target software will spend fewer instructions and less simulation time in wait loops. I blogged about one such example related to flash programming, where the execution time could be improved by a factor of 100 by not executing the wait loops needed on physical hardware.

Another optimization is to use automation. By using Simics scripting to automate interactive steps needed to load software, setup target state, and run target programs, significant time can be saved. Simulator scripts react immediately to the completion of tasks and type as fast as the target can accept input. We get an "ideal" user that does not pause to think and types incredibly quickly.

The video below shows a simple example of automation in action. Note that the target console is never at the front and never the target of manual typing.

Automated tasks can also be running on their own, while the user is doing something else. From the user's perspective, this compresses the task into taking no time at all (or at least no interactive time). For really large tasks, it makes sense to run them overnight, or to execute them on a server or other computation resource, freeing the user's personal machine for other work.

Going more into the details of target execution, yet another common optimization possible in a simulator like Simics is to detect and skip idle work. Rather than have the simulator actively execute an OS idle loop or a polled wait, we detect the repeated operation and skip ahead in time until a point where the target state (might) have changed in such a way that the loop exits. This is known as hypersimulation.

Hypersimulation is typically easy to achieve for OS idle time, since that is implemented using wait/halt instructions or other operations to put the processor into some lower-power mode. Another easy example is a loop-to-self instruction (which results from compiling C-code like while(1);). For other code, it is sometimes necessary to create a custom hypersimulation pattern that knows the code to skip and its exit condition. It is worth noting that hypersimulation is a pure optimization that does not affect target software behavior or the virtual time needed to complete the operation.

Thus, we can see that there are many ways in which virtual platforms can be used intelligently to speed up work. It is not  just about making the virtual platform itself faster, it is also about optimizing workflows.