Travelling into 64-bit Land with Simics

The new Freescale QorIQ P5020 SoC that was announced this week at the Freescale Technology Forum means that yet another chip family has now moved to 64 bits from 32 bits. This is a familiar scenario that has been played out many times before, starting in the mid-1990s as Sun, IBM and MIPS upgraded their server processor architectures to 64 bits. Before that, we had Intel extending the x86 family from 16 bits to 32 bits and IBM extending the System/360 architecture from 24 bits to 31 bits. Each time, the changed caused software pain for a while as the software is updated to work in the new world with more bits to use.  It is also something that Simics has helped with in the past.

The move from 32 bits to 64 bits is really disruptive for software. All processor registers double in size, and code which had assumed a certain register size will break in interesting ways. Just like with endianness, 64-bitness is insidious. I have seen code that runs well on 64-bit SPARC and 32-bit x86 break on 64-bit x86, from a single small mistake in variable types.

In the case of C and C++ code, the compiler will catch many of the errors. Doing pointer arithmetic using "int" is a typical example that will generate compiler errors and warnings (since int stays 32 bits as pointers become 64 bits). However, the compiler will not catch everything, and that's where testing and running code on the actual 64-bit target becomes crucial to getting through the transition. If you have code that is interpreting data stored in memory, you can easily trip up as "long" becomes 64 bits rather than 32 bits. In C, using type casts, you can easily mix up 32 bit and 64 bit data, leading to very strange results from computations despite the compiler approving of the code.

Furthermore, most relevant code should be updated to take advantage of the fact that it can now operate on 64 bit quantities rather than 32 bit quantities in a single operation. An IP packet header can fit both the source and destination into a single register, for example. Using 64 bits allows much larger counters without overflow (even though one should not assume that 64 bits is enough for anything; we recently had to extend some time counters in Simics to 128 bits to accommodate really long simulation runs). The most obvious advantage to many users is that they can now address more memory than four gigabytes. Four gigabytes is really becoming a limitation in many applications even in the embedded world, as workloads and databases become bigger and memory cheaper. Updated code obviously needs to be tested hard, so that no code starts interpreting a set bit 31 as a negative number, for example.

All of this points to the fact that a crucial part of moving to a 64-bit platform is to actually have the platform available to run code on, and that's where Simics comes in. Since its start as a research project in the mid-1990s, Simics has supported simulating arbitrary word length targets on arbitrary hosts. In particular, Simics started off simulating 64-bit UltraSPARCs on 32-bit x86 PCs. Since then, we have added support for 64-bit x86-64, 64-bit POWER architecture (with the IBM PPC970MP), and 64-bit MIPS. With this tradition, creating a 64-bit e500 core was not particularly hard.

With Simics, the P5020 has been available for software development for quite a while now, making it possible to port both operating systems, middleware, and user applications to the platfore before silicon availability. 

Simics does allow some interesting additional tricks beyond just running code. In one Simics classic, Simics was configured to simulate a physical RAM memory filling the entire possible physical address space of a machine (an amount of RAM that used to cost something like the US annual defense budget, although it is many orders of magnitude cheaper today). This tested that the operating system actually handled maxed-out memory correctly, and that it did not accidentally treat the memory size as a signed value somewhere. Since Simics simulates memory in a lazy fashion, we never actually had to represent the contents of 264 bytes of memory, which is still a pretty expensive proposition even when using magnetic drives.

1 Comment

  1. JoachimS

    Good article. This is one good reason why code should have started to use stdint.h a long time ago. Finding the hidden assumptions are the hard part, and cause for a lot of misery – in portability as well as security.
    A nitpick: You assume IPv4 addresses in your example. And would storage of both source and dest addr for IP application really be that useful? Forwarding for example normally reqs comparison between two addr (two operands), but rarely source and dest from same packet.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>