The move from 32 bits to 64 bits is really disruptive for software. All processor registers double in size, and code which had assumed a certain register size will break in interesting ways. Just like with endianness, 64-bitness is insidious. I have seen code that runs well on 64-bit SPARC and 32-bit x86 break on 64-bit x86, from a single small mistake in variable types.
In the case of C and C++ code, the compiler will catch many of the errors. Doing pointer arithmetic using "int" is a typical example that will generate compiler errors and warnings (since int stays 32 bits as pointers become 64 bits). However, the compiler will not catch everything, and that's where testing and running code on the actual 64-bit target becomes crucial to getting through the transition. If you have code that is interpreting data stored in memory, you can easily trip up as "long" becomes 64 bits rather than 32 bits. In C, using type casts, you can easily mix up 32 bit and 64 bit data, leading to very strange results from computations despite the compiler approving of the code.
Furthermore, most relevant code should be updated to take advantage of the fact that it can now operate on 64 bit quantities rather than 32 bit quantities in a single operation. An IP packet header can fit both the source and destination into a single register, for example. Using 64 bits allows much larger counters without overflow (even though one should not assume that 64 bits is enough for anything; we recently had to extend some time counters in Simics to 128 bits to accommodate really long simulation runs). The most obvious advantage to many users is that they can now address more memory than four gigabytes. Four gigabytes is really becoming a limitation in many applications even in the embedded world, as workloads and databases become bigger and memory cheaper. Updated code obviously needs to be tested hard, so that no code starts interpreting a set bit 31 as a negative number, for example.
All of this points to the fact that a crucial part of moving to a 64-bit platform is to actually have the platform available to run code on, and that's where Simics comes in. Since its start as a research project in the mid-1990s, Simics has supported simulating arbitrary word length targets on arbitrary hosts. In particular, Simics started off simulating 64-bit UltraSPARCs on 32-bit x86 PCs. Since then, we have added support for 64-bit x86-64, 64-bit POWER architecture (with the IBM PPC970MP), and 64-bit MIPS. With this tradition, creating a 64-bit e500 core was not particularly hard.
With Simics, the P5020 has been available for software development for quite a while now, making it possible to port both operating systems, middleware, and user applications to the platfore before silicon availability.
Simics does allow some interesting additional tricks beyond just running code. In one Simics classic, Simics was configured to simulate a physical RAM memory filling the entire possible physical address space of a machine (an amount of RAM that used to cost something like the US annual defense budget, although it is many orders of magnitude cheaper today). This tested that the operating system actually handled maxed-out memory correctly, and that it did not accidentally treat the memory size as a signed value somewhere. Since Simics simulates memory in a lazy fashion, we never actually had to represent the contents of 264 bytes of memory, which is still a pretty expensive proposition even when using magnetic drives.
Jakob Engblom is Technical Marketing Manager for the Simics product line at Wind River. He came to Wind River with the Virtutech acquisition in March 2010, and has been working with Simics since 2002. As technical marketing manager, he works with the what and how of Simics usage, including actually writing real code.

Aloha!
Good article. This is one good reason why code should have started to use stdint.h a long time ago. Finding the hidden assumptions are the hard part, and cause for a lot of misery - in portability as well as security.
A nitpick: You assume IPv4 addresses in your example. And would storage of both source and dest addr for IP application really be that useful? Forwarding for example normally reqs comparison between two addr (two operands), but rarely source and dest from same packet.
Posted by: JoachimS | June 22, 2010 at 07:19 AM