Introduction
Truly, "necessity is the mother of invention." The need for more
memory space, address space and the need for speed gave rise to
64-bit computing. But it is necessity, or lack of programming discipline that gave rise
to memory requirements of more than 2GB and 3GB with the /LARGEADDRESSWARE switch enabled.
Remember
the good old days when few KB and MB's of memory was
sufficient, and GB's of memory appeared practically unlimited? With a 64-bit processor and
Tera bytes of memory space (infinite) one wonders whether such an amount of
memory will be required. But as Einstein says, "only two things are infinite: the universe and human stupidity. I'm not sure about the former." So with a poorly designed enterprise application,
I see no reason why we can't touch the Tera bytes limit
and give rise to 128-bit computing. I feel
64-bit computing includes areas such as game programming, CAD, and image
processing.
I guess I am on wrong track. You also might have
started wondering by now whether the article throws some light on 64-bit porting,
or is anti-64-bit. So let's say something really nice about 64-bit
application porting.
32-bit Operating Systems, such as Windows NT and Windows
2000, support up to 4 gigabytes (GB) of memory. Over the years, 32-bit Windows
applications have not only grown in size, but the need for applications to have the
ability to manipulate large amounts of data have dramatically increased. Today,
millions of users around the world need to access terabytes of data in real time
and the demand for an advanced, scalable architecture that can support a large
amount of memory is understandable. The obvious question is: should you be
considering a move to 64-bit Windows? In this column, we will explore the answer
to this question. We will discuss the advantages of 64-bit Windows over 32-bit,
talk about a few concepts and give you some tips on how you can prepare for the
next wave of Windows. We will also examine some limitations of 64-bit Windows,
including the reasons that 64-bit processes cannot load 32-bit DLLs. We will
address the fate of 32-bit applications in a 64-bit environment. Finally, we
will talk about some general guidelines for porting 32-bit applications to
64-bit.
64-bit is the natural progression of computing technology and
Microsoft, as usual, is bringing to market a seamless migration strategy and
powerful 64-bit platform. Just as
computing migrated from 16-bit to 32-bit once the more powerful platform became
available, it is almost inevitable that virtually all computers, over the next
few years, will be 64-bit. However, the
migration to 64-bit has some key differences:
- the porting will be simpler
and will require far fewer resources
- the migration is driven by the desire
to take advantage of the technology.
Prerequisite
Before getting into discussion about 64-bit OS and porting issues we need to
focus on technology for 64-bit processors.
First, here's a look at the Intel 64-bit Itanium2 processor of the Itanium Processor Family (IPF):
- The processor's EPIC architecture provides dramatic
performance gains over older 32-bit chipsets.
- EPIC stands for Explicit Parallel Instruction Code;
it is a new instruction set designed for a high level of parallelism and
allows for up to nine instructions to be executed in
parallel.
The alternative to the IPF is x64 processor. There
are actually 2 groups of x64 processors: the AMD64 processors, Opteron and
Athlon 64 processors from Advanced Micro Devices, and the Intel Xeon EM64T(Emulated).
Because both the AMD and Intel processors are running the
same binary (as of x86) they are referred to as x64. One advantage of x64
is that it can run both 64-bit and 32-bit applications natively. This
means there is no performance degradation when running a 32 bit app on a 64
bit system. This is not case with IPF.
This next session digs into processors' internals. If you are interested, read on, if not, skip the section.
In computers, the front side bus (FSB) is a term for the
physical bi-directional data bus that carries all electronic signal
information between the CPU and other devices within the system such as
RAM, the system BIOS, AGP video cards, PCI expansion slots, hard disks
etc.
To fully realize the performance gains provided by multiple processor
cores, chip companies need to find a way to deliver enough data to the
processor from the main memory to keep the cores as productive as
possible. Intel's current front-side system bus design should be able to
keep as many as four cores satisfied, depending on their frequency, but
with technology growing faster, the FSB will be a bottle neck for quad core
processor release on 2007. The company can adopt on-chip memory
controllers to connect central processing units to memory. AMD64
processors eliminate the front-side bus architecture that dominates in
x86-based systems today.
By integrating the memory controller, and using
industry-standard HyperTransport technology for chip-to-chip
communication, AMD64 processors reduce the bottlenecks and latencies
commonly found in other x86-based systems. The drawback of an integrated
memory controller is, the integrated memory controller can only work with
the memory standard for which it was designed. With memory standards
changing every 18 months or so, this means companies have to tweak their
chips to enable transitions such as the switch from DDR (double data read)
memory to some other type.
Why dig into so much
detail on processors? If many of you ask, it goes to prove
that, in fact, in many cases 32-bit apps run faster on the 64-bit
AMD64 architecture than they would on an equivalent 32-bit x86 processor.
This is due to the fact that AMD64 processors eliminate the front-side bus
architecture that dominates in x86-based systems today.
About 64-bit operating system
P>Microsoft Corp. is currently offering two operating systems compatible
with the AMD64 architecture that provide WOW64 capability. Those
operating systems are Windows XP for 64-bit Extended Systems and
Windows Server 2003 for 64-bit Extended Systems.
Wow64 (Windows On Windows)
Wow64 (Win32 emulation on 64-bit Windows) refers to the
software that permit the execution of 32-bit x86 applications on 64-bit
Windows. It is implemented as a set of user-mode Dlls. Technically, WOW64
is implemented using three DLLs: Wow64.dll, which is the core interface to
the NT kernel that translates between 32-bit and 64-bit calls, including
pointer and stack manipulations; Wow64win.dll, which provides the
appropriate entry points for 32-bit apps; and Wow64cpu, which takes care
of switching the processor from 32-bit to 64-bit mode.
Despite its
outwardly similar appearance on all versions of 64-bit Windows, WOW64's
implementation varies depending on the target processor architecture. For
example, the version of 64-bit Windows developed for the Intel Itanium 2
processor uses Wow64win.dll to set up the emulation of x86 instructions
within the Itanium 2's unique instruction set. That's a more
computationally expensive task than the Wow64win.dll's functions on the
AMD64 architecture, which switches the processor hardware from its 64-bit
mode to 32-bit mode when it's time to execute a 32-bit thread, and then
handles the switch back to 64-bit mode. No emulation is required!
The
WOW64 subsystem also handles other key aspects of running 32-bit
applications. For example, it's involved in managing the interaction of
32-bit apps with the Windows registry, which is somewhat different in
64-bit versions of the OS, and in providing an interface to the storage
subsystem. WOW64 also ensures that 32-bit apps, utilities, dynamic link
libraries (DLLs) and other files are stored in the appropriate
directories.
Two of the more interesting
functions of the WOW64 infrastructure are its Registry
Redirector and File System Redirector.
Registry Redirector: In the case
of the registry redirector, 64-bit Windows actually maintains two separate
HKEY_LOCAL_MACHINE\Software trees. One is used by native 64-bit apps, the
other is for 32-bit apps. This allows a 32-bit application to see a system
image and resources that make it believe that it's running on a standard
32-bit version of Windows; obviously a 32-bit app wouldn't understand the
hardware changes in a 64-bit
system.
File System Redirector: The file system
redirector is used to ensure that 64-bit Windows doesn't suffer from
file-location overloading. That's because 64-bit Windows still uses the
C:\windows\system32 directory for native applications. Oddly enough, this
misnamed directory is for 64-bit apps only! So, 32-bit apps trying to read
or write to C:\windows\system32 are redirected to C:\windows\SysWOW64
instead. Similarly, only 64-bit apps can use C:\Program Files; 32-bit
apps are invisibly redirected to c:\Program Files(x86). You'll need to
pay attention to this, however, if you're executing command-shell scripts
that need to call a 32-bit application, because they'll be looking in the
wrong directory. By default, a command shell launched on 64-bit Windows is
a 64-bit command shell. You can still launch 32-bit apps from such a shell
by looking in the correct directory. Or, you can simply start a 32-bit
command shell, from C:\windows\SysWOW64\cmd.exe, and let it handle the
directory translations for you.
The WoW64 executes
and operates differently on the X64 and IPF chipsets. Because the X64 is designed to run
32-bit code natively, there is no performance loss for 32-bit
applications. The Itanium2
requires an execution layer because the X86 binary has to be converted on
the fly into EPIC. In many
cases, 32-bit applications will run slower on the Itanium2 when compared
to running on a 32-bit processor, because of the execution layer. An application's particular
requirements and specifications will dictate which processor is best
suited to its needs.
Why 64-bit apps cannot load 32-bit Dll's & 32-bit apps load 64-bit Dll's
I mentioned earlier that 32-bit processes can't load 64-bit DLLs and
64-bit processes can't load 32-bit DLLs. You might be wondering why. By
default, 64-bit applications can use 8 TB of user mode address space. You
have the option to specify that all memory below 2 GB be allocated to the
application. Because 32-bit DLL can't address memory space above 2GB, the
thunk layer would have to copy all data into the low 2GB of the 64-bit
application. Obviously, this won't work if the 64-bit application tries to
pass a pointer to data that is larger than 2GB.
Also 32-bit DLLs use x86 style exception handling and 4K pages. On an
IA-64 processor, the native page size is 8K and the WOW64 emulator is
responsible for simulating 4K pages. Because on an x86 machine exceptions
do not "unwind" from user mode to kernel mode and back, WOW64 implements
x86-style exception without switching from x86 code to IA-64 and back.
Finally, another reason why 64-bit and 32-bit processes can't load each
other's DLLs is that system DLLs (kernel32.dll, user32.dll, and gdi32.dll)
expect only one instance per process, 32-bit or 64-bit. If a process
contained more than one instance of, say user32.dll, Win32k.sys will not
be able to distinguish between them and wouldn't know which one to
call.
So a 64-bit application must
have all the Dll that are 64-bit.
64-bit Porting Issues
The
section explains the most effective strategy for migrating a 32-bit
solution for C++. In the majority of cases the port to 64-bit is
straightforward. For
unmanaged code (C/C++), the move to 64-bit focuses on 64-bit pointers and
how to handle them in the code base. The
memory model in 32-bit is ILP32; in 64-bit it is IL32P64. The most significant difference
between this memory model and the Unix 64-bit memory model (I32LP64) is
that a Long is still 32 bits whereas in Unix it is 64-bits.
The majority of
coding issues for migrating C/C++ applications to 64-bit can be
categorized into three groups:
Pointer casting, Pointer arithmetic, and alignment. Additional challenges may present
themselves when dealing with in-line assembler, use of one of the five
modified API calls, and when attempting to communicate across a 32/64-bit
boundary. Because the data types
Int and Long remain the same size (32 bits) the amount of code that will
need to be modified should be very small. Typically, the number of lines of
code touched should be less than one percent of the total code base. This is different from Unix, in
which Long moves to 64-bits. Developers must be very careful with
the alignment of variables. The
penalty for misalignment can be very severe in terms of performance. On x64 there is a performance hit
but on Itanium systems the problem is more grave; the exception propagates to
the application level and will cause the application to
crash. Developers can use
the �Wp64 compiler switch to ask the compiler to display possible
portability issues. This will
bring the vast majority of porting issues to their attention. This flag is also available in
32-bit mode.
I was working on porting an enterprise application
written in C++ from 32-bit to 64-bit. I was very happy with the
challenging assignment I got and started working on it enthusiastically.
Microsoft makes life so simple. I just opened all the
projectsdsp's (Dll's) in VS.NET 2005. It asked if I wanted to convert
the project to VS.NET 2005 format and I clicked "Yes to All." Then I changed some
compiler settings, rebuilt the solution, and the 64-bit Dll's were ready. I
really felt blessed working with Microsoft technologies - the cool GUI, the
next - next -next finish approach and /Wp64 compiler switch to give you
all probable warnings and the best part is it gives you the solution also
for 64-bit porting.
The porting assignment was completed in few days, and I
was feeling proud with one of the biggest achievements of my life. But God
has some different plan's for me. I think this time God has decided to
annihilate me and chosen this assignment as one of the weapons to use against me.
The project was put for testing and it started crashing now and then. I
started having sleepless nights with crashes, crash dump and GPF following
me in my dreams (if I actually got some sleep). Suddenly I started cursing the
creator of the 64-bit processor. By now you might have got the reason why
I was anti 64-bit at the start of this article. But believe me friends, if
God gets you there he will get you through it. There are a few tips that might
prevent you from having nightmares.
Porting Guidelines
64-bit clean: Before you start
using any other tools or apply any 64-bit porting guidelines, get your code
64-bit clean using the /Wp64 switch in the VS.NET 2005 IDE. It points out
many porting issues.
Pointer Casting: When moving from 32
to 64-bit, the main type that grows is the pointer and derived data types,
like handles. In Windows
64-bit, the pointers and derived types are now 64-bit long. Some other types also increase in
size: WPARAM
,
LPARAM
, LRESULT
, SIZE_T
all are derived from
pointer. One reason
for this is that they are used as parameters and some functions expect
pointers as parameters. All
of the types derived from "int" and "long" continue to be 32 bits in
size. Some of these include
DWORD
, UINT
, and ULONG
. Types
that were less than 32-bits remain at their current sizes. An example of this is the "short"
data type, which remains as a 16-bit signed integer.
Look for code where you have typecasted any of the
pointer derived data type to long or DWORD
which was perfectly acceptable
in 32 bit but an unforgivable sin in 64-bit. Use polymorphic data types such
as INT_PTR
, DWORD_PTR
for typecasting as they are represented
as:
#ifdef __64
typedef __int64 INT_PTR
#endif
#ifdef __32
typedef int INT_PTR
#endif
Pointer
Arithmetic: Look for code where you have done pointer
subtraction or other arithmetic. If you have subtracted two pointers and
stored the value in long acceptable in 32-bit but will cause pointer
truncation on 64-bit and point to wrong address. Use ptrdiif_t
to save
results of pointer arithmetic instead of long or
DWORD
.
char lcTestArray[16], *char_ptr;
char_ptr = (char*)((int)lcTestArray + 1);
*char_ptr = 'a';
Use polyorphic types
char lcTestArray[16], *char_ptr;
char_ptr = (char*)((size_t)lcTestArray + 1);
*char_ptr = 'a';
The example illustrates the improper use of the int
type in pointer arithmetic. The lcTestArray
pointer is cast to an int in
order to calculate offsets into the Test
array and the result is
cast back to a pointer.
Again, due to the data type size differences, loss of data will
occur, and memory faults will be inevitable.
Polymorphic Data Types: Be careful
of Polymorphic data types such as INT_PTR
, DWORD_PTR
, size_t
, time_t
and
many more. The CTime
function returns time_t
and looks for places you have
collected the values in long int. In my code and also in my DB I saved
a time stamp in long format. This will not pose any problem until the year
2038, but it will hurt a lot of programmers as its against our ethical code to
have something in code you know might not work after some time. So better
change it if you are storing the value in long int.
API Updated: The Win32 API remains the same. The only changes correspond to five
replacement functions; four of which are replaced by a polymorphic version
and one which is used for flat scroll bars:
GetClassLongPtr()
GetWindowLongPtr()
SetClassLongPtr()
SetWindowLongPtr()
The names of these
functions have been changed.
Also, these functions have been adjusted to use polymorphic data
types (such as UINT_PTR
), and use any updated
constants.
Structure Alignment: Another common source of porting problems is data structure
alignment. Data types tend to
be aligned on boundaries related to the size of the data type itself. For instance, chars
will align on one-byte boundaries, whereas integers will align
on four-byte boundaries. In the structure below this issue. The a
field (which is a
character) correctly aligns at the head of the structure. The b
and c
integer fields,
however, align at the next available 4-byte boundary in the
structure. This forces the
use of 3-bytes of padding between a
and b
in order to conform to the
4-byte alignment necessary for the integers.
struct ExampleStruct
{
char a;
int b, c;
void* d;
}
Similarly, d
(which is a pointer), will align at the
next available eight-byte boundary.
Because the total aggregate size of a
, b
and c
, as well as
the padding between a
and b
, totals up to 12 bytes, 4 additional
padding bytes are necessary after c
so that d
can be properly aligned
on an 8-byte boundary. These changes in structure allocation can cause
problems if incorrect assumptions are made about the way the structure is
implemented in memory at run-time. For instance, assuming that the
offset of d
within the structure is 12 bytes from the head of the
structure could cause problems if direct offset-based assignments or
accesses are attempted. There
are mechanisms in place to allow the use of offset-based access in a safe
and platform-neutral manner. There are differences between how a
structure may be padded on the 32-bit platform and the 64-bit
platform. Developers should
understand the architecture padding rules and ideally align all structure
members on natural boundaries. The padding structure is
significantly different on each platform, so any transfer of objects
across the 32/64-bit boundary may cause problems. Structure alignment
issues may give rise to invalid offset arithmetic.
The results for violating alignment vary from platform
to platform. The following
cases apply:
- x86 - An exception is raised, but the
operating system fixes the misalignment on the fly.
- IPF - It behaves similarly to the x86, except
the operating system does not fix it.
- x64 - The hardware does not raise the
exception; the fix is done at the hardware level.
There are several ways to avoid
misalignment. One of them is
to use the __unaligned keyword.
It allows access to misaligned data; however, even if the data is
aligned properly, the application will pay a performance penalty. This approach is not recommended
except for cases where no other option is practical. The __unaligned
keyword causes the compiler to insert code to correct misalignment
problems on the fly. This
increases the overall size of the executable and is the source of the
performance penalty. Also it is possible to indicate that the data
should be aligned on a specific boundary using __declspec(align())
. Furthermore, the _aligned_malloc()
call allows the developer to allocate memory in a pre-aligned
fashion. This is the
recommended best practice for ensuring that all data is aligned on natural
boundaries. Because the
majority of application providers are moving to the 64-bit platform for
performance-oriented reasons, it is essential that the developer pays
attention to alignment to prevent degradation of the
application.
Messaging Architecture: As in many
enterprise client server applications, the communication takes place by
means on request and response. The client sends a request and server sends the
response. The request and response are nothing but mere messages or pre
defined structure. Be careful that you don't use any polymorphic data types
like size_t
, time_t
, etc. in this messages because if the client is running
on 32-bit machine the size of these data types will be 32 bits and on
the server. A 64-bit application running on a 64-bit OS will be 64-bit, and
a request will be misinterpreted. For messaging architecture, stick to
the basic data types.
Data File Sharing: File handling
becomes one very important aspect particularly if it is handled by both 64
bit and 32 bit app. For example, a 64-bit server writes to a file and
distributes to a 32-bit client which reads them. A server application uses
fwrite
to write to a file (which uses sizeof(size_t)
to specify the size). On a
64-bit platform, the size will be 8 bytes, and a 32-bit application attempting to read from the
file can create havoc, as now the size will be 4 bytes. I will throw
some light on this aspect with an example later, as I don't have VC 6.0 and
MSDN installed.
Deprecated Functions: The best practice is
for developers to aim for a single code base to compile for both 32 and
64-bits. This allows
developers to protect their investment expertise in 32-bit code. They should not write code that
depends on or assumes the sizes of data types for its calculations and
operation. This code will
very likely not be portable and could create difficulties in porting. Things seem to be
settling for me and all of sudden from nowhere the VS.NET IDE starts
shouting about the deprecated functions for strcpy
, strncpy
, etc and
suggests I use strncpy_s
. But I need the same code for 32-bit apps in VC 6.0
and 64-bit apps in VS.NET 2005, and our old friend #ifdef
came to
the rescue again. Write your custom function port_strncpy
.
char* port_strncpy(Function paramters here)
{
#ifdef __64
strncpy_s(function parameters here)
#endif
#ifdef __32
strcpy(function parameters here)
#endif
}
Replace all your strncpy
occur
with port_strncpy
. Using find and replace in project it won't take more than
a couple of minutes. Replace all deprecated function by writing your custom
function.
Format Specifiers: Use the proper format specifiers in printf or
wsprintf. Use %p to print pointers in hexadecimal.
This is the best choice for printing pointers. Refer to MSDN for other
format specifiers.
A good indication that developers
have written clean code is when they can compile it cleanly with level W4
warnings turned on. This
doesn't specifically target 64-bit issues, but many portability issues are
identified this way.
The New Data Types and Helper
Functions
64-bit Windows introduces new data types that your applications should
be aware of: fixed-precision data types, pointer-precision types, and
specific-precision pointers. These types were added to the development
environment to allow developers to prepare for 64-bit Windows� well before
its introduction.
The developer tool kits from Microsoft also
include new helper functions that can be useful in managing a code base
across 32-bit and 64-bit systems. Some examples of these helper functions
are:
unsigned long HandleToUlong( const void *h )
long HandleToLong( const void *h )
void *LongToHandle( const long h )
unsigned long PtrToUlong( const void *p )
unsigned int PtrToUint( const void *p)
The New 64-bit
Compiler
The Platform SDK
includes a pre-release version of a 64-bit compiler that can be used to
identify pointer truncation, improper type casts, and other
64-bit-specific problems. You can run it on a project or set of code. This
is a great place to start. The first time you run the compiler, it will
generate many pointer truncation or type mismatch warnings. You can also
use VC6.0 IDE to test build launch the msdev from the 64-bit compiler command
prompt using /useenv switch. Note VC6.0 does not support the x64 processors
family, particularly Xeon(EM64T). I have tried to build on 64bit 2003 OS
with Dual Xeon processor(Hyper threaded); VC6.0 supports Itanium only I
guess, correct me if I am wrong or some settings need to be
changed.
The New Rules for Using
Pointers
Porting your code to
compile for both 32- and 64-bit Windows� is reasonably straightforward -
but you do need to be careful and consistent. You need to follow a few
simple rules about casting pointers, and use the new data types in your
code.
Some of these rules for pointer manipulation are as
follows.
- Do not cast pointers to
int, long, ULONG
, or DWORD
. - Use
UINT_PTR
and INT_PTR
where appropriate (and if you
are uncertain whether they are required, there is no harm in using them
just in case). - Use the
PtrToLong
or PtrToUlong
function to truncate pointers. - If you must truncate a pointer to a 32-bit value, use the
PtrToLong
or PtrToUlong
function. - When setting the
cbWndExtra
member of the WNDCLASS structure, be
sure to reserve enough space for pointers.
Finally
Thanks for your patience for reaching this far in the document. This is not complete guide to 64-bit OS since things
are relatively new, some of you might have dug more information and are
always welcome to contribute it. Suggestions for improvement are always
welcome. So is a rating, so don't forget to rate whether it may be
1or "?". I hope the article is useful for you in
addressing any porting issues if you have. Also revert back if you have
got any new issues in porting or guidelines for porting. Please let me know if some information in
this article is misleading and needs some correction.
References
Microsoft Windows 64-Bit Technology White Paper
AMD processor