This article explains the need for Rebasing a DLL to improve performance during application startup using more than one DLL. It covers how to change compiler settings to rebase a DLL and also the use of /FIXED switch in rebasing.
Every executable and DLL module has a preferred base address, which identifies the ideal memory address where the module should get mapped into a process' address space. When you build an executable module, the linker sets the module's preferred base address to 0x00400000. For a DLL module, the linker sets a preferred base address of 0x10000000. Using Visual Studio's DumpBin utility (with the /headers switch), you can see an image's preferred base address.
Go to command line and use the command Dumpbin /headers exename.exe.
Or use Visual Studio's Depends (Dependency Walker) utility and click an EXE, you will get the information of all DLLs and base addresses where they are loaded.
When this executable module is invoked, the operating system loader creates a virtual address for the new process. Then the loader maps the executable module at memory address 0x00400000 and the DLL module at 0x10000000. Why is this preferred base address so important? Let's look at this code:
Using the code
Let's look at a simple piece of code. I am initializing an integer
i in a function.
int i = 5; }
When the compiler processes the
Func function, the compiler and linker produce machine code that looks something like this:
MOV [0x10014540], 5
In other words, the compiler and linker have created machine code that is actually hard-coded in the address of the "
i" variable, i.e., 0x10014540. This memory address is absolutely correct as long as the DLL does in fact load at its preferred base address.
OK, now let's say that you're designing an application that requires two DLLs. By default, the linker sets the .exe module's preferred base address to 0x00400000 and the linker sets the preferred base address for both DLLs to 0x10000000. If you attempt to run the .exe, the loader creates the virtual address space and maps the .exe module at the 0x00400000 memory address. Then the loader maps the first DLL to the 0x10000000 memory address. But now, when the loader attempts to map the second DLL into the process' address space, it can't possibly map it at the module's preferred base address. It must relocate the DLL module, placing it somewhere else.
Below are the dependencies for a test EXE using DLL1 and DLL2 without rebasing. As seen, both DLLs have the same base address and only one will be loaded at that address and the other needs to be reallocated.
Relocating an executable (or DLL) module is an absolutely horrible process, and you should take measures to avoid it. Let's see why. Suppose that the loader relocates the second DLL to address 0x20000000. In that case, the code that changes the "
i" variable to 5 should be:
MOV [0x20014540], 5
But the code in the file's image looks like this:
MOV [0x10014540], 5
If the code from the file's image without changing the address is allowed to execute, some 4-byte value in the first DLL module will be overwritten with the value 5. This can't possibly be allowed. The loader must somehow fix this code. When the linker builds your module, it embeds a relocation section in the resulting file. If the loader can map a module at its preferred base address, the module's relocation section is never accessed by the system. This is certainly what we want—you never want the relocation section to be used because of the below reasons.
If the module cannot be mapped at its preferred base address, the loader opens the module's relocation section and iterates though all the entries. For each entry found, the loader goes to the page of storage that contains the machine code instruction to be modified. It then grabs the memory address that the machine instruction is currently using and adds to the address the difference between the module's preferred base address and the address where the module actually got mapped.
So, in the example above, the second DLL was mapped at 0x20000000, but its preferred base address is 0x10000000. This yields a difference of 0x10000000, which is then added to the address in the machine code instruction, giving us this:
MOV [0x20014540], 5
To avoid this, instead change the settings while compilation so as to give different base addresses during compilation itself.
Figure below shows how to achieve it:
Below are the dependencies for a test EXE using DLL1 and DLL2 with rebasing. As seen, both DLLs have different base addresses (DLL1: 0x10000000 and DLL2: 0x20000000) and will be loaded properly. If for some reason, it cannot be loaded at the other address specified, then it has to reallocate the DLL and the above process is carried.
Now this code in the second DLL will reference its "
i" variable correctly.
There are two major drawbacks when a module cannot load at its preferred base address:
- The loader has to iterate through the relocation section and modify a lot of the module's code. This produces a major performance hit and can really hurt an application's initialization time.
- As the loader writes to the module's code pages, the system's copy-on-write mechanism forces these pages to be backed by the system's paging file.
The second point above is truly bad. It means that the module's code pages can no longer be discarded and reloaded from the module's file image on disk. Instead, the pages are swapped to and from the system's paging file as necessary. This hurts performance too. But wait, it gets worse. Since the paging file backs all of the module's code pages, the system has less storage available for all processes running in the system.
By the way, you can create an executable or DLL module that doesn't have a relocation section in it. You do this by passing the /FIXED switch to the linker when you build the module. Using this switch makes the module smaller in bytes but it means that the module cannot be relocated. If the module cannot load at its preferred base address, it cannot load at all. If the loader must relocate a module but no relocation section exists for the module, the loader kills the entire process and displays an "Abnormal Process Termination" message to the user.
In this case, I have made base address of both DLLs same, i.e., 2000000, and have a fixed switch so only one DLL will be loaded and the other cannot be at that location, and you get an error as shown:
Points of Interest
In this particular example, I have tried to cover all aspects, i.e., without rebasing, how to rebase, and how to rebase using /FIXED switch, what are the needs of rebasing, and drawbacks of rebasing using /FIXED. Suggestions for improvement are most welcome.
This was my first article, hope all of you liked it.
Acknowledgement and References
I would like to acknowledge author Mr. Jeffery Richter and his book on Windows OS, which is one of the best books to know about the Windows operating system internals. Parts of this article is taken from the book and examples were added to simplify things.