This article explains how to create a dynamic library that loads exported functions the first time they are used, opposed to loading them when the library is loaded. We will use Win32 platform and DLL format as illustration but the concept can be applied to other platforms and library formats.
Please note that this concept is different from the standard delayed library loading where the whole library (i.e. all functions) are loaded together! Using this mechanism, you can load single functions from the same DLL when they are needed and then free them independently from each other.
How Dynamic Libraries Are Loaded into Memory
When an application requests from the operating system that a specific library is loaded (assuming that the library is not yet in memory), the OS loader first locates the library file. Then memory is allocated with attributes for code execution and the library is read from the file into the allocated buffer. Before library code can be executed, the memory references within the library code must be relocated to fit the library base address if necessary (sometimes relocation is not needed, when the library code is located at the default base address). After this initialization, the binary code of the library can be executed. Normally, a standard initialization procedure is called within the library (
DllMain for DLL files).
How Functions Are Called From a DLL
If binary code needs to call a function located in a dynamic library, it first has to locate the function address using a handle to the loaded library and the function name. This is done by calling the Win32 API function
GetProcAddress. The OS uses the DLL export directory to determine the function address. The DLL export directory is a special table located in the DLL meta-information that contains all exported functions’ names and their entry addresses, which get adjusted in the relocation process after the DLL gets loaded.
GetProcAddress returns the entry address of the library function and the calling code can use that address to call it. The calling convention (function header) must be known to the caller.
Exploiting the Export Directory to Load Functions Dynamically
How can we use the load/call mechanism described above to implement lazy loading on function level?
One possible solution is to create a proxy DLL with a single loader function. This function has the only task to allocate memory, read the payload DLL file and copy the requested function’s binary code to the allocated buffer, and then pass the execution to the loaded function with all parameters in place. The export entry of the proxy DLL should be updated to the entry address of the function code in the allocated buffer.
If the proxy DLL loader contains all export entries of the payload DLL pointing to the single loader function, the proxy DLL can be used by the calling code without any changes by simply changing the library reference to another DLL file.
You can use this tool to help you generate DEF file for the proxy DLL you want to create.
Properties of the Described Technique
- If an exported function is dependent on another exported function or on a non-exported one, the dependency must be taken into account and further functions must be loaded. Maintaining a function level dependency graph is imperative.
- Modern Operating Systems like Windows 7 have mechanisms to prevent execution of memory that is not marked as code, as well as writing to pages with code. Further consideration is required to make this technique comply with the latter conventions.
- False positives are possible in conjunction with antivirus software.
- For large DLLs with a big number of relatively independent functions, this technique can save operating memory, loading only the necessary functions in memory.
- The loaded functions can be freed by a garbage collector to save memory if they are not needed after a specific time, or if the system experiences lack of memory.
- The overhead loader code introduces performance penalties the first time the functions are loaded.
- In small DLLs with a small number of functions or in larger DLLs with very interdependent functions, the technique does not bring any advantages. Only in DLLs with relatively independent exported functions, using it makes sense.
- The technique delivers in the appropriate cases smaller memory footprint to the cost of longer execution time.
Another possible application of the described technique is to use the proxy DLL to load the payload DLL and then redirect each function call, executing overhead code on each call/return.
This can be used for debugging, gathering statistics, tracing parameter and return values, applying transformations to parameter and return values, monitoring, etc.
You can find example Visual C++ 2008 solution attached to this article. The code is meant for illustration. Error handling and universality are therefore intentionally neglected in order to keep the code simple enough and not distract the reader. The audience is encouraged to elaborate more robust solutions and provide feedback.