Introduction
A dynamic-link library (DLL) is a module that contains functions and data that can be used by another module (application or DLL). In Linux/UNIX, the same concept is implemented in shared object (.so) files. From now on, I use the term shared libraries to refer to DLL and SO files.
Advantaged of using shared libraries are:
- Create modular applications
- Reduce memory overhead when several applications use the same functionality at the same time, because although each application gets its own copy of the data, they can share the code.
This article will address the following topics:
- Fundamentals of shared libraries
- Differences between DLL and SO
- How we can create/use the shared libraries in both Windows and Linux?
- How we can write a platform independent code? Keep the same source files and compile for different platforms.
Creating, Linking and Compiling the DLL or SO:
- Write code for the DLL or SO. Identify the functions or variables that are to be available for the calling process.
- Compile the source code into an object file.
- Link that object file into either a DLL or SO.
Accessing the DLL or SO from a Calling Process:
- Load the DLL or SO.
- Get a pointer to the exported function or variable.
- Utilize the exported function or variable.
- Close the library.
There are many differences in the way shared libraries are created, exported and used.
A DLL can define two kinds of functions: exported and internal. The exported functions are intended to be called by other modules, as well as from within the DLL where they are defined. Internal functions are typically intended to be called only from within the DLL where they are defined. Although a DLL can export data, its data is generally used only by its functions. However, there is nothing to prevent another module from reading or writing that address.
But in the case of Linux/Unix, no special export statement needs to be added to the code to indicate exportable symbols, since all symbols are available to an interrogating process (the process which loads the SO/DLL).
Operation |
Unix/Linux |
Windows |
Export symbols in src file
|
No export symbol required.
|
__declspec( dllexport )
|
Header file
|
#include <dlfcn.h>
|
#include <windows.h>
|
Loading the shared library
|
void* dlopen
( const char *pathname, int mode );
|
HINSTANCE LoadLibrary
( LPCTSTR lpLibFileName );
|
Runtime access of functions
|
void* dlsym( void* handle,
const char *name);
|
GetProcAddress( HMODULE hModule,
LPCSTR lpProcName);
|
Closing the shared library
|
int dlclose( void *handle );
|
BOOL FreeLibrary
( HMODULE hLibModule );
|
Read further for more information on the above functions.
Creating, Compiling and Linking the DLL or SO
All UNIX object files are candidates for inclusion into a shared object library. No special export statements need to be added to the code to indicate exportable symbols, since all symbols are available to an interrogating process (the process which loads the SO/DLL).
In Windows NT, however, only the specified symbols will be exported (i.e., available to an interrogating process). Exportable objects are indicated by the including the keyword '__declspec(dllexport)
'. The following examples demonstrate how to export variables and functions.
__declspec( dllexport ) void MyExportFunction();
__declspec (dllexport) int MyExportVariable;
Both DLL and SO files are linked from compiled object files.
In Windows, most of the IDEs automatically help you compile and link the DLL.
CC = g++
add.so : add.o
$(CC) add.o -shared -o add.dll
add.o : add.cpp
$(CC) $(CFLAGS) add.cpp
Under UNIX, the linking of object code into a shared library can be accomplished using the '-shared' option of the linker executable 'ld'. For example, the following command line can be used to create an SO file add from add.cpp.
Accessing the DLL or SO
To use the shared objects in UNIX, the include directive '#include <dlfcn.h>' must be used. Under Windows, the include directive '#include <windows.h>' must be used.
In Unix, loading the SO file can be accomplished from the function dlopen(). The function protoype is:
void* dlopen( const char *pathname, int mode )
The argument pathname is either the absolute or relative (from the current directory) path and filename of the .SO file to load. The argument mode is either the symbol RTLD_LAZY
or RTLD_NOW
. RTLD_LAZY
will locate symbols in the file given by pathname as they are referenced, while RTLD_NOW
will locate all symbols before returning. The function dlopen() will return a pointer to the handle to the opened library, or NULL
if there is an error.
#define RTLD_LAZY 1
#define RTLD_NOW 2
Under Windows, the function to load a library is given by:
HINSTANCE LoadLibrary( LPCTSTR lpLibFileName );
In this case, lpLibFileName
carries the filename of an executable module. This function returns a handle to the DLL (of type HISTANCE
), or NULL
if there is an error.
Under UNIX, the shared object will be searched for in the following places:
- In the directory specified by the pathname argument to dlopen() if it is not a simple file name (i.e. it contains a character). In this case, the exact file is the only placed searched; steps two through four below are ignored.
- In any path specified via the
-rpath
argument to ld(1)
when the executable was statically linked.
- In any directory specified by the environment variable
LD_LIBRARY_PATH
. If LD_LIBRARY_PATH
is not set, 64-bit programs will also examine the variable LD_LIBRARY64_PATH
, and new 32-bit ABI programs will examine the variable LD_LIBRARYN32_PATH
to determine if an ABI-specific path has been specified. All three of these variables will be ignored if the process is running setuid or setgid.
- The default search paths will be used. These are /usr/lib:/lib for 32-bit programs, /usr/lib64:/lib64 for 64-bit programs, and /usr/lib32:/lib32 for new 32-bit ABI programs.
Under Windows, the shared object will be searched for in the following places:
- The directory from which the application loaded
- The current directory
- Windows 95 and Windows 98: The Windows system directory. Use the
GetSystemDirectory
function to get the path of this directory.
- Windows NT: The 32-bit Windows system directory. Use the
GetSystemDirectory
function to get the path of this directory. The name of this directory is SYSTEM32.
- Windows NT: The 16-bit Windows system directory. There is no function that obtains the path of this directory, but it is searched. The name of this directory is SYSTEM.
- The Windows directory. Use the
GetWindowsDirectory
function to get the path of this directory.
- The directories that are listed in the
PATH
environment variable.
Under Unix, symbols can be referenced from a SO once the library is loaded using dlopen()
. The function dlsym()
will return a pointer to a symbol in the library.
void* dlsym( void* handle, const char *name);
The handle argument is the handle to the library returned by dlopen()
. The name argument is a string
containing the name of the symbol. The function returns a pointer to the symbol if it is found and NULL
if not or if there is an error.
FARPROC GetProcAddress( HMODULE hModule, LPCSTR lpProcName);
Under Windows, the functions can be accessed with a call to GetProcAddress()
.
The argument hModule
is the handle to the module returned from LoadLibrary()
. The argument lpProcName
is the string
containing the name of the function. This procedure returns the function pointer to the procedure if successful, else it returns NULL
.
Closing the library is accomplished in Unix using the function dlclose, and in Windows using the function FreeLibrary
. Note that these functions return either a 0 or a non-zero value, but Windows returns 0 if there is an error. Unix returns 0 if successful.
In Unix, the library is closed with a call to dlclose
.
int dlclose( void *handle );
The argument handle is the handle to the opened SO file (the handle returned by dlopen
). This function returns 0 if successful, a non-zero value if not successful.
BOOL FreeLibrary( HMODULE hLibModule );
In Windows NT, the library is closed using the function Free Library.
The argument hLibModule
is the handle to the loaded DLL library module. This function returns a non-zero value if the library closes successfully, and a 0 if there is an error.
Coding for Multiple Platforms
Most of the big applications that we write will have many calls to API functions specific to the operating system. This will make the application platform dependant. The source code that is written compiles in all the platforms without any modifications to the source code, as shown in the figure below will be the ideal situation. This can be achieved by routing the operating system specific calls through a common function, which in turn will call the operating system specific calls based on the operating system.
One solution to create platform independent code is to create a header file, which handles all platform dependant calls. Based on the compiler (or operating system), the same code will generate applications for different platforms.
Main functions which differ between windows and Linux are:
- Functions related to shared libraries, e.g.
LoadLibrary
/dlopen
, …
- Functions related to creation and usage of threads/ forks.
………
The sample given below demonstrates a simple example of such a header file, which handles platform specific calls. In this example, the compiler is being checked to differentiate different platforms.
Advantages
- We can achieve the same result using preprocessor directives for each and every OS specific calls. But that will make the code ugly and non-readable.
- Once such a code is written, the usage will be simple and easy. This is applicable especially when the code size is huge.
- Extension to new platform/modification of calls will be very simple.
#ifndef os_call_h
#define os_call_h
#include<string>
#if defined(_MSC_VER) #include <windows.h>
#elif defined(__GNUC__) #include <dlfcn.h>
#else
#error define your copiler
#endif
void* LoadSharedLibrary(char *pcDllname, int iMode = 2)
{
std::string sDllName = pcDllname;
#if defined(_MSC_VER) sDllName += ".dll";
return (void*)LoadLibrary(pcDllname);
#elif defined(__GNUC__) sDllName += ".so";
return dlopen(sDllName.c_str(),iMode);
#endif
}
void *GetFunction(void *Lib, char *Fnname)
{
#if defined(_MSC_VER) return (void*)GetProcAddress((HINSTANCE)Lib,Fnname);
#elif defined(__GNUC__) return dlsym(Lib,Fnname);
#endif
}
bool FreeSharedLibrary(void *hDLL)
{
#if defined(_MSC_VER) return FreeLibrary((HINSTANCE)hDLL);
#elif defined(__GNUC__) return dlclose(hDLL);
#endif
}
#endif
#include "os_call.h"
#include <iostream>
using namespace std;
typedef int (*AddFnPtr)(int,int);
int main(int argc, char* argv[])
{
AddFnPtr AddFn;
void *hDLL;
hDLL = LoadSharedLibrary("add");
if(hDLL == 0)
return 1;
AddFn = (AddFnPtr)GetFunction(hDLL,"fnAdd");
int iTmp = AddFn(8,5);
cout<<"8 + 3 = "<<iTmp;
FreeSharedLibrary(hDLL);
return 0;
}
Another problem, which we face when we code targeting multiple platforms, is so called "Big Endian Little Endian". This is a problem raised because of different byte ordering used for information storage. Some machines choose to store the object in memory ordered from least significant byte to most, while other machines store them from most to least. The former convention—where the least significant byte comes first—is referred to as little endian. Most machines follow this convention. The latter convention—where the most significant byte comes first—is referred to as big endian. This convention is followed by most machines from IBM, Motorola, and Sun Microsystems. I will take a simple example to elaborate this problem. Say you want to store an integer data (4 byte long say) 0x12345678 starting from a memory address 0x5000 to 0x5003. The data arrangement in the memory will be as below.
Big Endian
0x5000 0x5001 0x5002 0x5003
-------------------
|
12
|
34
|
56
|
78
|
-----------------
|
Little Endian
0x5000 0x5001 0x5002 0x5003
-------------------
|
78
|
56
|
34
|
12
|
-----------------
|
This issue will become critical if we store data in external binary files. If two different platforms use the same binary data file, the data retrieved from the file will be completely different. Keep this point in mind when you are targeting many platforms.
With increasing usage of Linux, it will be good always to target multiple platforms when we write code. If we can keep the same source code and just compile for a different platform after coding, we can reduce the time spent on porting from one platform to another.