Click here to Skip to main content
16,020,378 members
Articles / Programming Languages / Visual Basic
Article

Inside the executable: an introduction to the Portable Executable format for VB programmers

Rate me:
Please Sign up or sign in to vote.
4.78/5 (26 votes)
6 May 2003CPOL4 min read 116.5K   1.9K   36   23
Describes the layout of a Windows executable file and how to read this.

Introduction

The Portable Executable Format is the data structure that describes how the various parts of a Win32 executable file are held together. It allows the operating system to load the executable and to locate the dynamically linked libraries required to run that executable and to navigate the code, data and resource sections compiled into that executable.

Getting over DOS

The PE Format was created for Windows but Microsoft had to make sure that running such an executable in DOS would yield a meaningful error message and exit. To this end the very first bit of a Windows executable file is actually a DOS executable (sometimes known as the stub) which writes "This program requires Windows" or similar, then exits.

The format of the DOS stub is:

VB
Private Type IMAGE_DOS_HEADER
    e_magic As Integer   ''\\ Magic number
    e_cblp As Integer    ''\\ Bytes on last page of file
    e_cp As Integer      ''\\ Pages in file
    e_crlc As Integer    ''\\ Relocations
    e_cparhdr As Integer ''\\ Size of header in paragraphs
    e_minalloc As Integer ''\\ Minimum extra paragraphs needed
    e_maxalloc As Integer ''\\ Maximum extra paragraphs needed
    e_ss As Integer    ''\\ Initial (relative) SS value
    e_sp As Integer    ''\\ Initial SP value
    e_csum As Integer  ''\\ Checksum
    e_ip As Integer  ''\\ Initial IP value
    e_cs As Integer  ''\\ Initial (relative) CS value
    e_lfarlc As Integer ''\\ File address of relocation table
    e_ovno As Integer ''\\ Overlay number
    e_res(0 To 3) As Integer ''\\ Reserved words
    e_oemid As Integer ''\\ OEM identifier (for e_oeminfo)
    e_oeminfo As Integer ''\\ OEM information; e_oemid specific
    e_res2(0 To 9) As Integer ''\\ Reserved words
    e_lfanew As Long ''\\ File address of new exe header
End Type

The only field of this structure that is of interest to Windows is e_lfanew which is the file pointer to the new Windows executable header. To skip over the DOS part of the program, set the file pointer to the value held in this field:

VB
Private Sub SkipDOSStub(ByVal hfile As Long) 

Dim BytesRead As Long

'\\ Go to start of file...
Call SetFilePointer(hfile, 0, 0, FILE_BEGIN)
If Err.LastDllError Then
    Debug.Print LastSystemError
End If

Dim stub As IMAGE_DOS_HEADER
Call ReadFileLong(hfile, VarPtr(stub), Len(stub), BytesRead, ByVal 0&)
Call SetFilePointer(hfile, stub.e_lfanew, 0, FILE_BEGIN)

End Sub

The NT header

The NT header holds the information needed by the Windows program loader to load the program. It consists of the PE File signature followed by an IMAGE_FILE_HEADER and IMAGE_OPTIONAL_HEADER records.

For applications designed to run under Windows (i.e. not OS/2 or VxD files) the four bytes of the PE File signature should equal &h4550. The other defined signatures are:

VB
Public Enum ImageSignatureTypes
    IMAGE_DOS_SIGNATURE = &H5A4D     ''\\ MZ
    IMAGE_OS2_SIGNATURE = &H454E     ''\\ NE
    IMAGE_OS2_SIGNATURE_LE = &H454C  ''\\ LE
    IMAGE_VXD_SIGNATURE = &H454C     ''\\ LE
    IMAGE_NT_SIGNATURE = &H4550      ''\\ PE00
End Enum

Following the PE file signature is the IMAGE_NT_HEADERS structure that stores information about the target environment of the executable. The structure is:

VB
Private Type IMAGE_FILE_HEADER
    Machine As Integer
    NumberOfSections As Integer
    TimeDateStamp As Long
    PointerToSymbolTable As Long
    NumberOfSymbols As Long
    SizeOfOptionalHeader As Integer
    Characteristics As Integer
End Type

The Machine member describes what target CPU the executable was compiled for. It can be one of:

VB
Public Enum ImageMachineTypes
    IMAGE_FILE_MACHINE_I386 = &H14C   ''\\ Intel 386.
    ''\\ MIPS little-endian,= &H160 big-endian
    IMAGE_FILE_MACHINE_R3000 = &H162  
    IMAGE_FILE_MACHINE_R4000 = &H166  ''\\ MIPS little-endian
    IMAGE_FILE_MACHINE_R10000 = &H168  ''\\ MIPS little-endian
    IMAGE_FILE_MACHINE_WCEMIPSV2 = &H169  ''\\ MIPS little-endian WCE v2
    IMAGE_FILE_MACHINE_ALPHA = &H184      ''\\ Alpha_AXP
    IMAGE_FILE_MACHINE_POWERPC = &H1F0    ''\\ IBM PowerPC Little-Endian
    IMAGE_FILE_MACHINE_SH3 = &H1A2   ''\\ SH3 little-endian
    IMAGE_FILE_MACHINE_SH3E = &H1A4  ''\\ SH3E little-endian
    IMAGE_FILE_MACHINE_SH4 = &H1A6   ''\\ SH4 little-endian
    IMAGE_FILE_MACHINE_ARM = &H1C0   ''\\ ARM Little-Endian
    IMAGE_FILE_MACHINE_IA64 = &H200  ''\\ Intel 64
End Enum

The SizeOfOptionalHeader member indicates the size (in bytes) of the IMAGE_OPTIONAL_HEADER structure that immediately follows it. In practice this structure is not optional, so that is a bit of a misnomer. This structure is defined as:

VB
Private Type IMAGE_OPTIONAL_HEADER
    Magic As Integer
    MajorLinkerVersion As Byte
    MinorLinkerVersion As Byte
    SizeOfCode As Long
    SizeOfInitializedData As Long
    SizeOfUninitializedData As Long
    AddressOfEntryPoint As Long
    BaseOfCode As Long
    BaseOfData As Long
End Type

and this in turn is immediately followed by the IMAGE_OPTIONAL_HEADER_NT structure:

VB
Private Type IMAGE_OPTIONAL_HEADER_NT
    ImageBase As Long
    SectionAlignment As Long
    FileAlignment As Long
    MajorOperatingSystemVersion As Integer
    MinorOperatingSystemVersion As Integer
    MajorImageVersion As Integer
    MinorImageVersion As Integer
    MajorSubsystemVersion As Integer
    MinorSubsystemVersion As Integer
    Win32VersionValue As Long
    SizeOfImage As Long
    SizeOfHeaders As Long
    CheckSum As Long
    Subsystem As Integer
    DllCharacteristics As Integer
    SizeOfStackReserve As Long
    SizeOfStackCommit As Long
    SizeOfHeapReserve As Long
    SizeOfHeapCommit As Long
    LoaderFlags As Long
    NumberOfRvaAndSizes As Long
    DataDirectory(0 To 15) As IMAGE_DATA_DIRECTORY
End Type

The most useful field of this structure (to my purposes, anyhow) are the 16 IMAGE_DATA_DIRECTORY entries. These describe whereabouts (if at all) the particular sections of the executable are located. The structure is defined thus:

VB
Private Type IMAGE_DATA_DIRECTORY
    VirtualAddress As Long
    Size As Long
End Type

And the directories are held in order thus:

VB
Public Enum ImageDataDirectoryIndexes
    IMAGE_DIRECTORY_ENTRY_EXPORT = 0  ''\\ Export Directory
    IMAGE_DIRECTORY_ENTRY_IMPORT = 1  ''\\ Import Directory
    IMAGE_DIRECTORY_ENTRY_RESOURCE = 2 ''\\ Resource Directory
    IMAGE_DIRECTORY_ENTRY_EXCEPTION = 3   ''\\ Exception Directory
    IMAGE_DIRECTORY_ENTRY_SECURITY = 4   ''\\ Security Directory
    IMAGE_DIRECTORY_ENTRY_BASERELOC = 5  ''\\ Base Relocation Table
    IMAGE_DIRECTORY_ENTRY_DEBUG = 6   ''\\ Debug Directory
    IMAGE_DIRECTORY_ENTRY_ARCHITECTURE = 7   ''\\ Architecture Specific Data
    IMAGE_DIRECTORY_ENTRY_GLOBALPTR = 8  ''\\ RVA of GP
    IMAGE_DIRECTORY_ENTRY_TLS = 9  ''\\ TLS Directory
    ''\\ Load Configuration Directory
    IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG = 10    
    ''\\ Bound Import Directory in headers
    IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT = 11   
    IMAGE_DIRECTORY_ENTRY_IAT = 12  ''\\ Import Address Table
    ''\\ Delay Load Import Descriptors
    IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT = 13   
End Enum

Note that if an executable does not contain one of the sections (as is often the case) there will be an IMAGE_DATA_DIRECTORY for it, but the address and size will both be zero.

The image data directories

The exports directory

The exports directory holds details of the functions exported by this executable. For example, if you were to look in the exports directory of the MSVBVM50.dll it would list all the functions it exports, that make up the Visual Basic 5 runtime environment.

This directory consists of some info to tell you how many exported functions there are, followed by three parallel arrays which give you the address, name and ordinal of the functions respectively. The structure is defined thus:

VB
Private Type IMAGE_EXPORT_DIRECTORY
    Characteristics As Long
    TimeDateStamp As Long
    MajorVersion As Integer
    MinorVersion As Integer
    lpName As Long
    Base As Long
    NumberOfFunctions As Long
    NumberOfNames As Long
    lpAddressOfFunctions As Long    '\\ Three parrallel arrays...(LONG)
    lpAddressOfNames As Long        '\\ (LONG)
    lpAddressOfNameOrdinals As Long '\\ (INTEGER)
End Type

And you can read this info from the executable thus:

VB
Private Sub ProcessExportTable(ExportDirectory As IMAGE_DATA_DIRECTORY)

Dim deThis As IMAGE_EXPORT_DIRECTORY
Dim lBytesWritten As Long
Dim lpAddress As Long

Dim nFunction As Long

If ExportDirectory.VirtualAddress > 0 And ExportDirectory.Size > 0 Then
    '\\ Get the true address from the RVA
    lpAddress = AbsoluteAddress(ExportDirectory.VirtualAddress)
    '\\ Copy the image_export_directory structure...
    Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
                   VarPtr(deThis), Len(deThis), lBytesWritten)
    With deThis
        If .lpName <> 0 Then
            image.Name = StringFromOutOfProcessPointer(DebugProcess.Handle,_
                   image.AbsoluteAddress(.lpName), 32, False)
        End If
        If .NumberOfFunctions > 0 Then
            For nFunction = 1 To .NumberOfFunctions
                lpAddress = LongFromOutOfprocessPointer_
                   (DebugProcess.Handle, _
                   image.AbsoluteAddress(.lpAddressOfNames)_
                   + ((nFunction - 1) * 4))
                fExport.Name = StringFromOutOfProcessPointer_
                   (DebugProcess.Handle, _
                   image.AbsoluteAddress(lpAddress), 64, False)
                fExport.Ordinal = .Base + _
                   IntegerFromOutOfprocessPointer(DebugProcess.Handle, _
                   image.AbsoluteAddress(.lpAddressOfNameOrdinals) + _
                   ((nFunction - 1) * 2))
                fExport.ProcAddress = LongFromOutOfprocessPointer_
                   (DebugProcess.Handle,_
                   image.AbsoluteAddress(.lpAddressOfFunctions) + _
                   ((nFunction - 1) * 4))
            Next nFunction
        End If
    End With
End If
    
End Sub

The imports directory

The imports directory lists the dynamic link libraries that this executable depends on and which functions it imports from that dynamic link library. It consists of an array of IMAGE_IMPORT_DESCRIPTOR structures terminated by an instance of this structure where the lpName parameter is zero. The structure is defined as:

VB
Private Type IMAGE_IMPORT_DESCRIPTOR
    lpImportByName As Long ''\\ 0 for terminating null import descriptor
    TimeDateStamp As Long  ''\\ 0 if not bound,
                           ''\\ -1 if bound, and real date\time stamp
                   ''\\ in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
                   ''\\ O.W. date/time stamp of DLL bound to (Old BIND)
    ForwarderChain As Long ''\\ -1 if no forwarders
    lpName As Long
    ''\\ RVA to IAT (if bound this IAT has actual addresses)
    lpFirstThunk As Long 
End Type

And you can walk the import directory thus:

VB
Private Sub ProcessImportTable(ImportDirectory As IMAGE_DATA_DIRECTORY)

Dim lpAddress As Long
Dim diThis As IMAGE_IMPORT_DESCRIPTOR
Dim byteswritten As Long
Dim sName As String
Dim lpNextName As Long
Dim lpNextThunk As Long

Dim lImportEntryIndex As Long

Dim nOrdinal As Integer
Dim lpFuncAddress As Long


'\\ If the image has an imports section...
If ImportDirectory.VirtualAddress > 0 And ImportDirectory.Size > 0 Then
    '\\ Get the true address from the RVA
    lpAddress = AbsoluteAddress(ImportDirectory.VirtualAddress)
    Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
             VarPtr(diThis), Len(diThis), byteswritten)
    
    While diThis.lpName <> 0
        '\\ Process this import directory entry
        sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
             image.AbsoluteAddress(diThis.lpName), 32, False)

        '\\ Process the import file's functions list
        If diThis.lpImportByName <> 0 Then
            lpNextName = LongFromOutOfprocessPointer(DebugProcess.Handle,_
                     image.AbsoluteAddress(diThis.lpImportByName))
            lpNextThunk = LongFromOutOfprocessPointer(DebugProcess.Handle,_
                     image.AbsoluteAddress(diThis.lpFirstThunk))
            While (lpNextName <> 0) And (lpNextThunk <> 0)
                '\\ get the function address
                lpFuncAddress = LongFromOutOfprocessPointer_
                                  (DebugProcess.Handle, lpNextThunk)
                nOrdinal = IntegerFromOutOfprocessPointer_
                                   (DebugProcess.Handle, lpNextName)
                '\\ Skip the two-byte ordinal hint
                lpNextName = lpNextName + 2
                '\\ Get this function's name
                sName = StringFromOutOfProcessPointer(DebugProcess.Handle, _
                     image.AbsoluteAddress(lpNextName), 64, False)
                If Trim$(sName) <> "" Then
                    '\\ Get the next imported function...
                    lImportEntryIndex = lImportEntryIndex + 1
                    
                    lpNextName = LongFromOutOfprocessPointer_
                       (DebugProcess.Handle, _
                       image.AbsoluteAddress(diThis.lpImportByName _
                       + (lImportEntryIndex * 4)))
                       
                    lpNextThunk = LongFromOutOfprocessPointer_
                       (DebugProcess.Handle,_
                       image.AbsoluteAddress(diThis.lpFirstThunk_
                       + (lImportEntryIndex * 4)))
                Else
                    lpNextName = 0
                End If
            Wend
        End If
               
        '\\ And get the next one
        lpAddress = lpAddress + Len(diThis)
        Call ReadProcessMemoryLong(DebugProcess.Handle, lpAddress, _
                VarPtr(diThis), Len(diThis), byteswritten)
    Wend

End If
    
End Sub

The resource directory

The structure of the resource directory is somewhat more involved. It consists of a root directory (defined by the structure IMAGE_RESOURCE_DIRECTORY) immediately followed by a number of resource directory entries (defined by the structure IMAGE_RESOURCE_DIRECTORY_ENTRY). These are defined thus:

VB
Private Type IMAGE_RESOURCE_DIRECTORY
    Characteristics As Long '\\Seems to be always zero?
    TimeDateStamp As Long
    MajorVersion As Integer
    MinorVersion As Integer
    NumberOfNamedEntries As Integer
    NumberOfIdEntries As Integer
End Type

Private Type IMAGE_RESOURCE_DIRECTORY_ENTRY
    dwName As Long
    dwDataOffset As Long
    CodePage As Long
    Reserved As Long
End Type

Each resource directory entry can either point to the actual resource data or to another layer of resource directory entries. If the highest bit of dwDataOffset is set, then this points to a directory. Otherwise it points to the resource data.

How is this information useful?

Once you know how an executable is put together, you can use this information to peer into its workings. You can view the resources compiled into it, the DLLs it depends on and the actual functions it imports from them. More importantly you can attach to the executable a debugger and track down any of those really troublesome general protection faults. The next article will describe how to attach a debugger and use the PE file format.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Ireland Ireland
C# / SQL Server developer
Microsoft MVP (Azure) 2017
Microsoft MVP (Visual Basic) 2006, 2007

Comments and Discussions

 
PraiseInteresting Pin
alexmro25-Mar-24 1:02
alexmro25-Mar-24 1:02 
QuestionGreat article, love you <3 Pin
Member 1049556531-Dec-13 13:55
Member 1049556531-Dec-13 13:55 
AnswerRe: Great article, love you <3 Pin
Duncan Edwards Jones2-Jan-14 21:04
professionalDuncan Edwards Jones2-Jan-14 21:04 
GeneralAccesing Import Info for a DLL Pin
Mani Shankar7-Aug-06 4:10
Mani Shankar7-Aug-06 4:10 
GeneralRe: Accesing Import Info for a DLL Pin
Blake Miller13-Apr-07 8:58
Blake Miller13-Apr-07 8:58 
GeneralDocumentation Pin
MrKalinka22-Jul-06 7:42
MrKalinka22-Jul-06 7:42 
GeneralRe: Documentation Pin
Duncan Edwards Jones22-Jul-06 9:12
professionalDuncan Edwards Jones22-Jul-06 9:12 
AnswerRe: Documentation Pin
MrKalinka23-Jul-06 4:34
MrKalinka23-Jul-06 4:34 
QuestionPE Graphical Area Access? Pin
IslamianFalcon24-Nov-05 20:05
IslamianFalcon24-Nov-05 20:05 
AnswerRe: PE Graphical Area Access? Pin
Duncan Edwards Jones24-Nov-05 21:22
professionalDuncan Edwards Jones24-Nov-05 21:22 
GeneralExportDirectory data Pin
progadmin6-Jun-05 6:54
progadmin6-Jun-05 6:54 
GeneralRe: ExportDirectory data Pin
Duncan Edwards Jones6-Jun-05 23:34
professionalDuncan Edwards Jones6-Jun-05 23:34 
GeneralRe: ExportDirectory data Pin
Anonymous7-Jun-05 2:38
Anonymous7-Jun-05 2:38 
GeneralRe: ExportDirectory data Pin
Duncan Edwards Jones7-Jun-05 3:00
professionalDuncan Edwards Jones7-Jun-05 3:00 
GeneralRe: ExportDirectory data Pin
progadmin8-Jun-05 7:00
progadmin8-Jun-05 7:00 
GeneralRe: ExportDirectory data Pin
Blake Miller13-Apr-07 9:01
Blake Miller13-Apr-07 9:01 
General..VB.NET Execuable Pin
Manish Pansiniya17-Mar-05 18:56
Manish Pansiniya17-Mar-05 18:56 
GeneralExcellent Pin
aamironline7-May-03 8:40
aamironline7-May-03 8:40 
GeneralRe: Excellent Pin
NormDroid7-May-03 23:50
professionalNormDroid7-May-03 23:50 
GeneralRe: Excellent Pin
Duncan Edwards Jones8-May-03 0:18
professionalDuncan Edwards Jones8-May-03 0:18 
GeneralRe: Excellent Pin
Bengi8-May-03 7:40
Bengi8-May-03 7:40 
GeneralRe: Excellent Pin
Game Tester24-Nov-03 16:39
Game Tester24-Nov-03 16:39 
GeneralRe: Excellent Pin
Paul Conrad16-Aug-05 17:06
professionalPaul Conrad16-Aug-05 17:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.