Understanding PE

What does it mean ?

The Portable Executable (PE) format is the standard file format used for executables, object code, dynamic-link libraries (DLLs), and other binary files on both 32-bit and 64-bit versions of Windows, as well as in UEFI environments. It serves as the primary format for executable files on Windows NT-based systems, including file types such as .exe, .dll, .sys (system drivers), and .mui. Essentially, the PE format is a structured data container that provides the Windows loader with all the necessary information to correctly handle and execute the code. This includes references to dynamic libraries, import and export tables for APIs, resource data, and thread-local storage (TLS) details.

Basically PE format is a fundamental format for executable files, object code, DLLs and other types of native files on Windows. It consists of a number of headers, sections that tell the Windows Loader how to put the file into memory and prepare it to run.

Portable Executable ( PE ) Structure

DOS Header

The first part of the PE format is the IMAGE_DOS_HEADER ( DOS Header ) which is a 64 bytes long structure that looks like this:

The most important members that we need to know from a malware development and analysis perspective are:

e_magic
e_lfanew

All the other members contain information that is useful for the DOS loader to calculate offsets and tells us very little about the file.

typedef struct _IMAGE_DOS_HEADER
{
     WORD e_magic; // Also called as Magic Number, all valid MS-DOS executables have the value set to 0x4d5a = "MZ"
     WORD e_cblp;
     WORD e_cp;
     WORD e_crlc;
     WORD e_cparhdr;
     WORD e_minalloc;
     WORD e_maxalloc;
     WORD e_ss;
     WORD e_sp;
     WORD e_csum;
     WORD e_ip;
     WORD e_cs;
     WORD e_lfarlc;
     WORD e_ovno;
     WORD e_res[4];
     WORD e_oemid;
     WORD e_oeminfo;
     WORD e_res2[10];
     LONG e_lfanew; // This contains an offset to the *NT_HEADER
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

Size of Types (on typical Windows platforms):

WORD = 2 bytes
LONG = 4 bytes

The IMAGE_DOS_HEADER is exactly 64 bytes because:

It contains only fixed-size data types (WORD and LONG)
No padding is required (the layout is tightly packed)
It was deliberately designed to be a compact and predictable header format at the beginning of executable files.

Add all of these up:

2 * 14 (first 14 WORD fields) = 28
+ 8 (e_res[4])
+ 2 (e_oemid)
+ 2 (e_oeminfo)
+ 20 (e_res2[10])
+ 4 (e_lfanew)
= 64 bytes

The DOS Header is what makes the PE file an MS-DOS executable.

DOS Stub

While PE files maintain backward compatibility for historical reasons, modern Windows PE files are not designed to run in DOS. Instead, they include a DOS stub—a small piece of code that displays an error message if the file is executed in a DOS environment. By default, this message is: “This program cannot be run in DOS mode.”

Rich Header

Present in executables built with Microsoft development tools, the Rich Header contains metadata about the build environment. Both malware authors and analysts can leverage this information in various insightful ways.

NT Headers

The structure looks like the following:

typedef struct _IMAGE_NT_HEADERS64 {
  DWORD                   Signature;
  IMAGE_FILE_HEADER       FileHeader;
  IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;

There are 2 variants of NT Headers:

For the x86 architecture it’s called: IMAGE_NT_HEADERS
For the x64 architecture it’s called: IMAGE_NT_HEADERS64

The NT Headers structure contains 3 main parts:

PE Signature: A 4-byte DWORD that marks the file as a PE image. It always holds the value 0x00004550, which corresponds to the ASCII string 'PE\0\0'.
File Header: A standard COFF File Header - A structure with seven fields that holds key information about the PE file, including the machine architecture, time-date stamp, size of the section table, size of the optional header, and various file characteristics.
Optional Header: Despite its name, the Optional Header is a critical part of the NT Headers for executable image files (such as .exe files). It’s called “optional” because it is omitted in certain file types like object files, but it is required for executables. This header supplies essential information to the operating system loader.
- Data Directories: An array of 16 data directories, each containing important information used by the PE loader during program execution.

NT Headers - File Header

The structure looks like the following:

typedef struct _IMAGE_FILE_HEADER {
  WORD  Machine; // Indicates the target CPU architecture for which this executable was compiled
  WORD  NumberOfSections; // Indicates the number of sections in the PE file
  DWORD TimeDateStamp; // Indicates a UNIX timestamp which indicates the date and time of when the file was created
  DWORD PointerToSymbolTable; // It contains a file offset to the COFF symbol table - however for PE files today this number is typically set to zero which indicates that no COFF symbol table exists
  DWORD NumberOfSymbols; // Indicates the number of symbol table entries - also zero
  WORD  SizeOfOptionalHeader; // Indicates the size of the optional header in bytes
  WORD  Characteristics; // Indicates flags that indicate attributes of the file - if it's an executable, contains debugging information and so on
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

This structure is also referred to as the COFF File Header, where COFF means Common Object File Format Header .

NT Headers - Optional Header

There are 2 variants of Optional Header:

For the x86 architecture it’s called: _IMAGE_OPTIONAL_HEADER
For the x64 architecture it’s called: _IMAGE_OPTIONAL_HEADER64

The structure looks like the following:

typedef struct _IMAGE_OPTIONAL_HEADER64 {
  WORD                 Magic; // 0x10B: PE32 & 0x20B: PE32+ ( x64 )
  BYTE                 MajorLinkerVersion;
  BYTE                 MinorLinkerVersion;
  DWORD                SizeOfCode;
  DWORD                SizeOfInitializedData;
  DWORD                SizeOfUninitializedData;
  DWORD                AddressOfEntryPoint; // It contains the RVA to the place where the first instruction of the PE is loaded in memory. This points to an initialization function 
  DWORD                BaseOfCode; // It stores the RVA to the start of the code section
  ULONGLONG            ImageBase; // Specifies the preferred base address to load the PE file into memory
  DWORD                SectionAlignment;
  DWORD                FileAlignment;
  WORD                 MajorOperatingSystemVersion;
  WORD                 MinorOperatingSystemVersion;
  WORD                 MajorImageVersion;
  WORD                 MinorImageVersion;
  WORD                 MajorSubsystemVersion;
  WORD                 MinorSubsystemVersion;
  DWORD                Win32VersionValue;
  DWORD                SizeOfImage;
  DWORD                SizeOfHeaders;
  DWORD                CheckSum;
  WORD                 Subsystem;
  WORD                 DllCharacteristics;
  ULONGLONG            SizeOfStackReserve;
  ULONGLONG            SizeOfStackCommit;
  ULONGLONG            SizeOfHeapReserve;
  ULONGLONG            SizeOfHeapCommit;
  DWORD                LoaderFlags;
  DWORD                NumberOfRvaAndSizes;
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;

Optional Header - Data Directories

DataDirectory is one of the most important members of the Optional Header. It is an array with a data type of IMAGE_DATA_DIRECTORY and contains up to 16 structures.

The array has a size of IMAGE_NUMBEROF_DIRECTORY_ENTRIES :

1	IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];

Which is a set to a constant of 16: #define IMAGE_NUMBEROF_DIRECTORY_ENTRIES 16

Each IMAGE_DATA_DIRECTORY struct has two members: VirtualAddress and Size

typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD   VirtualAddress;
    DWORD   Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

VirtualAddress is a relative virtual address (RVA) that points to the start of the data directory.
Size is the size of the data directory in bytes.

Data directories store information essential to the PE loader. Each entry is identified by an index and points to a specific offset within the file.

Among these, the most important are the Export Directory (IMAGE_DIRECTORY_ENTRY_EXPORT, index 0) and the Import Address Table (IMAGE_DIRECTORY_ENTRY_IAT).

The Export Directory is commonly found in DLLs that provide exported functions. It’s a data structure that holds the addresses of these exported functions and variables, allowing other executables to access and use them.

The Import Address Table (IAT) holds the addresses of functions and data imported from other executables, allowing the program to access external code and resources at runtime.

Section Headers

Following the Optional Header and preceding the actual sections are the Section Headers. These headers provide metadata about each section in the PE file.

Each Section Header is represented by the IMAGE_SECTION_HEADER structure, which is defined in winnt.h as follows:

typedef struct _IMAGE_SECTION_HEADER {
    BYTE    Name[IMAGE_SIZEOF_SHORT_NAME];
    union {
            DWORD   PhysicalAddress;
            DWORD   VirtualSize;
    } Misc;
    DWORD   VirtualAddress;
    DWORD   SizeOfRawData;
    DWORD   PointerToRawData;
    DWORD   PointerToRelocations;
    DWORD   PointerToLinenumbers;
    WORD    NumberOfRelocations;
    WORD    NumberOfLinenumbers;
    DWORD   Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

Sections

Sections hold the actual data of the executable and make up the remainder of the PE file following the headers—specifically, after the section headers.

Some sections have special names that reflect their purpose. We’ll cover a few of these, but a complete list can be found in Microsoft’s official documentation under the “Special Sections” section.

.text – Holds the program’s executable code.
.data – Stores initialized global and static variables.
.bss – Contains uninitialized data that is zeroed at runtime.
.rdata – Contains read-only initialized data, such as constants.
.edata – Holds the export table, which lists functions and data the file exports.
.idata – Stores the import table, detailing external functions and libraries the file uses.
.reloc – Contains relocation information used when the image is loaded at a different base address.
.rsrc – Includes application resources such as icons, images, dialogs, and embedded binaries.