“Understanding how shellcode actually resolves API addresses — not just calling functions blindly.”
When I started exploring shellcode and reverse engineering, I kept running into examples that used Windows APIs without explaining how those functions were actually found or called in shellcode. I wanted to go deeper — to really understand how to write a reverse shell in pure x64 assembly, without relying on hardcoded addresses or import tables.
This post is the first step in that journey.
We’ll start by figuring out how to find the base address of a DLL (like kernel32.dll
or user32.dll
) directly from the PEB (Process Environment Block) — a structure Windows maintains internally for each process.
I’ll walk through each line of the assembly code I wrote, explaining what it does, why it matters, and how I figured it out — piece by piece.
If you’re also trying to understand malware development, red teaming, or AV evasion at a deeper level, this post is for you.
Understanding How to Locate DLL Base Addresses in Windows
┌────────────────────────────────────────────────────────────────────────────┐
│ Thread Environment Block (TEB)
│
│ GS:[0x000] -> TEB Header
│ ... │ ...
│ GS:[0x060] -> PEB* (Pointer to Process Environment Block) ──────────────┐
└───────────────────────────────────────────────────────────────
↓
┌──────────────────────────────────────────────────────────────────────┐
│ Process Environment Block (PEB)
│
│ +0x000 -> PEB Header
│ +0x018 -> PEB_LDR_DATA* (Loader Data) ─────────────────────────────┐
└───────────────────────────────────────────────────────────────
↓
┌────────────────────────────────────────────────────────────┐
│ PEB_LDR_DATA Structure
│
│ +0x000 -> Length / size info
│ +0x020 -> InMemoryOrderModuleList (LIST_ENTRY) ─────────┐
└──────────────────────────────────────────────────────────┘
↓
┌──────────────────────────────────────────────────┐
│ LIST_ENTRY (first module, usually EXE or ntdll)
│
│ +0x000 -> Flink → LDR_DATA_TABLE_ENTRY ──────┐
│ +0x008 -> Blink
└──────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ LDR_DATA_TABLE_ENTRY (ntdll.dll)
│ │
│ +0x000 -> LIST_ENTRY (InMemoryOrderLinks)
│ +0x020 -> DllBase* (Base Address of ntdll.dll)
│ +0x050 -> BaseDllName (UNICODE_STRING)
│ +0x060 -> FullDllName (UNICODE_STRING)
└─────────────────────────────────────────────────────────────┘
↓
Flink → kernel32.dll's entry
↓
┌─────────────────────────────────────────────────────────────┐
│ LDR_DATA_TABLE_ENTRY (kernel32.dll)
│ │
│ +0x000 -> LIST_ENTRY (InMemoryOrderLinks)
│ +0x020 -> DllBase* (Base Address of kernel32.dll)
└─────────────────────────────────────────────────────────────┘
↓
Flink → user32.dll's entry
↓
┌─────────────────────────────────────────────────────────────┐
│ LDR_DATA_TABLE_ENTRY (user32.dll)
│ │
│ +0x000 -> LIST_ENTRY (InMemoryOrderLinks)
│ +0x020 -> DllBase* (Base Address of user32.dll) ← This is it
└─────────────────────────────────────────────────────────────┘
When a program is executed, Windows sets up a structure called the Thread Environment Block (TEB) for each thread.The TEB holds various pieces of information about the thread, including a pointer to the Process Environment Block (PEB), which is located at offset 0x60
At offset 0x018
in the PEB, there is a pointer to the PEB_LDR_DATA
structure. This structure contains the InMemoryOrderModuleList
(a LIST_ENTRY
) at offset 0x020
If we look at the official Microsoft documentation for
, we find that it references two important structures: PEB_LDR_DATA
LIST_ENTRY
and _LDR_DATA_TABLE_ENTRY
.
Let’s first discuss the LIST_ENTRY
structure. It consists of two pointers: Flink
and Blink
. The Flink
points to the InMemoryOrderLinks
field of the first _LDR_DATA_TABLE_ENTRY
, which typically corresponds to ntdll.dll
.
_LDR_DATA_TABLE_ENTRY
is a very important structure in Windows because it stores metadata about loaded DLLs like kernel32.dll
and user32.dll
. Right now, we are interested in two fields from this structure: InMemoryOrderLinks
and DllBase
.
typedef struct _LDR_DATA_TABLE_ENTRY {
PVOID Reserved1[2];
LIST_ENTRY InMemoryOrderLinks;
PVOID Reserved2[2];
PVOID DllBase;
PVOID EntryPoint;
PVOID Reserved3;
UNICODE_STRING FullDllName;
BYTE Reserved4[8];
PVOID Reserved5[3];
union {
ULONG CheckSum;
PVOID Reserved6;
};
ULONG TimeDateStamp;
} LDR_DATA_TABLE_ENTRY, *PLDR_DATA_TABLE_ENTRY;
As you can see, the InMemoryOrderLinks
field is just a LIST_ENTRY
— the same structure we explained earlier. It’s part of a doubly linked list that links all loaded modules in memory order. The Flink
points to the next DLL’s _LDR_DATA_TABLE_ENTRY
structure in this list.
For example, if we are currently looking at the kernel32.dll
‘s _LDR_DATA_TABLE_ENTRY
, the Flink
inside its InMemoryOrderLinks
will point to the entry for the next DLL — typically user32.dll
.
DllBase
is the base address of the DLL in memory. It’s crucial because it allows us to calculate the address of any function exported by that DLL — and that’s exactly what we need in most shellcode or reverse engineering tasks.
In the _LDR_DATA_TABLE_ENTRY
structure, DllBase
is located at an offset of 0x20
from the start of the InMemoryOrderLinks
field.
Assembly code
; filename: peb.asm
; assemble: nasm -f win64 peb.asm -o peb.obj
; link: gcc peb.obj -o peb.exe
BITS 64
global main
extern printf, ExitProcess
section .text
main:
; Get PEB
mov rax, gs:[0x60] ; PEB
mov rax, [rax + 0x18] ; PEB->Ldr
mov rsi, [rax + 0x20] ; Ldr->InMemoryOrderModuleList (second)
lodsq ; skip ntdll.dll
xchg rax, rsi
lodsq ; kernel32.dll
mov rbx, [rax + 0x20] ; DllBase (base address of kernel32.dll)
; Print the address
sub rsp, 40
mov rcx, msg_fmt
mov rdx, rbx
call printf
add rsp, 40
; Exit cleanly
mov ecx, 0
call ExitProcess
section .data
msg_fmt db "K32: %p", 10, 0
Code Explanation
BITS 64
global main
extern printf, ExitProcess
section .text
Let’s break down this part of the code:
BITS 64
: This tells the assembler that we’re writing 64-bit assembly code. It ensures that instructions and registers are interpreted in 64-bit mode.global main
: This declares themain
label as a global symbol, making it accessible to the linker (especially when linking with C functions likeprintf
orExitProcess
). This is necessary when you’re linking with tools likegcc
, which expect amain
entry point.extern printf, ExitProcess
: These are external functions we plan to use in our code. We’re telling the assembler that these functions are defined elsewhere (usually in external libraries likemsvcrt.dll
orkernel32.dll
) and will be resolved at link time.
main:
; Get PEB
mov rax, gs:[0x60] ; PEB
mov rax, [rax + 0x18] ; PEB->Ldr
mov rsi, [rax + 0x20] ; Ldr->InMemoryOrderModuleList (second)
lodsq ; skip ntdll.dll
xchg rax, rsi
lodsq ; kernel32.dll
mov rbx, [rax + 0x20] ; DllBase (base address of kernel32.dll)
Let’s break this down step by step:
main:
— This is the entry point of our function. Execution starts here.mov rax, gs:[0x60]
— This retrieves the pointer to the PEB (Process Environment Block) from the TEB (Thread Environment Block). In 64-bit Windows, the PEB is located at offset0x60
in the GS segment.mov rax, [rax + 0x18]
— This accesses thePEB->Ldr
field, which points to the PEB_LDR_DATA structure. This structure contains information about all loaded modules (DLLs).mov rsi, [rax + 0x20]
— This loads the InMemoryOrderModuleList, which is a doubly-linked list of loaded modules. The list is made up of_LDR_DATA_TABLE_ENTRY
structures, each representing a DLL.lodsq
— This instruction:- Loads a 64-bit value from the address in
RSI
intoRAX
. - Increments
RSI
by 8 (the size of a quadword). - So here, it loads the first module in the list (usually
ntdll.dll
) and moves the pointer forward.
- Loads a 64-bit value from the address in
xchg rax, rsi
— We swapRAX
andRSI
so we can uselodsq
again to load the next module.lodsq
— Now this loads the second module (kernel32.dll
) intoRAX
.mov rbx, [rax + 0x20]
— Finally, we read theDllBase
field from the_LDR_DATA_TABLE_ENTRY
ofkernel32.dll
, which gives us the base address ofkernel32.dll
. This is often the first DLL we care about when resolving APIs manually in shellcode.
main.exe's LIST_ENTRY (RSI starts here)
└── Flink ─────► ntdll.dll's LIST_ENTRY (1st `lodsq` loads this into RAX)
└── Flink ─────► kernel32.dll's LIST_ENTRY (2nd `lodsq`)
└── Flink ─────► user32.dll's LIST_ENTRY (3rd `lodsq`, if done)
; Print the address
sub rsp, 40
mov rcx, msg_fmt
mov rdx, rbx
call printf
add rsp, 40
; Exit cleanly
mov ecx, 0
call ExitProcess
section .data
msg_fmt db "K32: %p", 10, 0
Let’s break down this part of the code:
sub rsp, 40
:
This adjusts the stack pointer to allocate 40 bytes:- 32 bytes for shadow space, required by the Windows x64 calling convention.
- 8 bytes extra to maintain 16-byte alignment before the
call
instruction (since the return address occupies 8 bytes).
mov rcx, msg_fmt
:
Loads the format string into the first argument register (RCX) forprintf
.
The string contains:"K32: %p\n"
to print a pointer (address).mov rdx, rbx
:
Loads the base address ofkernel32.dll
(stored inRBX
) into the second argument register (RDX).
This is the value that gets printed byprintf
.call printf
:
Calls theprintf
function from the C standard library.
At this point, all required arguments are in the correct registers, and shadow space is allocated properly.add rsp, 40
:
Cleans up the stack by restoring the original value ofrsp
.
Always good practice after a function call.mov ecx, 0
:
Prepares the exit code0
in the first argument register (ECX) forExitProcess
.call ExitProcess
:
Exits the program cleanly.
Data Section
section .data
:
Declares a data segment in the binary.msg_fmt db "K32: %p", 10, 0
:
Defines a null-terminated ASCII string for use withprintf
.%p
: format specifier to print a pointer (memory address).10
: newline character (\n
).0
: null terminator (\0
).
Conclusion
By the end of this post, you now understand how to walk the PEB-linked list in x64 assembly to find the base address of a DLL — without relying on imports or hardcoded addresses. This is one of the foundational skills for writing shellcode, building custom loaders, or understanding how malware resolves functions dynamically.
We’ve manually located kernel32.dll
, one of the most critical DLLs for Windows API access. From here, we can use this base address to parse the PE header and locate functions like LoadLibraryA
and GetProcAddress
.
But what happens when the DLL load order is randomized? In that case, walking the list by order won’t work.
What’s Next
In the next part of this series, we’ll tackle how to locate any DLL in memory using string comparison (e.g., matching “kernel32.dll” in the module name) instead of relying on order. This allows our shellcode to adapt to different environments and stay stealthy even if modules are shuffled.
Stay tuned — and follow along if you’re serious about learning low-level evasion and shellcode internals from the ground up.