When a user runs a program, the image of the executable is loaded into memory. In this article, we will have a look at how the program's instructions and data are laid out into the memory.
Contents
Memory management in a Linux 32-bit system
Let's imagine, for the purposes of this article, that we are working on a Linux 32-bit system. In a 32-bit system, the operating system can only address up to 4GB of memory. The operating system typically reserves 1GB of memory for itself. So, user processes can utilise a maximum of 3GB.
Let's see a representation of this 4GB memory:
┌──────────┐
┌──────────────────┐ ◄───┤0xFFFFFFFF│
│ │ └──────────┘
│ Kernel Space │
┌──────────┐ │ │
│0xC0000000├──►├──────────────────┤
└──────────┘ │ │
│ │
│ │
│ User │
│ Space │
│ │
│ │
│ │
│ │
│ │
┌──────────┐ │ │
│0x00000000├──►└──────────────────┘
└──────────┘
The 4GB memory of our example 32-bit system is typically divided in two main segments:
- Kernel space: addresses in the range of
0xC0000000
to0xFFFFFFFF
are reserved for the kernel. The kernel space generally occupies the uppermost 1GB of the 4GB addressable memory. - User space: Addresses in the range of
0x00000000
to0xBFFFFFFF
, are used for user level processes. The user space occupies the remaining 3GB of memory.
Memory layout of a program
The memory is generally divided in two big sections, the static and the dynamic section. The static section contains the text segment, the data segment and the bss segment. In the dynamic section, we have the stack and the heap.
┌────────────────────┐
┌─► │ Stack │
│ │ │ Functions and
│ │ │ │ ◄─── variables inside
│ │ │ │ functions
│ │ ▼ │
┌────┴──┐ │--------------------│
│Dynamic│ │ │
│memory │ │ │
│layout │ │ │
└────┬──┘ │ │
│ │ │
│ │--------------------│
│ │ ▲ │ Variables allocated
│ │ │ │ ◄─── by memory management
│ │ │ │ functions (ex. `malloc`)
└─► │ Heap │
├────────────────────┤
┌─► │ bss │ "block starting symbol"
│ │ │ ◄─── variables allocated
│ │(uninitialised data)│ but not yet inisialised
┌───┴──┐ ├────────────────────┤
│Static│ │ Data │
│memory│ │ │ ◄─── Initialised variables
│layout│ │ (initialised data) │
└───┬──┘ ├────────────────────┤
│ │ │ Code segment: machine
│ │ Text │ ◄─── instructions for the
└─► │ │ program
└────────────────────┘
Memory layout of a C program
Let's use a program that adds two numbers as an example:
#include <stdio.h>
int a;
int main(void) {
int x = 3, y = 4;
int z = x + y;
printf("%d\n", z);
return 0;
}
This program has a global variable, a
, declared outside any function and it is not initialised with a value. Inside the main
function, there are three more local variables, x
, y
and z
.
The text segment
The text segment, also known as the code segment is the memory section where executable instructions reside. In a previous article, we briefly mentioned ELF files when talking about the assembler. Without going into much details, the ELF file is divided into sections, one of which is the text
section.
The machine code that goes inside the text segment of the memory is usually either the same as the "text" section of the ELF file, or derived from it.
The text segment is typically read only and has a fixed size.
The data segment
The data segment contains the static data of the program. This includes local static variables and global variables.
The bss segment
The block starting symbol or bss segment contains statically allocated variables, that, however, have not been initialised with a value. The int a;
global variable of our example program would live in this part of the memory.
The heap
The heap area is part of the dynamic memory layout. It usually starts at lower addresses and grows as needed towards higher addresses (see diagram). Variables can be allocated in the heap using memory management functions, such as malloc
or new
.
A variable stored in the heap memory can use any randon address in that area. To deallocate this memory, the program needs to keep track of the addresses and deallocate them when done.
Let's look at this example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *str;
str = malloc(24 * sizeof(char));
strcpy(str, "Hello world");
printf("%s\n", str);
free(str);
return 0;
}
As you can see, this is a very convoluted way to print "Hello world" on the console.
- With
char *str;
we declare a pointer of typechar
. The pointer variablestr
is in the stack (see below) and is, for now, uninitialsed.
- With
str = malloc(24 * sizeof(char));
, we allocate a contiguous block of memory in the heap, with size of 24 bytes (aschar
occupiesbyte
).str
points to the address of the first byte of this block. - Once the function
main
returns, all the local variables will be deallocated, so the pointerstr
will disappear. However, the 24 bytes memory block in the heap, will not be deallocated. It will also be lost, as nothing points to it anymore and there is no way to know the address it is located at. For this reason, we have tofree(str);
, otherwise we will cause a memory leak.
The stack
The stack is also part of the dynamic memory layout. In a typical x86 architecture, it starts from higher addresses and grows in the opposite direction than the heap, towards lower addresses (see diagram above).
The stack segment mainly contains variables that are defined inside functions.
The call stack
The call stack is a data structure that lives in the stack segment of the program. Its role is to store information about the active subroutines (functions) of the program.
The call stack is composed of one or more activation frames, one for each called function.
Activation frames
When the program calls the first function to execute (typically main()
), it will create an activation frame (also called "stack frame" or "activation record"). Activation frames are blocks of memory that contain information related to the function:
- Local variables of the function
- Parameters passed to the function
- Return addresses: When a function completes executing, the return address holds the memory address of the next instruction that should be executed
- Stack pointer: A pointer that points to the top of the stack
- Frame pointer: A pointer that points to the base of the activation frame.
Let's see an example:
#include <stdio.h>
int func(void) {
int a = 2, b = 4;
return a * b;
}
int main(void) {
int x = 10;
int y = x - func();
printf("%d\n", y);
return 0;
}
When we execute the program:
Click to view an animated gif depicting the steps below

- Function
main
:- Function
main
creates an activation frame. Local variablesx
andy
are declared but not yet initialised. - In the next instruction, variable
x
is initialised to10
. - In the next instruction, function
func
is called. A new activation frame is created below themain
activation frame forfunc
, with local variablesa
andb
declared, but not yet initialised. - The control passes to the first instruction of function
func
.
- Function
- Function
func
- In the first instruction of the function
func
, local variables are initialised to their respective values. - In the next instruction, the result of
a * b
is returned. The control passes back to themain
function. The activation frame forfunc
is typically deallocated soon after.
- In the first instruction of the function
- Function
main
:- The value of
y
can now be determined as the functionfunc
has returned - The next instruction (
printf
) executes - The function
main
returns and its activation frame is deallocated.
- The value of