Pointers in C - Use of pointers

Part I - The basics
☛ Part II - Use of pointers

The "point" of pointers

(sorry for the horrible pun)

Pointers in C allow for a few operations that are not otherwise possible, such as passing function arguments, or dynamically allocating memory.

Contents

Function arguments
Dynamic memory allocation
- malloc
- calloc
- free
Handling files

Function arguments

In C, function arguments are passed by value. If the main function calls a function a with arguments a(x, y), the function a will receive a copy of x and y and manipulate its own copy, leaving x and y in the main function unchanged.

Let's have a look at the classic example from K&R "The C programming language" where we try to swap the values of two numbers:

void swap(int px, int py) {
    int tmp;
    tmp = px;
    px = py;
    py = tmp;
}

This function takes two integers px and py and swaps around their values with the use of a temporary variable. Value of px is assigned to tmp, value of px is replaced with the value of py and, finally, value of py is replaced with the value of tmp.

Simple, right? Now let's try to use this into the main function:

int main(void) {
    int x = 20, y = 10;
    swap(x, y);
}

Here is what the compiler understood:

Function `main`
int variables	`x`	`y`
values	20	10
addresses	0x1004	0x1008

Function `swap`
int variables	`px`	`py`
values	20	10
addresses	0x2004	0x2008

Then swap is called in main. So, here is what is going on with the variables in memory:

Variable	Value before `swap`	Value after `swap`
x	20	20
y	10	10
px	20	10
py	10	20

This is because the function swap took a copy of the variables x and y, assigned them an address in its own activation frame, manipulated the value and returned the result, passing the control back to main. The variables x and y of the main function were never affected from what happened to a copy of their values, since they live in completely different addresses in memory.

The way to actually swap the values of our variables in the main function is the use of pointers. Here's how to achieve this:

void swap(int *px, int *py) {
    int tmp;
    tmp = *px;
    *px = *py;
    *py = tmp;
}

int main(void) {
    int x = 20, y = 10;
    swap(&x, &y);
}

So, what did we do here? Let's break it down.

Function main has two variables of type integer:

Variable x, with value 20, lives in the memory address 0x1004 (as an example).
Variable y, with value 10, lives in the memory address 0x1008.

Function swap takes two integer pointers as an argument and swaps their values. Here we call the swap function passing the addresses of the variables x and y. So, in the swap function:

px points to the address 0x1004, so *px is 20.
py points to the address 0x1008, so *py is 10.
The tmp is also created in some memory location, let's say 0x200c.

Before swapping

      ┌────────────────────┐                
      │main                │                
      ├─────────┬──────────┤                
      │ address │ variable │                
      ├─────────┼──────────┤                
   ┌─►│ 0x1004  │ x = 20   │                
   │  │         │          │                
┌──┼─►│ 0x1008  │ y = 10   │                
│  │  └─────────┴──────────┘                
│  │                                        
│  │  ┌────────────────────────────────────┐
│  │  │swap                                │
│  │  ├───────────┬─────────────┬──────────┤
│  │  │ address   │  variable   │ pointer  │
│  │  ├───────────┼─────────────┼──────────┤
│  └──┤ 0x2004    │ px = 0x1004 │ *px = 20 │
│     │           │             │          │
└─────┤ 0x2008    │ px = 0x1004 │ *py = 10 │
      │           │             │          │
      │ 0x200C    │ tmp         │          │
      └───────────┴─────────────┴──────────┘

Now the swap function does not manipulate a copy of the values; it can act directly on the values of x and y because it knows the memory address where they are located.

After swapping

        ┌────────────────────────────────────┐
        │swap                                │
        ├───────────┬─────────────┬──────────┤
        │ address   │  variable   │ pointer  │
        ├───────────┼─────────────┼──────────┤
    ┌───┤ 0x2004    │ px = 0x1004 │ *px = 10 │
    │   │           │             │          │
┌───┼───┤ 0x2008    │ px = 0x1004 │ *py = 20 │
│   │   │           │             │          │
│   │   │ 0x200C    │ tmp         │ tmp = 20 │
│   │   └───────────┴─────────────┴──────────┘
│   │                                         
│   │   ┌────────────────────┐                
│   │   │main                │                
│   │   ├─────────┬──────────┤                
│   │   │ address │ variable │                
│   │   ├─────────┼──────────┤                
│   └──►│ 0x1004  │ x = 10   │                
│       │         │          │                
└──────►│ 0x1008  │ y = 20   │                
        └─────────┴──────────┘

So, when a function calls another function, using pointers, we allow the called function to manipulate the arguments of the calling function.

Dynamic memory allocation

In some programs, we don't know how much memory we will need for certain variables. For example, if our program is taking user input, command-line arguments. In this case, we want to allow our program to dynamically allocate memory at runtime. This requires us to access the heap memory; however data variables live in the stack memory and do not have direct access to the heap memory.

`malloc`

To access the heap memory we use malloc(size), a standard library function that reserves a block of memory of size size and returns a void pointer to it.

#include <stdlib.h>

int main(void) {
    int *p = malloc(4 * sizeof(int));

    return 0;
}

In this example, we allocated a memory block of the size of 4 integers in the heap memory, and the pointer p points to the address of the first integer. This would be suitable to create an integer array of 4 elements.

`calloc`

The function calloc(n, size) returns a pointer to n objects of size size. It also initialises all allocated bytes to zero. It uses malloc internally.

#include <stdlib.h>

int main(void) {
    int *p = calloc(4, sizeof(int));

    return 0;
}

Again, we allocated 4 contiguous memory blocks, each of the size of an integer, suitable for an integer array of four elements. The difference here is that, if we examine the values before manually initialising them, they would all be 0.

`free`

Memory allocated in stack is typically deallocated after the activation frame returns. However, the program does not have direct control to memory allocated to the heap. Memory allocated to the heap with malloc or calloc has to be manually cleared, otherwise once the program finished, the allocated data will remain there indefinitely. This is done with the free() function:

#include <stdlib.h>
#include <stdio.h>

int main(void) {
    int *p = malloc(4 * sizeof(int));

    p[0] = 4;
    p[1] = 8;
    p[2] = 5;
    p[3] = 11;

    printf("p[0]: %d, p points at: %d, address p: %p\n", p[0], *p, &p);

    free(p);

    printf("p[0]: %d, p points at: %d, address p: %p\n", p[0], *p, &p);

    return 0;
}

This will return:

p[0]: 4, p points at: 4, address p: 0x7ffec841cac0
p[0]: 1531789535, p points at: 1531789535, address p: 0x7ffec841cac0

After freeing the memory allocated with malloc, the pointer p still exists in the activation frame, still pointing to where the allocated memory was. However, after freeing the memory with free(p), the memory now holds garbage values, as it should. Other processes can now claim this memory space.

Handling files

An important use of pointers is accessing files, devices, or other resources external to the program. The C standard library provides a FILE structure for this. In glibc implementation, FILE is itself an _IO_FILE structure (see /usr/include/bits/types/FILE.h), and _IO_FILE is defined in /usr/include/bits/types/struct_FILE.h.

We can open files by using a file pointer:

FILE *f = fopen("filename", "r");

Then the file handling is done by using functions like fopen, fclose, fgets, etc, that are all provided by the standard library and are wrappers around kernel system calls for file handling.

Pointers are also closely related with arrays, a relation we will see in an upcoming article.