July 10, 2015

How memory is allocated

Posted in Software at 16:38 by graham

tl;dr man 2 brk

Last year when I was learning assembler, I was asking myself how to allocate memory without malloc. Usually memory is either allocated for us by our language, or we do it with new or malloc. But malloc is a library function, it’s not a system call. How does malloc itself get memory from the kernel? To answer that we need to look at the layout of a program in memory.

On Linux amd64, every process gets it’s own 128 Tb virtual address space. The program code, global data, debugging information and so on are loaded at the bottom of that space, working ‘upwards’ (bigger numeric addresses). Then comes the heap, where we are going to allocate some memory. Where the heap ends is called the program break. Then there is a very large gap, which the heap will grow into. At the top of the address space (0x7fffffffffff) is the stack, which will grow downwards, back towards the top of the heap. Here is a graphic of virtual memory layout

To allocate memory on the heap, we simply ask the kernel to move the program break up. The space between old program break and new program break is our memory. The system call is brk. First we have to find out where it is now. brk returns the current position, so we simply have to call it. We pass it 0, which is an invalid value, so that it doesn’t change anything.

    mov $12, %rax   # brk syscall number
    mov $0, %rdi    # 0 is invalid, want to get current position
    syscall

When that returns, the current position is in rax. Let’s allocate 4 bytes, by asking the kernel to move our break up by four bytes:

    mov %rax, %rsi  # save current break

    mov %rax, %rdi  # move top of heap to here ...
    add $4, %rdi    # .. plus 4 bytes we allocate
    mov $12, %rax   # brk, again
    syscall

We can now store anything we want at the address pointed at by rsi, where we saved the start of our allocated space. Here is a full assembly program which puts “HI\n” into that space, and prints it out. alloc.s. Compile, link, run:

as -o alloc.o alloc.s
ld -o alloc alloc.o
./alloc

To free memory, you do the opposite, you move the break back down. That allows the kernel to re-use that space. Happy allocating!

2 Comments »

  1. graham said,

    July 11, 2015 at 05:31

    @Fazal Good point. Linux gcc malloc switches from brk to mmap around 128k, although that is both configurable and adaptive, see man 3 mallopt. I should do a follow up with mmap. The Go language, for example, I think only uses mmap.

  2. Fazal Majid said,

    July 11, 2015 at 04:19

    Not all malloc(3c) implementations use brk/sbrk under the hood, in fact I am surprised any still do, what with ASLR and all. Many (most?) are now based on anonymous mmap(2) instead, with no file specified (Mac OS X, iOS, FreeBSD, OpenBSD and Solaris).

Leave a Comment

Note: Your comment will only appear on the site once I approve it manually. This can take a day or two. Thanks for taking the time to comment.