Heap and malloc

basaltc provides its own heap allocator in lib/basalt/c/src/malloc.rs. The implementation is intentionally simple: a singly linked free list, first-fit search, bidirectional coalescing on free, and a single futex-based mutex for thread safety. There is no jemalloc, no mimalloc, no per-thread arenas — the design optimizes for code clarity and correctness over throughput. This page documents the layout, the allocation/free algorithms, the realloc and aligned-alloc variants, and the design constraints that determine when this allocator is appropriate.

Block Layout

Every allocation carries a 16-byte header placed immediately before the user pointer:

const HEADER_SIZE: usize = 16;
const ALIGN: usize = 16;

#[repr(C)]
struct BlockHeader {
    /// Total block size in bytes, including this header.
    size: usize,
    /// Pointer to the next free block (only valid when on the free list).
    next: *mut BlockHeader,
}

Memory layout for an allocation of N bytes:

+----------------+----------------+----------------------------+
|     size       |     next       |       user data            |
| (8 bytes)      | (8 bytes)      |   (rounded up to 16)       |
+----------------+----------------+----------------------------+
^                                 ^
header pointer                    user pointer (returned to caller)

The user pointer is 16-byte aligned because:

  1. sbrk returns a 16-byte aligned address (basaltc requires page alignment from the kernel, which exceeds 16).

  2. The header is exactly 16 bytes.

  3. malloc rounds the requested size up to a 16-byte multiple before adding the header.

16-byte alignment is enough for SSE2 packed double, all standard scalar types up to long double, and the C++ standard max_align_t.

The next field is only meaningful when the block is on the free list. For allocated blocks, next is undefined and the user data fills both the header’s next slot and the user-data region — that is, the user data starts at header + HEADER_SIZE.

Free list invariants:

  • Sorted by address (ascending). This is the essential precondition for bidirectional coalescing.

  • Singly linked through the next pointer.

  • The list head is static mut FREE_LIST: *mut BlockHeader.

  • null terminates the list.

Allocation Algorithm

malloc(size) runs in two stages:

  1. First-fit search — walk the free list from the head, returning the first block whose size is at least total = align_up(size + HEADER_SIZE, ALIGN).

  2. sbrk fallback — if no free block is large enough, request more virtual address space from the kernel and use the new region as a brand-new block.

unsafe fn malloc_inner(size: usize) -> *mut u8 {
    if size == 0 {
        return core::ptr::null_mut();
    }
    let total = align_up(size + HEADER_SIZE, ALIGN);

    // First-fit search
    let mut prev: *mut BlockHeader = core::ptr::null_mut();
    let mut curr = FREE_LIST;
    while !curr.is_null() {
        if (*curr).size >= total {
            // ... split or take whole block ...
            return (curr as *mut u8).add(HEADER_SIZE);
        }
        prev = curr;
        curr = (*curr).next;
    }

    // sbrk fallback
    let ptr = trona_posix::mm::posix_sbrk(total as i64);
    if ptr == u64::MAX {
        errno::set_errno(errno::ENOMEM);
        return core::ptr::null_mut();
    }
    let header = ptr as *mut BlockHeader;
    (*header).size = total;
    (*header).next = core::ptr::null_mut();
    (header as *mut u8).add(HEADER_SIZE)
}

Splitting

When the first-fit block is significantly larger than total, malloc splits it. The threshold is total + HEADER_SIZE + ALIGN (32 bytes of overhead): the remainder must be large enough to host another header plus at least one aligned chunk of user data. Otherwise the entire block is taken as-is and the small overage is internal fragmentation.

Splitting puts the smaller new free block at current + total and links it back into the free list at the position previously occupied by current.

sbrk Fallback

When the free list is exhausted, posix_sbrk(total) extends the heap by exactly total bytes — no over-allocation. The allocator does not maintain a "wilderness" pool the way classic dlmalloc does. This keeps the implementation simple but means each allocation that misses the free list takes one IPC round-trip to mmsrv.

posix_sbrk returns the old break (the start of the new region). basaltc treats this as a fresh BlockHeader with size = total, no next, and immediately hands the user pointer back to the caller. The block does not enter the free list until it is freed.

If posix_sbrk returns u64::MAX (the sentinel for failure), malloc sets errno to ENOMEM and returns NULL.

Free Algorithm

free(ptr) runs in three steps:

  1. Find the insertion point in the address-sorted free list.

  2. Coalesce forward: if the block being freed is immediately followed by a free block, merge them.

  3. Coalesce backward: if the block being freed is immediately preceded by a free block, merge them.

unsafe fn free_inner(ptr: *mut u8) {
    if ptr.is_null() { return; }
    let header = ptr.sub(HEADER_SIZE) as *mut BlockHeader;

    let mut prev: *mut BlockHeader = core::ptr::null_mut();
    let mut curr = FREE_LIST;
    while !curr.is_null() && (curr as usize) < (header as usize) {
        prev = curr;
        curr = (*curr).next;
    }

    // Coalesce forward
    let header_end = (header as *mut u8).add((*header).size) as *mut BlockHeader;
    if header_end == curr {
        (*header).size += (*curr).size;
        (*header).next = (*curr).next;
    } else {
        (*header).next = curr;
    }

    // Coalesce backward
    if !prev.is_null() {
        let prev_end = (prev as *mut u8).add((*prev).size) as *mut BlockHeader;
        if prev_end == header {
            (*prev).size += (*header).size;
            (*prev).next = (*header).next;
        } else {
            (*prev).next = header;
        }
    } else {
        FREE_LIST = header;
    }
}

The coalescing is exact: blocks merge only when contiguous in address space (prev_end == header, header_end == curr). There is no looser tolerance. This means heap fragmentation in the long-running case depends entirely on the application’s allocation pattern; basaltc does not run a background defragmentation pass.

Diagram

Concurrency Model

A single futex-based mutex (HEAP_LOCK) serializes every public entry point:

static HEAP_LOCK: trona::sync::Mutex = trona::sync::Mutex::new();

#[unsafe(no_mangle)]
pub unsafe extern "C" fn malloc(size: usize) -> *mut u8 {
    HEAP_LOCK.lock();
    let result = unsafe { malloc_inner(size) };
    HEAP_LOCK.unlock();
    result
}

The trona::sync::Mutex type implements a three-phase acquire:

  1. CAS — try a single compare-and-swap on the futex word. Most uncontended cases finish here.

  2. Spin — if CAS fails, spin a small bounded number of iterations in the hope that the holder releases quickly.

  3. futex_wait — only after spinning fails does the thread make a Futex syscall to block. This is the only path that enters the kernel.

For uncontended single-threaded code (the common case for many ports), every malloc/free call avoids the kernel entirely and costs only one atomic operation per lock and unlock. Under heavy contention multiple threads will queue inside the kernel, but the queue is a per-mutex futex slot, not a global heap lock, so other allocators (pthread, errno, stdio) are not affected.

realloc holds the lock across both the inner malloc and the inner free to keep the operation atomic from the caller’s perspective.

realloc

pub unsafe extern "C" fn realloc(ptr: *mut u8, size: usize) -> *mut u8 {
    if ptr.is_null() { return malloc(size); }
    if size == 0 { free(ptr); return null_mut(); }

    HEAP_LOCK.lock();
    let header = ptr.sub(HEADER_SIZE) as *const BlockHeader;
    let old_data_size = (*header).size - HEADER_SIZE;
    let new_ptr = malloc_inner(size);
    if new_ptr.is_null() { HEAP_LOCK.unlock(); return null_mut(); }
    let copy_size = old_data_size.min(size);
    core::ptr::copy_nonoverlapping(ptr, new_ptr, copy_size);
    free_inner(ptr);
    HEAP_LOCK.unlock();
    new_ptr
}

basaltc’s realloc always allocates a new block and copies, then frees the old block. There is no in-place expansion or shrinking, even when the existing block has free space immediately after it. This is a known optimization opportunity but has not been needed in practice — most ports do realloc-light workloads.

realloc(NULL, n) is malloc(n), and realloc(ptr, 0) is free(ptr) and returns NULL.

calloc and Overflow Checking

pub unsafe extern "C" fn calloc(nmemb: usize, size: usize) -> *mut u8 {
    let total = match nmemb.checked_mul(size) {
        Some(t) => t,
        None => {
            errno::set_errno(errno::ENOMEM);
            return core::ptr::null_mut();
        }
    };
    let ptr = malloc(total);
    if !ptr.is_null() {
        core::ptr::write_bytes(ptr, 0, total);
    }
    ptr
}

checked_mul rejects overflow with ENOMEM instead of wrapping silently. reallocarray (an OpenBSD/glibc extension) follows the same pattern.

Aligned Allocation

aligned_alloc(alignment, size) and posix_memalign(memptr, alignment, size) provide allocations with stronger alignment than the default 16 bytes. The implementation is straightforward but not space-efficient:

pub unsafe extern "C" fn aligned_alloc(alignment: usize, size: usize) -> *mut u8 {
    if alignment == 0 || !alignment.is_power_of_two() {
        return core::ptr::null_mut();
    }
    if alignment <= ALIGN {
        return malloc(size);
    }
    let ptr = malloc(size + alignment);
    if ptr.is_null() { return null_mut(); }
    ((ptr as usize + alignment - 1) & !(alignment - 1)) as *mut u8
}

The over-allocation by alignment bytes guarantees an aligned address exists somewhere inside the returned block. basaltc returns the aligned offset directly to the caller and does not track the difference between the malloc-returned base and the aligned offset. This means freeing an aligned_alloc result with free will pass the wrong header pointer to free_inner, corrupting the heap.

In practice this is a documented limitation: aligned_alloc is rare in the SaltyOS port set, and the few callers either use posix_memalign (which has the same limitation) or stay with the default 16-byte alignment. Fixing this requires either a per-allocation prefix that records the offset to the original header, or a side table mapping aligned pointers to their original blocks. Neither has been implemented.

strdup, strndup

strdup and strndup are convenience wrappers built on top of malloc:

pub unsafe extern "C" fn strdup(s: *const u8) -> *mut u8 {
    if s.is_null() { return null_mut(); }
    let len = crate::string::strlen(s);
    let p = malloc(len + 1);
    if !p.is_null() {
        core::ptr::copy_nonoverlapping(s, p, len + 1);
    }
    p
}

strndup follows the same pattern but caps the length at n and explicitly NUL-terminates. Both functions exist here rather than in string.rs because they are the only string.h functions that need the heap.

Limitations

  • No size class binning. All allocations share one free list, so allocation latency is O(N) in the number of free blocks.

  • No thread-local arenas. Heavy multi-threaded workloads may contend on HEAP_LOCK.

  • No background coalescing. Heap fragmentation depends on the allocation pattern.

  • No wilderness pool. Each sbrk call requests exactly the bytes needed.

  • aligned_alloc leaks the prefix. The unaligned head bytes are not returned to the free list.

  • No malloc_usable_size. The header is private; basaltc does not expose it.

  • No mallopt/mallinfo. No tuning parameters or statistics.

  • realloc always copies. No in-place expansion when the next block is free and contiguous.

These limitations are acceptable today because basaltc serves a userland made up mostly of small daemons and ported FreeBSD utilities, where allocation rates and pressure are low. A future replacement with a size-class allocator would slot in cleanly because the public ABI (malloc, free, realloc, calloc, aligned_alloc, posix_memalign, strdup, strndup, reallocarray) is small and stable.

Pre-Heap Constraints

The heap is implicitly initialized on the first malloc call: there is no heap_init function, and FREE_LIST starts as null. The first call walks an empty list, falls through to posix_sbrk, and creates the first block. This means basaltc startup code that runs before init_mm_from_auxv (in CRT Startup) cannot allocate, because posix_sbrk will fail without the mmsrv handshake.

In practice this is invisible: nothing in crt_start.S or the early portion of __libc_start_main calls malloc. The first allocation happens inside env::init_environ or stdio::ensure_stdio_init, both of which run after init_mm_from_auxv completes.