POSIX Threads

trona_posix::pthread is the largest single POSIX module — 965 lines — but most of it is plumbing. The actual locking primitives live in substrate/sync.rs; pthread thread lifecycle is delegated to procmgr; per-thread storage uses the substrate ThreadDesc pool. What’s left in pthread.rs is the POSIX-flavored entry points and the cancellation state machine.

This page is about thread lifecycle. For pthread mutex, condvar, rwlock, once, and key handling — those are wrappers around trona::sync::* and basaltc’s pthread C ABI is the primary user — see basalt: Threads and Synchronization.

pthread_t encoding

A pthread_t is a 64-bit handle:

pub type PthreadT = u64;  // = (pool_index << 32) | generation

The pool index points into the substrate ThreadDesc pool (max 64 entries — see Threads, TLS, and Worker Pool). The generation increments every time a slot is reused. This lets pthread_join detect "stale handle pointing at a reaped slot" — if the generation in the handle does not match the generation in the slot, the join returns ESRCH instead of joining the wrong thread.

PthreadAttr

#[repr(C)]
pub struct PthreadAttr {
    pub stack_size: u64,
    pub detach_state: u32,
    _pad: u32,
}

Only two fields matter:

  • stack_size — bytes; 0 means use the default (DEFAULT_STACK_SIZE = 2 MiB).

  • detach_state0 for joinable (default), 1 for detached.

Other POSIX pthread attributes (scheduling policy, priority, scope, guard size, …) are accepted by the basaltc-side wrappers but stored in the C pthread_attr_t and not propagated through to trona_posix. trona_posix only honors stack size and detach state.

pthread_create

pub unsafe fn pthread_create(
    thread_out: *mut PthreadT,
    attr: *const PthreadAttr,
    start_routine: extern "C" fn(*mut c_void) -> *mut c_void,
    arg: *mut c_void,
) -> i32;

The implementation orchestrates several layers:

  1. Allocate a thread descriptor. Call tls::allocate_thread(ThreadOwner::Posix) to reserve a slot in the substrate ThreadDesc pool. If full, return EAGAIN.

  2. Allocate stack and TLS block. Use posix_mmap (anonymous, RW) to allocate the requested stack size plus a guard page. Allocate the static TLS block separately and copy in the __trona_tls_template.

  3. Compute the IPC buffer vaddr. Each thread needs its own IPC buffer. The vaddr is reserved at thread creation and recorded in the ThreadDesc for later cleanup.

  4. Send PM_THREAD_CREATE to procmgr. The message carries: entry function (start_routine), arg, computed initial RSP (top of the new stack minus the trampoline frame), TLS base (top of the TLS block), and the IPC buffer vaddr. procmgr creates a new TCB, configures it, and resumes it; replies with the new TCB cap.

  5. Store the TCB cap and procmgr-assigned tid. Both go into a PosixThreadExt struct attached to the ThreadDesc.personality_data pointer.

  6. Build the PthreadT handle. (pool_index << 32) | generation is written to *thread_out.

  7. Return 0 to the caller.

The thread starts running concurrently with the parent’s return. Its first action is the small assembly trampoline that loads its TLS pointer, calls start_routine(arg), and on return calls pthread_exit(retval).

The trampoline

The thread starts at a tiny per-arch trampoline that:

  1. Loads the thread pointer (FS base on x86_64, tpidr_el0 on aarch64) from the TLS base computed in step 4 above.

  2. Calls _pthread_thread_entry(start_routine, arg) — a Rust function that performs final per-thread setup (cancellation state, signal mask copy from parent) and then start_routine(arg).

  3. On return from _pthread_thread_entry, calls pthread_exit(retval).

The trampoline is a few instructions; the trampoline itself is generated by the substrate at the time of the tcb_configure call rather than being a fixed entry point.

pthread_exit

pub unsafe fn pthread_exit(retval: *mut c_void) -> !;

Marks the thread as exited and never returns. The implementation:

  1. Set the ThreadDesc.state to TD_EXITED.

  2. If the thread is detached, transition straight to TD_REAPING. If joinable, leave it for a future pthread_join to reap.

  3. Run any registered cleanup handlers from the thread’s cleanup stack (in reverse order).

  4. Run any TLS destructors registered through pthread_key_create.

  5. Send PM_THREAD_EXIT to procmgr with the retval. procmgr destroys the TCB and frees the SchedContext.

  6. The kernel never returns control to this TCB after the PM_THREAD_EXIT reply, so the function is → !.

If the exiting thread is the main thread (pool index 0), pthread_exit is treated as a process exit — calling pthread_exit from main is equivalent to calling exit(retval as i32) and tears down the whole process.

pthread_join and pthread_detach

pub unsafe fn pthread_join(thread: PthreadT, retval: *mut *mut c_void) -> i32;
pub unsafe fn pthread_detach(thread: PthreadT) -> i32;

pthread_join blocks the caller until the target thread reaches TD_EXITED, copies its retval into *retval, and reaps the slot (transitioning TD_EXITED → TD_REAPING → TD_FREE). The blocking is implemented through the substrate Condvar on the target’s PosixThreadExtpthread_exit signals it as part of its teardown.

If the target was already detached, pthread_join returns EINVAL. If the target’s generation does not match the handle’s, it returns ESRCH.

pthread_detach flips the detached flag in PosixThreadExt and sends PM_THREAD_DETACH to procmgr so that procmgr knows not to expect a join. After detach, calling pthread_join on the same handle returns EINVAL.

Cancellation

POSIX cancellation is the most subtle part of pthread, and trona_posix supports it through a small set of TLS-stored flags plus the cancellation hook installed on substrate’s sync primitives.

pub unsafe fn pthread_cancel(thread: PthreadT) -> i32;
pub unsafe fn pthread_setcancelstate(state: i32, oldstate: *mut i32) -> i32;
pub unsafe fn pthread_setcanceltype(type_: i32, oldtype: *mut i32) -> i32;
pub unsafe fn pthread_testcancel();

The state machine is:

  • PTHREAD_CANCEL_ENABLE / PTHREAD_CANCEL_DISABLE — controls whether the thread reacts to cancellation at all. Stored in the per-thread ThreadLocalBlock.

  • PTHREAD_CANCEL_DEFERRED / PTHREAD_CANCEL_ASYNCHRONOUS — controls when the cancellation takes effect. Only DEFERRED is implemented; setting ASYNCHRONOUS is accepted but treated as DEFERRED.

  • cancel_pending — set by pthread_cancel; checked by pthread_testcancel and by substrate’s sync primitives via the cancellation hook.

When pthread_cancel(other_thread) is called, the implementation finds the target’s ThreadLocalBlock (via the pool index), sets cancel_pending, and signals the target’s signal notification so that any thread blocked in a syscall wakes up.

When the target thread reaches a cancellation point (pthread_testcancel, or any blocking primitive in substrate sync that checks the cancellation hook), the cancellation runs:

  1. Pop and execute every cleanup handler in the thread’s cleanup stack.

  2. Call pthread_exit(PTHREAD_CANCELED) ((void *) -1).

Asynchronous cancellation would require interrupting the thread mid-instruction, which trona_posix does not do — the only "asynchronous" event is the kernel-injected signal frame, and even that runs after returning from kernel mode rather than mid-instruction.

Cleanup handlers

pub unsafe fn pthread_cleanup_push_impl(routine: extern "C" fn(*mut c_void), arg: *mut c_void);
pub unsafe fn pthread_cleanup_pop_impl(execute: i32);

The cleanup handler stack is a small per-thread fixed-size array (default 32 entries) inside PosixThreadExt. push adds an entry; pop(execute=1) runs the entry; pop(execute=0) discards it.

Cleanup handlers run on cancellation, on pthread_exit, and on every explicit pop with execute=1. basaltc’s pthread_cleanup_push / pthread_cleanup_pop macros — which use a setjmp / longjmp pair on glibc — wrap these directly without any local trickery, because the trona_posix model already exposes them as plain function calls.

pthread_self

pub fn pthread_self() -> PthreadT;

Returns the calling thread’s handle. Implementation:

let index = tls::current_thread_index();
let desc = unsafe { &*tls::current_thread_desc() };
((desc.generation as u64) << 32) | (index as u64)

This is one of the cheapest operations in trona_posix — it’s three loads and a shift.

pthread_key_* and TLS

pthread_key_create, pthread_key_delete, pthread_setspecific, and pthread_getspecific are implemented as a thin wrapper over a per-process key table (max 128 keys) plus per-thread value arrays in the PosixThreadExt. The destructor for each key (passed to pthread_key_create) is called on thread exit for any key whose value is not null.

This is the same machinery basaltc’s pthread C ABI exposes; the C wrappers in basaltc just forward to these Rust functions.

tls.rs — the per-thread accessors

posix/tls.rs (110 lines) is small enough that it deserves a paragraph rather than its own page.

It exposes four functions used by basaltc and by other trona_posix modules:

  • current_tls() — returns a pointer to the current thread’s ThreadLocalBlock.

  • current_errno() — returns a pointer to the per-thread errno slot.

  • init_main_thread_tls() — initializes TLS for the main thread (called once during process startup, after rtld has set the thread pointer).

  • tls_addr(module_id, offset) — resolves a TLS variable address for the General Dynamic TLS access model. Static TLS is fast; dynamically-loaded modules with their own TLS would use this — though SaltyOS does not currently support dynamic loading after process start.

Both current_errno and current_tls return null if called before TLS is initialized; callers must tolerate that during very early startup.