Page Fault Handling

When a thread accesses a virtual address that the hardware cannot translate, a page fault occurs. The kernel classifies the fault and resolves it through one of two paths: a kernel fast-path that requires no IPC, or a slow-path that delegates to the userspace memory manager server (mmsrv).

Fault Dispatch Overview

Diagram
If a VmArea exists but its backing MO pointer is null, handle_cow_fault returns Err(VSpaceError::NotMapped) (H9 behavior change — previously this was a silent success). The fault escalates to the slow path, which delivers a VMFault IPC to mmsrv.

PageFaultInfo

Each architecture constructs a PageFaultInfo from its native fault registers, providing a uniform interface to the generic fault handler:

pub struct PageFaultInfo {
    pub present: bool,  // page was present (permission violation, not unmapped)
    pub write: bool,    // fault caused by a write access
    pub user: bool,     // fault occurred in user mode
}
x86_64

Constructed from the page fault error code pushed by the CPU: bit 0 = present, bit 1 = write, bit 2 = user. Fault address from CR2.

aarch64

Constructed from ESR_EL1 (Exception Syndrome Register): EC field identifies data abort vs instruction abort, ISS field provides WnR (write/read) and DFSC (fault status code). Fault address from FAR_EL1.

IPC Error Code Encoding

PageFaultInfo is encoded into a canonical error code for the VMFault IPC message:

Bit 0 — Present  (1 = permission fault, 0 = translation/not-present)
Bit 1 — Write    (1 = write access)
Bit 2 — User     (1 = user mode)
Bit 4 — I/D      (1 = instruction fetch)

Both architectures produce this same encoding, so mmsrv handles faults uniformly.

Kernel Fast-Path Faults

Fast-path faults are resolved entirely in the kernel without context switching to mmsrv. This is critical for performance — COW faults during fork and demand faults during heap growth are extremely frequent.

COW Write Fault

Triggered when a thread writes to a page marked copy-on-write.

Conditions
  • PTE is present (ENTRY_PRESENT set).

  • PTE is not writable (ENTRY_WRITABLE clear).

  • PTE has the COW marker (ENTRY_COW set, bit 9).

  • The VmArea permits writes.

Resolution
  1. Identify the backing MemoryObject from the VmArea.

  2. Allocate a new frame tagged KernelPrivate{General} from the PMM.

  3. Copy 4,096 bytes from the original page to the new frame.

  4. Call cow_install_atomic:

    1. Reserve the radix slot with PHYS_TAG_BUSY under MO.commit_lock.

    2. Write the new PTE with write permission, issue TLB invalidation, call pmm_retain_mapping.

    3. Finalize the radix slot: replace PHYS_TAG_BUSY with the new physical address.

    4. Call pmm_set_owner(MoData) as the commit point.

  5. On failure: the frame is freed to the PMM; the PTE is never modified. A RaceLost return from cow_install_atomic means another CPU already resolved the fault — translate to Ok(true) and resume the thread.

  6. Resume the faulting thread.

No lock contention with mmsrv. commit_lock is now held during all radix reads on the COW chain — including reads of self.cow_parent — not just during writes. self.cow_parent must be read inside self.commit_lock.

Demand Page Fault

Triggered when a thread accesses a page that has been mapped but not yet backed by a physical frame.

Conditions
  • PTE is not present (ENTRY_PRESENT clear).

  • PTE has the demand marker (ENTRY_DEMAND set, bit 10).

Resolution
  1. handle_demand_fault() looks up the VmArea and its backing MemoryObject.

  2. If the MO is anonymous, a zero-filled frame is allocated on the fault-path source (untyped primary, PMM fallback) and inserted into the MO’s radix tree.

  3. If the MO is pager-backed, the page is committed at the faulted page index using the MO’s normal commit path.

  4. The PTE is rewritten with the new physical address, ENTRY_PRESENT = 1, ENTRY_DEMAND = 0, and permissions drawn from the VmArea.

  5. VSpace.vm_demand_pages is decremented and the TLB entry is invalidated before resuming the thread.

Demand markers are installed by VSPACE_MAP_DEMAND (0x59), VSPACE_MAP_DEMAND_RANGE (0x5A), and by VSPACE_MAP_MO when the caller passes VSPACE_FLAG_DEMAND for MO pages that are not yet committed at map time.

Stack Growth

Triggered when a thread accesses an address just below the current stack region.

Conditions
  • Fault address is within one page below the bottom of an existing stack VmArea.

  • The VmArea is identified as a stack region.

Resolution
  1. Extend the stack VmArea downward by one page.

  2. Commit a frame for the new page.

  3. Install the PTE.

  4. Resume the faulting thread.

Pooled COW Fast-Path (handle_cow_fault_pooled)

When cow_pool_phys != 0 (the VSpace has a non-empty CowPool), the fault handler uses the pooled path instead of allocating from the PMM:

  1. Pop a frame from the CowPool ring buffer (head index). The frame is already tagged KernelPrivate{CowPool}.

  2. Call cow_install_atomic with the pool frame.

  3. On success: the frame transitions to MoData at the cow_install_atomic commit point and the PTE is live.

  4. On RaceLost: free the pool entry back to the PMM with pmm_free(KernelPrivate{CowPool}), then return Ok(false) so the fault escalates to mmsrv for pool replenishment.

RaceLost Behavior Contract

The asymmetry between the two COW fault paths is load-bearing for correctness:

Path RaceLost Return Meaning

handle_cow_fault (non-pooled)

Ok(true)

Fault is resolved — another CPU installed the page. Resume the faulting thread immediately.

handle_cow_fault_pooled

Ok(false)

Pool entry was wasted; escalate to mmsrv so it can replenish the pool before the next fault.

In both cases the frame is freed and the PTE is not modified. The difference is whether the fault handler considers the fault resolved (Ok(true)) or requests mmsrv intervention (Ok(false)). Using Ok(true) in the pooled path would silently drain the pool without triggering replenishment, eventually leaving the VSpace with no pool frames.

Slow-Path Faults (mmsrv IPC)

If the fault does not match any fast-path pattern, the kernel sends a VM fault message to the faulting thread’s fault endpoint. The memory manager server (mmsrv) receives the message, determines the appropriate action (e.g., commit pages, extend mappings, report a segfault), and replies to resume the thread.

Fault Message Format

The fault is delivered as a standard IPC message on the thread’s fault endpoint:

Field Value Content

label

2 (FaultType::VMFault)

Identifies this as a VM fault.

length

4

Four message registers.

regs[0]

fault address

The virtual address that caused the fault (CR2 / FAR_EL1).

regs[1]

error code

Encoded PageFaultInfo (present, write, user, instruction bits).

regs[2]

instruction pointer

The faulting instruction address (RIP / ELR_EL1).

regs[3]

is_instruction_fault

1 if the fault was caused by an instruction fetch, 0 otherwise.

Fault Handler Flow

  1. The faulting thread enters BlockedReason::FaultBlocked with the fault message.

  2. The kernel sends the message to the fault endpoint.

  3. mmsrv (or another fault handler) receives the message via Recv.

  4. The handler examines the fault address and error code, performs the necessary memory operations (e.g., MO_COMMIT, VSPACE_MAP_MO).

  5. The handler replies to the fault message.

  6. The kernel receives the reply and resumes the faulting thread.

Reply-to-Resume

Replying to a fault message resumes the faulted thread at the faulting instruction. The thread re-executes the instruction that caused the fault. If the handler has fixed the fault (committed the page, installed the mapping), the instruction succeeds. If the handler has not fixed the fault (e.g., segfault), the handler can kill the thread instead of replying.

Other Fault Types

User Exception (label 4)

Delivered when a thread triggers a CPU exception that is not a page fault (e.g., illegal instruction, alignment fault):

regs[0]

Exception vector number

regs[1]

Error code

regs[2]

Faulting instruction pointer

regs[3]

Faulting stack pointer

Capability Fault (label 1)

Delivered when a thread’s capability lookup fails during a syscall (e.g., the thread tries to invoke a null or invalid capability slot). This is informational — the fault handler can log it and decide whether to terminate the thread.

Performance Considerations

The fast-path / slow-path split is critical to microkernel performance:

COW fast-path

A fork followed by exec in a typical shell invocation can trigger hundreds of COW faults. If each fault required an IPC round-trip to mmsrv (send fault → mmsrv recv → mmsrv commit → mmsrv reply → thread resume), the overhead would dominate fork time. The kernel fast-path resolves each fault in a single trap without leaving the kernel.

Demand fast-path

Heap growth via mmap + first-touch triggers demand faults. The kernel fast-path avoids the IPC round-trip for each page.

Slow-path

Reserved for cases that require policy decisions: mapping new regions, handling unmapped addresses, extending non-stack regions, and reporting segfaults. mmsrv has full context about the process’s memory layout and can make informed decisions.