Page Fault Handling
When a thread accesses a virtual address that the hardware cannot translate, a page fault occurs. The kernel classifies the fault and resolves it through one of two paths: a kernel fast-path that requires no IPC, or a slow-path that delegates to the userspace memory manager server (mmsrv).
Fault Dispatch Overview
If a VmArea exists but its backing MO pointer is null, handle_cow_fault returns Err(VSpaceError::NotMapped) (H9 behavior change — previously this was a silent success). The fault escalates to the slow path, which delivers a VMFault IPC to mmsrv.
|
PageFaultInfo
Each architecture constructs a PageFaultInfo from its native fault registers, providing a uniform interface to the generic fault handler:
pub struct PageFaultInfo {
pub present: bool, // page was present (permission violation, not unmapped)
pub write: bool, // fault caused by a write access
pub user: bool, // fault occurred in user mode
}
- x86_64
-
Constructed from the page fault error code pushed by the CPU: bit 0 = present, bit 1 = write, bit 2 = user. Fault address from
CR2. - aarch64
-
Constructed from
ESR_EL1(Exception Syndrome Register): EC field identifies data abort vs instruction abort, ISS field provides WnR (write/read) and DFSC (fault status code). Fault address fromFAR_EL1.
IPC Error Code Encoding
PageFaultInfo is encoded into a canonical error code for the VMFault IPC message:
Bit 0 — Present (1 = permission fault, 0 = translation/not-present) Bit 1 — Write (1 = write access) Bit 2 — User (1 = user mode) Bit 4 — I/D (1 = instruction fetch)
Both architectures produce this same encoding, so mmsrv handles faults uniformly.
Kernel Fast-Path Faults
Fast-path faults are resolved entirely in the kernel without context switching to mmsrv. This is critical for performance — COW faults during fork and demand faults during heap growth are extremely frequent.
COW Write Fault
Triggered when a thread writes to a page marked copy-on-write.
- Conditions
-
-
PTE is present (
ENTRY_PRESENTset). -
PTE is not writable (
ENTRY_WRITABLEclear). -
PTE has the COW marker (
ENTRY_COWset, bit 9). -
The VmArea permits writes.
-
- Resolution
-
-
Identify the backing MemoryObject from the VmArea.
-
Allocate a new frame tagged
KernelPrivate{General}from the PMM. -
Copy 4,096 bytes from the original page to the new frame.
-
Call
cow_install_atomic:-
Reserve the radix slot with
PHYS_TAG_BUSYunderMO.commit_lock. -
Write the new PTE with write permission, issue TLB invalidation, call
pmm_retain_mapping. -
Finalize the radix slot: replace
PHYS_TAG_BUSYwith the new physical address. -
Call
pmm_set_owner(MoData)as the commit point.
-
-
On failure: the frame is freed to the PMM; the PTE is never modified. A
RaceLostreturn fromcow_install_atomicmeans another CPU already resolved the fault — translate toOk(true)and resume the thread. -
Resume the faulting thread.
-
No lock contention with mmsrv. commit_lock is now held during all radix reads on the COW chain — including reads of self.cow_parent — not just during writes. self.cow_parent must be read inside self.commit_lock.
Demand Page Fault
Triggered when a thread accesses a page that has been mapped but not yet backed by a physical frame.
- Conditions
-
-
PTE is not present (
ENTRY_PRESENTclear). -
PTE has the demand marker (
ENTRY_DEMANDset, bit 10).
-
- Resolution
-
-
handle_demand_fault()looks up the VmArea and its backing MemoryObject. -
If the MO is anonymous, a zero-filled frame is allocated on the fault-path source (untyped primary, PMM fallback) and inserted into the MO’s radix tree.
-
If the MO is pager-backed, the page is committed at the faulted page index using the MO’s normal commit path.
-
The PTE is rewritten with the new physical address,
ENTRY_PRESENT = 1,ENTRY_DEMAND = 0, and permissions drawn from the VmArea. -
VSpace.vm_demand_pagesis decremented and the TLB entry is invalidated before resuming the thread.
-
Demand markers are installed by VSPACE_MAP_DEMAND (0x59), VSPACE_MAP_DEMAND_RANGE (0x5A), and by VSPACE_MAP_MO when the caller passes VSPACE_FLAG_DEMAND for MO pages that are not yet committed at map time.
Stack Growth
Triggered when a thread accesses an address just below the current stack region.
- Conditions
-
-
Fault address is within one page below the bottom of an existing stack VmArea.
-
The VmArea is identified as a stack region.
-
- Resolution
-
-
Extend the stack VmArea downward by one page.
-
Commit a frame for the new page.
-
Install the PTE.
-
Resume the faulting thread.
-
Pooled COW Fast-Path (handle_cow_fault_pooled)
When cow_pool_phys != 0 (the VSpace has a non-empty CowPool), the fault handler uses the pooled path instead of allocating from the PMM:
-
Pop a frame from the CowPool ring buffer (
headindex). The frame is already taggedKernelPrivate{CowPool}. -
Call
cow_install_atomicwith the pool frame. -
On success: the frame transitions to
MoDataat thecow_install_atomiccommit point and the PTE is live. -
On
RaceLost: free the pool entry back to the PMM withpmm_free(KernelPrivate{CowPool}), then returnOk(false)so the fault escalates to mmsrv for pool replenishment.
RaceLost Behavior Contract
The asymmetry between the two COW fault paths is load-bearing for correctness:
| Path | RaceLost Return | Meaning |
|---|---|---|
|
|
Fault is resolved — another CPU installed the page. Resume the faulting thread immediately. |
|
|
Pool entry was wasted; escalate to mmsrv so it can replenish the pool before the next fault. |
In both cases the frame is freed and the PTE is not modified. The difference is whether the fault handler considers the fault resolved (Ok(true)) or requests mmsrv intervention (Ok(false)). Using Ok(true) in the pooled path would silently drain the pool without triggering replenishment, eventually leaving the VSpace with no pool frames.
Slow-Path Faults (mmsrv IPC)
If the fault does not match any fast-path pattern, the kernel sends a VM fault message to the faulting thread’s fault endpoint. The memory manager server (mmsrv) receives the message, determines the appropriate action (e.g., commit pages, extend mappings, report a segfault), and replies to resume the thread.
Fault Message Format
The fault is delivered as a standard IPC message on the thread’s fault endpoint:
| Field | Value | Content |
|---|---|---|
|
|
Identifies this as a VM fault. |
|
|
Four message registers. |
|
fault address |
The virtual address that caused the fault ( |
|
error code |
Encoded |
|
instruction pointer |
The faulting instruction address ( |
|
is_instruction_fault |
|
Fault Handler Flow
-
The faulting thread enters
BlockedReason::FaultBlockedwith the fault message. -
The kernel sends the message to the fault endpoint.
-
mmsrv (or another fault handler) receives the message via
Recv. -
The handler examines the fault address and error code, performs the necessary memory operations (e.g.,
MO_COMMIT,VSPACE_MAP_MO). -
The handler replies to the fault message.
-
The kernel receives the reply and resumes the faulting thread.
Reply-to-Resume
Replying to a fault message resumes the faulted thread at the faulting instruction. The thread re-executes the instruction that caused the fault. If the handler has fixed the fault (committed the page, installed the mapping), the instruction succeeds. If the handler has not fixed the fault (e.g., segfault), the handler can kill the thread instead of replying.
Other Fault Types
Performance Considerations
The fast-path / slow-path split is critical to microkernel performance:
- COW fast-path
-
A fork followed by exec in a typical shell invocation can trigger hundreds of COW faults. If each fault required an IPC round-trip to mmsrv (send fault → mmsrv recv → mmsrv commit → mmsrv reply → thread resume), the overhead would dominate fork time. The kernel fast-path resolves each fault in a single trap without leaving the kernel.
- Demand fast-path
-
Heap growth via
mmap+ first-touch triggers demand faults. The kernel fast-path avoids the IPC round-trip for each page. - Slow-path
-
Reserved for cases that require policy decisions: mapping new regions, handling unmapped addresses, extending non-stack regions, and reporting segfaults. mmsrv has full context about the process’s memory layout and can make informed decisions.
Related Pages
-
Virtual Address Spaces — PTE flags, VmArea, MAP_MO
-
Memory Objects — COW clone, commit/decommit, radix tree
-
Physical Memory — frame allocation for fault resolution
-
Endpoints — fault message delivery via fault endpoint