Endpoints

An endpoint is a synchronous IPC channel. Sender and receiver rendezvous at an endpoint: if no partner is waiting, the caller blocks until one arrives. Endpoints carry messages consisting of a label, up to 32 message registers, and optionally up to 4 capabilities.

Endpoint Structure

#[repr(C)]
pub struct Endpoint {
    pub header: KernelObject,
    lock: AtomicU8,               // per-endpoint spinlock
    state: EndpointState,
    send_queue: WaitQueue,        // blocked senders
    recv_queue: RecvWaitQueue,    // blocked receivers
}

Each endpoint has its own spinlock with a per-CPU IRQ flag save/restore. This prevents same-CPU deadlock when a timer tick handler (which calls check_wakeups) needs the endpoint lock while a syscall on the same CPU already holds it.

Lock ordering: CAP_LOCK → endpoint.lock → sched.lock_cpu. Context switches and reschedules always occur outside the endpoint lock.

Endpoint State

Diagram

Message Format

Message Structure

#[repr(C)]
pub struct Message {
    pub label: u64,        // operation identifier (msg_info bits 51:12)
    pub length: usize,     // valid message registers, 0-127 (msg_info bits 6:0)
    pub extra_caps: usize, // capabilities to transfer, 0-4 (msg_info bits 11:7)
    pub regs: [u64; 32],   // message registers MR0-MR31
    pub caps: [u64; 4],    // sender CNode slot indices for cap transfer
}

Message Info Word

The msg_info word is packed into a single register:

Bits  6:0  — length     (0-127 message registers)
Bits 11:7  — extra_caps (0-4 capabilities)
Bits 51:12 — label      (40-bit operation identifier)

MR0-MR3 are passed in CPU registers (fastpath optimization). MR4-MR19 overflow into the IPC buffer mapped in the thread’s VSpace.

IPC Buffer

The IPC buffer is a 4,096-byte (one page) structure mapped into each thread’s VSpace:

#[repr(C)]
pub struct IpcBuffer {
    pub msg: [u64; 34],         // trona_msg overlay: [label, length, regs[0..31]]
    pub badge: u64,             // sender badge
    pub caps: [u64; 4],         // cap slot indices for transfer
    pub receive_cnode: u64,     // receiver's CNode for incoming caps
    pub receive_index: u64,     // starting slot in receive CNode
    pub receive_depth: u64,     // CNode depth for receive
    pub reserved: [u64; 466],   // extended payload area (3,728 bytes)
}

The reserved area is used for extended payloads (e.g., exec binary data passed from procmgr).

Operations

Send (syscall 0)

Send a message to an endpoint. Blocks if no receiver is waiting.

  1. Acquire endpoint.lock.

  2. If a receiver is waiting (RecvBlocked): dequeue the receiver, transfer the message and badge, wake the receiver.

  3. If no receiver: enqueue the sender in the send queue, set state to SendBlocked, release lock, reschedule.

The sender’s rights on the endpoint capability must include SEND.

Recv (syscall 1)

Receive a message from an endpoint. Blocks if no sender is waiting.

  1. If the thread has a bound notification with pending bits, consume the bits immediately and return (notification pre-check).

  2. Acquire endpoint.lock.

  3. If a sender is waiting (SendBlocked): dequeue the sender, transfer the message and badge, wake the sender.

  4. If no sender: cache the receiver’s IPC buffer receive-slot configuration into the TCB, enqueue in the recv queue, set state to RecvBlocked, release lock, reschedule.

The receive-slot configuration (receive_cnode, receive_index, receive_depth) is cached from the IPC buffer while the thread is current (its VSpace is active), because the receiver’s VSpace may not be active when a sender eventually arrives.

The receiver’s rights must include RECV.

Call (syscall 2)

Atomic send-then-receive. The primary client RPC pattern.

  1. Send a message to the endpoint (same as Send, but with CallSendBlocked reason).

  2. When a receiver picks up the message, the caller transitions to ReplyWait (blocked waiting for the reply, not in any endpoint queue).

  3. The kernel saves a reply capability referencing the caller’s TCB in the server’s reply_tcb field.

  4. The server processes the request and invokes ReplyRecv to reply and wait for the next client.

Diagram

During the Call send phase, the kernel performs priority inheritance: if the caller has an earlier deadline than the server, the server’s priority is temporarily boosted (see PIP).

The caller’s rights must include CALL.

ReplyRecv (syscall 3)

Reply to a previous caller and wait for the next message. The server-side counterpart to Call.

  1. Use reply_tcb to send the reply message to the original caller.

  2. Revert any PIP donation from the caller.

  3. Enter Recv on the endpoint, waiting for the next request.

The reply is one-shot: reply_tcb is cleared after sending.

NBSend (syscall 4) — DEPRECATED

Syscall slot 4 is retired. The kernel returns InvalidArgument for any NBSend invocation. Console hot paths that previously used NBSend have migrated to shared-memory rings. Do not reuse slot 4.

Capability Transfer

Capabilities can be transferred through IPC alongside the message. The transfer is atomic: either all capabilities are transferred, or none are (with rollback on failure).

Sender Setup

The sender sets extra_caps in the message info (1-4) and places source CNode slot indices in the IPC buffer’s caps[0..3] array.

Receiver Setup

Before calling Recv, the receiver configures three fields in its IPC buffer:

  • receive_cnode: capability address of the CNode to receive into.

  • receive_index: starting slot index within that CNode.

  • receive_depth: CNode address resolution depth (0 for flat lookup).

These values are cached into the TCB when the thread enters RecvBlocked.

Transfer Algorithm

The transfer runs outside the endpoint lock but under CAP_LOCK:

  1. Validation pass (for each of extra_caps capabilities):

    1. Resolve the sender’s source slot in the sender’s CSpace.

    2. Verify the source capability has GRANT right.

    3. Resolve the receiver’s destination slot via the cached receive configuration.

    4. Verify the destination slot is empty.

    5. Record both source and destination for the copy pass.

  2. Copy pass (for each validated capability):

    1. Call CNode::copy_slot() on the receiver’s CNode, copying from the sender’s CSpace.

    2. This allocates a fresh global slot, copies the capability, increments the object refcount, and inserts into the CDT.

    3. On failure: roll back all previously copied capabilities by calling delete() on each.

  3. Extended payload: if the message label matches PM_EXEC_LABEL, the kernel also copies the reserved area from the sender’s IPC buffer to the receiver’s (used for binary data during exec).

Fault Delivery

When a thread faults (page fault, user exception), the kernel sends a fault message to the thread’s designated fault endpoint. The fault message uses the same Message format with a fault-specific label:

Fault Type Label Message Registers

VM Fault

2

MR0=fault address, MR1=error code, MR2=faulting IP, MR3=is_instruction_fault

User Exception

4

MR0=exception vector, MR1=error code, MR2=faulting IP, MR3=faulting SP

Cap Fault

1

Capability lookup failure details

The faulting thread blocks with BlockedReason::FaultBlocked. Replying to the fault message resumes the faulted thread (reply-to-resume).

Timed IPC

SendTimed (syscall 21) and RecvTimed (syscall 22) accept a timeout in nanoseconds. If the operation does not complete within the timeout, the thread is woken from the endpoint queue and receives a timeout error.

The thread is inserted into both the endpoint wait queue and the sleep queue. Whichever fires first (partner arrival or timeout expiry) wakes the thread and removes it from the other queue.

Multi-Endpoint Receive

RecvAny (syscall 23) and ReplyRecvAny (syscall 24) allow a thread to wait on multiple endpoints simultaneously. The thread registers up to MAX_RECV_WAIT_ENDPOINTS (32) endpoints via RecvWaitLink structures embedded in the TCB.

On RecvAny:

  1. Acquire all registered endpoints' locks in address order (preventing deadlock).

  2. Check each endpoint for a ready sender.

  3. Check the bound notification for pending bits.

  4. If any source is ready, deliver the message, release all locks, and return with a badge identifying the source.

  5. If no source is ready: enqueue the thread in every registered endpoint’s recv queue, release all locks, and block.

When any endpoint receives a message or the bound notification fires:

  1. The thread’s recv_wait_selected is set to identify the source.

  2. The thread is dequeued from all other endpoints.

  3. The thread is woken.

Timed variants: RecvAnyTimed (syscall 25) and ReplyRecvAnyTimed (syscall 26) add sleep queue integration.

Error Conditions and Edge Cases

Endpoint Destruction While Threads Are Blocked

When the last capability to an endpoint is deleted, Endpoint::cleanup() runs:

  • All threads in the send queue are woken with an error.

  • All threads in the recv queue are woken with an error.

Threads that were in CallSendBlocked or ReplyWait are also woken. PIP donations through the endpoint are reverted.

Capability Transfer Failure

If cap transfer fails partway through (e.g., the 2nd of 3 caps fails due to SlotOccupied):

  • All previously transferred caps in the current message are rolled back via delete().

  • The message is not delivered — the entire send fails.

  • The sender receives SyscallError::SlotOccupied (or the specific cap error).

RecvAny with All Endpoints Destroyed

If all registered endpoints are destroyed while the thread is blocked in RecvAny, the thread is woken as each endpoint runs cleanup. The thread sees recv_wait_selected = NONE and returns an error.

Timed IPC Timeout vs Partner Arrival Race

When a timeout and a partner arrival happen simultaneously:

  • The sleep queue and endpoint queue each attempt to wake the thread.

  • Whichever succeeds first (atomic state transition) wins.

  • The loser finds the thread already woken and does nothing (idempotent).

  • If the timeout wins, the sender’s message is not consumed — the sender remains blocked or is also timed out.

Reply to a Dead Thread

If a server holds a reply capability to a thread that has been destroyed (TCB deleted), the reply succeeds silently — the message is discarded because the TCB’s refcount reached zero and cleanup already ran.

  • Notifications — asynchronous signaling and bound notifications

  • IPC Fastpath — assembly-optimized fast paths for Call and ReplyRecv

  • Capabilities — capability transfer mechanics

  • Design Patterns — server loop and client RPC patterns

  • Syscall ABI — register conventions and syscall numbers