Endpoints
An endpoint is a synchronous IPC channel. Sender and receiver rendezvous at an endpoint: if no partner is waiting, the caller blocks until one arrives. Endpoints carry messages consisting of a label, up to 32 message registers, and optionally up to 4 capabilities.
Endpoint Structure
#[repr(C)]
pub struct Endpoint {
pub header: KernelObject,
lock: AtomicU8, // per-endpoint spinlock
state: EndpointState,
send_queue: WaitQueue, // blocked senders
recv_queue: RecvWaitQueue, // blocked receivers
}
Each endpoint has its own spinlock with a per-CPU IRQ flag save/restore.
This prevents same-CPU deadlock when a timer tick handler (which calls check_wakeups) needs the endpoint lock while a syscall on the same CPU already holds it.
Lock ordering: CAP_LOCK → endpoint.lock → sched.lock_cpu.
Context switches and reschedules always occur outside the endpoint lock.
Message Format
Message Structure
#[repr(C)]
pub struct Message {
pub label: u64, // operation identifier (msg_info bits 51:12)
pub length: usize, // valid message registers, 0-127 (msg_info bits 6:0)
pub extra_caps: usize, // capabilities to transfer, 0-4 (msg_info bits 11:7)
pub regs: [u64; 32], // message registers MR0-MR31
pub caps: [u64; 4], // sender CNode slot indices for cap transfer
}
Message Info Word
The msg_info word is packed into a single register:
Bits 6:0 — length (0-127 message registers) Bits 11:7 — extra_caps (0-4 capabilities) Bits 51:12 — label (40-bit operation identifier)
MR0-MR3 are passed in CPU registers (fastpath optimization). MR4-MR19 overflow into the IPC buffer mapped in the thread’s VSpace.
IPC Buffer
The IPC buffer is a 4,096-byte (one page) structure mapped into each thread’s VSpace:
#[repr(C)]
pub struct IpcBuffer {
pub msg: [u64; 34], // trona_msg overlay: [label, length, regs[0..31]]
pub badge: u64, // sender badge
pub caps: [u64; 4], // cap slot indices for transfer
pub receive_cnode: u64, // receiver's CNode for incoming caps
pub receive_index: u64, // starting slot in receive CNode
pub receive_depth: u64, // CNode depth for receive
pub reserved: [u64; 466], // extended payload area (3,728 bytes)
}
The reserved area is used for extended payloads (e.g., exec binary data passed from procmgr).
Operations
Send (syscall 0)
Send a message to an endpoint. Blocks if no receiver is waiting.
-
Acquire
endpoint.lock. -
If a receiver is waiting (
RecvBlocked): dequeue the receiver, transfer the message and badge, wake the receiver. -
If no receiver: enqueue the sender in the send queue, set state to
SendBlocked, release lock, reschedule.
The sender’s rights on the endpoint capability must include SEND.
Recv (syscall 1)
Receive a message from an endpoint. Blocks if no sender is waiting.
-
If the thread has a bound notification with pending bits, consume the bits immediately and return (notification pre-check).
-
Acquire
endpoint.lock. -
If a sender is waiting (
SendBlocked): dequeue the sender, transfer the message and badge, wake the sender. -
If no sender: cache the receiver’s IPC buffer receive-slot configuration into the TCB, enqueue in the recv queue, set state to
RecvBlocked, release lock, reschedule.
The receive-slot configuration (receive_cnode, receive_index, receive_depth) is cached from the IPC buffer while the thread is current (its VSpace is active), because the receiver’s VSpace may not be active when a sender eventually arrives.
The receiver’s rights must include RECV.
Call (syscall 2)
Atomic send-then-receive. The primary client RPC pattern.
-
Send a message to the endpoint (same as Send, but with
CallSendBlockedreason). -
When a receiver picks up the message, the caller transitions to
ReplyWait(blocked waiting for the reply, not in any endpoint queue). -
The kernel saves a reply capability referencing the caller’s TCB in the server’s
reply_tcbfield. -
The server processes the request and invokes
ReplyRecvto reply and wait for the next client.
During the Call send phase, the kernel performs priority inheritance: if the caller has an earlier deadline than the server, the server’s priority is temporarily boosted (see PIP).
The caller’s rights must include CALL.
ReplyRecv (syscall 3)
Reply to a previous caller and wait for the next message. The server-side counterpart to Call.
-
Use
reply_tcbto send the reply message to the original caller. -
Revert any PIP donation from the caller.
-
Enter
Recvon the endpoint, waiting for the next request.
The reply is one-shot: reply_tcb is cleared after sending.
Capability Transfer
Capabilities can be transferred through IPC alongside the message. The transfer is atomic: either all capabilities are transferred, or none are (with rollback on failure).
Sender Setup
The sender sets extra_caps in the message info (1-4) and places source CNode slot indices in the IPC buffer’s caps[0..3] array.
Receiver Setup
Before calling Recv, the receiver configures three fields in its IPC buffer:
-
receive_cnode: capability address of the CNode to receive into. -
receive_index: starting slot index within that CNode. -
receive_depth: CNode address resolution depth (0 for flat lookup).
These values are cached into the TCB when the thread enters RecvBlocked.
Transfer Algorithm
The transfer runs outside the endpoint lock but under CAP_LOCK:
-
Validation pass (for each of
extra_capscapabilities):-
Resolve the sender’s source slot in the sender’s CSpace.
-
Verify the source capability has
GRANTright. -
Resolve the receiver’s destination slot via the cached receive configuration.
-
Verify the destination slot is empty.
-
Record both source and destination for the copy pass.
-
-
Copy pass (for each validated capability):
-
Call
CNode::copy_slot()on the receiver’s CNode, copying from the sender’s CSpace. -
This allocates a fresh global slot, copies the capability, increments the object refcount, and inserts into the CDT.
-
On failure: roll back all previously copied capabilities by calling
delete()on each.
-
-
Extended payload: if the message label matches
PM_EXEC_LABEL, the kernel also copies thereservedarea from the sender’s IPC buffer to the receiver’s (used for binary data duringexec).
Fault Delivery
When a thread faults (page fault, user exception), the kernel sends a fault message to the thread’s designated fault endpoint.
The fault message uses the same Message format with a fault-specific label:
| Fault Type | Label | Message Registers |
|---|---|---|
VM Fault |
2 |
MR0=fault address, MR1=error code, MR2=faulting IP, MR3=is_instruction_fault |
User Exception |
4 |
MR0=exception vector, MR1=error code, MR2=faulting IP, MR3=faulting SP |
Cap Fault |
1 |
Capability lookup failure details |
The faulting thread blocks with BlockedReason::FaultBlocked.
Replying to the fault message resumes the faulted thread (reply-to-resume).
Timed IPC
SendTimed (syscall 21) and RecvTimed (syscall 22) accept a timeout in nanoseconds.
If the operation does not complete within the timeout, the thread is woken from the endpoint queue and receives a timeout error.
The thread is inserted into both the endpoint wait queue and the sleep queue. Whichever fires first (partner arrival or timeout expiry) wakes the thread and removes it from the other queue.
Multi-Endpoint Receive
RecvAny (syscall 23) and ReplyRecvAny (syscall 24) allow a thread to wait on multiple endpoints simultaneously.
The thread registers up to MAX_RECV_WAIT_ENDPOINTS (32) endpoints via RecvWaitLink structures embedded in the TCB.
On RecvAny:
-
Acquire all registered endpoints' locks in address order (preventing deadlock).
-
Check each endpoint for a ready sender.
-
Check the bound notification for pending bits.
-
If any source is ready, deliver the message, release all locks, and return with a badge identifying the source.
-
If no source is ready: enqueue the thread in every registered endpoint’s recv queue, release all locks, and block.
When any endpoint receives a message or the bound notification fires:
-
The thread’s
recv_wait_selectedis set to identify the source. -
The thread is dequeued from all other endpoints.
-
The thread is woken.
Timed variants: RecvAnyTimed (syscall 25) and ReplyRecvAnyTimed (syscall 26) add sleep queue integration.
Error Conditions and Edge Cases
Endpoint Destruction While Threads Are Blocked
When the last capability to an endpoint is deleted, Endpoint::cleanup() runs:
-
All threads in the send queue are woken with an error.
-
All threads in the recv queue are woken with an error.
Threads that were in CallSendBlocked or ReplyWait are also woken.
PIP donations through the endpoint are reverted.
Capability Transfer Failure
If cap transfer fails partway through (e.g., the 2nd of 3 caps fails due to SlotOccupied):
-
All previously transferred caps in the current message are rolled back via
delete(). -
The message is not delivered — the entire send fails.
-
The sender receives
SyscallError::SlotOccupied(or the specific cap error).
RecvAny with All Endpoints Destroyed
If all registered endpoints are destroyed while the thread is blocked in RecvAny, the thread is woken as each endpoint runs cleanup.
The thread sees recv_wait_selected = NONE and returns an error.
Timed IPC Timeout vs Partner Arrival Race
When a timeout and a partner arrival happen simultaneously:
-
The sleep queue and endpoint queue each attempt to wake the thread.
-
Whichever succeeds first (atomic state transition) wins.
-
The loser finds the thread already woken and does nothing (idempotent).
-
If the timeout wins, the sender’s message is not consumed — the sender remains blocked or is also timed out.
Related Pages
-
Notifications — asynchronous signaling and bound notifications
-
IPC Fastpath — assembly-optimized fast paths for Call and ReplyRecv
-
Capabilities — capability transfer mechanics
-
Design Patterns — server loop and client RPC patterns
-
Syscall ABI — register conventions and syscall numbers