Strings and Memory

string.rs and mem.rs together cover the C string.h and the memory primitives that travel with it. The interesting parts are the architecture dispatch through ArchString / ArchMem traits, the FreeBSD-style strlcpy / strlcat (which differ from glibc’s strncpy / strncat in subtle ways), the memmove overlap handling, and the strtok reentrancy split. This page covers the function inventory, the dispatch story, and the specific behaviors that ports need to know about.

Function Inventory

string.rs contains the byte/string manipulation functions, mem.rs contains the byte-block functions (some of which are also available under string.h aliases). Together they cover roughly the entire string.h surface plus the BSD extensions.

Group Functions

Length and search

strlen, strnlen, strchr, strrchr, strstr, strpbrk, strspn, strcspn, strchrnul

Comparison

strcmp, strncmp, strcasecmp, strncasecmp, strcoll

Copy and concatenate

strcpy, strncpy, stpcpy, stpncpy, strcat, strncat, strlcpy (BSD), strlcat (BSD)

Tokenize

strtok, strtok_r, strsep

Duplicate

strdup, strndup (declared in malloc.rs because they allocate)

Error and signal

strerror, strerror_r, strsignal

Memory operations

memcpy, memmove, memset, memcmp, memchr, memrchr, memmem

Bit and byte

bzero, bcopy, bcmp, ffs, fls

Architecture Dispatch

mem.rs and string.rs rely on the crate::arch::Arch type for the operations that benefit from SIMD. The pattern is consistent throughout:

#[unsafe(no_mangle)]
pub unsafe extern "C" fn memcpy(dst: *mut u8, src: *const u8, n: usize) -> *mut u8 {
    unsafe { crate::arch::Arch::memcpy(dst, src, n) }
}

#[unsafe(no_mangle)]
pub unsafe extern "C" fn strlen(s: *const u8) -> usize {
    unsafe { crate::arch::Arch::strlen(s) }
}

Arch is a compile-time alias for X86_64Arch or AArch64Arch, both of which implement ArchMem and ArchString. On x86_64 with basaltc_sse2 enabled, memcpy becomes mem_sse2::memcpy_sse2 (16-byte movdqu loop), strlen becomes string_sse2::strlen_sse2 (pcmpeqb + pmovmskb). On aarch64, both go through NEON. See Multi-Architecture Dispatch for the full trait surface and the per-architecture implementations.

Functions that do not benefit from SIMD stay in string.rs/mem.rs as scalar Rust:

  • strcmp, strncmp, strcasecmp, strncasecmp — branch on every byte difference, so SIMD gain is limited.

  • strcpy, strncpy, stpcpy, stpncpy — copy until NUL, no SIMD benefit until the length is known.

  • strchr, strrchr, strstr, strpbrk — pattern search, mostly bound by branch prediction.

  • bzero, bcmp, bcopy, ffs, fls — small fast paths.

The functions in this category use straightforward Rust loops that the compiler optimizes per target.

memmove Overlap Handling

memcpy requires non-overlapping dst and src; memmove accepts overlapping ranges. The implementation checks the relative position and dispatches:

#[unsafe(no_mangle)]
pub unsafe extern "C" fn memmove(dst: *mut u8, src: *const u8, n: usize) -> *mut u8 {
    unsafe { crate::arch::Arch::memmove(dst, src, n) }
}

Arch::memmove (in arch/x86_64/mem_sse2.rs or arch/aarch64/mod.rs) checks:

  • If dst < src, copy forward (low to high addresses) — guaranteed correct because each byte is read before being written.

  • If dst >= src, copy backward (high to low addresses) — guaranteed correct for the same reason in the opposite direction.

  • If dst == src, no copy needed.

The forward and backward paths both use the SIMD primitives, so memmove is nearly as fast as memcpy even for overlapping ranges.

The scalar fallback in arch/x86_64/mod.rs follows the same pattern with byte loops.

FreeBSD strlcpy and strlcat

strlcpy and strlcat are the BSD-recommended replacements for strncpy and strncat. They differ from glibc’s standard functions in two important ways: the size parameter is the buffer size (not the maximum copy length), and the result is always NUL-terminated as long as the buffer is non-empty.

#[unsafe(no_mangle)]
pub unsafe extern "C" fn strlcpy(dst: *mut u8, src: *const u8, size: usize) -> usize {
    unsafe {
        let src_len = crate::arch::Arch::strlen(src);
        if size > 0 {
            let copy = core::cmp::min(size - 1, src_len);
            core::ptr::copy_nonoverlapping(src, dst, copy);
            *dst.add(copy) = 0;
        }
        src_len  // Length of source, NOT length copied
    }
}

The return value is the length of the source string, not the number of bytes copied. This lets callers detect truncation:

if (strlcpy(buf, src, sizeof(buf)) >= sizeof(buf)) {
    // truncation occurred
}

strlcat follows the same convention: append src to dst, NUL-terminating, and return the total length the result would have had if the buffer were big enough.

The reason these matter for ports: ported FreeBSD utilities use strlcpy / strlcat exclusively for buffer-safe string handling, and they expect the BSD return-value semantics. basaltc’s implementations match BSD behavior bit-for-bit.

strtok and strtok_r

strtok(str, delim) is the classic tokenizer with internal state:

static mut STRTOK_STATE: *mut u8 = core::ptr::null_mut();

#[unsafe(no_mangle)]
pub unsafe extern "C" fn strtok(s: *mut u8, delim: *const u8) -> *mut u8 {
    unsafe { strtok_r(s, delim, &raw mut STRTOK_STATE) }
}

basaltc’s strtok is not thread-safe: it uses a global STRTOK_STATE because that is the C standard convention. Threaded code should use strtok_r directly with a per-thread state pointer:

char *saveptr;
char *tok = strtok_r(line, " \t", &saveptr);
while (tok) {
    process(tok);
    tok = strtok_r(NULL, " \t", &saveptr);
}

strsep(stringp, delim) is the BSD variant with slightly different empty-token handling: it returns empty strings for adjacent delimiters, while strtok_r skips them.

strstr Algorithm

basaltc’s strstr uses a simple O(N×M) two-loop search:

pub unsafe extern "C" fn strstr(haystack: *const u8, needle: *const u8) -> *mut u8 {
    unsafe {
        if *needle == 0 { return haystack as *mut u8; }
        let mut h = haystack;
        while *h != 0 {
            let mut hh = h;
            let mut nn = needle;
            while *nn != 0 && *hh == *nn {
                hh = hh.add(1);
                nn = nn.add(1);
            }
            if *nn == 0 { return h as *mut u8; }
            h = h.add(1);
        }
        core::ptr::null_mut()
    }
}

There is no Boyer-Moore or KMP, no SIMD acceleration, no length check pre-pass. This is acceptable because strstr is rarely the bottleneck in realistic basaltc workloads. A future optimization opportunity exists if a port shows up that benefits from a faster implementation.

memmem

memmem(haystack, hlen, needle, nlen) is the binary-safe analogue of strstr (no NUL termination required). basaltc implements it the same way, with a lengthwise check instead of NUL checks.

memmem is a glibc extension used by curl, openssl, and many cryptography libraries; it is exported from basaltc unconditionally.

strerror and strerror_r

strerror(errnum) returns a NUL-terminated string for the given errno value. basaltc has a static table of strings indexed by errno number:

pub unsafe extern "C" fn strerror(errnum: i32) -> *mut u8 {
    if errnum >= 0 && (errnum as usize) < ERR_TABLE.len() {
        ERR_TABLE[errnum as usize].as_ptr() as *mut u8
    } else {
        b"Unknown error\0".as_ptr() as *mut u8
    }
}

The table includes strings for the Linux errno numbers basaltc supports (see trona Boundary). The returned pointer is into a static array, so it stays valid forever and is shared across threads.

strerror_r(errnum, buf, buflen) writes the string into the caller’s buffer. There are two conflicting POSIX signatures (XSI vs GNU); basaltc supports the XSI form (returning int) by default and the GNU form (returning char*) when the corresponding compile-time symbol is defined.

bzero, bcopy, bcmp

The BSD memory functions have the same semantics as memset/memmove/memcmp but with reordered argument lists:

void bzero(void *s, size_t n);              // memset(s, 0, n)
void bcopy(const void *src, void *dst, size_t n);  // memmove(dst, src, n) — note arg order!
int  bcmp(const void *s1, const void *s2, size_t n); // memcmp

basaltc implements them as one-line forwarders to the corresponding mem* functions. The argument-order swap in bcopy is the historical BSD signature; ports that use bcopy get the right behavior because basaltc compiles bcopy(src, dst, n) into memmove(dst, src, n).

ffs and fls

ffs(i) returns the position of the first (least significant) set bit in i, where bit 1 is the LSB and 0 means no bits set. fls(i) returns the position of the last (most significant) set bit.

#[unsafe(no_mangle)]
pub extern "C" fn ffs(i: i32) -> i32 {
    if i == 0 { 0 } else { i.trailing_zeros() as i32 + 1 }
}

#[unsafe(no_mangle)]
pub extern "C" fn fls(i: i32) -> i32 {
    if i == 0 { 0 } else { 32 - i.leading_zeros() as i32 }
}

Both functions become single CPU instructions on x86_64 (bsf / bsr) and aarch64 (rbit + clz / clz), so the Rust intrinsics generate code as efficient as a hand-written assembly version.

strchr, strrchr, strchrnul

strchr(s, c) returns a pointer to the first occurrence of c in s, or NULL if not found. strrchr(s, c) returns the last occurrence. strchrnul(s, c) (glibc extension) returns the first occurrence or a pointer to the terminating NUL if not found — useful for path parsing where you want to fall through to the end on a missing character.

basaltc implements all three; the SIMD potential is limited because they need to handle the NUL terminator, which is a separate condition from "did we find c", so the inner loop carries two checks per byte.