Math Library

math.rs (~780 lines) implements <math.h> for basaltc. The architecture-independent functions (sign manipulation, classification, infinity / NaN handling) live directly in math.rs and operate on u64/u32 bit patterns. The architecture-dependent functions (sqrt, sin, cos, tan, log, exp, pow, atan2, fma, scalbn, etc.) dispatch through the ArchMath trait so x86_64 can use SSE2 / x87 and aarch64 can use NEON. This page covers the function inventory, the bit-level operations, the architecture dispatch, the precision guarantees, and the errno/errno-state semantics.

Function Inventory

Group Functions

Classification

fpclassify, isfinite, isnan, isinf, isnormal, signbit

Sign manipulation

copysign, fabs, nan

Rounding

floor, ceil, trunc, round, rint, nearbyint, lrint, llrint, lround, llround

Power and root

sqrt, cbrt, pow, hypot

Exponential and logarithm

exp, exp2, expm1, log, log2, log10, log1p, logb, ilogb

Trigonometry

sin, cos, tan, asin, acos, atan, atan2

Hyperbolic

sinh, cosh, tanh, asinh, acosh, atanh

Modulo and remainder

fmod, remainder, remquo, modf, frexp, ldexp, scalbn, scalbln

Combined operations

fma, fdim, fmax, fmin

Error and gamma

erf, erfc, tgamma, lgamma

Float versions

All of the above with f suffix for float (e.g., sqrtf, sinf)

Architecture-Independent Functions

The functions that operate on the bit pattern of a float — fpclassify, isnan, isinf, signbit, copysign, fabs — are implemented in pure Rust on u64 (for double) and u32 (for float). They never need hardware floating-point support and never depend on Arch::*.

#[unsafe(no_mangle)]
pub extern "C" fn fabs(x: f64) -> f64 {
    f64::from_bits(x.to_bits() & 0x7FFF_FFFF_FFFF_FFFF)
}

#[unsafe(no_mangle)]
pub extern "C" fn copysign(x: f64, y: f64) -> f64 {
    let mag = x.to_bits() & 0x7FFF_FFFF_FFFF_FFFF;
    let sign = y.to_bits() & 0x8000_0000_0000_0000;
    f64::from_bits(mag | sign)
}

#[unsafe(no_mangle)]
pub extern "C" fn isnan(x: f64) -> i32 {
    let bits = x.to_bits();
    let exp = (bits >> 52) & 0x7FF;
    let mant = bits & 0x000F_FFFF_FFFF_FFFF;
    (exp == 0x7FF && mant != 0) as i32
}

#[unsafe(no_mangle)]
pub extern "C" fn fpclassify(x: f64) -> i32 {
    let bits = x.to_bits();
    let exp = (bits >> 52) & 0x7FF;
    let mant = bits & 0x000F_FFFF_FFFF_FFFF;
    if exp == 0 && mant == 0 { return FP_ZERO; }
    if exp == 0 { return FP_SUBNORMAL; }
    if exp == 0x7FF && mant == 0 { return FP_INFINITE; }
    if exp == 0x7FF { return FP_NAN; }
    FP_NORMAL
}

These compile to a few integer instructions per function — faster than any hardware-classification path because they avoid loading the value into an SSE/NEON register at all.

ArchMath Dispatch

The architecture-dependent functions go through crate::arch::Arch:

#[unsafe(no_mangle)]
pub extern "C" fn sqrt(x: f64) -> f64 { crate::arch::Arch::sqrt(x) }
#[unsafe(no_mangle)]
pub extern "C" fn sin(x: f64)  -> f64 { crate::arch::Arch::sin(x) }
#[unsafe(no_mangle)]
pub extern "C" fn cos(x: f64)  -> f64 { crate::arch::Arch::cos(x) }
#[unsafe(no_mangle)]
pub extern "C" fn pow(x: f64, y: f64) -> f64 { crate::arch::Arch::pow(x, y) }

Arch::sqrt resolves at compile time to X86_64Arch::sqrt or AArch64Arch::sqrt, both of which implement the ArchMath trait. See Multi-Architecture Dispatch for the full trait surface.

x86_64 Backend

X86_64Arch provides two math implementations side by side:

  • math_sse2.rs — uses SSE2 sqrtsd / sqrtss for square roots.

  • math_x87.rs — uses the legacy x87 FPU for everything else (transcendentals, trig, log, exp).

The dispatch in arch/x86_64/mod.rs:

impl super::ArchMath for X86_64Arch {
    fn sqrt(x: f64) -> f64 {
        #[cfg(basaltc_sse2)]
        { unsafe { math_sse2::sqrt_sse2(x) } }
        #[cfg(not(basaltc_sse2))]
        { math_x87::sqrt_x87(x) }
    }
    fn sin(x: f64)   -> f64 { math_x87::sin_x87(x) }
    fn cos(x: f64)   -> f64 { math_x87::cos_x87(x) }
    fn log(x: f64)   -> f64 { math_x87::log_x87(x) }
    fn exp(x: f64)   -> f64 { math_x87::exp_x87(x) }
    fn pow(x: f64,y: f64) -> f64 { math_x87::pow_x87(x, y) }
    // ... and so on
}

Only sqrt and sqrtf use SSE2; everything else routes through x87. The reason: SSE2 has packed-double sqrtsd instruction but no fsin / fcos / flog equivalents. basaltc could implement those in software using SSE2 polynomial approximations, but the x87 transcendentals are precise enough and shorter to write, so basaltc uses them.

math_x87.rs is hand-written core::arch::asm! blocks invoking the f* instruction family:

pub fn sin_x87(x: f64) -> f64 {
    let result: f64;
    unsafe {
        asm!(
            "fsin",
            inout("st(0)") x => result,
            options(nomem, nostack, pure),
        );
    }
    result
}

Each x87 wrapper is 5–15 lines of assembly with register constraints. The functions are marked pure and nomem so the optimizer can hoist them out of loops.

aarch64 Backend

AArch64Arch (lib/basalt/c/src/arch/aarch64/mod.rs) is one ~17 KB file containing all math, memory, and string implementations. Math functions use NEON intrinsics directly:

  • sqrt uses the FSQRT instruction.

  • floor, ceil, trunc, rint use FRINTM, FRINTP, FRINTZ, FRINTX — dedicated rounding instructions.

  • Transcendentals (sin, cos, log, exp) use software approximations written in pure Rust on top of FMA primitives.

aarch64 does not have hardware transcendentals, so the implementations are explicit polynomial approximations with carefully chosen coefficients. The accuracy target is "approximately one ulp" for most functions, which is enough for typical port use cases but may differ from x86_64 by a few ulps for the same input.

Ports that depend on bit-exact reproducibility across architectures should not use the basaltc transcendentals — they should bring their own implementation (e.g., the SoftFloat library or a vendored copy of fdlibm).

Float vs Double

basaltc provides both double and float versions of every architecture-dependent function:

  • sqrt(double) and sqrtf(float)

  • sin(double) and sinf(float)

  • pow(double, double) and powf(float, float)

The f-suffixed versions either dispatch to a dedicated SIMD instruction (sqrtss for sqrtf) or convert to double, call the double version, and convert back. Because the x87 transcendentals operate on 80-bit extended precision, the round-trip through double does not lose precision relative to a hypothetical pure-float implementation.

long double

basaltc treats long double as double on both x86_64 and aarch64. This is the same choice Fuchsia and many other modern libcs make: 80-bit extended precision is x86-specific, painful to support across architectures, and rarely needed in modern C code.

logl, sinl, sqrtl, etc., are provided as one-line wrappers that cast long double to double, call the double version, and cast back. Programs that explicitly check LDBL_DIG > DBL_DIG (the documented way to detect extended precision) will get LDBL_DIG == DBL_DIG, which correctly tells them no extra precision is available.

errno and Range Errors

POSIX requires math functions to set errno to EDOM for domain errors (e.g., sqrt(-1)) and ERANGE for range errors (e.g., exp(1000) overflowing to infinity). basaltc honors these:

#[unsafe(no_mangle)]
pub extern "C" fn sqrt(x: f64) -> f64 {
    if x < 0.0 {
        crate::errno::set_errno(crate::errno::EDOM);
        return f64::NAN;
    }
    crate::arch::Arch::sqrt(x)
}

Functions that produce infinity on overflow (exp, pow, cosh) check the result and set ERANGE if the magnitude exceeds the representable range. Functions that produce zero on underflow do not set ERANGE (POSIX leaves underflow signaling implementation-defined).

math_errhandling constant in <math.h> is set to MATH_ERRNO, indicating that basaltc reports errors via errno (not via the IEEE 754 exception flags, which would require feexcept support that basaltc does not have).

fma — Fused Multiply-Add

fma(x, y, z) computes x * y + z with a single rounding step at the end, instead of two roundings (one after the multiply, one after the add). This is required by IEEE 754 and matters for numerical accuracy in matrix and dot-product code.

basaltc’s fma uses:

  • x86_64: hardware FMA via vfmadd* instructions if available, otherwise a software fallback.

  • aarch64: hardware FMADD instruction.

The hardware path is selected at runtime via cpuid on x86_64 (FMA3 feature flag). On older x86_64 CPUs without FMA, basaltc falls back to a software implementation that does the multiply in extended precision, then the add, then the round.

Rounding Mode

basaltc does not provide <fenv.h> rounding mode control. The rounding mode is fixed at "round to nearest, ties to even" (the IEEE 754 default). Programs that need to change the rounding mode can core::arch::asm! with ldmxcsr directly, but the consequences for basaltc internals are undefined — the math functions assume default rounding.

fenv.h is present as a stub with the constants but no functional fegetround / fesetround / feraiseexcept.

Constraints

  • No quad precision__float128 is not supported.

  • No bit-exact cross-architecture results — transcendentals may differ by a few ulps between x86_64 and aarch64.

  • No <fenv.h> rounding control — fixed at round-to-nearest-even.

  • No IEEE 754 exception flag supportmath_errhandling = MATH_ERRNO only, not MATH_ERREXCEPT.

These match the standard limitations of a freestanding libc and have not blocked practical use cases.