mathstodon.xyz is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Mastodon instance for maths people. We have LaTeX rendering in the web interface!

Server stats:

2.8K
active users

#arxiv

27 posts16 participants0 posts today

Quantum-assured magnetic navigation with higher positioning accuracy than GPS

arxiv.org/abs/2504.08167

arXiv logo
arXiv.orgQuantum-assured magnetic navigation achieves positioning accuracy better than a strategic-grade INS in airborne and ground-based field trialsModern navigation systems rely critically on GNSS, which in many cases is unavailable or unreliable (e.g. due to jamming or spoofing). For this reason there is great interest in augmenting backup navigation systems such as inertial navigation systems (INS) with additional modalities that reduce positioning error in the absence of reliable GNSS. Magnetic-anomaly navigation is one such approach, providing passive, non-jammable navigation through periodic position fixes obtained by comparing local measurements of Earth's crustal field against known anomaly maps. Despite its potential, existing MagNav efforts have been limited by magnetometer performance and platform noise; solutions addressing these problems have proven either too brittle or impractical for realistic deployment. Here we demonstrate a quantum-assured MagNav solution based on proprietary quantum magnetometers with by a novel denoising and map-matching algorithms. The system fits on fixed-wing drones or in the avionics bay of a commercial airliner. We present trials at altitudes up to 19000 feet, testing onboard and outboard quantum magnetometers comparing against a strategic-grade INS. Our MagNav solution achieves superior performance, delivering up to 46x better positioning error than the velocity-aided INS; the best final positioning accuracy we achieve is 22m or 0.006% of the flight distance. Airborne trials consistently achieve at least 11x advantage over the INS across varying conditions, altitudes, and flight patterns. The system learns model parameters online without special vehicle maneuvers providing robustness to various configuration changes (e.g. changing payload or latitude). Our trials also include the first successful MagNav performed in a ground vehicle using publicly-available anomaly maps, delivering bounded positioning error 7x lower than the INS, with both systems in strapdown configuration.

Pydrofoil: Accelerating Sail-based instruction set simulators

arxiv.org/abs/2503.04389

arXiv logo
arXiv.orgPydrofoil: accelerating Sail-based instruction set simulatorsWe present Pydrofoil, a multi-stage compiler that generates instruction set simulators (ISSs) from processor instruction set architectures (ISAs) expressed in the high-level, verification-oriented ISA specification language Sail. Pydrofoil shows a > 230x speedup over the C-based ISS generated by Sail on our benchmarks, and is based on the following insights. (i) An ISS is effectively an interpreter loop, and tracing just-in-time (JIT) compilers have proven effective at accelerating those, albeit mostly for dynamically typed languages. (ii) ISS workloads are highly atypical, dominated by intensive bit manipulation operations. Conventional compiler optimisations for general-purpose programming languages have limited impact for speeding up such workloads. We develop suitable domain-specific optimisations. (iii) Neither tracing JIT compilers, nor ahead-of-time (AOT) compilation alone, even with domain-specific optimisations, suffice for the generation of performant ISSs. Pydrofoil therefore implements a hybrid approach, pairing an AOT compiler with a tracing JIT built on the meta-tracing PyPy framework. AOT and JIT use domain-specific optimisations. Our benchmarks demonstrate that combining AOT and JIT compilers provides significantly greater performance gains than using either compiler alone.

Ultra-precision formation flying demonstration for space-based interferometry

arxiv.org/abs/2504.05001

arXiv logo
arXiv.orgSILVIA: Ultra-precision formation flying demonstration for space-based interferometryWe propose SILVIA (Space Interferometer Laboratory Voyaging towards Innovative Applications), a mission concept designed to demonstrate ultra-precision formation flying between three spacecraft separated by 100 m. SILVIA aims to achieve sub-micrometer precision in relative distance control by integrating spacecraft sensors, laser interferometry, low-thrust and low-noise micro-propulsion for real-time measurement and control of distances and relative orientations between spacecraft. A 100-meter-scale mission in a near-circular low Earth orbit has been identified as an ideal, cost-effective setting for demonstrating SILVIA, as this configuration maintains a good balance between small relative perturbations and low risk for collision. This mission will fill the current technology gap towards future missions, including gravitational wave observatories such as DECIGO (DECihertz Interferometer Gravitational wave Observatory), designed to detect the primordial gravitational wave background, and high-contrast nulling infrared interferometers like LIFE (Large Interferometer for Exoplanets), designed for direct imaging of thermal emissions from nearby terrestrial planet candidates. The mission concept and its key technologies are outlined, paving the way for the next generation of high-precision space-based observatories.

Pushing the Limits of LLM Quantization via the Linearity Theorem

arxiv.org/abs/2411.17525

arXiv logo
arXiv.orgPushing the Limits of Large Language Model Quantization via the Linearity TheoremQuantizing large language models has become a standard way to reduce their memory and computational costs. Typically, existing methods focus on breaking down the problem into individual layer-wise sub-problems, and minimizing per-layer error, measured via various metrics. Yet, this approach currently lacks theoretical justification and the metrics employed may be sub-optimal. In this paper, we present a "linearity theorem" establishing a direct relationship between the layer-wise $\ell_2$ reconstruction error and the model perplexity increase due to quantization. This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, which outperforms all prior data-free approaches such as the extremely popular NF4 quantized format, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels which match a given compression constraint in the medium-bitwidth regime, obtained by reduction to dynamic programming. On the practical side, we demonstrate improved accuracy-compression trade-offs on Llama-3.1 and 3.2-family models, as well as on Qwen-family models. Further, we show that our method can be efficiently supported in terms of GPU kernels at various batch sizes, advancing both data-free and non-uniform quantization for LLMs.

Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

arxiv.org/abs/2502.15840

arXiv logo
arXiv.orgVending-Bench: A Benchmark for Long-Term Coherence of Autonomous AgentsWhile Large Language Models (LLMs) can exhibit impressive proficiency in isolated, short-term tasks, they often fail to maintain coherent performance over longer time horizons. In this paper, we present Vending-Bench, a simulated environment designed to specifically test an LLM-based agent's ability to manage a straightforward, long-running business scenario: operating a vending machine. Agents must balance inventories, place orders, set prices, and handle daily fees - tasks that are each simple but collectively, over long horizons (>20M tokens per run) stress an LLM's capacity for sustained, coherent decision-making. Our experiments reveal high variance in performance across multiple LLMs: Claude 3.5 Sonnet and o3-mini manage the machine well in most runs and turn a profit, but all models have runs that derail, either through misinterpreting delivery schedules, forgetting orders, or descending into tangential "meltdown" loops from which they rarely recover. We find no clear correlation between failures and the point at which the model's context window becomes full, suggesting that these breakdowns do not stem from memory limits. Apart from highlighting the high variance in performance over long time horizons, Vending-Bench also tests models' ability to acquire capital, a necessity in many hypothetical dangerous AI scenarios. We hope the benchmark can help in preparing for the advent of stronger AI systems.