It is intuitively plausible that formalization can help detect errors in mathematical papers. I discovered today that more old-fashioned numerics can also achieve a similar goal. In a paper I just uploaded at https://arxiv.org/abs/2503.20170 , I claimed an upper bound on a certain number-theoretic function t(N), and gave a proof. But, inspired by a nascent crowdsourced effort at https://terrytao.wordpress.com/2025/03/26/decomposing-a-factorial-into-large-factors/ to get enough estimates on this quantity for medium values of N to verify some explicit conjectures of Guy and Selfridge, I decided to plot this upper bound against the known values of t(N), and found a few places where the upper bound was actually *smaller* than the known value (see first image, where the pink dots sometimes dipped below the blue graph). This was of course distressing, but by isolating the smallest counterexample and numerically verifying key intermediate claims in my proof, I found where the error was (I was "off by one" in a certain estimate involving the floor function). This will be patched in the next revision of the paper of course, but is another example of how computer assistance (this time in the form of traditional numerics) can help detect and fix errors in papers. (And after fixing this, and recomputing the numerics, the upper bound reassuringly stayed just above the known lower bounds of t(N); see second graph.)
@tao I think all mathematical papers have errors. But they are usually typos or insignificant errors, that can be easily rewritten.
Not just maths… The text of #RaymondQueneau's 1947 "Exercices de style" (translated as Exercises in Style), which showcases different writing constraints among which the #lipogram (text avoiding strictly the usage of one or more letters of the alphabet) …has an *e* right in the middle of its lipogram in e.
So, should the English translation have one too? Is there an implied warranty that a text that claims to be a lipogram is one? Of course, we expect that a (...)
@Thales_Curiosities @tao mathematical proof published as a mathematical proof is correct ;)
This wouldn't have happened in a publication on https://oulipo.social
@tao off-by-one errors are the worst. Remember when loads of cash registers and windows machines all stopped around the world after a global version update? That was an off-by-one error causing memory to be read just beyond an array's length.
I think the problem boils down to the labeling of unit intervals (0,1),(1,2) etc rather than the labeling of integers (which is unambiguous).
Array entries, grid cells, ages, years, pixels are all unit intervals, but are labelled using integers.
So do you label a baby in their first year as being 0 or 1? Many say 0, in China it's 1.
Do you label the first year of Jesus's life as AD 1 or AD 0?
Should the first index of an array be 0 or 1? (C languages use 0, Matlab and others use 1).
It comes down to the fact that we're repurposing the integers to label unit intervals. Which is ambiguous, and also causes problems with negatives. e.g. there is only 1 year difference between 1BC and 1AD.
@TomL
Doesn't the problem you mention correspond to mixing up a bound vector with a free vector? And ultimately a 1-form measuring [0,n] on R with the integral of that form?
Then codes could be corrected by a stronger typing where you carry that information, e.g. something like
but
and then you cannot convert implicitely from one to the other, e.g. when indexing arrays.