Follow

Going to start my Mastodon account with a result I put quite a bit of work into: Sorting 11 inputs using a sorting network requires 35 comparisons and sorting 12 inputs requires 39 comparisons. I still haven't finished writing the paper (and should be working on that instead), but all the code including a formal proof in Isabelle/HOL is on github: https://github.com/jix/sortnetopt

For 9 inputs they did a computer search on a cluster that took over a week: https://doi.org/10.1016/j.jcss.2015.11.014 That wasn't brute force search though, they came up with a nice way to prune large parts of the search space. Optimizing their algorithm allows the same computation to finish on a single server within a few hours (e.g. https://doi.org/10.1007/978-3-030-19212-9_19 or https://github.com/jix/sortnetopt-gnp).

For 10 inputs already, their approach would take way too long and require too much memory. Luckily, Van Voorhis's bound (https://doi.org/10.1109/TC.1972.5009021) which already solved the 8 inputs case, happens to match the known upper bound for 10 inputs again, based on the exact result for 9 inputs.

Van Voorhis's bound states that \(s_n \ge s_{n-1} + \lceil \log_2 n \rceil\) where \(s_n\) is the number of comparators required for a sorting netowrk on \(n\) inputs. We get an equality for 2,3,4,6,8,10 and it turns out 12 inputs. Given that computing the result for 9 inputs took so much effort, this seemed like a really powerful bound to me.

So I set out to find a way to use the idea behind Van Voorhis's bound as part of a computer search. For that I found a generalized version of it, that gives a lower bound for a comparator network required to sort only a given subset of the possible input sequences. You recover Van Voorhis's bound when that set contains all possible inputs. Maybe surprisingly, Huffman's algorithm makes an appearance in the generalized bound.

Based on that new bound, I came up with a search using dynamic programming and canonicalization. That allowed me to compute \(s_{11}\) and \(s_{12}\). It took 5 hours and 180GB of RAM on a big server with 24 cores. In comparison my search finds \(s_9\) in 5 seconds and 40MB (different units!) on a laptop, which shows how fast this gets difficult with growing \(n\).

To have a result that can be trusted, I also wanted to formally verify it. Formally verifying the whole algorithm doesn't seem feasible to me though. Instead I used the common strategy to emit a certificate during the search and wrote a much simpler checker and verified that. Problem the certificate would be _much_ larger than 180GB and processing it with a verified (and thus not as optimized) checker also isn't feasible.

I worked around that by using the ideas behind Codish, Cruz-Filipe, Frank & Schneider-Kamp's computer search to minimize the generated certificate down to 3GB. That takes another 72 hours of processing on a 24 core server, but the resulting certificate can be checked using a formally verified checker within 3 hours on a laptop.

As I already mentioned, I still need to finish writing the paper for this. There is a bit more detail in the slides https://files.jix.one/s/20200707.pdf of talk I gave, but it wasn't recorded and the slides aren't that self-contained. I'm always happy to answer any questions about this, though :)

@jix welcome to mathstodon, and thanks for that really clear description of your result!

Jannis Harder@jix@mathstodon.xyzThe problem of determining the minimal number of required comparators in a sorting network is also known as the Bose-Nelson Sorting problem. Floyd and Knuth first computed it exactly for up to 7 inputs. A bound by Van Voorhis extended that to 8 inputs. After that it took over 40 years for further progress: in 2014 Codish, Cruz-Filipe, Frank & Schneider-Kamp computed it for up to 10 inputs.