The term “experimental mathematics” refers to a contemporary practice in mathematics, where computers are used to inspire research. This can mean, among others:
The last point sounds a bit vague. I hope to clarify it in the next section and then give examples of a strange new phenomenon in the use of the advanced mathematical software tools which we have today:
Sometimes the software is much smarter than the user and by the speed with which it refutes hand-crafted potential counterexamples, it apparently contains a fitting heuristic in its source code. This heuristic might translate into a proof of the conjecture — but the software doesn’t tell you what it is!
This is as marvellous an achievement of human engineering as it is frustrating if you can’t extract the heuristic from the source code and learn the missing part of math to prove your conjecture.
Note: I will generally assume that the software works correctly. Certifiability of yes/no answers found by software is certainly related but it is a bigger can of worms than what I want to point out here.
My colleague Andreas Kretschmer and I have been working on a question asked by Aida Maraj about two weeks ago. I don’t want to go into too many details but it concerns solutions to certain systems of polynomial equations over the positive-definite matrices.
One thing I like to use computers for is to assess which parts of the conjecture are necessary. That means, when we remove certain assumptions from the conjecture, do we find an easy counterexample? Such data points about the conjecture tell us which kinds of proof methods are doomed to fail. For example, is positive definiteness necessary? A real symmetric matrix is positive-definite if and only if all of its principal minors are positive. A relatively natural generalization of this requires that all principal minors are non-zero. This condition is called principal regularity.
The reason for this relaxation, which might not be obvious to the non-algebraist, is that we switch to an easier theory. With positive definiteness, the set of counterexamples to Aida’s conjecture is a semialgebraic set. These are theoretically tractable on a computer, but we have already ruled out all potential counterexamples which are small enough for these general methods to terminate in any reasonable amount of time. But with principal regularity, all polynomial constraints are either equations or inequations; there are no inequalities anymore. Hence, we can put the problem into the algebraically closed complex numbers and the set of counterexamples becomes constructible. Constructible sets tend to be somewhat easier to deal with on a compuer than semialgebraic ones and we can indeed explore some bigger examples — hoping that among the counterexamples we may find, there is one which transfers back into the real numbers of our original conjecture.
As it turns out, we can find counterexamples to the relaxed conjecture where positive definiteness is replaced by principal regularity. Here is the most impressive counterexample I managed to find. After a long computation in Macaulay2
, we find that the Zariski closure of the constructible set we are interested in, which contains the principally regular counterexamples to the conjecture, is described by the prime ideal generated by the following polynomials on six variables :
This might look scary, but Mathematica
finds a generic point in this variety in about two seconds on my laptop, which is indeed principally regular. The solution coordinates look something like this, involving invocations to Mathematica
’s Root
function, which represents real algebraic numbers exactly by univariate polynomials and an index of the root to pick.
x1 -> -(
(274690496000177606838447793146325447500965895245837010731843554043039615306319
Root[
1775380433750000000 - 78478332364687500000 #1^2 +
891090422891055078125 #1^4 - 849232777556096718750 #1^6 +
871885597934472187500 #1^8 - 643355496779169825000 #1^10 +
96277889059372440000 #1^12 - 5847789209959296000 #1^14 +
235403285409331200 #1^16 - 6977138838405120 #1^18 +
95569451679744 #1^20 &, 1]) /
6728554820402784834716554295293520236450118616607950097000909229587745883956)
+ ...
There are ten such summands in the solution coordinate . It looks similar for the other coordinates. Thus our system has a solution which is principally regular. If we replace the principal regularity constraints by positive definiteness constraints and call FindInstance
, then Mathematica
knows after five seconds that there exists no solution. So, our search for a counterexample was not successful: we were able to check a promising example of larger size by relaxing positive definiteness to an algebraic condition, but the counterexamples to the relaxation yielded no counterexample to the original conjecture, in this case.
This tells the experimental mathematician two things. First, if the conjecture is true, positive definiteness is necessary but the algebraic version of it, principal regularity, does not suffice. The conjecture cannot be proved by mere algebraic manipulation of the equations alone. Positivity has to be used at some point, because otherwise the conjecture would be true in the principally regular setting, too.
Second, experience tells that a system of eleven equations of degree 2 in six variables plus positive definiteness constraints are never ever solved in five seconds by general methods like Positivstellensatz, CAD or Gröbner bases. Mathematica
needs to have a heuristic whose preconditions are quick to verify and which leads to an insolvability proof very quickly. And, by the way, the same pattern should apply to the principally regular system as well. Mathematica
must have had a systematic way of finding these insane algebraic numbers shown above.
This gives us hope that the conjecture might be true. The situation presents itself to me as if some mathematician or engineer at Wolfram Inc. figured out a pattern of polynomial systems once which implies structure on the solution set which can be used to produce a solution quickly or prove that none exists. This pattern “obviously” applies to our system, but we have no idea what it is! We only know that there is a (proprietary!) software which knows. But of course, it could also be that our example was just not big enough to yield a real counterexample because whatever structure is exploited here is not always present in the situation of our conjecture. This situation would be easier to judge if we knew what Mathematica
saw that made this computation so easy.
LUBF
-gaussoidsBut this is not a nuisance of proprietary software alone. A while ago I computed the orientability of LUBF
-gaussoids on ground sets of size up to eight. I didn’t talk about it in the article, because it was about the symmetry reduction, but the same phenomenon of software seeing things the human doesn’t occurred in the stage that actually decided orientability. At the end of the symmetry reduction, there were 222 LUBF
-gaussoids, which means 222 Boolean formulas for orientability, on average in 546 variables and with 33335 clauses. The venerable MiniSAT
solved them in total time of five seconds. This includes startup time, parsing the formulas and output processing in my shell script — all done sequentially. MiniSAT
can obviously tell extremely, extremely quickly which of these formulas have a solution.
Experience tells that this cannot happen unless the SAT solver contains a fitting heuristic. The good news is that MiniSAT
is open source software, so we can look inside. The bad news is that it does not contain much more than a “metaheuristic”. MiniSAT
is a rather clean solver, written for educational purposes and reusability, but it also happens to be really fast. It does not contain a library of special-case heuristics for formulas of a certain structure, but implements conflict-driven clause learning. According to the MiniSAT
paper, the solver explores the space of all assignments to the variables and tries to deduce, or learn, more information about the formula from (partial) assignments which fail to satisfy it.
Thus, the source code is of no great help in finding why LUBF
leads to apparently easy orientability problems. The solver is intelligent enough on its own to discover the reason in realtime by just trying out some assignments. I, for one, still don’t know the reason. I suppose if I really wanted to, I could inspect the database of learnt clauses and try to infer which special structure the solver discovered. That’s a plus for free software.
In conclusion, I want to emphasize that, compared to mathematical practice at large, this is a completely new phenomenon brought to us by the extensive collaboration and expertise that goes into today’s advanced mathematical software.
Addendum (29 Sep 2021): The question I mentioned above turned out to have an affirmative answer for positive-definite matrices (although it is false for principally regular real matrices). See Theorem 3.23 of The geometry of Gaussian double Markovian distributions.
Addendum (28 Jan 2022): Zach Teitler kindly informs me that the situation is not hopeless all around. The effect I described above also struck Frank Sottile in the late 90s when investigating a conjecture of Shapiro and Shapiro in real Schubert calculus. The story has it that after setting up one instance of the problem, he did not even have time to leave his seat to get a computations-coffee. If I understand correctly, it turned out that the problem was benign enough to not lead to the worst-case complexity of Gröbner basis computations and that this fact could be understood after being discovered experimentally. A good starting point to learn more details seems to be Real Schubert Calculus: Polynomial Systems and a Conjecture of Shapiro and Shapiro.