Computing the characteristic set of a quasiaffine variety

A quasiaffine variety XX is the solution set of a system of polynomial equations fi(x)=0f_i(x) = 0 and inequations gj(x)0g_j(x) \neq 0 (the indices ii and jj range in finite sets [s][s] and [r][r], respectively, and x=(x1,,xn)x = (x_1, \dots, x_n)). Quasiaffine varieties are useful in describing geometric configurations: the equations specify special position relations among the coordinates and the inequations can be used to remove degenerate loci. For example, the realization spaces of matroids (and gaussoids in the algebraic setting) are quasiaffine varieties.

For this article we assume that the defining polynomials fif_i and gjg_j have integer coefficients and we ask which numbers pp (equal to either 00 or a prime) admit a field KK of characteristic pp over which XX is non-empty. Let X(K)X(K) be the set of KK-rational points in XX, i.e., take the images of the fif_i and gjg_j under the canonical homomorphism ZK\mathbb{Z} \to K and let X(K)X(K) be the corresponding quasiaffine variety. Then the characteristic set of XX is

char(X):={p:K field of char. p and X(K)}.\mathrm{char}(X) := \{\, p : \text{$K$ field of char.~$p$ and $X(K) \neq \emptyset$} \,\}.

A priori there is no reason why this set should be computable. However, results from model theory show that the situation is not as bad as it seems.

Quantifier elimination and computability

This setup and related questions were discussed a bit in the previous article on First-order objects in algebraic geometry. Let me recall from there that the model-theoretic Lefschetz principle asserts the following:

Hence, char(X)\mathrm{char}(X) either contains 00 and excludes finitely many primes, or it excludes 00 and consists of finitely many primes. Thus, it can be described by a boolean (is 00 in the set or not?) together with a finite list of primes. The Lefschetz principle also shows that in each fixed characteristic it is enough to consider solvability over a single field Fp\overline{\mathbb{F}_p} over which computations are possible!

There is a theorem in model theory which states that

The first-order theory of algebraically closed fields has quantifier elimination in the language of rings.

This means that to each first-order formula ϕ\phi in the language of rings, there exists a quantifier-free formula ψ\psi which is equivalent to ϕ\phi over all algebraically closed fields. A quasiaffine variety XX is a definable set, i.e., it can be described by a first-order formula ϕX\phi_X (since each polynomial with integer coefficients is a term in the language of rings). To express that XX is non-empty, we simply prefix ϕX\phi_X with existential quantifiers (x)ϕX(\exists x)\phi_X. Quantifier elimination produces a quantifier-free formula ψ\psi which is equivalent to (x)ϕX(\exists x)\phi_X over all algebraically closed fields. Now, ψ\psi has no quantifiers but also no free variables since it is equivalent to a formula with that property, so it cannot contain variables at all. This means that it must be a boolean combination of atomic statements of the form k=0k = 0. But a statement like that is true in a field if and only if its characteristic divides kk.

Quantifier elimination for algebraically closed fields can be done constructively. Hence, the formula ψ\psi is computable and from the boolean combination of the k=0k=0 statements, the relevant prime factors and information on whether they are inside or outside the characteristic set can be extracted. Model theory books usually include a description of the quantifier elimination process. It consists of successively viewing multivariate polynomials as univariate in a distinguished variable and extracting the coefficient polynomials to do further symbolic manipulations on them (whose validity in eliminating the existential quantifier require algebraic closedness). This process is rather finicky and it seems like it would be slow to run on a computer. In any case, the performance will heavily depend on the representation of polynomials on the computer and how fast they can be rearranged into univariate polynomials. These are implementations very deep down in a computer algebra system which I prefer not to touch for a mere blog post.

Using Gröbner bases

Instead of these low-level polynomial manipulations, we can use any of the widely available and well-engineered Gröbner bases algorithms as a blackbox to compute char(X)\mathrm{char}(X). This observation rests on the following series of observations:

  1. First compute a Gröbner basis GG of the folded saturation ideal I=((f1,,fs:g1)):grI = (\dots (\langle f_1, \dots, f_s\rangle : g_1^\infty)\dots) : g_r^\infty in the ring Z[x1,,xn]\mathbb{Z}[x_1, \dots, x_n].
  2. This computation is fraction-free, as it is carried out over Z\mathbb{Z}, so the resulting Gröbner basis elements are contained in the reduction of II modulo any prime.
  3. GG is a Gröbner basis for II over Q\mathbb{Q} but it may fail to be one modulo some primes. By Buchberger’s criterion, GG is for sure a Gröbner basis modulo pp when its lead terms have “good reduction”, i.e., they do not reduce to zero.
  4. Hence, one has to collect all prime divisors of lead terms in GG and run a dedicated Gröbner basis computation modulo these finitely many exceptional primes to see what really happens there.
  5. Using the Gröbner bases, one can decide if 11 is in the (reduced) ideal or not which decides whether or not the quasiaffine variety is empty over the algebraic closure of the prime field.

Note that this uses Gröbner bases over Z\mathbb{Z} as a blackbox algorithm. The computation does not have to specially instrumented or monitored (which would require engineering effort and could slow down performance). Moreover, any monomial ordering and any Gröbner basis algorithm is valid.

Here is a sample implementation in Macaulay2:

-- Check if the saturation ideal(F):G^\infty contains a non-zero constant
-- by computing a a Gröbner basis over the integers. Also determine all
-- characteristics over which the Gröbner basis does not necessarily have
-- the same lead monomials.
testSolvabilityGeneric = (F, G) -> (
  R := ZZ[gens ring F_0]; -- make sure we work over ZZ
  I := fold(saturate, ideal(F / (f -> sub(f,R))), G / (g -> sub(g,R)));
  -- If there is a (non-zero) scalar in the GB, the ideal is trivial
  -- for most primes: all those which do not divide the scalar.
  U := select(first entries gens gb I, isConstant);
  if #U > 0 then (
    exceptF := unique flatten apply(U, g ->
      apply(toList factor leadCoefficient g, f -> first toList f)
    );
    return { false, exceptF };
  );
  -- Otherwise we have a non-trivial ideal. Buchberger's criterion
  -- certifies that this GB computation has good reduction modulo
  -- all primes which do not divide the lead coefficients.
  exceptT := unique flatten apply(first entries gens gb I, g ->
    apply(toList factor leadCoefficient g, f -> first toList f)
  );
  { true, exceptT }
);

-- Check if the saturation ideal(F):G^\infty is trivial over GF(p).
testSolvabilityModP = (F, G, p) -> (
  R := GF(p)[gens ring F_0];
  I := fold(saturate, ideal(F / (f -> sub(f,R))), G / (g -> sub(g,R)));
  1 % I != 0
);

-- Compute the set of characteristics over which f == 0 for all f in F and
-- g != 0 for all g in G has a solution. The result is a boolean, which
-- is the "generic" answer to the feasibility of the system, and a finite
-- list of exceptional primes where the answer is opposite.
solvability = (F,G) -> (
  r := testSolvabilityGeneric(F,G);
  { r#0, select(r#1, p -> testSolvabilityModP(F,G,p) != r#0) }
);

To test this out, consider the following configuration of geometric objects: two points AA and BB which lie on a line CC and also on a parabola DD in the affine plane. In total this gives us nine coordinates and four equations among them. We also add inequations to ensure that the line is non-vertical and that the parabola does not degenerate to a line.

R = ZZ[a1,a2, b1,b2, c0,c1, d0,d1,d2];
F = { c1*a1+c0 - a2, c1*b1+c0 - b2, d2*a1^2+d1*a1+d0 - a2, d2*b1^2+d1*b1+d0 - b2 };
G = { a1-b1, d2 };
time print solvability(F, G);
--> {true, {}}
--> used 0.0065934 seconds

The computation concludes blazingly fast that this configuration is realizable over every characteristic. (In fact, it is easy to write down a rational parametrization from which this becomes obvious.)

Characteristic sets of matroids

More challenging examples come from matroids. The characteristic set of a matroid is a well-known object and it coincides with the characteristic set (as defined above) of its realization space (which is a quasiaffine variety). Oxley’s book, for example, contains a definition and the computability results mentioned above.

It is well-known that the Fano matroid can only be realized in characteristic 22. We can now check this entirely computationally. First, we need to turn a matroid into a description of its realization space:

needsPackage "Matroids";

realizationSpace = M -> (
  n := #groundSet M;
  r := rank M;
  C := (circuits M) / toList;
  B := (bases M) / toList;
  R := ZZ[a_(1,1) .. a_(n,r)];
  A := genericMatrix(R,r,n);
  F := flatten apply(C, x -> first entries gens minors(#x, submatrix(A,x)));
  G := apply(B, x -> det submatrix(A,x));
  (F, G)
);

Now the characteristic sets of the Fano and non-Fano matroid can be computed:

(F, G) = realizationSpace specificMatroid "fano";
time print solvability(F,G);
--> {false, {2}}
--> used 56.8058 seconds

Fano is generally non-realizable, except in characteristic 2. With non-Fano, it is exactly the opposite:

(F, G) = realizationSpace specificMatroid "nonfano";
time print solvability(F,G);
--> {true, {2}}
--> used 78.3184 seconds

The computation of characteristic sets takes a lot of time. I let the above procedure run for the Vámos matroid but terminated it after 5 hours with no result. Its realization space is described by too many equations and inequations in 32 variables. Even using more knowledge from matroid theory does not seem to help: since Vámos violates an instance of the Ingleton inequality, we can narrow down the equations and inequations to a smaller set which should still not admit a solution over any field. For this much smaller system I managed to get a Gröbner basis of the equations in 10 minutes, but the saturation seems hopeless.

Another thought: it would be nice if the Gröbner basis computations modulo exceptional primes could benefit from the previous “generic” computation over Z\mathbb{Z} in some way. In Macaulay2 I don’t know how to link the two computations as they happen in different rings. The Julia package Groebner.jl seems to facilitate the “learning” of a Gröbner basis computation in order to speed up subsequent similar computations. Unfortunately, it does not implement fraction-free Gröbner basis computation over Z\mathbb{Z}, so I haven’t tried it on the above problems.

I believe that one could compute a Gröbner basis over Q\mathbb{Q} and take the denominators of coefficients in the “change matrix” (i.e., the RR-linear map which shows how to obtain the Gröbner basis as linear combinations of the initial generators) and the lead terms of the Gröbner basis to be exceptional primes over which dedicated Gröbner basis computations have to be performed.