A rational realization of Šimeček’s № 85

In his article Gaussian Representation of Independence Models over Four Random Variables from 2006, Petr Šimeček lists all conditional independence structures (or “independence models”) which are realizable by Gaussian distributions, regular or not, in dimension 4. For each model he gives a concrete correlation matrix which realizes it. Only one of these examples, named M85, is displayed with irrational entries. Šimeček asks in his article whether all Gaussian CI structures can be realized with rational correlations.

When I was at the Prague Stochastics Workshop in memory of František Matúš last year, the same question came up and the remark quickly passed the room that Šimeček had left this model M85 unresolved — a concrete example to work on. This is how I learned about his interesting paper and the excesses of singular Gaussians. Today I allowed myself to get side-tracked and see if M85 was really inherently irrational.

The short answer is: no, it also has a rational correlation matrix. Consequently, as far as I’m aware, the conjecture about rationality of Gaussian CI remains open.

Gaussian conditional independence

We consider a random vector ξ\xi with entries ξ1,,ξn\xi_1, \dots, \xi_n which is distributed according to a Gaussian distribution with a positive-semidefinite covariance matrix Σ\Sigma. Of interest of this vector is its conditional independence (CI) structure, which is the collection of all statements ξiξj    ξKabbreviated as (ijK)\xi_i \CI \xi_j \; | \; \xi_K \quad \text{abbreviated as $(ij|K)$} for i,j[n]i, j \in [n] different and K[n]{i,j}K \subseteq [n] \setminus \{i,j\}, which hold for the distribution. This statement means that learning the value of the coordinate ξi\xi_i does not tell us anything about ξj\xi_j if we already know all the coordinates ξk\xi_k for kKk \in K.

What structure does this set of all CI statements have? For Gaussian distributions a particularly intriguing characterization is known. To decide whether (ijK)(ij|K) holds, determine an inclusion-maximal set LKL \subseteq K such that the principal minor detΣL\det \Sigma_{L} does not vanish. Then (ijK)(ij|K) holds if and only if the almost-principal minor detΣiL,jL\det \Sigma_{iL,jL} does vanish. In particular if the Gaussian is regular (equivalently, if Σ\Sigma is positive-definite) then we always choose L=KL = K in this test.

If we are given some CI structure A\mathcal{A} as a collection of symbols (ijK)(ij|K), then in order to decide if it has a regular Gaussian realization, we have to concatenate polynomial equations, inequations and inequalities in the (n+12)\binom{n+1}{2} unknowns which are the entries of the symmetric matrix Σ\Sigma:

  1. The first couple of inequalities detΣI>0\det \Sigma_I > 0, I[n]I \subseteq [n] state that the matrix be positive-definite,
  2. we add equations detΣiK,jK=0\det \Sigma_{iK,jK} = 0 for (ijK)A(ij|K) \in \mathcal{A} for the CI statements we wish to hold,
  3. and then inequations detΣiK,jK0\det \Sigma_{iK,jK} \ne 0 for (ijK)A(ij|K) \notin \mathcal{A} to exclude all other CI statements.

Notice that deciding whether the space of these matrices Σ\Sigma is empty or not is a semialgebraic decision problem. All polynomials are determinants of a symmetric matrix, which makes Gaussian CI particularly exciting.

The above only describes regular Gaussian realizations. For singular realizations we would have to pick in the first step which principal minors should vanish. These choices would affect which almost-principal minors to evaluate in steps 2 and 3 to model CI statements. Clearly, it gets a little more complicated overall as more choice is added. Of course, there are CI structures which are realizable by singular and not regular Gaussians, such as our № 85.

As a final remark to this section, there is a well-known lemma that allows us to pass to a special kind of covariance matrix which has all 11s on the diagonal (so is simultaneously the distribution’s correlation matrix) without changing the CI structure of the random vector. This trick reduces the number of variables in our semialgebraic system from (n+12)\binom{n+1}{2} to (n2)\binom{n}{2} as it eliminates the nn variances on the diagonal.

The model № 85 and (part of) its realization space

Below is the CI structure M85 with its irrational realization, copied straight from Šimeček’s paper.

Šimeček’s model № 85 as presented in the paper.

The style of diagram on the left was developed by Radim Lněnička to encode CI structures on four random variables. It reads

M85:={(124),(143),(1423),(243),(2413),(3412)}.\mathsf{M85} := \{ (12|4), (14|3), (14|23), (24|3), (24|13), (34|12) \}.

In the previous section, I have outlined an algorithm to describe all the positive-semidefinite matrices which realize a given CI structure, such as M85\mathsf{M85}. Since we want a non-regular realization (there is no regular realization because M85\mathsf{M85} is not a gaussoid), we have to make some decisions about which principal submatrices we want to be zero and which positive.

For the CI statements in M85\mathsf{M85}, we care about the principal minors of the sets 44, 33, 2323, 1313 and 1212. The minors of singleton sets are already equal to 11 when we work with correlation matrices. The only non-trivial assumptions we make are that Σ12\Sigma_{12}, Σ13\Sigma_{13} and Σ23\Sigma_{23} are positive. Then we can write down equations for the CI statements in M85\mathsf{M85} as if the matrix Σ\Sigma was regular. Suppose

Σ=(1abca1debd1fcef1),\Sigma = \begin{pmatrix} 1 & a & b & c \\ a & 1 & d & e \\ b & d & 1 & f \\ c & e & f & 1 \end{pmatrix},

then we obtain the following equalities:

Using the first three equalities and the requirement that all variables be non-zero, we see that (1423)(14|23) and (2413)(24|13) both reduce to the equality cd=becd = be which already follows from the simpler equalities: cd=(bf)d=b(df)=becd = (bf)d = b(df) = be.

Moreover notice that aa is determined by cc and ee, cc by bb and ff and ee by dd and ff, and even in a way that preserves rationality. If we can find rational b,d,fb,d,f which satisfy the equality of (3412)(34|12), then we have all rational entries which satisfy all the equations required.

Eliminating all a,c,ea, c, e from equation (3412)(34|12), we obtain equivalently

1=f4b2d2+b2+d22b2d2f2.1 = f^4 b^2 d^2 + b^2 + d^2 - 2 b^2 d^2 f^2.

I substituted b=p/qb = p/q, d=s/td = s/t, f=x/yf = x/y and asked Mathematica for integer solutions in p,q,s,t,x,yp,q,s,t,x,y to the above equality together with some of the inequality constraints like 0<(p/q)2<10 < (p/q)^2 < 1 etc. and it reported back a solution after maybe 10 seconds, which lifts to:

My Julia code indeed confirms that this matrix is a positive-semidefinite realization of M85\mathsf{M85}. The singular principal minors are 123123 and 12341234. My apologies to whoever expected clever arguments at this point. Mathematica simply beat me to it. It was very fortunate that all the inequalities and inequations seemed to fall into place once the equalities were satisfied rationally.

Rationality conjectures

The conjecture that all Gaussian CI structures are realizable by rational correlation matrices is still unresolved, even in the special case of regular Gaussians. By the way, there is an analogous conjecture by Matúš about discrete (as opposed to Gaussian) CI structures with an adapted notion of rationality on the last page of his paper Conditional independences among four random variables III: final conclusion. And finally, rationality in a somewhat different flavor is mentioned in Question 2 in Studený’s book, p. 190: is the conical closure of discrete multiinformation functions a rational polyhedron?

Likewise I am not aware of a solution to these conjectures. If you know of updates, please do let me know.

Addendum (26 May 2020): The conical closure of discrete multiinformations is a linear image of the entropy region and while I have not proved it in detail, I would suspect that Matúš’s breakthrough carries over to show that discrete multiinformations do not form a rational polyhedron, because they are not even a polyhedron.

Addendum (3 Mar 2021): I answer Šimeček’s question negatively in my recent preprint “Incidence geometry in the projective plane via almost-principal minors of symmetric matrices”.