The computation of symmetry orbits of LUBF-gaussoids in dimension 8

I learned something about parallel symmetry reduction that I want to share. It is too seldom that I can report on an enjoyable experience with computers, so that alone would be reason enough to write it up.

This article is about mathematics a little less than it is about engineering a parallel algorithm. The idea is really not deep: By precomputing some invariants of the symmetry classes, I can distribute related tasks to each worker so that: (1) the workers do not have to communicate at all and (2) still none of them perform redundant work. For gaussoids, that helps tremendously, because the invariants are very cheap to compute and relatively precise.

Beware that this article can contain traces of scientific code!

Context

Note: If you do not know what a gaussoid is and feel that it hinders your understanding of the article, there is a Gaussoid primer in the appendix.

In our paper Construction Methods for Gaussoids, Thomas Kahle and I systematically explore specializations of the notion of gaussoid — those classes of gaussoids which have symmetric axiomatizations in dimension 3. There are exactly 32 of these classes. We found that:

The last one is called LUBF in our taxonomy and is the subject of this article. We know almost nothing about this class, but what we know makes it all the more interesting.

The four fast-growing classes are those which lie above ELU. Since we know that there is a single exponential upper bound on the number of realizable gaussoids, these four classes contain mostly junk from the perspective of a statistician: gaussoids that do not come from a Gaussian. They still contain many realizable gaussoids.

The thirteen combinatorial classes of gaussoids correspond to things such as graphs, partitions and involutions. Remarkably, all of them in every dimension are realizable.

What is so interesting about LUBF is that its growth seems to be relatively tame: single exponential in the dimension (but we don’t know for sure). This is obviously better than the non-growth of the fourteen trivial classes and also better than all the junk in the four double exponential classes. At the same time, the realizability and orientability predicates are not trivial in LUBF, unlike in the combinatorial classes. This makes LUBF an attractive space to look for examples of infinite families of gaussoids to prove, for example, the non-axiomatizability or NP-hardness of orientability.

The problem

Since we had (and still have) no idea about the structure of LUBF, we started by counting how many members it has in each dimension. This is the table up to dimension 8:

nn 33 44 55 66 77 88
LUBFn\texttt{LUBF}_n 1010 142142 11661\,166 1279612\,796 183772183\,772 32216603\,221\,660

These numbers are quite easy to compute with a #SAT solver because being LUBF is a property which is characterized by a relatively small Boolean formula. Then, because I’m interested in orientability of LUBF-gaussoids I set out to compute the numbers of non-orientable LUBF-gaussoids. This is much harder because no axioms for orientability are known beyond dimension 4. SAT solvers are of limited use. By the end of this article, I will show you how I was able to compute the numbers in the following table nonetheless:

nn 33 44 55 66 77 88
LUBFnø\texttt{LUBF}^{ø}_n 00 4242 210210 12601\,260 1470014\,700 355740355\,740

The principal trick in getting these numbers is symmetry reduction. Orientability is invariant under the action of the symmetric group on gaussoids (and LUBF is an invariant subset under this action as well). Consequently orientability only has to be decided for one member out of each symmetry orbit. One can first reduce the list of all LUBF-gaussoids modulo symmetry, then run orientability tests on representatives and then sum over the sizes of non-orientable orbits. For example in dimension 6, there are 12,796 LUBF-gaussoids but only 47 orbits. This is a 272-fold speedup once the orbits are known.

To reduce or not to reduce

And getting those orbits is exactly the problem. The dimensions 3 to 7 could be handled by more or less brute parallelization: spin off 50 workers, each takes 1/50th of the input list and they compute for every gaussoid its orbit representative. The data is collected and summarized in the end.

But in the world of gaussoids, combinatorial explosion is a constant problem: the symmetric group grows superexponentially, the number of LUBF-gaussoids to reduce grows (probably) exponentially, the length of each gaussoid we act on grows exponentially. The product of all of these numbers is approximately, barring the exponential growth of LUBF assumption, and up to constant factors n222nn!A LOT!!n^2 2^{2n} n! \approx \text{A LOT!!} and this is about the number of string indexing operations that have to be performed to compute all the orbits. We oppose this growth with 50 measly worker processes. To say it in contemporary terms: this doesn’t scale at all. The estimated running time for the reduction with 50-fold parallelization is a bit over 9 days. (I should mention that this was written in sagemath.)

By that point, I was not too keen anymore on getting the dimension 8 number because it was just going to be a data point in a table. So I stopped there and continued to do what I always do when I want to know if someone knows a sequence of numbers I obtained experimentally: I asked the OEIS. In this case, as well as for the total numbers of LUBF-gaussoids, it didn’t have an answer. However, as the OEIS recommends in such cases, I cancelled common factors in my input sequence and found to my astonishment that all these numbers are divisble by 42 and all numbers from dimension 5 upwards are even divisible by 210. After canceling common factors, the numbers for 5, 6, 7 were indeed found by the OEIS to be nothing but (4n2n)\binom{4n}{2n}.

Wouldn’t a combinatorial characterization of non-orientable LUBF-gaussoids be too beautiful to be true? I had to get dimension 8 and see if the pattern continues. But waiting 9 days…?

Of course, this is just the symmetry reduction, not the actual goal of counting orientable gaussoids. What if, instead of symmetry reduction, we wrote down the orientation formula for each gaussoid and run a regular SAT solver to determine its orientability? Skip the symmetry reduction with all its benefits and drawbacks? — An average orientation formula for an 8-dimensional LUBF-gaussoid is 563 KiB big and can be solved in zero seconds. But multiplied by 3,221,660 this means 1.7 TiB worth of Boolean formulas. This much space might have been affordable for this task and solvers might have chugged through all of it, perhaps, but even computing these formulas takes a lot of time. I estimated 384 days to generate all these files sequentially, which is still over a week with 50-fold parallelization — just for computing the formulas. No.

Symmetry reduction is absolutely necessary to crack this nut!

The solution: invariants

I already mentioned that the first algorithm was a very brute parallelization of the sequential task of determining for every LUBF-gaussoid its orbit representative. There are 50 workers and each get a share of gaussoids to work on. All computations are independent. With 9 projected days, this took too long.

My second attempt was based on the observation that the workers compute the same orbits over and over again, needlessly. Since the orbit of a gaussoid has to be enumerated to determine its representative, we have actually solved the problem for the whole orbit. In the first attempt, workers just threw the orbit away. I implemented a parallel algorithm, again in sage, which took advantage of it using a shared dictionary of solved cases to which whole orbits were added at once. Performance turned out to be much worse. When I looked at htop, I saw that the processes spent most of their time contending for that shared dictionary. I declared that approach a dead end and focused on the parallel algorithm engineering aspect at which I had obviously failed.

The saving idea was suggested by Petra Schwer next door over the hallway when I talked to her about this problem. For each number kk the amount of symbols (ijK)(ij|K) inside the gaussoid where K=k|K| = k is an invariant of its symmetry class. This gives a whole array of invariants indexed by kk which I tentatively call the signature of the gaussoid. Having the same signature is a coarsening of the symmetry relation. Each signature class decomposes as a union of symmetry classes.

This coarsening was a godsend as it is much cheaper to compute than the orbit and a whole signature class can be sent to each worker. Since it is a union of symmetry orbits, each worker can follow the more efficient “compute each orbit just once” strategy and no shared dictionary is needed for the workers to coordinate their work, as the orbits they work on are already disjoint. Since each orbit is computed just once, the algorithm becomes much faster.

After the discussion, I made a plan and from the moment I started compiling postgresql on the institute’s computing server (as a prerequisite for my plan), over writing and babysitting the programs I needed, (having a beer with a friend) and getting the final result you see in the table above, about 11 hours passed. This is decidedly better than 9 days. Given all the setup and programs, the pipeline takes about 80 raw minutes to execute now. Everything went smoothly from start to finish. I think I speak for a lot of programmers when I remark that this never happens.

Computing signatures

Getting tired of sage/python and in particular its multiprocessing module, I took the liberty of implementing my plan from scratch in Perl, which I know better. For this plan there isn’t really anything in sage that we rely on either.

After the horrible shared dictionary spaghetti I wrote and witnessing its performance, I also decided that I should use proper multi-client DBMS for data storage and concurrent access, in this case PostgreSQL. Let’s see some code:

use Modern::Perl;
use ntheory qw(binomial);
use Mojo::Pg;

my $pg = Mojo::Pg->new("postgresql://tboege@/lubf8modsn");
my $db = $pg->db;

$pg->migrations->from_data->migrate;

Mojo::Pg is a really nice Perl module in the Mojolicious framework, which according to its documentation “makes PostgreSQL a lot of fun to use”. That is indeed so. The lines above connect to my local postgresql instance and set up my schemas (omitted here) if they did not exist before.

my @sindex;
my $nf = $n - 2;
for my $k (0 .. $nf) {
    for (1 .. binomial($nf, $k)) {
        push @sindex, $k;
    }
}

$nf = 0+ @sindex;
my $schunk = qr/.{$nf}/;

This piece of code constructs a lookup table @sindex which maps an index to the cardinality of the right-hand side KK of the symbol (ijK)(ij|K) at that index in the official ordering of symbols from gaussoids.de. And then a regex $schunk which extracts a chunk of a certain length from a string.

These two data structures exploit properties of the official symbol ordering. Symbols are ordered in chunks, so that all symbols (12...)(12|...) come first, then all (13...)(13|...), and so on. These chunks are what the regex extracts. Each chunk has the same ordering of right-hand sides: first \emptyset, then all singletons, then all pairs, and so on. This is the cardinality mapping that @sindex implements.

while (<>) {
    chomp;

    my @signature;
    # Slice it up into chunks of (ij|K) where ij is constant
    for (m/$schunk/g) {
        for my $i (0 .. $#sindex) {
            $signature[$sindex[$i]] += 1 - substr($_, $i, 1);
        }
    }
    # Push the signature and the gaussoid into the database
    $pending->wait;
    $pending = $db->insert_p('lubf8' => {
            gaussoid  => $_,
            signature => \@signature,
    });
}

Then the main program reads gaussoids from stdin, iterates over the chunks and uses the cardinality lookup table to update the signature. Once the signature is completed, it is pushed to the database asynchronously.

Notice that computing the signature does scale a lot better than computing orbits. The number of additions is linear in the size of a gaussoid. This script took about 52 minutes to digest the 3,221,660 LUBF-gaussoids from a 5.4 GiB text file. I parallelized it in the shell so that it took 6 wallclock minutes.

Symmetry reduction

Now that all the gaussoids and their signatures are in the database (around 7.4 GiB), we can do the symmetry reduction.

use Modern::Perl;
use List::Util qw(maxstr);
use Mojo::Pg;
use Forks::Super;

my $pg = Mojo::Pg->new("postgresql://tboege@/lubf8modsn");
my $db = $pg->db;

sub symmetric_group { … }

my @group = symmetric_group 8;

The first step is again to connect to the database and also compute a presentation of the symmetric group that we want to reduce by. The implementation of that is omitted because it gets a bit intricate. It enumerates all permutations of the ground set (as arrays mapping index to image under the permutation) and upgrades each of them to a permutation of the gaussoid under our chosen ordering of symbols.

Effectively, if @g is one such permutation presented as an array and @G is a gaussoid as an array of 0s and 1s, then the image of the gaussoid will simply be the slice @G[@g].

$Forks::Super::MAX_PROC = 50;
$Forks::Super::ON_BUSY = 'block';

my $res = $db->select('lubf8' =>
        ['signature', \'count(gaussoid)'] =>
        undef,
        { group_by => ['signature'] }
);

while (my $record = $res->hash) {
    fork \&worker_signature, args => [$record->{signature}];
}

waitall;

Another cool module comes into play here: Forks::Super. I tell it that I want to run 50 workers (having over 80 cores in the computing server and me usually being alone on there). I also tell it that adding more jobs than that should work but block until workers are free. Afterwards it manages a worker pool for me by overloading the fork function in Perl.

Each worker gets a signature and works on all gaussoids of this signature. This has the aforementioned advantage that workers will automatically work on disjoint symmetry classes. No communication is necessary between the workers and no work is performed redundantly.

There are 165 signature classes of wildly different sizes. Here’s an excerpt:

|          signature          | count |
+-----------------------------+-------+
| {16,90,223,314,253,108,19}  | 20160 |
| {19,113,284,385,296,122,21} | 10080 |
| {19,107,250,313,226,92,16}  | 40320 |
| {28,168,420,560,420,168,28} |     1 |
| {13,98,280,406,322,134,23}  | 10080 |
| {26,156,390,520,390,156,26} |   210 |
| {11,76,219,326,262,110,19}  | 20160 |
| {27,162,405,540,405,162,27} |    28 |
| {22,136,346,464,346,136,22} |   630 |
| {22,127,309,406,304,123,21} |  7560 |

The workers getting large batches will take longer than the others, but the others exiting early make room for new workers which compute the next batch of orbits in parallel. This asymmetry in job sizes only really matters in the end when all except the long-running jobs are finished.

sub worker_signature {
    my $signature = shift;
    my $db = $pg->db;

    # XXX: Apparently Mojo::Pg does not handle arrays in WHERE clauses well
    my $batch = $db->select('lubf8' => 'gaussoid' => {
            signature => sprintf '{%s}', join(',', @$signature),
    });
    note "started and have @{[ $batch->rows ]} records to process";
    my %orbited;
    while (my $record = $batch->hash) {
        next if $orbited{$record->{gaussoid}};
        my @G = split //, $record->{gaussoid};
        my @orbit = map { join '', @G[@$_] } @group;
        my $rep = maxstr(@orbit);
        for my $H (@orbit) {
            $db->update('lubf8' =>
                    { representative => $rep },
                    { gaussoid       => $H   },
            );
            $orbited{$H}++;
        }
    }
    note "done";
}

Finally, this is the worker rountine. It gets a signature, retrieves its job from the database, computes the orbits and updates the database with the findings. No orbit is ever computed twice. The whole deal takes 27 and a half wallclock minutes, the 3,221,660 LUBF-gaussoids collapse to 222 symmetry classes.

Deciding orientability

A piece of sage code then generates orientation formulas for the 222 representatives extracted from the database — I have not found a replacement for sage’s propcalc module in Perl. This takes about 44 minutes sequentially.

MiniSAT then sifts through these formulas and adds up the orbit sizes for unsatisfiable formulas, i.e. unorientable LUBF-gaussoids.

for f in LUBF-8-mod-Sn-*.cnf
  do minisat_static "$f" | grep -q ^UNSATISFIABLE &&
  grep -e $(head -1 "$f" | cut -d= -f2) LUBF-8-list-mod-Sn.txt
done | sum-orbit.pl

The line above terminates after no more than 5 seconds with the result 355,740 given in the table at the beginning.

All in all, this procedure takes 80 minutes instead of 9 days, thanks to invariants and thanks to lots of awesome software written by many hands. Thank you!


Appendix

Wait, what is a “gaussoid”?

Let me quickly recap some information from gaussoids.de about gaussoids and how we deal with them on a computer. This section exists until I get to write a proper introduction to gaussoids to link to instead.

Gaussoid land is built inside a finite universe or ground set which we usually take to be [n]={1,2,,n}[n] = \{1, 2, \dots, n\}. In this set we can form objects (ijK)(ij|K) that consist of a two-element set ijij whose members are ii and jj and a disjoint set KK. Making ijij a set is a convenient way to express that we do not wish to distinguish between the symbols (1234)(12|34) and (2134)(21|34) or (2143)(21|43) — but (1324)(13|24) is a different symbol, and (1123)(11|23) or (1223)(12|23) are not symbols at all because they violate the disjointness criteria.

A gaussoid is a set of such symbols that satisfies certain closure axioms. The size of the ambient universe [n][n] is called the dimension of the gaussoid. On a computer, we represent a gaussoid as a “co-indicator” vector, i.e. we fix some linear ordering of the symbols (ijK)(ij|K) and then write a string that has a 0 at the k-th position if the k-th symbol in our ordering is inside the gaussoid and 1 if it is not inside. This might seem backwards but trust me that it makes more sense this way. (We think of symbols inside a gaussoid as vanishing and things outside as non-vanishing.)

For example, in dimension 5 there are 80=(52)25280 = \binom{5}{2}2^{5-2} symbols. Under the official ordering of the symbols from gaussoids.de, this is one random 55-dimensional gaussoid as a computer sees it:

00101011001000001111111100010010111010001111111111111111110000001111111111010100

The closure axioms such a binary string has to satisfy to be called a gaussoid are not very important to this article and explaining them in a meaningful way takes a bit of time. Suffice it to say that for every dimension nn there are exactly (n3)2n3\binom{n}{3}2^{n-3} index arrays consisting of 6 elements, such that when you slice this binary string with the given indices, you always get one of the following eleven strings:

An LUBF-gaussoid is one where you allow only the Ls, the Us, the Bs and the Fs to appear in string slices. The difference to gaussoids is that 111111 does not appear as any slice.

Symmetry

There is an obvious action of the symmetric group SnS_n that just permutes the universe. This action extends to symbols (ijK)(ij|K) as follows (ijK)π:=(π(ij)π(K))(ij|K)^\pi := (\pi(ij)|\pi(K)). And this action extends naturally to sets of such symbols. Gaussoids are invariant under this action, meaning that if a string was a gaussoid, its images under the action will also be gaussoids. Here is another gaussoid in the same symmetry orbit as the one shown earlier:

11111111111111111101010011000000010011010100010001000000111111111110100011111111

It is obtained by a permutation of the previous binary string induced by some permutation of the set {1,2,3,4,5}\{1, 2, 3, 4, 5\}. It should not be obvious to you which symbol goes where.

This action preserves important properties of a gaussoid, such as:

Interesting properties of gaussoids are often invariant under this action. In these cases it is advisable to decide the property for representatives of the symmetry classes only as those are much fewer but yield the same information.