`LUBF`

-gaussoids in dimension 8I learned something about parallel symmetry reduction that I want to share. It is too seldom that I can report on an enjoyable experience with computers, so that alone would be reason enough to write it up.

This article is about mathematics a little less than it is about engineering a parallel algorithm. The idea is really not deep: By precomputing some invariants of the symmetry classes, I can distribute *related* tasks to each worker so that: (1) the workers do not have to communicate at all and (2) still none of them perform redundant work. For gaussoids, that helps tremendously, because the invariants are very cheap to compute and relatively precise.

**Beware that this article can contain traces of scientific code!**

*Note: If you do not know what a gaussoid is and feel that it hinders your understanding of the article, there is a Gaussoid primer in the appendix.*

In our paper Construction Methods for Gaussoids, Thomas Kahle and I systematically explore specializations of the notion of gaussoid — those classes of gaussoids which have symmetric axiomatizations in dimension 3. There are exactly 32 of these classes. We found that:

**4**of them grow very fast, doubly exponential in the dimension,**14**have axioms which eventually (in dimension 5) become trivial,**13**correspond very clearly to combinatorial structures,**1**is*none of the above*(as far as we know).

The last one is called `LUBF`

in our taxonomy and is the subject of this article. We know almost nothing about this class, but what we know makes it all the more interesting.

The four fast-growing classes are those which lie above `ELU`

. Since we know that there is a single exponential upper bound on the number of realizable gaussoids, these four classes contain mostly *junk* from the perspective of a statistician: gaussoids that do not come from a Gaussian. They still contain many realizable gaussoids.

The thirteen combinatorial classes of gaussoids correspond to things such as graphs, partitions and involutions. Remarkably, all of them in every dimension are realizable.

What is so interesting about `LUBF`

is that its growth seems to be relatively tame: single exponential in the dimension (but we don’t know for sure). This is obviously better than the non-growth of the fourteen trivial classes and also better than all the junk in the four double exponential classes. At the same time, the realizability and orientability predicates are not trivial in `LUBF`

, unlike in the combinatorial classes. This makes `LUBF`

an attractive space to look for examples of infinite families of gaussoids to prove, for example, the non-axiomatizability or NP-hardness of orientability.

Since we had (and still have) no idea about the structure of `LUBF`

, we started by counting how many members it has in each dimension. This is the table up to dimension 8:

$n$ | $3$ | $4$ | $5$ | $6$ | $7$ | $8$ |

$\texttt{LUBF}_n$ | $10$ | $142$ | $1\,166$ | $12\,796$ | $183\,772$ | $3\,221\,660$ |

These numbers are quite easy to compute with a #SAT solver because being `LUBF`

is a property which is characterized by a relatively small Boolean formula. Then, because I’m interested in orientability of `LUBF`

-gaussoids I set out to compute the numbers of *non-orientable* `LUBF`

-gaussoids. This is much harder because no axioms for orientability are known beyond dimension 4. SAT solvers are of limited use. By the end of this article, I will show you how I was able to compute the numbers in the following table nonetheless:

$n$ | $3$ | $4$ | $5$ | $6$ | $7$ | $8$ |

$\texttt{LUBF}^{ø}_n$ | $0$ | $42$ | $210$ | $1\,260$ | $14\,700$ | $355\,740$ |

The principal trick in getting these numbers is *symmetry reduction*. Orientability is invariant under the action of the symmetric group on gaussoids (and `LUBF`

is an invariant subset under this action as well). Consequently orientability only has to be decided for one member out of each symmetry orbit. One can first reduce the list of all `LUBF`

-gaussoids modulo symmetry, then run orientability tests on representatives and then sum over the sizes of non-orientable orbits. For example in dimension 6, there are 12,796 `LUBF`

-gaussoids but only 47 orbits. This is a 272-fold speedup *once the orbits are known*.

And getting those orbits is exactly the problem. The dimensions 3 to 7 could be handled by more or less brute parallelization: spin off 50 workers, each takes 1/50th of the input list and they compute for every gaussoid its orbit representative. The data is collected and summarized in the end.

But in the world of gaussoids, combinatorial explosion is a constant problem: the symmetric group grows superexponentially, the number of `LUBF`

-gaussoids to reduce grows (*probably*) exponentially, the length of each gaussoid we act on grows exponentially. The product of all of these numbers is approximately, barring the exponential growth of `LUBF`

assumption, and up to constant factors $n^2 2^{2n} n! \approx \text{A LOT!!}$ and this is about the number of string indexing operations that have to be performed to compute all the orbits. We oppose this growth with 50 measly worker processes. To say it in contemporary terms: *this doesn’t scale at all*. The estimated running time for the reduction with 50-fold parallelization is a bit over 9 days. (I should mention that this was written in sagemath.)

By that point, I was not too keen anymore on getting the dimension 8 number because it was just going to be a data point in a table. So I stopped there and continued to do what I always do when I want to know if someone knows a sequence of numbers I obtained experimentally: I asked the OEIS. In this case, as well as for the total numbers of `LUBF`

-gaussoids, it didn’t have an answer. However, as the OEIS recommends in such cases, I cancelled common factors in my input sequence and found to my astonishment that all these numbers are divisble by 42 and all numbers from dimension 5 upwards are even divisible by 210. After canceling common factors, the numbers for 5, 6, 7 were indeed found by the OEIS to be nothing but $\binom{4n}{2n}$.

Wouldn’t a combinatorial characterization of non-orientable `LUBF`

-gaussoids be too beautiful to be true? I had to get dimension 8 and see if the pattern continues. But waiting 9 days…?

Of course, this is just the symmetry reduction, not the actual goal of counting orientable gaussoids. What if, instead of symmetry reduction, we wrote down the orientation formula for each gaussoid and run a regular SAT solver to determine its orientability? Skip the symmetry reduction with all its benefits and drawbacks? — An average orientation formula for an 8-dimensional `LUBF`

-gaussoid is 563 KiB big and can be solved in zero seconds. But multiplied by 3,221,660 this means 1.7 TiB worth of Boolean formulas. This much space *might* have been affordable for this task and solvers *might* have chugged through all of it, perhaps, but even *computing* these formulas takes a lot of time. I estimated 384 days to generate all these files sequentially, which is still over a week with 50-fold parallelization — just for computing the formulas. No.

**Symmetry reduction is absolutely necessary to crack this nut!**

I already mentioned that the first algorithm was a very brute parallelization of the sequential task of determining for every `LUBF`

-gaussoid its orbit representative. There are 50 workers and each get a share of gaussoids to work on. All computations are independent. With 9 projected days, this took too long.

My second attempt was based on the observation that the workers compute the same orbits over and over again, needlessly. Since the orbit of a gaussoid has to be enumerated to determine its representative, we have actually solved the problem for the whole orbit. In the first attempt, workers just threw the orbit away. I implemented a parallel algorithm, again in sage, which took advantage of it using a shared dictionary of solved cases to which whole orbits were added at once. Performance turned out to be much worse. When I looked at `htop`

, I saw that the processes spent most of their time contending for that shared dictionary. I declared that approach a dead end and focused on the parallel algorithm engineering aspect at which I had obviously failed.

The saving idea was suggested by Petra Schwer next door over the hallway when I talked to her about this problem. For each number $k$ the amount of symbols $(ij|K)$ inside the gaussoid where $|K| = k$ is an invariant of its symmetry class. This gives a whole array of invariants indexed by $k$ which I tentatively call the *signature* of the gaussoid. Having the same signature is a coarsening of the symmetry relation. Each signature class decomposes as a union of symmetry classes.

This coarsening was a godsend as it is much cheaper to compute than the orbit and a whole signature class can be sent to each worker. Since it is a union of symmetry orbits, each worker can follow the more efficient “compute each orbit just once” strategy and no shared dictionary is needed for the workers to coordinate their work, as the orbits they work on are already disjoint. Since each orbit is computed just once, the algorithm becomes much faster.

After the discussion, I made a plan and from the moment I started compiling postgresql on the institute’s computing server (as a prerequisite for my plan), over writing and babysitting the programs I needed, (having a beer with a friend) and getting the final result you see in the table above, about 11 hours passed. This is decidedly better than 9 days. Given all the setup and programs, the pipeline takes about 80 raw minutes to execute now. Everything went smoothly from start to finish. I think I speak for a lot of programmers when I remark that this *never* happens.

Getting tired of sage/python and in particular its multiprocessing module, I took the liberty of implementing my plan from scratch in Perl, which I know better. For this plan there isn’t really anything in sage that we rely on either.

After the horrible shared dictionary spaghetti I wrote and witnessing its performance, I also decided that I should use proper multi-client DBMS for data storage and concurrent access, in this case PostgreSQL. Let’s see some code:

`use Modern::Perl;`

`use ntheory qw(binomial);`

`use Mojo::Pg;`

`my $pg = Mojo::Pg->new("postgresql://tboege@/lubf8modsn");`

`my $db = $pg->db;`

`$pg->migrations->from_data->migrate;`

Mojo::Pg is a really nice Perl module in the Mojolicious framework, which according to its documentation “makes PostgreSQL a lot of fun to use”. That is indeed so. The lines above connect to my local postgresql instance and set up my schemas (omitted here) if they did not exist before.

`my @sindex;`

`my $nf = $n - 2;`

`for my $k (0 .. $nf) {`

`for (1 .. binomial($nf, $k)) {`

`push @sindex, $k;`

`}`

`}`

`$nf = 0+ @sindex;`

`my $schunk = qr/.{$nf}/;`

This piece of code constructs a lookup table `@sindex`

which maps an index to the cardinality of the right-hand side $K$ of the symbol $(ij|K)$ at that index in the official ordering of symbols from gaussoids.de. And then a regex `$schunk`

which extracts a chunk of a certain length from a string.

These two data structures exploit properties of the official symbol ordering. Symbols are ordered in chunks, so that all symbols $(12|...)$ come first, then all $(13|...)$, and so on. These chunks are what the regex extracts. Each chunk has the same ordering of right-hand sides: first $\emptyset$, then all singletons, then all pairs, and so on. This is the cardinality mapping that `@sindex`

implements.

`while (<>) {`

`chomp;`

`my @signature;`

`# Slice it up into chunks of (ij|K) where ij is constant`

`for (m/$schunk/g) {`

`for my $i (0 .. $#sindex) {`

`$signature[$sindex[$i]] += 1 - substr($_, $i, 1);`

`}`

`}`

`# Push the signature and the gaussoid into the database`

`$pending->wait;`

`$pending = $db->insert_p('lubf8' => {`

`gaussoid => $_,`

`signature => \@signature,`

`});`

`}`

Then the main program reads gaussoids from stdin, iterates over the chunks and uses the cardinality lookup table to update the signature. Once the signature is completed, it is pushed to the database asynchronously.

Notice that computing the signature does scale a lot better than computing orbits. The number of additions is linear in the size of a gaussoid. This script took about 52 minutes to digest the 3,221,660 `LUBF`

-gaussoids from a 5.4 GiB text file. I parallelized it in the shell so that it took 6 wallclock minutes.

Now that all the gaussoids and their signatures are in the database (around 7.4 GiB), we can do the symmetry reduction.

`use Modern::Perl;`

`use List::Util qw(maxstr);`

`use Mojo::Pg;`

`use Forks::Super;`

`my $pg = Mojo::Pg->new("postgresql://tboege@/lubf8modsn");`

`my $db = $pg->db;`

`sub symmetric_group { … }`

`my @group = symmetric_group 8;`

The first step is again to connect to the database and also compute a presentation of the symmetric group that we want to reduce by. The implementation of that is omitted because it gets a bit intricate. It enumerates all permutations of the ground set (as arrays mapping index to image under the permutation) and upgrades each of them to a permutation of the gaussoid under our chosen ordering of symbols.

Effectively, if `@g`

is one such permutation presented as an array and `@G`

is a gaussoid as an array of `0`

s and `1`

s, then the image of the gaussoid will simply be the slice `@G[@g]`

.

`$Forks::Super::MAX_PROC = 50;`

`$Forks::Super::ON_BUSY = 'block';`

`my $res = $db->select('lubf8' =>`

`['signature', \'count(gaussoid)'] =>`

`undef,`

`{ group_by => ['signature'] }`

`);`

`while (my $record = $res->hash) {`

`fork \&worker_signature, args => [$record->{signature}];`

`}`

`waitall;`

Another cool module comes into play here: Forks::Super. I tell it that I want to run 50 workers (having over 80 cores in the computing server and me usually being alone on there). I also tell it that adding more jobs than that should work but block until workers are free. Afterwards it manages a worker pool for me by overloading the `fork`

function in Perl.

Each worker gets a signature and works on all gaussoids of this signature. This has the aforementioned advantage that workers will automatically work on disjoint symmetry classes. No communication is necessary between the workers and no work is performed redundantly.

There are 165 signature classes of wildly different sizes. Here’s an excerpt:

`| signature | count |`

`+-----------------------------+-------+`

`| {16,90,223,314,253,108,19} | 20160 |`

`| {19,113,284,385,296,122,21} | 10080 |`

`| {19,107,250,313,226,92,16} | 40320 |`

`| {28,168,420,560,420,168,28} | 1 |`

`| {13,98,280,406,322,134,23} | 10080 |`

`| {26,156,390,520,390,156,26} | 210 |`

`| {11,76,219,326,262,110,19} | 20160 |`

`| {27,162,405,540,405,162,27} | 28 |`

`| {22,136,346,464,346,136,22} | 630 |`

`| {22,127,309,406,304,123,21} | 7560 |`

The workers getting large batches will take longer than the others, but the others exiting early make room for new workers which compute the next batch of orbits in parallel. This asymmetry in job sizes only really matters in the end when all except the long-running jobs are finished.

`sub worker_signature {`

`my $signature = shift;`

`my $db = $pg->db;`

`# XXX: Apparently Mojo::Pg does not handle arrays in WHERE clauses well`

`my $batch = $db->select('lubf8' => 'gaussoid' => {`

`signature => sprintf '{%s}', join(',', @$signature),`

`});`

`note "started and have @{[ $batch->rows ]} records to process";`

`my %orbited;`

`while (my $record = $batch->hash) {`

`next if $orbited{$record->{gaussoid}};`

`my @G = split //, $record->{gaussoid};`

`my @orbit = map { join '', @G[@$_] } @group;`

`my $rep = maxstr(@orbit);`

`for my $H (@orbit) {`

`$db->update('lubf8' =>`

`{ representative => $rep },`

`{ gaussoid => $H },`

`);`

`$orbited{$H}++;`

`}`

`}`

`note "done";`

`}`

Finally, this is the worker rountine. It gets a signature, retrieves its job from the database, computes the orbits and updates the database with the findings. No orbit is ever computed twice. The whole deal takes 27 and a half wallclock minutes, the 3,221,660 `LUBF`

-gaussoids collapse to 222 symmetry classes.

A piece of sage code then generates orientation formulas for the 222 representatives extracted from the database — I have not found a replacement for sage’s propcalc module in Perl. This takes about 44 minutes sequentially.

MiniSAT then sifts through these formulas and adds up the orbit sizes for unsatisfiable formulas, i.e. unorientable `LUBF`

-gaussoids.

`for f in LUBF-8-mod-Sn-*.cnf`

`do minisat_static "$f" | grep -q ^UNSATISFIABLE &&`

`grep -e $(head -1 "$f" | cut -d= -f2) LUBF-8-list-mod-Sn.txt`

`done | sum-orbit.pl`

The line above terminates after no more than 5 seconds with the result 355,740 given in the table at the beginning.

All in all, this procedure takes 80 minutes instead of 9 days, thanks to invariants and thanks to lots of awesome software written by many hands. *Thank you!*

Let me quickly recap some information from gaussoids.de about gaussoids and how we deal with them on a computer. *This section exists until I get to write a proper introduction to gaussoids to link to instead.*

Gaussoid land is built inside a finite *universe* or *ground set* which we usually take to be $[n] = \{1, 2, \dots, n\}$. In this set we can form objects $(ij|K)$ that consist of a two-element set $ij$ whose members are $i$ and $j$ and a disjoint set $K$. Making $ij$ a set is a convenient way to express that we do not wish to distinguish between the symbols $(12|34)$ and $(21|34)$ or $(21|43)$ — but $(13|24)$ is a different symbol, and $(11|23)$ or $(12|23)$ are not symbols at all because they violate the disjointness criteria.

A *gaussoid* is a set of such symbols that satisfies certain closure axioms. The size of the ambient universe $[n]$ is called the *dimension* of the gaussoid. On a computer, we represent a gaussoid as a “co-indicator” vector, i.e. we fix some linear ordering of the symbols $(ij|K)$ and then write a string that has a `0`

at the k-th position if the k-th symbol in our ordering is *inside* the gaussoid and `1`

if it is not inside. This might seem backwards but trust me that it makes more sense this way. (We think of symbols inside a gaussoid as vanishing and things outside as non-vanishing.)

For example, in dimension 5 there are $80 = \binom{5}{2}2^{5-2}$ symbols. Under the official ordering of the symbols from gaussoids.de, this is one random $5$-dimensional gaussoid as a computer sees it:

`00101011001000001111111100010010111010001111111111111111110000001111111111010100`

The closure axioms such a binary string has to satisfy to be called a gaussoid are not very important to this article and explaining them in a meaningful way takes a bit of time. Suffice it to say that for every dimension $n$ there are exactly $\binom{n}{3}2^{n-3}$ index arrays consisting of 6 elements, such that when you slice this binary string with the given indices, you always get one of the following eleven strings:

`111111`

which is called`E`

for “empty”,`011111`

,`110111`

,`111101`

which are`L`

for “lower”,`101111`

,`111011`

,`111110`

which are`U`

for "upper,`110000`

,`001100`

,`000011`

which are`B`

for “belt”,`000000`

which is`F`

for “full”.

An `LUBF`

-gaussoid is one where you allow only the `L`

s, the `U`

s, the `B`

s and the `F`

s to appear in string slices. The difference to gaussoids is that `111111`

does not appear as any slice.

There is an obvious action of the symmetric group $S_n$ that just permutes the universe. This action extends to symbols $(ij|K)$ as follows $(ij|K)^\pi := (\pi(ij)|\pi(K))$. And this action extends naturally to sets of such symbols. Gaussoids are invariant under this action, meaning that if a string was a gaussoid, its images under the action will also be gaussoids. Here is another gaussoid in the same symmetry orbit as the one shown earlier:

`11111111111111111101010011000000010011010100010001000000111111111110100011111111`

It is obtained by a permutation of the previous binary string induced by some permutation of the set $\{1, 2, 3, 4, 5\}$. It should not be obvious to you which symbol goes where.

This action preserves important properties of a gaussoid, such as:

- its cardinality,
- the number of contained symbols $(ij|K)$ where $K$ has a fixed size,
- whether or not it is
`LUBF`

, and - whether or not it is realizable or orientable.

Interesting properties of gaussoids are often invariant under this action. In these cases it is advisable to decide the property for representatives of the symmetry classes only as those are much fewer but yield the same information.