Counting combinatorial objects using SAT solvers

Using SAT solvers

For a few years now I have been in the business of counting, enumerating and axiomatizing combinatorial objects using SAT solvers. Not every type of object is suitable for this; it needs to have a definition which can be easily translated to implications among boolean variables (more precisely, a conjunction of variables implying a disjunction of variables). Once an axiomatization is available, SAT solvers can be used to check if a given object satisfies the axioms or to count or list all extensions of a given partially defined object.

SAT solvers are truly marvellous pieces of software which get better and better every year, thanks to regular competitions organized by the research community. I have applied this scheme extensively to semigraphoids, gaussoids and variants of matroids. Although most of these computations have gone unused and unpublished, there is great value in the general idea of using SAT solvers to search, count and enumerate combinatorial objects.

Recently I looked over our preprint on Sign patterns of principal minors of real symmetric matrices and noticed that the number of admissible sign patterns on 6 items was only computed using the probabilistic model counter GANAK (code). Since the result carries uncertainty, we decided not to include it in our submission of the sequence to the OEIS. We tried the old exact model counter sharpSAT (code) on this problem and it ran out of memory. This prompted me to check out recent developments in model counting software and see if there are any new exact model counters which are strong enough for this problem. I found sharpSAT-TD (code) from the University of Helsinki which won some first prizes in the model counting competition earlier this decade. It confirmed GANAK’s probabilistic count of 3343368353433\,433\,683\,534 in just half a minute. I was very impressed and so I wanted to try it on something else.

Example: counting antimatroids

For different reasons but at the same time, I was looking at antimatroids. Like the structures I was experienced with, antimatroids have a simple propositional axiomatization in terms of feasible sets:

The first one can be written as [A][B]    [AB][A] \wedge [B] \implies [A\cup B], where the brackets denote a boolean variable corresponding to the set the enclose. The second axiom one is [A]    xA[A{x}][A] \implies \bigvee_{x \in A} [A \setminus \{x\}]. These axioms fit perfectly into the SAT solver framework and give a relatively small and straightforward boolean formula.

With these two axioms, I kept getting the wrong count, compared to the OEIS A119770. Turns out this is fixed by requiring also:

Perhaps this is omitted from or hidden in the descriptions I have looked at but I think it is reasonable and makes my code yield the correct results for all numbers known at the OEIS. The program is very simple:

#!/usr/bin/env perl

use Modern::Perl 2018;

use CInet::Base;
use CInet::ManySAT;
use Algorithm::Combinatorics qw(subsets);
use Array::Set qw(set_diff set_union);

my $n = shift // die 'need ground set size';

my $N = [1 .. $n];

# @PN maps a 0-based variable number to the corresponding subset of $N,
# which is a possible feasible set.
my @PN = subsets($N);

# %lut maps a set to its 1-based variable number, the inverse of @PN.
tie my %lut, 'CInet::Hash::FaceKey';
for (my $i = 0; $i < @PN; $i++) {
    $lut{[ $PN[$i], $N ]} = $i+1;
}

sub getvar {
    my $A = shift;
    $lut{[ $A, $N ]} // die "lookup failed on set @{[ join('', @$A) ]}"
}

sub getset {
    my $x = shift;
    $PN[$x-1] // die "lookup failed on variable number $x";
}

# Definition of antimatroid in terms of feasible sets:
my $solver = CInet::ManySAT->new;
$solver->add([ getvar($N) ]);
# Closure under unions
for my $AB (subsets(\@PN, 2)) {
    my ($A, $B) = @$AB;
    my $x = getvar($A);
    my $y = getvar($B);
    my $z = getvar(set_union($A, $B));
    next if $x == $z or $y == $z;
    $solver->add([ -$x, -$y, $z ]);
}
# Accessibility of feasible sets
for my $B (@PN) {
    next if not @$B;
    my @A = map set_diff($B, [$_]), @$B;
    $solver->add([ -getvar($B), map getvar($_), @A ]);
}
my $gen = $solver->dimacs;
while (defined(my $line = $gen->())) {
    print $line;
}

The OEIS data is generated by a hand-crafted C++ program due to Przemysław Uznański and Michał Bartoszkiewicz. This program, crucially, enumerates antimatroids and can even reduce them modulo isomorphy (yielding A224913). This only works up to n=7n = 7 items and the OEIS entry contains a provocative line:

The next item (for n=8) should be roughly 2^78 and seems hopeless without more mathematics.

While I have to agree that 2782^{78} is definitely a lot for a counting task, I would not call it hopeless. The loss of hope relies on the assumption that a program countig antimatroids needs to enumerate them. Model counters do not do that. If a satisfying assignment of the formula is reached using only kk out of NN variables. Then in fact 2Nk2^{N-k} satisfying assignments have been found. In this way, model counters are not bound to perform a number of steps which is bounded from below by to the number of models to count.

Challenges

I did not manage to compute the number of 8-antimatroids using sharpSAT-TD, but after some encouragement by Henrik Winther I decided to share the above little success I had with sharpSAT-TD and a few challenges. I should mention that I ran the largest problems in each category on the compute server of the math department at UiT for a straight month without getting an answer. But perhaps in a few years, SAT solvers will have advanced enough to complete some of these tasks.

These are just some objects I am familiar with. I am sure that there are more combinatorial structures of interest which can be modeled by boolean formulas and counted by SAT solvers. If you have some suggestions or if you compute any of these numbers or are interested in setting up a large-scale computation to really tackle one of these open problems, do let me know!