A revised and updated version is available here. See also new Arxiv paper.


We present here full details on the SAT encoding of the Erdős Discrepancy Problem used in

Background

In 1930s Paul Erdős conjectured that for any positive integer C in any infinite ±1 sequence (xn) there exists a subsequence xd , x2d , x3d , . . . , xkd , for some positive integers k and d, such that  | x1+...+xkd |> C. The conjecture has been referred to as one of the major open problems in combinatorial number theory and discrepancy theory. For the particular case of C = 1 a human proof of the conjecture exists; for C = 2 a bespoke computer program had generated sequences of length 1124 of discrepancy 2, but the status of the conjecture remained open even for such a small bound. We show that by encoding the problem into Boolean satisfiability and applying the state of the art SAT solver, one can obtain a discrepancy 2 sequence of length 1160 and a proof of the Erdős discrepancy conjecture for C = 2, claiming that no discrepancy 2 sequence of length 1161, or more, exists. We also present our partial results for the case of C = 3.

For further details see the paper.

C++ generator 

All SAT instances used in our experiments have been automatically generated by a C++ program that can be downloaded here. For any given values of LENGTH and DISCREPANCY the generator produces a SAT formula in such a way that the formula is satisfiable if, and only if, there exists a ±1 sequence of length LENGTH of discrepancy DISCREPANCY. The formula is in the DIMACS format and it can be directly fed to modern SAT tools.

Usage

The SAT generator is a short C++ program. We give brief instructions on how to use if under a Linux system.

  • Save the program as new.cc
  • Edit the new.cc program to change the following values defined in the beginning of the file:
    • DISCREPANCY this defines the particular value for which we are trying to generate the number sequence
    • LENGTH this is the desired length of the sequence
    • BITS the number of bits used to encode automaton states. Make sure that 2BITS ≥ 2*(DISCREPANCY+1)
  • Compile the program
g++ -g new.cc -o new
  • Run the program
./new > output.cnf
  • Run your favourite SAT solver on output.cnf
    • If your SAT solver complains that the first line of output.cnf is a comment, filter the comments and empty lines out
grep -v "^c" output.cnf | grep -v "^$" > output_no_comments.cnf
    • Run your favourite solver on output_no_comments.cnf
  • Interpreting the results: if the formula is satisfiable, the first LENGTH values in the satisfying assignment encode the sequence of discrepancy DISCREPANCY or less. If the formula is unsatisfiable, no sequence of length LENGTH and discrepancy DISCREPANCY exists.

Some precomputed encodings of the EDP

  • Discrepancy 2
    • A small size formula with comments for DISCREPANCY = 2 and LENGTH = 10 to see the encoding
    • The encoding of the EDP for DISCREPANCY = 2 and LENGTH = 1160 (the maximal length for which a sequence exists) with comments, without comments
    • The encoding of the EDP for DISCREPANCY = 2 and LENGTH = 1161 (the formula is unsatisfiable) with comments, without comments
  • Discrepancy 3
    • The encoding of the EDP for DISCREPANCY = 3 and LENGTH = 13000 without comments (bzipped).
    • The encoding of the EDP for DISCREPANCY = 3 and LENGTH = 13500 without comments (bzipped).
    • The encoding of the EDP for DISCREPANCY = 3 and LENGTH = 13700 without comments (bzipped).
    • The encoding of the EDP for DISCREPANCY = 3 and LENGTH = 13800 without comments (bzipped).
    • The encoding of the EDP for DISCREPANCY = 3 and LENGTH = 13900 without comments (bzipped).

Results

The following results have been obtained with the Plingeling SAT solver version ats.

  • Instructions on how to obtain and verify the DRUP certificate for DISCREPANCY = 2 and LENGTH = 1161

Although SAT solvers can establish the unsatisfiability of the EDP encoding for DISCREPANCY = 2 and LENGHT = 1161, the DRUP certificate is about 13GB so it is too large to put it online. We describe here the steps we followed in our experiments. We used the Glucose SAT solver version 3.0 for generating the certificate and the drup-trim tool for verifying it.

    • glucose -certified task_d2_l1161.cnf -certified-output=<path to storage>/cert_d2_l1161.txt
    • drup-trim task_d2_l1161.cnf <path to storage>/cert_d2_l1161.txt
  • We also used the drup-trim tool to identify a smaller size unsatisfiable subset (bzipped) of the encoding of the problem for DISCREPANCY = 2 and LENGTH = 1161

Discrepancy checkers

In order to check independently the discrepancy of generated sequences, we also implemented a simple Prolog program with design clarity rather than efficiency in mind. It can be downloaded here (contains the computed sequences). Instructions on how to use it are within.One can also use the Python script from the Polymath page verifying the discrepancy of ±1 sequences.