The following is a summary of a talk I gave at Dagstuhl on the 22nd November 2007. I mainly used the blackboard; this is a "cleaned-up" summary of the talk, in which I correct various errors I made during the talk, as pointed out by members of the audience, and I also insert additional clarificatory remarks. Further comments and corrections would be gratefully received!
PPAD is a class of total search problems, that is, problems for which solutions are guaranteed to exist, and the challenge is to actually exhibit a specific solution. The kind of total search problems we are interested is are ones where it is, for some reason or other, possible to prove that a solution exists without actually constructing one. A good starting-point for understanding these kinds of total search problems is the following. We are given the following set of 10 numbers and we want to find two distinct subsets of those numbers that add up to the same total.
More generally, let us define the problem EQUAL SUBSETS as follows.
INPUT: a1,a2,...,an; 1≤ ai ≤ (2n−1)/n
FIND: Two distinct subsets of the ai that add up to the same total.
Notice that EQUAL SUBSETS belongs to NP, or to be more precise, it belongs to TFNP, which is the class of all total search problems solvable in non-deterministic polynomial time. A total search problem is one where any instance has a solution, so the decision problem is trivial, and the interesting challenge is to actually find a solution. The obvious non-deterministic algorithm is:
Guess two distinct subsets|
Test whether they add up to the same total.
So, should we expect EQUAL SUBSETS to be NP-hard?
We observe that solutions to EQUAL SUBSETS owe their existence to the pigeon-hope principle. Let us define a general class of problems for which the existence of solutions in promised by the pigeon-hole principle, and see if that class "captures the complexity" of EQUAL SUBSETS.
Define the problem PIGEONHOLE CIRCUIT as follows. An input to PIGEONHOLE CIRCUIT consists of a circuit having n inputs and outputs as shown in the diagram. (This problem is called EQUAL OUTPUTS in Papadimitriou's textbook.)
INPUT: Boolean circuit C, having the same number of inputs as outputs.
FIND: Either x such that C(x)=0n (i.e. input that gets mapped to the zero vector), or alternatively, x and x' such that C(x)=C(x').
The challenge is to find a solution in time polynomial in the size of C — there is clearly a brute-force expnential-time algorithm.
(In my talk, I said that the size of C should be polynomially bounded in n, and then we seek a solution in time polynomial in n. In fact, it looks like a nicer definition is to say that we just seek an algorithm that is polynomial in the size of C (having the same number of inputs and outputs) without requiring the size of C to be polynomially related to the number n of inputs and outputs. I thank Nimrod Megiddo for pointing that out.)
This looks somehow like a very "generic" kind of problem for which solutions owe their existence to the pigeonhole principle. Let us
|Define the complexity class PPP (short for "polynomial pigeonhole principle") to be the class of all problems reducible in polynomial time to PIGEONHOLE CIRCUIT. Furthermore we will say that a problem X is complete for PPP provided that X is in PPP and PIGEONHOLE CIRCUIT is poly-time reducible to X.|
The good news...
We can verify that EQUAL SUBSETS is a member of PPP.
The bad news...
|It is not known if we can reduce in the opposite direction, from PIGEONHOLE CIRCUIT to EQUAL SUBSETS. Indeed, no problems in PPP are known to be complete for PPP, apart from PIGEONHOLE CIRCUIT and varaints thereof. And that is unfortunate for the class PPP, since it has not succeeded in capturing the computational complexity of any problem, other than the one involved in its definition.|
PPAD: A subclass of PPP
We define the class of problems PPAD in the same sort of way that we defined PPP. That is, in terms of a specific problem that has something to do with circuits, and is guaranteed to have a solution, but for which the solution is seemingly hard-to-find in general, despite the fact we know one must exist.
Define the problem END-OF-LINE as follows. We are given 2 circuits having the same number of inputs and outputs, as shown in the diagram. Treat S as the "successor" circuit and P as the "predecessor" circuit. S and P define a directed graph on the set of length-n vectors as follows: There is an edge from x to y provided that S(x)=y and also P(y)=x. EXCEPT that the all-zeroes vector has no predecessor.
The existence of a solution to END OF LINE is guaranteed by the observation that
Note that the problem is not to find the end of the line that begins at the zeroes vector. That problem is PSPACE-complete! You would essentially be trying to identify the final configuration of a polynomially space-bounded Turing machine, albeit one that is "reversible" (in that you can use trace the computation backwards from any configuration). I thank Peter Bro Milterson for assuring us that TM computations can be made reversible. Also, note that a solution is allowed to be a source other than the all-zeroes vector; it does not actually have to be a sink (even though there is of course no guarantee that another source exists). I thank Nimrod Megiddo for pointing that out. (In the DGP paper this problem gets called "END OF THE LINE"; in the CDT paper it is called END OF LINE. I think it was Mike Paterson who suggested that "END OF A LINE" would be a better name than "END OF THE LINE".)
The fact that it's actually complete for PPAD requires us to establish two things: that it's a member of PPAD (you can reduce it to END OF LINE) and that you can reduce END OF LINE to the problem of finding an approximate Nash equilibrium. I focus on the first of these two. Notice that the proof that membership of PPAD necessarily embodies a proof that Nash equilibria do in fact exist, since we have already seen that PPAD is a class of total search problems.We reduce NASH to SPERNER, and from there to END OF LINE. So, what is SPERNER? It's a total search problem based on Sperner's lemma. To find out about Sperner's lemma, consider the following sequence of diagrams.
|Given a game for which we seek a Nash equilibrium, pretend that the domain of its mixed-strategy profiles is triangular. OK, so it's not really a triangle, but topologically it's a simplex, and our discussion with respect to triangles can be made to extend to simplices. We can divide it up into tiny triangles as shown in the picture. Then, define a "preference function" that takes any mixed-strategy profile, and maps it to a new one, constructed by letting players move slightly towards strategies that they would prefer over their current ones. Note that such a function has a fixpoint at a Nash equilibrium, but not elsewhere. If we are not at a Nash equilibrium, draw an arrow in the direction that the function takes the mixed-strategy profile.|
We can colour the points where the function has been computed, according to the direction of the arrows. If we give the 3 outermost (extremal) points of the triangle the 3 colours red, green abd blue, then each point with an arrow gets coloured with a colour of an extreme point that it is moving away from. Note that according to this rule, the points on the edge between the outermost red and green vertices will be either red or green. Similarly for points on edges between other vertices.
The icon at the top right-hand side of the picture indicates how the colors are chosen — a downward arrow is red, a up-left arrow is green and a up-right arrow is blue.
|Intuitively, we would expect fixpoints of the preference function to lie in vicinity of small triangles whose 3 vertices get 3 different colours. This is because the points are being dragged in 3 different directions, and by continuity, if we triangulate at a sufficiently fine resolution, we should converge to a fixpoint. In this diagram there are 3 small traingles with vertices of all colours, marked with black spots.|
|Let's try doing the triagulation on a finer scale. Here's a picture of what you might end up with when you compute the preference function at all points...|
|Notice that once again there are small triangles with vertices of all different colours! Call these the panchromatic triangles. Sperner's Lemma predicts that such triangles always exist.|
|Let's try again at a finer scale...|
|Again, we find panchromatic triangles — nine of them here!|
|Trying once more at an even finer scale, there still exist panchromatic triangles. You can count 17 in the picture if you look carefully.|
Sperner's LemmaSuppose a simplex has each of its extremal vertices coloured a different colour. Suppose that we subdivide it into smaller simplices and colour the vertices of the simplices subject to the following rule. If a vertex of a subsimplex lies in the convex hull of a subset of the extreme points, then its colour must be one of the colours of that subset.
|The proof of Sperner's Lemma is central to one's understanding of why Nash equilibria exist, and in particular, why approximate Nash equilibria owe their existence to the principle that every graph with in/out-degree 1 and a source, has a degree-1 vertex other than the source. So, what is the relationship between such graphs, and Brouwer functions? Read on...|
Proof of Sperner's Lemma
Given a convex domain that has been triangulated as described in the statement, construct a graph as follows. Each triangle (or in higher dimension, subsimplex) is vertex of the graph. If 2 of these subsimplices are adjacent and share a facet that itself has all different coloured vertices, then there is an edge between those vertices. The colouring can give the facet an orientation and this determines the direction of the edge. But, don't worry too much about the orientation.
Now, let's just stick to 2 dimensions here, and claim without proof that the idea generalises.
|As it happens, this proof is intimately related to Scarf's 1967 algorithm for computing approximate Nash equilibria; this algorithm and variants thereof have been often applied in practice. Scarf's algorithm works by following the same kind of paths through the domain that are promised in the above way by Sperner's lemma. It was motivated by the desire for computational efficiency, and while we now know it is exponential-time in the worst case, it is usually of course much better than brute-force search.|
And Finally... Why do we believe that PPAD-complete problems are hard?
Last modified by Paul Goldberg, 2.1.12.