The standard textbook on NP-completeness is:
Michael Garey and David Johnson: Computers and Intractability - A Guide to the Theory of NP-completeness; Freeman, 1979.
David Johnson also runs a column in the journal Journal of Algorithms (in the HCL; there is an on-line bibliography of all issues)
On the Web the following sites may be of interest:
http://www.nada.kth.se/~viggo/problemlist/compendium.html
Or trying giving `NP-complete' or `NP and complete' as a search term to
http://liinwww.ira.uka.de/searchbib/index>
(This is basically a bibliography database, but, you can click on the `on-line papers' button to list electronically readable full texts).
The Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) gives links to technical papers, abstracts, and other information concerning algorithms, approximation techniques and properties of NP-complete problems. Other general algorithmics and complexity related sites may be found at:
The Electronic Colloquium on Computational Complexity (ECCC)
A number of relevant journals are available on-line through the University Library Web Pages (these are only accessible to members of Liverpool University). The following journals are all available and publish research papers in the general areas of Algorithmics and Complexity Theory:
Number: 0
Name: Wombat Eating Assignment (WEA) [DU0] 2
Input: A set W of n wombats; a forest T of m eucalyptus trees (m>=n); a habitation mapping, mu:W-> subsets of T, such that for each wombat, w, mu (w) defines the set of eucalyptus trees, t in T, in which the wombat lives.
Question: Is there a mapping, E, from the set of wombats to the set of trees which satisfies both:
Comments: Can be solved by efficient methods using (0,1)-network flow maximisation methods applied to bipartite graphs, i.e. it's a matching problem.
The first line gives the unique identifying number for the problem (as described above). The second line gives the name of the problem and (if there is one) its usual abbreviation. These are how the problem is usually referred to in research papers, textbooks, etc. If you are looking for material on the Web, then these should give reasonable terms to supply to search engines. The code in square brackets is a reference to the classification in Garey and Johnson's book (with the exceptions of Problem 14, 86, 87, 88).
The next part of the description indicates what a typical instance (i.e. input) consists of: notice, as in the example above, that this will usually consist of a number of distinct objects which must be represented in trying to solve the problem: sometimes these can be quite involved structures, such as graphs, sets, mappings, logical expressions, etc; and, sometimes quite simple forms such as a single integer. The main part of a problem definition is always formulated as a question. This is the core of the decision problem description and defines precisely what property of the input instance must be determined in solving it. Thus, in the example given, one is trying to decide if a given input instance is such that a mapping from wombats to trees, satisfying certain conditions is possible. The final part (which is not always present) gives some (not necessarily useful) comments regarding the problem.
Notice that one way in which the WEA problem could be solved is by exhaustively considering each distinct way of mapping from single wombats to single trees until either one found an assignment that met the required conditions given in the Question; or one had no further mappings left to consider. Clearly the former case would lead to the answer true being returned; the latter to the answer false being given.
Name: 3-Satisfiability (3-SAT) [LO2] 2
Input: A set of m clauses - C_{1} ,C_{2},...,C_{m} - over a set of n Boolean valued variables X_{n}=< x_{1},x_{2},...,x_{n}>, such that each clause depends on exactly three distinct variables from X_{n}. A clause being a Boolean expression of the form y_{i}+y_{j}+y_{k} where each y is of the form x or -x (i.e. negation of x) with x being some variable in X_{n}. For example if n=4 and m=3 a possible instance could be the (set of) Boolean expressions: C_{1}=(x_{1}+(-x_{2})+(-x_{3})); C_{2}=(x_{2}+x_{3}+(-x_{4})); C_{3}=((-x_{1})+x_{3}+x_{4});
Question: Can each variable x_{i} of X_{n} be assigned a Boolean value alpha_{i} in such a way that every clause evaluates to the Boolean result true under the assignment < x_{i}:=alpha_{i}:1<=i<=n>?
In the example instance, the assignment < x_{1}:=true;x_{2}:=false;x_{3}:=true; x_{4}:=false> is such that each of the three clauses takes the value true.
Comments: For reasons that will become clearer at the end of the course, this problem had to be Number 1. Some relevant material concerning this problem might be found in the special issue (Volume 81, 1996) of the journal Artificial Intelligence that is available in the Library.
Name: Graph 3-Colourability (3-COL) [GT4] 1
Input: An n-node undirected graph G(V,E) with node set V and edge set E.
Question: Can each node of G(V,E) be assigned exactly one of three colours - Red, Blue, Green - in such a way that no two nodes which are joined by an edge, are assigned the same colour?
Name: Monochromatic triangle [GT6] 2
Input: An n-node undirected graph G(V,E) with node set V and edge set E.
Question: Can the edges, E, of G be partitioned into two disjoint sets E_{1} and E_{2}, in such a way that neither of the two graphs G_{1}(V,E_{1}) or G_{2}(V,E_{2}) contains a triangle, i.e. a set of three distinct nodes u,v,w such that {u,v}, {u,w}, {v,w} are all edges?
Name: Clique [GT19] 1
Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=n.
Question: Does G contain a k-clique, i.e. a subset W of the nodes V such that W has size k and for each distinct pair of nodes u, v in W, {u,v} is an edge of G?
Name: Bipartite Subgraph [GT25] 2
Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=|E|.
Question: Is there a subset, F of the edges of G, having size at least k and such that the graph H(V,F) is bipartite?
Comments: G(V,E) is bipartite if the nodes can be partitioned into two disjoint sets U and W such that every edge of G connects a node in U to a node in W, i.e. no two nodes in U (resp. W) form an edge of G.
Name: Vertex Cover [GT1] 1
Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=n.
Question: Is there a subset W of V having size at most k and such that for every edge {u,v} in E at least one of u and v belongs to W?
Name: Chromatic Index (Edge Colouring) [OPEN5] 3
Input: An n-node undirected graph G(V,E) with node set V and edge set E; a positive integer k with k<=|E|.
Question: Can the edges of G be assigned exactly one of k colours in such a way that no two edges which have a common node as an endpoint are assigned the same colour?
Comments: Vizing's Theorem from Graph Theory shows that the only `hard' case for this problem is when k is equal to the maximum degree of G. The degree of a node, v, is the number of edges in the graph with v as an end-point; the degree of a graph is the maximum degree of any node in the graph. Chromatic Index was shown to be NP-complete soon after the 1979 edition of Garey and Johnson went to press, hence the OPEN categorisation.
The classical paper on sequential methods is:
Name: Multiprocessor Scheduling [SS8] 4
Input: A set, T, of tasks and for each task t in T a positive (integer) running time len(t). A positive integer D, called the deadline.
Question: Is there a 2-processor schedule for the tasks that completes within the deadline, D, i.e. is there a function sigma:T->N such that both of the following hold:
Comments: This is the simplest avatar of a very large number of NP-complete processor scheduling problems and, unsurprisingly given its practical applications in multiprogramming Operating Systems on small parallel systems, there is an enormous range of literature on approximation and heuristic techniques to tackle it, see e.g. relevant sections of this bibliography. The formal framework given by (a) and (b) can be interpreted as meaning: a) at any given time at most two tasks are being executed; b) every task has been completed by the deadline D.
Name: Comparative Divisibility [AN4] 3
Input: A (strictly increasing) sequence A=< a_{1},a_{2},...,a_{n}> and a (strictly increasing) sequence B=< b_{1},b_{2},...,b_{m}> of positive integers.
Question: Is there an integer, c, such that Divides(c,A)> Divides(c,B), where Divides(x,Y) (Y being a sequence of positive integers) is the number of elements, y in Y, for which x is an exact divisor of y?
Comments: You may think that this has an obvious fast algorithm, and, indeed the algorithm in question is obvious: what it is not is efficient. Consider: how many bits are needed to store the input data (assuming, without loss of generality, that a_{n}>=b_{m})? How many steps, however, is this `obvious method' taking in the worst-case? It is important to realise that representing integer values in unary is not considered to be a `reasonable' approach (the number 2^{50}-1 requires 2^{50} digits in unary but only 50 digits in binary).
Name: Cyclic Ordering [MS2] 3
Input: A finite set, A, and a collection, C, of ordered triples (a,b,c) of distinct elements from A.
Question: Is there a one-to-one mapping f:A->{1,2,3,...,|A|} (i.e. a function which maps each element of A to a number between 1 and |A| with no two distinct elements of A being mapped to the same number) such that for each (a,b,c) in C one of
f(a) < f(b) < f(c) ; f(b) < f(c) < f(a) ; f(c) < f(a) < f(b)
holds?
Comments: For any triple (a,b,c) there are 6 (six) possible orderings that could result for ((a),f(b),f(c)). The question being asked is whether there is a choice of function that forces every specified triple into one of three `legal' orderings. Garey and Johnson has a typographical error in describing this problem, whereby each triple `in A' is referred to. Obviously `in C' is intended.
Name: Quadratic Diophantine Equations [AN8] 3
Input: Positive integers a, b, and c.
Question: Are there two positive integers x and y such that (a*x*x)+(b*y)=c?
Comments: The comments regarding Problem 9 (Comparative Divisibility) are also pertinent with respect to this problem. Again, there is an `obvious' algorithm that, on the surface, appears to be efficient and is seen not to be so only once one compares the input size (space needed to represent the input data) to the actual computation time in the worst-case.
Name: Maximum 2-Satisfiability [LO5] 3
Input: A set of m clauses C_{1},C_{2},...,C_{m} over n Boolean valued variables X_{n}, where each clause depends on two distinct variables from X_{n}; a positive integer k with k<=m.
Question: Is there an assignment of Boolean values to the variables X_{n} such that at least k distinct clauses take the value true under the assignment?
Comments: For relevant definitions see under Problem 1. 2-SAT, a natural variant of the problem 3-SAT described in Problem 1, can be solved in O(m) steps. The simple variation described here (whereby one asks whether at least some number of clauses can be made true simultaneously) is much more difficult.
Name: Register Sufficiency [PO1] 5
Input: A directed, acyclic graph G(V,A) in which each node has at most two out-going edges; a positive integer k.
Question: Is there a k or fewer register `computation' for G, i.e. is there an ordering < v_{1},v_{2},...,v_{n}> of the (n) nodes of G and a sequence < S_{0};S_{1};...;S_{n}> of subsets of V which satisfy all of the following:
Comments: The background to this problem comes from developing Code Generation methods in compilers for High-Level Languages. On early architectures, performing arithmetic operations with both operands stored in registers was significantly faster than fetching an operand from main memory. Such machines, however, had only a small (16-32) number of fast registers. Thus in generating assembly code to represent the computation of a lengthy arithmetic expression, it was necessary to use registers efficiently. A common first stage in compiling such expressions was to represent the computation as a (so-called) `straight-line program' which in turn could be modelled as a directed acyclic graph. The object of the Register Sufficiency Problem is to determine if such a `straight-line program' can be evaluated using only the specified number of available registers.
Name: Central Slice of Half-Clique 2
Input: An 2n-node undirected graph G(V,E) with node set V and edge set E.
Question: Does either of the following hold true of G(V,E)
Comments: For the definition of `k-clique' see Problem 4 above. This specific variant of the CLIQUE problem post-dates Garey and Johnson's 1979 text by 5 years. Its classification as NP-complete is, originally, proved in:
Name: Decision Tree [MS15] 5
Input: A Boolean logic function, f, of n variables, X_{n}, described by its 1-points, i.e. the set of assignments, alpha=< a_{1},...,a_{n}> such that f(a_{1},...,a_{n})=1; a positive integer K.
Question: Is there a decision tree for f that has average path length at most K?
Comments: A decision tree is a binary tree in which each non-leaf node is labelled with a variable from X_{n}, each leaf node is labelled either 0 or 1, the edge from a non-leaf node to its left child is labelled 0, that to its right child is labelled 1. For a given assignment of Boolean values to the variables X_{n} a path can be traced starting from the root of the decision tree and following the edges marked with the value of the variable labelling each node encountered. The path terminates at a leaf node whose associated label gives the value of the function. The decision tree computes a Boolean function f(X_{n}) if for every assignment, alpha, to the variables the path traced from the root under alpha terminates in a leaf labelled f(alpha). The average path length of a binary tree with t leaf nodes and root v is: sum from {w:w is a leaf} |Path from x to w|/t.
This problem is a simpler (but still NP-complete) version of the form given in Garey and Johnson. For relevant variations and potential heuristic approaches, the papers
Name: Shortest Common Superstring [SR9] 3
Input: A finite set R={r_{1},r_{2},...,r_{m}} of binary strings (sequences of 0 and 1); positive integer k.
Question: Is there a binary string w of length at most k such that every string in R is a substring of w, i.e. for each r in R, w can be decomposed as w=w_{0}rw_{1} where w_{0}, w_{1} are (possibly empty) binary strings?
Comments: General problem allows more than two symbols (i.e. not just binary), but this simpler version remains NP-complete.
Name: Longest Circuit [ND28] 2
Input: n-node undirected graph G(V,E); positive integer k.
Question: Does G contain a simple cycle containing at least k nodes?
Comments: Special cases of this are the famous Travelling Salesman Problem and Hamiltonian Circuit Problem. The latter corresponds to the cases k=n; the former to the case with each graph edge being weighted and also having k=n.
Name: Kernel [GT57] 2
Input: n-node directed graph G(V,A).
Question: Does G possess a kernel, i.e. a subset W of the nodes V such that no two nodes in W are joined by an edge in A and such that for each node v in V-W there is a node w in W for which (w,v) is an edge in A?
Name: k-Closure [GT58] 2
Input: n-node directed graph G(V,A); positive integer k<=n.
Question: Is there a subset W of V having size at most k and such that for all edges (u,v) in A either u is in W or v is not in W.
Name: Bandwidth [GT40] 3
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Is there a linear ordering of V with bandwidth at most k, i.e. a one-to-one function f:V->{1,2,...,n} such that for all edges {u,v} in G, |f(u)-f(v)|<=k?
Comments: If you need to be reminded of what a `one-to-one' function is, see Problem 10.
Name: Maximum Leaf Spanning Tree [ND2] 4
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Does G have a spanning tree in which at least k nodes have degree 1.
Comments: A spanning tree of G(V,E) is a tree T(V,F) containing all the nodes of G and whose edges, F, are a subset of the edges in E. For the definition of `degree of a node' see Problem 7. The paper,
Other papers of interest are given here
Name: Independent Set [GT20] 1
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Does G have an independent set of size at least k, i.e. a subset W of at least k nodes from V such that no pair of nodes in W is joined by an edge in E?
Name: Degree Constrained Spanning Tree [ND1] 4
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Does G have a spanning tree in which no node has degree greater than k?
Comments: For definitions see Problem 21 and Problem 7.
Name: Hamiltonian Path [GT39] 1
Input: n-node undirected graph G(V,E).
Question: Is there a simple path of edges in G that contains every node in V, and thus contains exactly n-1 edges?
Comments: Material regarding the Hamiltonian circuit problem may be found to be relevant.
Name: Graph Partitioning [ND14] 2
Input: 2n-node undirected graph G(V,E); positive integer k<=|E|.
Question: Can the nodes of G be partitioned into 2 disjoint sets U and W each of size n and such that the total number of distinct edges in E that connect a node u in U to a node w in W is at most k?
Comments: There has long been interest in this problem from the area of printed circuit board (and latterly VLSI chip) design, in which areas this optimisation problem provides a natural formulation for the task of minimising the number of wires crossing between different boards (or chips). The famous approximating heuristic of Kernighan and Lin (Bell Systems Tech. Jnl., 49, (1970), pp. 291-307: BSTJ can found in the HCL) appeared in this context. A survey and bibliography of algorithms for special cases is available.
Name: Cubic Subgraph [GT32] 3
Input: n-node undirected graph G(V,E).
Question: Is there a (non-empty) subset F of the edges of G such that every node in the graph H(V,F) has either degree 3 or degree 0.
Comments: For the definition of `degree of a node' see Problem 7.
Name: Travelling Salesman [ND22] 3-4
Input: A set C of n cities {c_{1},...,c_{n}}; for each pair of cities (c_{i},c_{j}) (1<=i< j<=n) a positive integer distance d_{i,j}; a positive integer B.
Question: Is there an ordering < pi(1),pi(2),...,pi(n)> of the n cities such that the value sum from i=1 to n-1 d_{pi(i),pi(i+1)}+d_{pi(n),pi(1)} is no more than B?
Comments: In effect what this problem is asking is whether there is a tour of the given collection of cities that visits each city exactly once and takes up total distance no more than B. There is a huge volume of literature concerning approximation methods, search heuristics, special case methods, etc for this very well studied problem. A bibliography has been compiled.
Name: Steiner Tree in Bipartite Graphs [ND12a] 3
Input: Bipartite graph B(V,W,E); positive integer k.
Question: Is there a subtree of B(V,W,E) that includes all of the nodes of V and has at most k edges?
Comments: For the definition of bipartite graph (on two disjoint sets of nodes V and W) see Problem 5.
Name: Bounded Diameter Spanning Tree [ND4] 4
Input: n-node undirected graph G(V,E); for each edge, {u,v} a positive integer weight w(u,v); positive integer B.
Question: Does G have a spanning tree T such that the total sum of the weights of edges in T is at most B and no simple path in T contains more than 5 edges?
Comments: For definitions see Problem 21.
Name: Optimal Linear Arrangement [GT42] 3
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Is there a one-to-one function f:V->{1,2,...,n} such that sum from {{u,v} in E} |f(u)-f(v)|<=K?
Comments: For definitions see Problem 10.
Name: Dominating Set [GT2] 2
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Does G contain a dominating set of size at most k, i.e. a subset W of V containing at most k nodes and such that for every node u in V-W (i.e. in V but not in W) there is a node w in W such that {u,w} is an edge of G?
Name: Path with Forbidden Pairs [GT54] 4
Input: n-node directed graph G(V,A); two distinct nodes s and t belonging to V; finite collection C={(a_{1},b_{1}),...,(a_{r},b_{r})} of pairs of nodes from V.
Question: Is there a directed path from the node s to the node t in G that contains at most one node from each pair of nodes in the collection C?
Comments: Several simplifications of this problem remain NP-complete: requiring G to be acyclic; requiring the collection C to contain only directed edges from A (rather than arbitrary pairs of nodes); requiring all the pairs to be disjoint.
Name: Oriented Diameter [GT64] 4
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Can a direction be placed on each edge {u,v} of E in such a way that the resulting directed graph H(V,A) is such that both of the following hold:
Name: Rural Postman [ND27] 3-4
Input: n-node undirected graph G(V,E); subset F of the edges E; positive integer k<=|E|
Question: Is there a (not necessarily simple) cycle of in G that contains every edge in F and has at most k edges in total?
Comments: A simple cycle in a graph is one in which each node visited in the cycle is visited exactly once. A non-simple cycle allows nodes to be visited more than once (although, obviously, each edge can only occur once). For example in the 5 node graph with edges ({1,2};{1,3};{1,4};{1,5};{2,3};{2,4};{2,5};{3,4}), the cycle 1-> 2-> 3-> 1 is a simple cycle; the cycle 1-> 2-> 4-> 3-> 2-> 5-> 1 is a non-simple cycle since 2 is visited twice (notice that in neither case is any edge used more than once).
Name: Longest Path [ND29] 2
Input: n-node undirected graph G(V,E); nodes s and t in V; positive integer k.
Question: Is there a simple path between s and t in G that contains at least k edges?
Comments: The similarity to Problem 17 should be noted.
Name: 3-Dimensional Matching (3DM) [SP1] 3
Input: 3 disjoint sets X, Y, and Z each comprising exactly n elements; a set M of m triples {(x_{i},y_{i},z_{i}):1<=i<=m} such that x_{i} is in X, y_{i} in Y, and z_{i} in Z, i.e. M is a subset of XxYxZ.
Question: Does M contain a matching, i.e. is there a subset Q of M such that |Q|=n and for all distinct pairs of triples (u,v,w) and (x,y,z) in Q it holds that u/=x and v/=y and w/=z.
Comments: The variant 2-dimensional matching in which 2 disjoint sets X and Y form the basis of a set of pairs, can be solved by a number of fast methods.
Name: Set Splitting [SP4] 3
Input: A finite set S; A collection C_{1},...,C_{m} of subsets of S.
Question: Can S be partitioned into two disjoint subsets - S_{1} and S_{2} - such that for each set C_{i} it holds that C_{i} is not a subset of S_{1} and C_{i} is not a subset of S_{2}?
Name: Set Packing [SP3] 3
Input: A collection C=(C_{1},...,C_{m}) of finite sets; a positive integer k<=m.
Question: Are there k sets - D_{1},...,D_{k} - from the collection C such that for all 1<=i< j<=k, D_{i} and D_{j} have no common elements?
Name: Exact Cover by 3-Sets (X3C) [SP2] 3
Input: A finite set X containing exactly 3n elements; a collection C of subsets of X each of which contains exactly 3 elements.
Question: Does C contain an exact cover for X, i.e. a sub-collection of 3-element sets D=(D_{1},...,D_{n}) such that each element of X occurs in exactly one subset in D?
Name: Minimum Cover [SP5] 3-4
Input: A finite set S; A collection C=(C_{1},...,C_{m}) of subsets of S; a positive integer k<=m.
Question: Does C contain a cover for S comprising at most k subsets, i.e. a collection D=(D_{1},...,D_{t}), where t<=k, each D_{i} is a set in C, and such that every element in S belongs to at least one set in D?
Name: Partition [SP12] 3
Input: Finite set A; for each element a in A a positive integer size s(a).
Question: Can A be partitioned into 2 disjoint sets A_{1} and A_{2} in a such a way that the sum of the sizes s(x) of elements x in A_{1} is exactly the same as the sum of the sizes s(y) of the elements y in A_{2}.
Comments: It should be noted that it is not required that A_{1} and A_{2} contain equal numbers of elements, although even with this condition the problem is still NP-complete.
Name: Subset Sum [SP13] 3
Input: Finite set A; for each element a in A a positive integer size s(a); a positive integer K.
Question: Is there a subset B of A such that the sum of the sizes, s(x), of the elements x in B is exactly equal to K?
Name: Comparative Containment [SP10] 4
Input: A finite set X; 2 collections R=(R_{1},...,R_{k}) and S=(S_{1},...,S_{m}) each of which is a set of subsets of X; for each R_{i} in R, a positive integer weight w(R_{i}); for each S_{i} in S a positive integer weight w(S_{i});
Question: Is there a subset Y of X such that: if R_{Y} is the set of subsets, R_{i} of R having Y as a subset of R_{i} and S_{Y} is the set of subsets, S_{i} of S having Y as a subset of S_{i} then the total weight of the sets in R_{Y} is at least the total weight of the sets in S_{Y}.
Name: Minimum Test Set [SP6] 3
Input: A finite set S; A collection C=(C_{1},...,C_{m}) of subsets of S; a positive integer k<=m.
Question: Is there a sub-collection D=(D_{1},...,D_{t}) of the sets in C such that: t<=k and for each distinct pair of elements u, v in S there is a set D_{u,v} in D that contains exactly one of u and v?
Name: Minimum Sum of Squares [SP19] 3-4
Input: A set A of n elements; for each element a in A a positive integer size s(a); positive integers k<=n and J.
Question: Can A be partitioned into k disjoint sets A_{1},...,A_{k} such that sum from i=1 to k ( sum from {x in A_{i}} s(x))^{2}<=J?
Name: 3-Partition [SP15] 3
Input: A set A of 3m elements; a positive integer bound B; for each element x in A a positive integer size s(x) that satisfies B/4< s(x)< B/2, and is such that the sum of the sizes of elements in A is exactly mB.
Question: Can A be partitioned into m disjoint sets A_{1},...,A_{m} such that each A_{i} contains exactly 3 elements of A and each A_{i} has total size equal to B?
Name: Subset Product [SP14] 3
Input: Finite set A; for each element a in A a positive integer size s(a); a positive integer K.
Question: Is there a subset B of A such that the product (i.e. result of multiplying together all) of the sizes, s(x), of the elements x in B is exactly equal to K?
Comments: There is a subtle technical distinction between this and Problem 42: the former case has a `pseudo-efficient' algorithm obtained by allowing numbers to be represented in unary; unless all NP-complete problems can be solved by fast algorithms, however, the Subset Product Problem, cannot be solved by `efficient' methods using even this unreasonable input representation.
Name: Bin Packing [SR1] 3
Input: A finite set U of m items; for each item u in U a positive integer size s(u); positive integers B (called the bin capacity) and k<=m.
Question: Can U be partitioned into k disjoint sets U_{1},...,U_{k} such that for each U_{i} (1<=i<=k) the total sum of the sizes of the items in U_{i} does not exceed B?
Name: Hitting String [SR12] 3
Input: Finite set S={s^{1},...,s^{m}} each s^{i} being a string of n symbols over {0,1,*}.
Question: Is there a binary string x=x_{1}x_{2}...x_{n} of length n such that for each s^{j} in S, s^{j} and x agree in at least one position.
Name: Rectilinear Picture Compression [SR25] 4-5
Input: An n by n matrix M of 0s and 1s; a positive integer K.
Question: Can all of the 1-valued entries in M be covered by a collection of K or fewer rectangles, i.e. is there a sequence of 4-tuples (a_{i},b_{i},c_{i},d_{i}) (where for all 1<=i<=K, a_{i}<=b_{i} and c_{i}<=d_{i}) such that:
Comments: Despite the apparently complicated definition, this is quite a well-motivated problem. Its background is, as its name suggests, from image compression and the task being set can be interpreted as representing a 2-dimensional image using at most K `blocks' of information.
Name: Sequencing with Release Times and Deadlines [SS1] 4-5
Input: A set T of tasks; for each task t in T: a positive integer length len(t); a positive integer release time r(t); and a positive integer deadline d(t).
Question: Is there a one-processor schedule for the tasks, T, that satisfies the release time constraints and meets all of the deadlines, i.e. a one-to-one function sigma from the set of tasks to positive integers such that all of the following hold:
Comments: Another scheduling problem in the style of Problem 8, but in a single processor environment. The requirements of the schedule, sigma, are that it must: 1) assign a unique starting time to each task in the set (sigma is a one-to-one function); 2) two tasks cannot be running simultaneously (this is condition (a) above, which says that if a task t is scheduled to start after a task w - sigma(t)> sigma(w) - then the earliest time at which t can be scheduled is after w has completed, i.e. sigma(w)+len(w); 3) no task can start before its specified `release time' (condition (b) above); 4) every task has finished no later than its allotted deadline (condition (c) which states that the scheduled starting time of a task t plus the total amount of time that t takes to complete - len(t) - must be no greater than the deadline that is assigned for t, i.e. d(t)).
In common with other scheduling and resource management problems there is a large volume of research into heuristics and approximation methods for this problem. Some relevant references may be found in sections of this bibliography.
For a reminder of what a `one-to-one' function is, see Problem 10.
Name: Precedence Constrained Scheduling [SS9] 3-4
Input: A set T of tasks each of which has length 1 or length 2; a partial ordering << on the set of tasks, T; a positive integer deadline D.
Question: Is there a 2-processor schedule, sigma, for T that meets the overall deadline D and obeys the `precedence constraints' of the partial order <<, i.e. if t<< w then sigma(w)>=sigma(t)+len(t)?
Comments: For a definition of `2-processor schedule' refer to Problem 8. Recall that a partial order << on a set R is an ordering relation which satisfies: for all distinct s and t at most one of s<< t or t<< s holds; and for all distinct s, t, w in R if s<< t and t<< w then s<< w. The bibliography mentioned above may contain useful references.
Name: Quadratic Congruences [AN1] 3
Input: Positive integers a, b, and c.
Question: Is there a positive integer x whose value is less than c and is such that x^{2}==a(mod b), i.e. the remainder when x^{2} is divided by b is equal to a?
Comments: The comments made with respect Problem 9 and Problem 11 are also relevant with respect to this problem.
Name: Square-Tiling [GP13] 5
Input: A set C of n `colours'; a collection of tiles, T, each tile t being a 4-tuple < a,b,c,d> of colours where a is the colour at the top of the tile; b that on the right-hand side; c that at the bottom; and d that on the left-hand side of the tile; a positive integer k<=n.
Question: Is there a proper tiling of a k by k square using the tiles in T, i.e. an assignment of a tile A(i,j) in T to each ordered pair (i,j) with 1<=i<=k, 1<=j<=k such that both of the following hold:
Comments: Informally, a proper tiling is one in which two tiles which are next to each other are required to have the same colour on touching sides, i.e. the right-hand side of one is coloured the same as the left-hand side of its neighbour and similarly with vertically adjacent tiles: the lower side of one is coloured the same as the top side of its neighbour. Decision problems involving questions about tiling patterns tend to be extremely difficult. For the most general form of the question it can be proven that no algorithm at all exists to solve it, i.e. not even an extremely inefficient one. A tiling problem also constitutes the only known `natural' `hard' member of the following class of problems: given a positive integer n as input, determine the number of `objects' of `size' n having a particular property, e.g. the number of n-node graphs, or the number of n-node graphs with a Hamiltonian circuit.
Name: Crossword Puzzle Construction [GP14] 5
Input: A finite set of characters SIGMA={sigma_{1},...,sigma_{k}}; a finite set, W={w_{1},...,w_{2n}} each w_{i} being a sequence of n characters from SIGMA, i.e. a string.
Question: Can an n by n crossword puzzle be built using all of the 2n words (strings) in W, i.e. if C is an n by n table of `blanks', is there an assignment, f, that maps each entry (i,j) of C to some character in SIGMA in such a way that the word formed by taking the n consecutive characters in a row corresponds to a word in W; the word formed by taking the n consecutive characters in a column (reading from top-to-bottom) corresponds to a word in W?
Comments: The definition may look complicated but it isn't. Here is a simple example instance for which a crossword can be constructed. Let SIGMA={A,B,C,D,E,G,O}, n=3, and W={AGE,AGO,BEG,CAB,CAD,DOG}. A 3 by 3 crossword puzzle for this is given by the assignment C(1,1)=C; C(1,2)=A; C(1,3)=B; C(2,1)=A; C(2,2)=G; C(2,3)=E; C(3,1)=D; C(3,2)=O; C(3,3)=G. This results in the grid
CAB AGE DOG
whose correctness is self-evident.
Name: Disjunctive non-tautology [LO8] 2
Input: A A set of m products - P_{1},P_{2},...,P_{m} - over a set of n Boolean valued variables X_{n}=< x_{1},x_{2},...,x_{n}>, such that each product depends on exactly three distinct variables from X_{n}. A product being a Boolean expression of the form y_{i} AND y_{j} AND y_{k} where each y is of the form x or -x (i.e. negation of x) with x being some variable in X_{n}.
Question: Is there an assignment of Boolean values to the variables in X_{n} that results in every product taking the value false (equivalently 0)?
Name: Simultaneous incongruences [AN2] 3
Input: A set of n ordered pairs of positive integers {(a_{1},b_{1}),...,(a_{n},b_{n})} where a_{i}<=b_{i} for each 1<=i<=n.
Question: Is there a positive integer x such that: for each i, a_{i} does not equal the remainder when dividing x by b_{i}?
Comments: As with most number-theoretic problems, the comments regarding Problems 9, 11, and 53 apply.
Name: Betweenness [MS1] 3
Input: A finite set of size n, A; a set C of ordered triples, (a,b,c), of distinct elements from A.
Question: Is there a one-to-one function, f:A->{1,2,...,n} such that for each triple (a,b,c) in C it holds that either f(a)< f(b)< f(c) or f(c)< f(b)< f(a)?
Comments: If you need to be reminded of what a one-to-one function is then see Problem 10 to which this problem may appear similar.
Name: Minimum weight and/or graph solution [MS16] 4-5
Input: A directed acyclic graph G(V,A) having a single node s, with no incoming edges; a labelling, f(v) of each node having at least one out-going edge in G, as either an and-node or an or-node; a positive integer K.
Question: Is there a sub-graph H(W,B) of G(V,A), i.e. W is a subset of V and B is a subset of A satisfying all of the following:
Name: Fault-detection in directed graphs. [MS18] 3-4
Input: A directed, acyclic graph G(V,A) such that G has a unique node t with no out-going edges; I the set of nodes in V having no incoming edges; a positive integer K.
Question: Is there a `test set' of size at most K that can detect every `single fault' in G (N.B. not `every single fault' but every `single fault'), i.e. a subset T of the nodes in I such that
Name: Minimum Broadcast Time. [ND49] 3-4
Input: n-node undirected graph G(V,E); a subset V_{0} of the nodes in V.
Question: Can a message be `broadcast' from the base set V_{0} to all of the nodes in V in at most 4 steps, i.e. is there a sequence
V_{0};E_{1};V_{1};E_{2};V_{2};E_{3};V_{3}; E_{4};V_{4}
which satisfies all of the following:
Name: Disjoint Connecting Paths [ND40] 3-4
Input: n-node undirected graph G(V,E); a set of disjoint pairs of nodes {(s_{1},t_{1});...;(s_{k},t_{k})}.
Question: Does G contain k mutually node disjoint paths, P_{i}, with P_{i} being a path from s_{i} to t_{i}?
Comments: Mutually disjoint means that no two paths have any common nodes.
Name: Shortest Weight-Constrained Path [ND30] 3
Input: n-node undirected graph G(V,E); for each edge e in E a positive integer length len(e) and a positive integer weight w(e); specified nodes s and t in V; positive integers K and W.
Question: Is there a path from s to t in G that has both:
Comments: If all of the edges have the same length or all of the edges have the same weight, then this is simply the normal shortest-path problem for which various efficient algorithms exist, e.g. the Dynamic Programming method of Floyd that is covered in the course. Minimising a single measure, in this context of path lengths, is `easy', but attempting simultaneously to minimise two (or more) distinct measures is not.
Name: Minimum Maximal Matching [GT10] 3
Input: n-node undirected graph G(V,E); positive integers k<=|E|.
Question: Is there a subset, F, of at most k edges from E that forms a maximal matching in G, i.e. no two edges in F have a common endpoint and every edge of G that is not in F has a common endpoint with at least one edge of F?
Name: Partition into Triangles [GT11] 3
Input: (3n)-node undirected graph G(V,E).
Question: Can the nodes of G be partitioned into n disjoint sets - V_{1},...,V_{n} - each of which contains exactly 3 nodes and is such that for each V_{i}={u_{i},v_{i},w_{i}}, all three of the edges {u_{i},v_{i}}, {u_{i},w_{i}} and {v_{i},w_{i}} belong to E?
Name: Partition into Forests [GT14] 3
Input: n-node undirected graph G(V,E); positive integer k<=n
Question: Can the nodes of G be partitioned into t<=k disjoint sets V_{1},...,V_{t} - in such a way that for each V_{i} (1<=i<=t), the subgraph G_{i}(V_{i},E_{i}) induced by V_{i} contains no cycles, i.e. is a forest (set of trees)?
Comments: If G(V,E) is a graph, then the subgraph induced by a subset W of V is the graph H(W,F) whose edges, F, are formed by all the edges in E that connect two nodes in W.
Name: Partition into Cliques [GT15] 3
Input: n-node undirected graph G(V,E); positive integer k<=n
Question: Can the nodes of G be partitioned into t<=k disjoint sets - V_{1},...,V_{t} - in such a way that for each V_{i} (1<=i<=t), the subgraph G_{i}(V_{i},E_{i}) induced by V_{i} is a clique, i.e. a graph in which every pair of nodes is connected by an edge?
Comments: For definition of induced subgraph see Problem 66.
Name: Partition into Perfect Matchings [GT16] 3
Input: n-node undirected graph G(V,E); positive integer k<=n
Question: Can the nodes of G be partitioned into t<=k disjoint sets - V_{1},...,V_{t} - in such a way that for each V_{i} (1<=i<=t), the subgraph G_{i}(V_{i},E_{i}) induced by V_{i} is a perfect matching, i.e. a graph in which every node is the endpoint of exactly one edge?
Comments: For definition of induced subgraph see Problem 66.
Name: Covering by cliques [GT17] 3
Input: n-node undirected graph G(V,E); positive integer k<=n
Question: Are there t<=k subsets - V_{1},...,V_{t} of V - such that both of the following hold:
Comments: For relevant definitions see Problems 66 and 67.
Name: Degree-bounded connected subgraph [GT26] 3
Input: n-node undirected graph G(V,E); non-negative integer d<=n; positive integer k<=|E|.
Question: Is there a subset F of the edges in E having size at least k and such that subgraph G(V,F) of G is connected and has no node of degree greater than d?
Comments: For the definition of `degree of a node' see Problem 7. A graph is connected if for every pair of nodes v and w there is a path connecting v and w in the graph.
Name: Uniconnected subgraph [GT30] 3
Input: n-node directed graph G(V,A); positive integer k<=|A|.
Question: Is there a subset B of the edges in A having size at least k and such that the subgraph H(V,B) of G has at most one directed path between any pair of nodes in V?
Name: Minimum k-connected subgraph [GT31] 3
Input: n-node undirected graph G(V,E); positive integers k<=n and B<=|E|.
Question: Is there a subset F of the edges E having size at most B and such that the subgraph H(V,F) of G is k-connected, i.e. remains connected after removing any set W of a most k-1 nodes and their adjacent edges?
Comments: For a reminder of what a `connected graph' is see Problem 70.
Name: Minimum cut linear arrangement [GT44] 3
Input: n-node undirected graph G(V,E); positive integer k.
Question: Is there a one-to-one function f:V->{1,2,...,n} such that for all i, with 1< i< n, the total number of edges {u,v} in E for which f(u)<=i< f(v) is at most k?
Comments: To rediscover what a `one-to-one' function is, see Problem 10.
Name: Consecutive Sets [SR18] 4
Input: Finite alphabet SIGMA of t symbols; collection C={SIGMA_{1},SIGMA_{2},...,SIGMA_{n}} of subsets of SIGMA; positive integer K.
Question: Is there a string, w, (sequence of symbols) over SIGMA such that |w|< =K and for each 1< =i< =n, all of the symbols in SIGMA_{i} appear in a consecutive block of |SIGMA_{i}| symbols in the string w?
Comments: N.B. minor typo. in Garey and Johnson, W written instead of w.
Name: Multiple choice matching [GT55] 3
Input: n-node undirected graph G(V,E); partition of E into t disjoint sets - E_{1},...,E_{t}; positive integer k.
Question: Is there a subset F of E having size at least k and satisfying both of the following:
Name: Path distinguishers [GT60] 3
Input: n-node directed, acyclic graph G(V,A); specified nodes s and t in V; positive integer k<=|A|.
Question: Is there a subset B of the edges A having size at most k and such that for any pair of distinct paths P and Q from s to t in G, there is an edge in B that is in exactly one of P and Q?
Name: Nestril-Rodl dimension [GT62] 5
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Is there a one-to-one function
f:V-> {( a_{1}, a_{2},...,a_{k}) : 1 <= a_{i}<= n }
such that for all pairs of nodes u and v in V, {u,v} is an edge of G if and only if f(u) and f(v) disagree in all k components?
Name: Shortest total path length spanning tree [ND3] 4-5
Input: n-node undirected graph G(V,E); positive integer B.
Question: Is there a spanning tree T(V,F) of G such that the sum over all pairs of nodes u and v of the length of the path between u and v in T is no greater than B?
Comments: For the definition of spanning tree, see Problem 21.
Name: Induced Path [GT23] 3
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Is there a subset W of the nodes V having size at least k and such that the subgraph H(W,F) of G induced by W is a simple path on |W| nodes?
Comments: For relevant definitions refer to Problem 66.
Name: Bounded Component Spanning Forest [ND10] 3
Input: n-node undirected graph G(V,E); positive integer k<=n.
Question: Can V be partitioned into t disjoint sets - V_{1},...,V_{t} - where t<=k, such that for each i (1<=i<=t) both of the following hold:
Comments: For relevant definitions refer to Problem 66 and Problem 70.
Name: Maximum subgraph matching [GT50] 3
Input: Directed graphs G(V,A), H(W,B); positive integer k.
Question: Is there a subset R of VxW, i.e. pairs of nodes where one node is from V and one from W, having size at least k and such that for every < u,v> and < x,y> in R the edge (u,x) belongs to A if and only if the edge (v,y) belongs to B?
Name: Numerical 3-dimensional matching [SP16] 3
Input: Disjoint sets W, X, and Y each containing exactly m elements; for each element u of W union X union Y, a positive integer size s(u); a positive integer bound B.
Question: Can the 3m element set W union X union Y be partitioned into m disjoint sets A_{1},...,A_{m} in such a way that:
Name: Open Hemisphere [MP6] 4-5
Input: Finite set X of m-tuples of integers; a positive integer k<=|X|.
Question: Is there an m-tuple y of rational numbers such that sum from i=1 to m x_{i}y_{i<}> 0 for at least k of the m-tuples in X, x_{i} (resp. y_{i}) being the ith component of the m-tuple x (resp. y)?
Name: Largest Common Subgraph [GT49] 3-4
Input: 2 n-node undirected graphs G(V,E), H(W,F); positive integer k.
Question: Do there exist subsets E_{1} of the edges E and F_{1} of the edges F that satisfy all of the following:
Name: Clustering [MS9] 3
Input: Finite set X; for each pair of elements x and y in X, a positive integer distance d(x,y); positive integer B.
Question: Is there a partition of X into 3 disjoint sets - X_{1},X_{2},X_{3} - with which: for each set X_{i} (1<=i<=3), for all pairs x and y in X_{i} it holds that d(x,y)<=B?
Name: Protein Folding 5
Input: Three positive integers, t, n, and k; a finite alphabet, SIGMA, of t symbols; a string S of length n.
Question: Is there an embedding of S into a two dimensional grid that has a bond-score of at least k? i.e. if S=s_{1}s_{2}...s_{n}, is there a one-to-one function FOLD:{1,...,n}->NxN such that:
Comments: The problem arises from an extremely simplified model of protein-folding of interest in computational biology. The NP-completeness proof is highly non-trivial (by a transformation from 3-SAT), is a recent result not mentioned in Garey and Johnson, and is due to Paterson and Przytycka (1996). It is important to note that the alphabet is part of the input. It is an open question as to whether the variant in which an alphabet of a fixed size, e.g. 2 as with binary, remains NP-complete. In should also be remembered that no limit is placed on the grid size, although trivially, an n x n grid will always suffice.
Number: 87
Name: Hamming Centre 3-4
Input: A set S of k binary strings, each of length n; a positive integer r.
Question: Is there an n-bit string y such that for every string x in S, the Hamming distance, H(x,y) is at most r?.
Comments: The Hamming distance, H(x,y), between two binary strings x and y of length n, is the number of bit positions in which x and y have differing values.
This is another recent result not given in Garey and Johnson, its NP-complete classification being given in Frances and Litman (1997).
M. Frances and A. Litman. On Covering Problems of Codes, Theory of Computing Systems, 30, 1997, 113-119
Number: 88
Name: 2-Spanner 3
Input: An undirected graph G(V,E); a positive integer s.
Question: Does G(V,E) contain a 2-spanner with at most s edges, i.e. subset F of the edges E having size at most s and such that for each edge {v,w} in E either {v,w} is in F or (for some node x in V) the edges {v,x} and {x,w} are both in F?
Comments: 2-Spanners (or more generally t-spanners for some positive integer t) are a natural extension of the idea of spanning tree (N.B. A spanner is required neither to be a tree nor, even, a connected graph), and were first considered in the context of problems in communications networks and parallel architectures by Peleg and Ullman (1989). Subsequently, Peleg and Schaffer (1989) showed this problem to be NP-complete, hence its absence from the appendix in Garey and Johnson.
D. Peleg and A.A. Schaffer. Graph Spanners. Jnl. of Graph Theory, 13, (1989), 99-116.
D. Peleg and J.D. Ullman. An Optimal Synchroniser for the Hypercube. SIAM Jnl. on Computing, 18, (1989), 740-747