Complexity Theory  NP and NPcompleteness
NP and NPcompleteness
The Complexity Class NP
Consider the following:
Composite:
Input: nbit number, x
Output: 1 if x is a composite number; 0 otherwise.
Satisfiability: (SAT)
Input: A Boolean expression, F,
over n variables (x1,. ..,xn)
Output: 1 if the variables of F can be fixed to values which make F equal true;
0 otherwise.
Hamiltonian Cycle (HC)
Input: nnode graph, G( V, E )
Output: 1 if there is a cycle of edges in G which includes
each of
the n nodes exactly once; 0 otherwise.
CLIQUE
Input: nnode graph G( V, E ); positive integer k
Output: 1 if there is a set of k nodes, W, in V such that
every pair of nodes in W is joined by an edge in E; 0 otherwise.
3Colourability (3COL)
Input: nnode graph G(V,E)
Output: 1 if each node of G can be assigned one of three colours
in such a way that no two nodes joined by an edge are given the same colour;
0 otherwise.
All 5 of these (and infinitely many other) decision problems share a certain property.
They are
`Checkable' by efficient
(polynomial time) algorithms
What does this mean?
In order to solve each of the decision problems above
we have to demonstrate the existence of a particular object
which has a certain relationship to the input.
That is, to show that:
x, is composite we `only' have to find a factor of x.
F(X) is satisfiable we `only' have to find an assignment to X that
makes F true.
G(V,E) is hamiltonian we `only' have to find a suitable cycle in G.
G(V,E) has a kclique we `only' have to find a suitable set of k nodes in V.
G(V,E) is 3colourable we `only' have to find a suitable colouring of V
So for the examples:
Given an input instance, I, (number, Boolean expression, graph)
one has to find a witness, W (number, assignment, cycle, set of nodes, colouring)
such that a particular relationship holds between the witness and the input.
e.g.

That W divides I with no remainder.

That W makes I take the value true.

That W is a cycle in I containing each node of I exactly once.

That W is a set of k nodes in I each pair of which is connected by an edge of I.

That W does not assign the same colour to two nodes in I which are
joined by an edge of I.
How difficult is it to decide if something is indeed a valid witness
for instances to these decision problems?
It is easy.
One can check if :

W is a factor of I (by division)

W satisfies I (by evaluating the expression)

W is a hamiltonian cycle in I (by testing if I contains all
the edges in W)

W is a kclique in I (test if each pair in W is joined by an edge of I)

W is a colouring of I (test if each edge in I joins nodes with different colours in W)
All of these can be carried out by efficient algorithms.
In summary:

For any decision problem, f, with each possible input size, n, there
is a set of potential witnesses Wn.

To solve a decision problem `all' that is needed is an algorithm to
determine if there is a genuine witness, W is in Wn for any given
input
instance I of size n.
In principle, one could do this by going through each element, W of Wn,
in turn and checking if W is a genuine witness for I.
However, the question
`Is W a genuine witness for I'
(for the decision problem f)
(e.g.

Is y a factor of x? (for Composite)

Does alpha make F(X) equal true? (for SAT)

Is C a hamiltonian cycle in G(V,E)? (for HC)

Is K a kclique in G(V,E) (for Clique)

Is chi is 3colouring of G(V,E) (for 3Col) )

is itself a decision problem (based on f)
So for any decision problem, f, we can define another decision
problem, CHECK(f)
CHECK(f):
Input: I input instance for f; W possible witness that f(I)=1.
Output: 1 if W is a genuine witness that f(I)=1;
0 otherwise
Now we have seen that

CHECK(Composite)

CHECK(SAT)

CHECK(HC)

CHECK(Clique)

CHECK(3Col)
are all polynomial time decidable, i.e. all of these have efficient algorithms.
The notation, NP, denotes the class of all decision problems, f, with the
property that CHECK(f) is polynomial time decidable,
NP = { f : CHECK(f) is in P }
Thus for decision problems we have two `variants'
Finding version of f
(f itself)
which asks
Is there any witness W for I (with f)
and
Verifying version of f
(Check(f))
which asks
Is this specific W a witness for I (with f)
P is the class of problems with efficient `finding' algorithms.
NP is the class of problems with efficient `verifying' algorithms.
How are P and NP related?
If f is in P then certainly Check(f)
is in P and so f is in NP also.
Thus,
P is a subset of NP
Undoubtedly the most important open question in modern Computational Complexity
Theory is:
Is P a proper subset of NP?
i.e.
Are there decision problems f
such that Check(f) is in P (i.e. f is in NP) but
for which f is not in P.
Or, in informal terms,
Are there decision problems f such that checking a solution (witness) is
much easier (polynomial time) than finding a solution?
The following problems are a fraction of the tens of thousands of problems which
are known to be in NP but for which no efficient algorithm has been found
(despite, in some cases, centuries of work on them)
 Hamiltonian Cycle

Satisfiability

3Colouring

Clique

Travelling Salesman Problem

Integer Knapsack Problem

Longest Path in a Graph

Edge Colouring

Optimal Scheduling

Minimising Boolean Expressions

Minimum Graph Partition

Composite Number

Primality
With the exception of the last 2, it is generally believed that no efficient algorithm
exists for any of these decision problems.
i.e. They are all intractable.
This has yet to be proved.
The concept of NPcomplete decision problems offers a possible route
to resolving whether P = NP or P = NP.
NPcompleteness
The idea underlying the concept of NPcomplete problems
is to describe formally what can be understood as the
`most difficult' problems in NP
That NPcomplete problems are (intuitively) the most difficult
in NP follows from the fact that we may prove
P = NP
if and only if
some NPcomplete problem, f,
has a polynomialtime algorithm
Suppose S and T are two decision problems and suppose we can
construct an ADA program (algorithm) to solve T that is of
the form
procedure T is
x : T_input_type;
y : S_input_type;
function S ( z : S_input_type)
return boolean is
***************************************
 ADA function for decision problem S
 goes in here.
***************************************
end S;
function transform ( t : T_input_type)
return S_input_type is
*****************************************
 ADA function which transforms an
 input instance for T into an input
 instance for S.
*****************************************
end transform;
begin
get (x);
y := transform (x);
if S(y) then
put (1);
else
put (0);
end if;
end T;
In this transform is a function which takes an input instance for T
and changes it into an input instance for S. For example if T is a
decision
problem on numbers and S is one on graphs, then transform given a
number would construct a graph using that number.
Now suppose that both:
transform and S
have efficient (polynomial)
algorithms
Then the procedure above gives an efficient algorithm for T.
It follows that (with the above procedure form)
There is an efficient algorithm for T
if
there is an efficient algorithm for S
But, more importantly,
If we can prove that there is
NO efficient algorithm for S
then
we have also proved that there is
NO efficient algorithm for T
This means that if we have of an efficient procedure, tau which

transforms input instances x for T into input instances, tau( x), for S

is such that T( x ) = 1 if and only if S( tau( x ) ) = 1
Then we have established a relationship between the TimeComplexity of S and the TimeComplexity
of T.
If S and T are decision problems for which such an efficient transformation procedure
from inputs for T to inputs for S exists then we say that
The decision problem T is
polynomially reducible
to the decision problem S
This relationship is denoted
T < =p S
How does the idea of `polynomial reducibility' help in addressing the question
of whether P equals NP?
Suppose we have a decision problem, f, for which we can prove the following:

f is in NP, i.e. f there is an efficient checking algorithm for f.

For every other decision problem, g in
NP, it is the case that g < =p f,
i.e. for
every decision problem, g with an efficient checking algorithm there is an efficient procedure,
taug, that transforms inputs for g into inputs for f.
Then if there are decision problems in NP for which efficient algorithms do not exist
then
the decision problem f must be
one such problem
In other words, for such a decision problem f:
P = NP
if and only if
f has an efficient algorithm (i.e. f is in P)
A decision problem, f, which has these properties, i.e.

f is in NP

For all g in NP, g < =p f
(f has an efficient checking algorithm and every problem with an efficient checking algorithm
can be efficiently transformed to f)
is called an
NPcomplete decision problem
The NPcomplete decision problems are the `most difficult' problems in NP in the sense
that if
Any single NPcomplete problem does not
have an efficient algorithm
then
NO NPcomplete problem
has an efficient algorithm
Conversely
If we find an efficient algorithm for
just one NPcomplete problem
then
we can construct efficient algorithms
for all
decision problems in NP
Problem:
The definition of NPcompleteness requires the following to be done in order
to show that a problem f is NPcomplete

We have to find an efficient checking algorithm for f (i.e. prove f is in NP)

We have to show that every problem with an efficient checking algorithm can be
transformed into f (i.e. prove for all g in NP, g < =p f)
The first is usually (relatively) easy.
But for the second, how can we reduce infinitely many decision problems to a
single problem?
Before dealing with this we can note the following: suppose we have shown
for two decision problems, f and g say, that:

f is NPcomplete.

g is in NP

f < =p g
Then we have also shown that g is NPcomplete.
Why?
So we have a polynomial reduction from every problem in NP to g by

Converting an input instance x for h (in NP) to one tau(h)(x) for f.

Converting the input tau(h)(x) for f to an input instance for g.
The first is possible since f is NPcomplete. The second since there is a polynomial
reduction from f to g.
This argument shows that once we have identified a single NPcomplete
decision problem, f, in order to prove that another decision problem, g, is
NPcomplete `all' we have to do is find an efficient algorithm for transforming
input instances for f into input instances for g, i.e. prove
f < =p g
But, this still leaves the problem of finding a `first' NPcomplete problem, that is
of finding some way of
transforming infinitely many
decision problems
into
just one
decision problem
In 1971 Steven Cook proved one of the most important results in modern computational
complexity theory. This result is (now) known as
Cook's Theorem
SAT is NPcomplete
Proof: Omitted (not difficult, just long)
Within a few months of Cook's result dozens of new NPcomplete problems had
been identified. By 1994 well over 10,000 basic NPcomplete decision problems
were known.
Thus it is known that the following problems are all NPcomplete
 CLIQUE

Hamiltonian Cycle

Travelling Salesman Problem

3Colouring

Set Partitioning

Edge Colouring

Longest Path

Integer Knapsack
A proof that a decision problem is NPcomplete is accepted as evidence that
the problem is intractable since a fast method of solving a single NPcomplete
problem would immediately give fast algorithms for all NPcomplete
problems.
It generally though to be extremely unlikely that any NPcomplete
problem has a fast algorithm.
It has still to be proved that any NPcomplete problem does not have
an efficient algorithm, i.e. the question of whether P=NP is still open.
While most of the classical decision problems have now been determined as
being in P or being NPcomplete a number of important problems are
still open. In particular
Primality
Composite Number
Both are known to be in NP. Neither has been shown to be NPcomplete. It is
believed that both are in fact solvable by efficient algorithms.
PED Home Page