The final part of the course deals with the issue of assessing how difficult specific computational problems are to solve. Given a description of a particular problem a number of questions arise:
are the fields of Computer Science concerned with the questions raised earlier.
Both areas require a formal definition of the concepts:
f : {0,1}^* -> {0,1}^*
({0,1}^* is the set of all finite strings (or words) containing only 0s and 1s.
It is only necessary to consider such functions as have range {0,1}. These are known as
is equivalent to
Any binary string can be viewed as the representation of some natural number. Thus for decision problems on binary strings we can concentrate on the set of functions of the form
f : N -> {0,1}
That is `problems' of the form
Input: n a natural number
Output: 1 if n satisfies some property; 0 if n does not satisfy it.
Examples:
In all programming systems
Suppose we can show that an ADA program cannot be written which solves some problem.
Why should one take this as evidence that
exists for the problem?
In other words
How can it be argued that results on computability apply to all types of computer? (Past, Present, and Future)
The answer is that such results are proved with respect to abstract models of computation that capture all the essential elements of a typical computer. A `typical computer' has
This is not an unrealistic assumption: we do not want to classify a problem as unsolvable just because on some real machines there may not be enough memory available.
The first abstract model of computation was defined by Alan Turing in 1936. This had:
The Church-Turing Hypothesis
The importance of the Church-Turing hypothesis is that it allows us, without any loss of generality, to consider computability results solely with respect to
Suppose P is some decision problem.
The Church-Turing hypothesis asserts
Because we can (in principle) write a Turing machine program which can exactly simulate the behaviour of any given ADA program.
Similarly, we can write an ADA program which can exactly simulate the behaviour of any given Turing machine program.
The discussion above has, casually, assumed that there are decision problems for which algorithms do not exist.
(nothing above has given any evidence for the assumption made).
We can prove that there exist decision problems for which no algorithm exists by employing a method of argument called
If there are `more' decision problems than programs then it must be the case that some decision problems cannot be solved by programs since since a single program can solve only a single decision problem.
Suppose we are given two sets of items, A and B, say, e.g.
A = Set of all students registered for the course 2CS46
B = Set of all students registered for the course 2CS42
How can we tell if there are more items in A than there are in B?
By `counting' the number of items in each set.
Thus there are an infinite number of distinct decision problems.
How many different ADA programs are there which take a single number, n, as input and return a 0 or 1 as their result?
Infinitely many. Since there are clearly infinitely many such programs which contain a loop of the form
for i in 1..m loop s := 0; end loop;
`Counting' is all right when the sets being compared are finite.
`Counting' does not appear to tell us anything when both sets are infinite.
Q: What do we actually do when we `count' the number of items in a set?
A: To each item we attach a unique `label': 1,2,3,4,...
Thus, in comparing the sizes of two finite sets, A and B, we attach `labels' to the items in A, and `labels' to the items in B and we say that
But why use `numbers' as the `labels' to count with?
We could use items in A to label items in B (i.e. count) and, if A is larger than B, then at least one item in A will not have been used to label any item in B.
And we can `count' B by using the `labels'
and now GARY has not been used to label any element of B and so we can say that
If it can be shown that for every way of labelling items in B with items in A there is at least one item of A which is not used as a label then, again, we can state that:
Let:
A = The set of all decision problems, f : N -> {0,1}.
B = The set of all ADA programs.
We can enumerate (i.e. list) every single ADA program. That is there is an algorithm which will output every single different ADA program in turn, e.g.
i := 1; loop if ADA_PROGRAM ( i ) = 1 then put ( Binary representation of i ); end if; i :=i + 1; end loop;(Recall that ADA_PROGRAM ( n ) is the decision problem which tests if n written in binary corresponds to the ASCII character representation of a compiling ADA program)
Thus we can talk about a
i.e. the 1st, 2nd, 3rd, ..., nth, ..., program that is output by the procedure above.
And so we can form an infinite table, T:
1 | 2 | 3 | .. | n | .. | |
P_{1} | *P_{1} (1) | P_{1} (2) | P_{1}(3) | .. | P_{1}(n) | .. |
P_{2} | P_{2} (1) | *P_{2} (2) | P_{2} (3) | .. | P_{2} (n) | .. |
P_{3} | P_{3} (1) | P_{3} (2) | *P_{3} (3) | .. | P_{3} (n) | .. |
.. | .. | .. | .. | .. | ||
.. | .. | .. | .. | .. | ||
P_{n} | P_{n} (1) | P_{n} (2) | P_{n} (3) | .. | *P_{n} (n) | .. |
.. | .. | .. | .. | .. | ||
.. | .. | .. | .. | .. |
If it is the case that every decision problem, f, is computed by at least one ADA program, then in the table above which `labels' each ADA program with a decision problem:
i.e. There is a row, Pf, of the table such that Pf( n ) = f( n ) for all n.
What about the following decision problem?
DIAG is well-defined, since our table, T, above exists.
P1 (DIAG(1) /= P1(1))
OR that labelling P2 (DIAG(2) /= P2(2))
OR that labelling P3 (DIAG(3) /= P3(3))
...
OR that labelling Pn (DIAG(n) /= Pn(n)).
And so, the decision problem, DIAG, does not label any Pn, in the table T.
Now matter how the ADA programs are ordered, we can always define a decision problem, DIAG, as above, which does not label any program in the ordering.
In consequence:
Decision problems for which no algorithm exists are called:
Diagonalisation indicates that undecidable decision problems
That is: It does not identify an undecidable decision problem whose input and output behaviour can be given with a finite description.
The decision problems
But, all three of these are obviously,
Alan Turing in 1936 exhibited the first explicit example of an undecidable decision problem:
The Halting Problem, HP, is the following decision problem:
Input: An ADA program, P, (which takes as input a single number, n) and an input w for P.
Output: 1 if P comes to a halt when run with input w ; 0 if P never halts when started with input w.
It may be objected that HP is not a decision problem in the sense that we have been using to date since:
HP2 takes as input a single number, m. This number is a valid input if
The Halting Problem (HP) is undecidable
Thus there is no algorithm which solves the Halting Problem.
There is an ADA program, HALT, which
Since HALT is assumed to exist we can write an ADA program, BAD say, which uses HALT as a sub-procedure.
procedure BAD is Q : integer; function HALT (P integer; w integer) return boolean is if ADA_PROGRAM(P) then --********************************** -- Halting problem program goes -- here. --********************************* end if; end HALT; begin get (Q); if HALT (Q,Q) then loop put ("NOT OK"); end loop; else put ("OK"); end if; end BAD;What does BAD do?
HP (which HALT solves) takes an ADA program and single number as input.
which when written in binary corresponds exactly to the ASCII representation of the ADA program BAD
BAD_NUMBER is therefore a valid input for the ADA program BAD.
as input?
Suppose
This can only happen if
But what does this mean? It means that The program corresponding to BAD_NUMBER does not halt on input BAD_NUMBER.
But `the program corresponding to BAD_NUMBER' is BAD i.e.
means that
On the other hand, suppose
This can only happen if
But what does this mean? It means that
The program corresponding to BAD_NUMBER halts on input BAD_NUMBER.
But `the program corresponding to BAD_NUMBER' is BAD i.e.
means that
Recall that a decision problem takes as input a finite length binary string and returns as output a 0 or a 1, hence a decision problem
We measure the running time of an algorithm as the number of steps taken on inputs of length n
That is, if A is an algorithm (ADA program, say) that solves a decision problem, f, the running time of A is a function R : N -> N, so that R( n ) is the worst-case number of steps taken on inputs of length n.
The Time Complexity of a decision problem, f, is the running time of the `best' algorithm A for f.
Thus we say that a decision problem, f, has
Thus we say that a decision problem, f, has
Thus we can define complexity classes, e.g.
TIME ( n ) is the set of decision problems for which an algorithm running in at most n steps on inputs of length n exists.
TIME ( n^2 ) is the set of decision problems for which an algorithm running in at most n^2 steps on inputs of length n exists.
TIME ( 2^n ) is the set of decision problems for which an algorithm running in at most 2^n steps on inputs of length n exists.
And in general,
Which complexity classes correspond to decision problems with `efficient' (for run-time) algorithms?
Intuitively we regard an algorithm with worst-case run-time of n, n^2, n^3, etc as being `efficient' since the time taken to return an answer for input of length n is `acceptable' even for large values of n; and the amount of additional computation needed for inputs of length n+1 does not `greatly exceed' the time on inputs of length n.
On the other hand, intuitively, we would regard an algorithm with worst-case run-time of 2^n, 3^n, n^n, n, n^{logn}, etc as being `inefficient' since even for moderate values of n the time taken to return an answer on input of length n is `unacceptable'; and the amount of additional computation needed for inputs of length n+1 considerably exceeds the time on inputs of length n.
How can we distinguish run-times in the second category from run-times in the first?
All of the run-times in the first category have the following property:
For each `efficient' run-time there is a constant, k, such that the running time is at most n^k when n is large enough.
A decision problem, f, for which there is an algorithm taking at most n^k steps on inputs of length n (i.e. f in TIME ( n^k )) is called a
For each `inefficient' run-time there is NO constant, k, such that the running time is at most n^k when n is large enough.
A decision problem, f, for which there is NO algorithm taking at most n^k steps on inputs of length n (i.e. f is not in TIME ( n^k )) is called a
There are very few `natural' examples of decision problems for which such lower bounds have been proved.
A major open problem in Computational Complexity Theory is to develop arguments by which important computational problems can have their time complexity described exactly.
For `almost all' `interesting' problems no lower bound on Time Complexity larger than OMEGA( n ) has been obtained.
Example Open Problems
We have seen that multiplication of 2 n-digit numbers has Time Complexity at worst n^{log3}. It is conjectured that
Multiplication of 2 n by n integer matrices can be done in time O(N^1.81) (where N = n^2). It is conjectured that
It is unknown whether there is a polynomial time algorithm that can determine if a given n-digit number is a Prime Number.
It is unknown whether there is a polynomial time algorithm that will factorise an n digit number.
Computational Complexity Theory involves a large number of subfields each of which is ultimately concerned with problems such as those above, e.g.
The study of the last field is over 50 years old.
It is convenient to use a single symbol to denote the class of all decision problems for which time efficient algorithms exist.
P = { f : f has a polynomial time algorithm } = Union over all k TIME ( n^k )
Composite:
Input: n-bit number, x
Output: 1 if x is a composite number; 0 otherwise.
Satisfiability: (SAT)
Input: A Boolean expression, F, over n variables (x1,. ..,xn)
Output: 1 if the variables of F can be fixed to values which make F equal true; 0 otherwise.
Hamiltonian Cycle (HC)
Input: n-node graph, G( V, E )
Output: 1 if there is a cycle of edges in G which includes each of the n nodes exactly once; 0 otherwise.
CLIQUE
Input: n-node graph G( V, E ); positive integer k
Output: 1 if there is a set of k nodes, W, in V such that every pair of nodes in W is joined by an edge in E; 0 otherwise.
3-Colourability (3-COL)
Input: n-node graph G(V,E)
Output: 1 if each node of G can be assigned one of three colours in such a way that no two nodes joined by an edge are given the same colour; 0 otherwise.
All 5 of these (and infinitely many other) decision problems share a certain property.
They are
In order to solve each of the decision problems above we have to demonstrate the existence of a particular object which has a certain relationship to the input.
That is, to show that:
x, is composite we `only' have to find a factor of x.
F(X) is satisfiable we `only' have to find an assignment to X that makes F true.
G(V,E) is hamiltonian we `only' have to find a suitable cycle in G.
G(V,E) has a k-clique we `only' have to find a suitable set of k nodes in V.
G(V,E) is 3-colourable we `only' have to find a suitable colouring of V
So for the examples:
Given an input instance, I, (number, Boolean expression, graph) one has to find a witness, W (number, assignment, cycle, set of nodes, colouring) such that a particular relationship holds between the witness and the input.
e.g.
How difficult is it to decide if something is indeed a valid witness for instances to these decision problems?
It is easy.
One can check if :
In summary:
In principle, one could do this by going through each element, W of Wn, in turn and checking if W is a genuine witness for I.
However, the question
So for any decision problem, f, we can define another decision problem, CHECK(f)
CHECK(f):
Input: I input instance for f; W possible witness that f(I)=1.
Output: 1 if W is a genuine witness that f(I)=1; 0 otherwise
Now we have seen that
The notation, NP, denotes the class of all decision problems, f, with the property that CHECK(f) is polynomial time decidable,
NP is the class of problems with efficient `verifying' algorithms.
Thus,
Or, in informal terms,
Are there decision problems f such that checking a solution (witness) is much easier (polynomial time) than finding a solution?
The following problems are a fraction of the tens of thousands of problems which are known to be in NP but for which no efficient algorithm has been found (despite, in some cases, centuries of work on them)
The idea underlying the concept of NP-complete problems is to describe formally what can be understood as the
Suppose S and T are two decision problems and suppose we can construct an ADA program (algorithm) to solve T that is of the form
procedure T is x : T_input_type; y : S_input_type; function S ( z : S_input_type) return boolean is --*************************************** -- ADA function for decision problem S -- goes in here. --*************************************** end S; function transform ( t : T_input_type) return S_input_type is --***************************************** -- ADA function which transforms an -- input instance for T into an input -- instance for S. --***************************************** end transform; begin get (x); y := transform (x); if S(y) then put (1); else put (0); end if; end T;In this transform is a function which takes an input instance for T and changes it into an input instance for S. For example if T is a decision problem on numbers and S is one on graphs, then transform given a number would construct a graph using that number.
Now suppose that both:
It follows that (with the above procedure form)
But, more importantly,
This means that if we have of an efficient procedure, tau which
If S and T are decision problems for which such an efficient transformation procedure from inputs for T to inputs for S exists then we say that
This relationship is denoted
Suppose we have a decision problem, f, for which we can prove the following:
Then if there are decision problems in NP for which efficient algorithms do not exist then
Problem: The definition of NP-completeness requires the following to be done in order to show that a problem f is NP-complete
But for the second, how can we reduce infinitely many decision problems to a single problem?
Before dealing with this we can note the following: suppose we have shown for two decision problems, f and g say, that:
Why?
So we have a polynomial reduction from every problem in NP to g by
This argument shows that once we have identified a single NP-complete decision problem, f, in order to prove that another decision problem, g, is NP-complete `all' we have to do is find an efficient algorithm for transforming input instances for f into input instances for g, i.e. prove
Within a few months of Cook's result dozens of new NP-complete problems had been identified. By 1994 well over 10,000 basic NP-complete decision problems were known.
Thus it is known that the following problems are all NP-complete
It generally though to be extremely unlikely that any NP-complete problem has a fast algorithm.
It has still to be proved that any NP-complete problem does not have an efficient algorithm, i.e. the question of whether P=NP is still open.
While most of the classical decision problems have now been determined as being in P or being NP-complete a number of important problems are still open. In particular
Both are known to be in NP. Neither has been shown to be NP-complete. It is believed that both are in fact solvable by efficient algorithms.
A program can alter one `bit' of memory at a time.
The current bit is indexed by an integer value: at_cell
The next bit is always adjacent to the previous one.
The program is in one of a finite number, s say, of states.
The next state entered is chosen non-deterministically from a choice of two.
A complete program is specified when the action for each state and bit value is defined.
State s is the unique accept state.
procedure nd_ada_f (T:integer) is type workspace is array (integer range <>) of 01B; n : integer; begin get(n); declare w : workspace( -T..T ):=(others=>B); procedure do_it_on ( w : in out workspace; position : in out integer; state : in out integer); begin case state is when k => if w(position)=B then w(position):=(0 or 1)_{k}; position:=position (+-1)_{k}; state:=choose(next_{1}, next_{2})_{k}; elsif w(position)=0 then w(position):=(0 or 1)_{k}; position:=position (+-1)_{k}; state:=choose(next_{1}, next_{2})_{k}; else w(position):=(0 or 1)_{k}; position:=position (+-1)_{k}; state:=choose(next_{1}, next_{2})_{k}; end if; end case; end do_it_on; begin at_cell:=1; in_state:=1; get(w(1..n)); for i in 1.. T loop do_it_on(w,at_cell,in_state); end loop; if in_state=accept then return true; end if; end; end nd_ada_f;
The decision problem NCOMP is:
Input: A non-deterministic program, AP;
An input x in <0,1>^{n} for AP;
A time bound T.
Question: Is there a `computation path' of AP on x that accepts in T iterations?
ADA_NP is the class of decision problems, f, for which there are:
Theorem: ADA_NP=NP
Proof:
a) ADA_NP is a subset of NP.
Suppose that f is in ADA_NP.
Let NA_{f} and p(n) be the program and polynomial such that for all x in {0,1}^{n}:
f(x)=1 if and only if NCOMP(NA_{f}, x, p(n))=1
A set of witnesses to f(x)=1 is the set of computation paths NACP, for NA_{f}.
Check_{f}(CP,x)=1 if CP is a valid accept computation for NA_{f} on input x.
Given NA_{f}, CP and x this can be checked by simulating NA_{f} on x using the computation path CP.
This test can be done in polynomial-time, so Check_{f} is in P, i.e. f is in NP.
b) NP is a subset of ADA_NP
Let f is in NP, so that Check_{f} is in P.
Let W_{n} be the witness set for x in {0,1}^{n}.
For any w in W_{n} it must hold that |w|<= p(n) for some polynomial p(n).
(otherwise Check_{f}(x,w) could not be carried out in P).
Let W_{n} be encoded in binary. Then
For all x in {0,1}^{n} there are at most 2^{p(n)} `witnesses to f(x)=1'.
NA_{f} is the program which:
iff NA_{f} constructs a witness w for x after p(n) moves which checks correctly in q(n) steps
iff Check_{f} is in P.
In summary, Check_{f} in P implies that f is in ADA_NP.
Hence (from (a) and (b)), ADA_NP=NP.
The Satisfiability Problem (SAT) is:
Input: A set of n Boolean variables X_{n}=<^x_{1},x_{2},...,x_{n}>; A Boolean formula F over X_{n}, (whose length is at most r(n) for some polynomial r)
Question: Is there an assignment, a of true/false to the variables X_{n} with the property that F(a)=true?
SAT is NP-complete.
Proof: We have to do 2 things:
That (a) holds has already been shown.
Now let f be any decision problem in NP.
We know that f also belongs to ADA_NP (since NP=ADA_NP).
So there is a non-deterministic Ada program NA_{f} for f and a polynomial p(n) such that, for any x in {0,1}^{n}, f(x)=1 iff NCOMP( NA_{f},x,p(n))=1
To show that SAT is NP-complete, we build an instance of SAT (i.e. a Boolean formula), H (NA_{f}) (X_{m(n)}) with the property
What do we know about NA_{f} on input x in {0,1}^{n}?
At any value i of its main for loop its status is fully described by
A1) The value of in_state.
A2) The value of at_cell.
A3) The content (0, 1, or B) of each `bit' w(k).
These can change only in accordance with what the procedure do_it_on allows, given the initial n-bit input x.
So, suppose we wish to know if a given sequence of p(n) configurations (i.e. state, position, memory) is a valid accepting computation path on NA_{f}.
Then such a sequence must embody all of the following characteristics at all times i:
C1) At time i=0: in_state=1, w(1..n)=x, all other bits equal B; at_cell=1.
C2) in_state has exactly one value at time i.
C3) at_cell has exactly one value at time i.
C4) w(k) has exactly one value at time i, (for all k)
C5) in_state=s (accept state) at i=p(n).
C6) w(-p(n)...p(n)), at_cell, and in_state must change from time i to time i+1 according to the case statement in do_it_on of NA_{f}.
We construct H(NA_{f}) with the Boolean variables: (below 0 <= i <= p(n))
V1) 3(2p(n)+1)(p(n)+1) variables
W(v,k,i) with v in {0,1,B}, -p(n) <= k <= p(n)
V2) s(p(n)+1) variables
S(st,i) 1 <= st <= s,
V3) (2p(n)+1)(p(n)+1) variables
A(k,i) -p(n) <= k <= p(n)
H(NA_{f}(x)) is built so that: in a satisfying assignment those variables which which are assigned the value true define a valid computation, CP of NA_{f} on x:
V1T) W(v,k,i)=1 iff in CP: at time i, `bit' w(k) contains v.
V2T) S(st,i)=1 iff in CP: at time i, in_state=st.
V3T) A(k,i)=1 iff in CP: at time i, at_cell=k.
Thus the whose only satisfying assignments of H(NA_{f})(V1,V2,V3) will be as given by (V1T, V2T, V3T) and define legal computations of NA_{f} on x, i.e. capture the conditions (C1) through (C6)
(Below & denotes logical and; + denotes logical or. For brevity we assume and is implied).
C1)
S(1,0)A(1,0) &_{k=1}^{n} W( x_{i},k,0) & &_{k=-p(n)}^{0} W(B,k,0) &_{k=n+1}^{p(n)} W(B,k,0)
C2) `exactly one value' == `at least one value' and at most one value.
&_{i=0}^{p(n)}( +_{st=1}^{s} S(st,i) ) & &_{1<=c< d<=s} (-S(c,i) + -S(d,i))
C3) Similarly,
&_{i=0}^{p(n)} ( +_{k=-p(n)}^{p(n)} A(k, i) ) & &_{-p(n) <=c< d<=p(n)} ( -A(c,i) + -A(d,i) )
C4) Again, similarly,
&_{i=0}^{p(n)} &_{k=-p(n)}^{p(n)} ( +_{{v in {0,1,B }} W(v,k,i) ) &_{v1=v2} ( -W( v_{1},k,i) + -W( v_{2},k,i) )
C5) S(s,p(n))
C6) is, superficially, more complicated, however at time i: from
in_state=st, at_cell=k, w(k) is one of {0,1,B}
the case statement determines the values at time i+1 of
the new value of at_cell
the content of w(k),
the two possible next states to jump to.
Thus each of the s states is described by a `move' function:
delta (st,v) = ( ( st_{1} , st_{2} ), nv, ch )
where v is in {0,1,B}, nv is in {0,1}, ch=(+or-)1
We thus have (C6) as
&_{i=1}^{p(n)-1} &_{j=-p(n)}^{p(n)} &_{{delta (st,v)}} MOVE(i,j,st,v, st_{1} , st_{2},nv,ch)
where MOVE(i,j,st,v, st_{1}, st_{2}, nv , ch) is,
S(st,i)& A(j,i)& W(v,j,i) -> ( S( st_{1},i+1) & A(j+ch,i+1) & W(nv,j,i+1) + S( st_{2},i+1) & A(j+ch,i+1) & W(nv,j,i+1) )
Thus,
If at time i
NA_{f} is in_state st
at_cell j which contains v then
at time i+1,
NA_{f} is in_state st_{1} or st_{2}
at_cell j+ch with cell j containing nv.
So the final formula H(NA_{f}) (V1,V2,V3) is
C1 & C2 & C3 & C4 & C5 & C6 ( V1,V2,V3 )
H(NA_{f})(V1,V2,V3) is satisfiable
iff
NCOMP (NA_{f},x,p(n))=1.
If NCOMP( NA_{f},x, p(n))=1 setting the variables of V1, V2, V3 that encode the accepting computation path of NA_{f} gives a satisfying assignment of H;
H guarantees that the true variables of any satisfying assignment must represent a valid computation of NA_{f} on input x that ends in accept after p(n) steps. i.e. a witness that NCOMP ( NA_{f}, x, p(n) )=1 PED Home Page