290 likes | 577 Views
Chapter 11: Theory. Computer Science has identified a number of theoretical ideas regarding what computers can and cannot do These concepts make up “Theory of Computation”
E N D
Chapter 11: Theory • Computer Science has identified a number of theoretical ideas regarding what computers can and cannot do • These concepts make up “Theory of Computation” • We examine these theories here, by considering a primitive computer and primitive programming language, which demonstrate these issues • We also examine a class of solvable problems which are too complex to solve in practice
A universal programming language Minimal (bare bones) to keep it simple data are bit strings of any length variable names consist only of letters (a-z, A-Z) imperative statements consist of clear (set variable to 0) incr (add 1 to variable) decr (subtract 1 from variable) move var1 to var2 1 control statement: while loop while variable not 0 do … end; Some example of code appear on p. 493-495 Obviously this language lacks many features like arithmetic expressions other than i++ and i-- counter-controlled loops if-then, if-then-else statements procedure/function calls In fact, the above features can all be implemented See figure 11.1 for multiplication and review question 3 on p. 497 (answer on p. 595) A bare bones language
Turing Machines • To run the bare bones universal programming language, we need a universal machine • A machine that is capable of being implementing on any computer, and thus being universal • Alan Turing, a mathematician, invented the Turing Machine in 1936 as a conceptual device to test out computational theories • The machine itself • consists of a control unit that can read and write symbols onto an infinitely long tape (see 11.3 p. 498) • and a set of states that define what the machine will do given an input
TM alphabet: 0, 1, * TM states: START, ADD, CARRY, NO CARRY, OVERFLOW, RETURN and HALT Figure 11.4 shows the TM program or state transition sequences Example: If currently in the state ADD and the symbol at the input is a 0, then write a 1 over the 0 and move Left one position and change to state NO CARRY We see on p. 500-502 how this TM performs an increment from *101* to *110* where the * denote the two ends of the string being operated on Start at *101* with state=START Write an *, move LEFT and enter state=ADD Now at *101* with state=ADD Write a 0, move Left and enter state=CARRY Now at *100* with state=CARRY Write a 1, move Left and enter state=NO CARRY Now at *110* with state=NO CARRY Write a 1, move Left and enter state=NO CARRY Now at *110* with state=NO CARRY Write an *, move Right and enter state=RETURN Move Right replacing each character with itself until the right * is reached, then HALT Example
We can view a computer running a program as if it were computing a function: F(X) = Y X is the starting state of the computer (the bits that represent the data and program) Y represents the ending state of the computer (the bits that represent the data) The program computes the function F We can then view the computational power of a machine to be a list of the functions that it is able to compute One type of machine is one that contains a list of inputs with the associated outputs The successor machine is listed in figure 11.5 p. 504 Such a machine’s computability is easily determined, just look at the output table and it shows what this machine can do Unfortunately, this machine has a major flaw there is no way to enumerate all I/O pairs for successor (or addition or multiplication, …) so there are functions that this type of machine cannot compute Computable Functions
A better approach • If we describe how a machine computes its output, then the machine is not restricted as the table look-up machine is • Example: express the machine’s function through mathematical formulae • V = P(1 + r)n calculates compound interest • Output = Input + 1 calculates the successor • C = (F - 32) * 5 / 9 calculates Centigrade temperatures given Fahrenheit temperatures • However, there are some functions that are so complex, there is no known formula to compute it • Such functions are non-computable
The Church-Turing Thesis • Any function that can be computed on a TM is said to be Turing Computable • And since a TM is a universal computing device, any Turing Computable function can be solved on a computer by a computer program • Alonzo Church worked on this conjecture which is dubbed the Church-Turing thesis and is widely accepted as true • If we find a function that is not Turing Computable, then it is not solvable on any computer • this is an important idea that demonstrates machine limitations that we will now explore, but first two other ideas
Godel Numbering • Mathematician Kurt Godel came up with a technique for providing unique nonnegative integers to each object of a collection (he used these to number formulas and proofs) • We will adopt this idea for numbering all programs that solve a computable function • Our numbering system will use the ASCII values of the letters that make up the instructions in the program • See figure 11.6 where 2114075508630810683 encodes the bare bones program “Clear X;” • Thus, every program will have a unique Godel Number
Self-Terminating Programs • Infinite loops are possible, even in a bare bones program (see for example p. 508) • So, a bare bones program is either self-terminating (reaches the HALT state) or not • We formally define a self-terminating program as one which starts with itself as input and reaches the halt state • What does “start with itself as input” mean? • Using Godel numbers, all programs have a unique number • if the first variable in the program is assigned this Godel number initially, and the program goes on to terminate, then the program is self-terminating
A Non-computable Function • We now define the Halting Problem, a non-computable function • Consider a bare bones program, f, which takes as input a Godel number and outputs a 1 or a 0 • If the Godel number, which represents a bare bones program, is a self-terminating program, then the program f outputs a 1 otherwise it is a non-terminating program and the program f outputs a 0 • f then solves the Halting Problem -- it determines whether any given bare bones program will terminate or not • See the upper left hand box in figure 11.7 on p. 511 which represents our program Foo
Now modify f to g by adding the following loop where X is the output (1 or 0) of f While X not 0 do End; The program g is given its own Godel number as input If g were to terminate, then X is 1 and g enters the while loop but never terminates – that is, the terminating program does not terminate If g does not terminate, then X is 0 and g terminates – that is, the non-terminating program terminates A program cannot be both a terminating and non-terminating program, so g cannot exist If g does not exist, then there is no such program f that can solve the Halting Problem Therefore, there is no solution to the Halting Problem, or by the Church Turing Thesis, the Halting Problem is not computable (solvable) More on the Halting Problem
Unsolvable Problems • The halting problem is just one of a class of problems that have been shown to be non-computable • By the Church-Turing thesis then, any problem that falls into this class is thought to be unsolvable • There is no program that can solve it • What is the significance of this finding towards Computer Science? Are things that we will never be able to solve? • Is intelligence unsolvable?
In chapter 4 we analyzed algorithms by their computational complexity Here, we will elaborate on complexity and find a class of problems that is too complex to solve by computer That is, there are solvable problems which are too time consuming to solve Notice that the term “complexity” is ambiguous amount of branching involved making the decision complexity as seen by the programmer when developing the program such as problems faced in AI this might also be indicated by the size of the program this does not imply complexity in execution time we might think of these examples of complexity as space complexity Or how much effort is involved in executing the solution? complexity from the computer’s viewpoint, or time complexity The complexity of a problem is taken to be the complexity of the best available (simplest) algorithm to solve the problem Complexity of Problems
Time Complexity • We define the time complexity of an algorithm to be O(f(n) ) where f(n) is some mathematical expression on n and for some m > 1, there exists a constant c where c*f(n) is an upper bound on the number of instructions executed by the algorithm on an input of size n where n > m • What? • Basically, it says that for any reasonable sized input, the algorithm takes within a constant factor f(n) instructions
Merge Sort Complexity • We have already analyzed 3 sorting algorithms: • Bubble Sort: best case O(n), worst case O(n2) • Insertion Sort: best case O(n), worst case O(n2) • Selection Sort: best case O(n2), worst case O(n2) • Merge sort uses recursion and is capable of solving the sorting problem in O(n log n) (best and worst case) • We save this analysis for 2380 or 3333 • See pages 516-518 if you want the details now • Consider the difference between the sorting algorithms when n = 100,000!
Polynomial vs Non-polynomial • The problems we have analyzed all have polynomial complexities • f(n) is a polynomial such as n, n log n, n2, n3, etc • see figure 11.11 p. 519 for some graphs of polynomial functions • Some problems have non-polynomial complexities • an example of a non-polynomial function is 2n • compare n2 and 2n when n=5, when n=10, when n=100! • Lets consider some examples
Given a list of people on a committee, generate a list of all possible subcommittees With n members of the committee, there are 2n subcommittees Example: {a, b, c, d} --> {}, {a}, {b}, {c}, {d}, {a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d}, {a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d} An algorithm to perform this action must generate all 2n combinations and so would have a complexity of at least O(2n) The committee-listing problem is complex because of the large number of outputs A problem with a small output can be complex Find a combination of integers that when multiplied together equal the sum of the combination of integers squared This requires trying many combinations so even though the output is small, the time it takes to find the output may be very large Non-polynomial Problems
Traveling Salesman Problem • This is perhaps the most famous non-polynomial problem • A traveling salesman must travel from his/her current location to a number of n cities and then return • There are paths between many of the cities • Each path has a cost • Find the route that takes the salesman to every city and back to the original city that costs the least • There are 2n possible paths, each of which must be enumerated and compared, thus the complexity is O(2n)
A nondeterministic instruction is one that can provide the right answer to a question without having to search for it it will answer the question in O(1) time rather than O(n) or greater current=current_city; x=all_reachable_cities(current); for all cities do nondeterministically find the city, c, that gives tsp the overall smallest value tsp = tsp + distance(current, c) let current=c Complexity of this solution is O(n) not O(2n) Unfortunately, we don’t know how to create a non-deterministic instruction as implied by the pseudocode solution to TSP But if we could… Many non-polynomial problems could be solved as polynomial problems Such an algorithm would be a non-deterministic algorithm A Polynomial Solution to TSP
Any problem that can be solved in polynomial time by using a non-deterministic algorithm is called a non-deterministic polynomial time problem The set of problems is referred to as the NP Problems or by the class NP The class P is the set of polynomial deterministic problems P is a subset of NP (all P problems are also NP problems) Efforts have been made to determine if P = NP or not That is, are all NP problems in P as well? If P=NP then this would indicate that all problems can be solved in polynomial time which would be a significant finding no solvable problems would be too complex to solve A class within NP is known as NP-Complete A problem is in the class NP-Complete if it can be reduced to all other problems in NP-Complete TSP is in NP-Complete. If we could show that another NP-Complete problem can be converted into TSP, then both problems are in NP-Complete If any NP-Complete problem can be solved in P, then all NP-Complete can be solved in P NP Problems and NP Complete
Knapsack Problem • Another NP-Complete problem • Given a knapsack that can store a weight of up to N pounds and separate items weighing w1, w2, w3, , …, wn pounds each, what combination of items 1..n can be selected to fill the knapsack? • Example: knapsack can hold 21 pounds with items weighing 1, 4, 3, 8, 6, 11 and 5 pounds • which combination of items do you choose? • one combination is 4, 6 and 11 and another is 3, 4, 6 and 8, but to solve this problem, you would need to generate all possible combinations and select one that equals 21
Classes of Problems • We see at least two classes of problems: • Unsolvable • Solvable in P • Two further classes are Solvable NP and Solvable in NP-Complete • What are their relationships to P? • This is an open question pursued in the study of algorithms and computer theory Solvable Problems Polynomial Problems Nonpolynomial Problems Unsolvable Problems Where does NP end - - - - - - - - -?
Since packet-switched networks are not secure but highly useful for commerce and business, we have a problem: how can people trust the Internet for business? one answer is to provide security over the Internet without changing how it works to do this, we turn to encryption one problem is that both the sender and the receiver of the message must share the same code and this is impractical for business over the Internet if you knew the code, you could decode others’ messages A recently discovered method of encryption is public key encryption where all senders share the same public key to encode messages but only the receiver has the private key to decode messages This technique permits secure information across the Internet Public key encryption is a variation of (can be reduced to) the knapsack problem, so is NP-Complete unless you have a key Internet security is maintained Encryption
Using Public Key Encryption • Start with a sequence of integer values • for instance, use the list on p. 523 • Take the message to be transmitted and encode it as a string of bits using ASCII or Unicode • Break the bit string into segments of bits • For instance, group the string of bits into 10 bits each • Now, represent each segment as a single number by adding together the integers in our original sequence with corresponding 1’s in the bit segment
Using the sequence 191 691 573 337 365 730 651 493 177 354 And a bit segment of 1001100001, we would get 191 + 337 + 365 + 354 = 1247 Do this for each bit segment and transmit 1247 and the other sums to the recipient To decode, the recipient must solve the knapsack problem for each sum transmitted Given 1247, what bit segment does it represent? Since n=10, this could take 210 = 1024 different combinations to decipher the message Thus, decryption is NP! If our bit segments were of length 100, then the recipient could not decipher it in any reasonable amount of time! So, this encryption code works but how does the intended recipient decode it? Example
How to decode in P • Lets use a different list of numbers to encode the bit segment: • 1 4 6 12 24 51 105 210 421 850 • Notice in this list that the numbers increase such that the sum of all previous numbers is still less than the number (e.g., 12 < 6 + 4 + 1) • Now its easy to decode a number, say 995 • Must include 850 leaving 145, so it must include 105 leaving 40, so it must include 24 leaving 16, so it must include 12 leaving 4, so it must include 4 • 850 --> 0101101001 • But now its too easy to decode
Magic Numbers • What we need to do is have an encoding sequence that requires that people use the knapsack problem to solve it, but have a decoding sequence that only the recipient knows that makes it easy to decode where the encoding and decoding sequences are related • We can apply 3 magic numbers to the easy sequence that gives us the hard sequence • For our example, our 3 numbers are 642, 2311 and 18 • We take each number of the easy sequence, multiply by 642, divide by 2311, and use the remainders
Apply our magic numbers to our “easy number sequence” (1 * 642) mod 2311 = 642 (4 * 642) mod 2311 = 57 (6 * 642) mod 2311 = 1541 (12 * 642) mode 2311 = 771 etc gives us the “hard number sequence” 642 57 1541 771 2184 388 391 782 2206 304 To encode a message, anybody on the Internet would convert their bit segments into integers with the above numbers Now, to decode the message, take the encoded sequence of sums, and multiply each by 18, divide by 2311 using the remainder Next, use the easy sequence and the knapsack problem to decode each sum into a bit sequence Example: sum = 4895 (4895 * 18) mod 2311 = 292 = 6 + 25 + 51 + 210 or the bit segment 0010110100 Group the bits together and then decode using ASCII or Unicode See fig 11.14 p. 527 and an example in fig 11.15 p. 531 Using the 3 magic numbers
The End Any questions?