200 likes | 222 Views
This module discusses the probabilistic method, a technique for providing non-constructive existence proofs. It explores the lower bounds for Ramsey numbers using probabilistic reasoning and showcases the application of the method in graph coloring.
E N D
Discrete MathCS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part e) 1) The Probabilistic Method 2) Randomized Algorithms
The Probabilistic Method • Method for providing non-constructive existence proofs: • Thm. If the probability that a randomly selected element of the set S does • not have a particular property is less than 1, then there exists an • element in S with this property. • Alternatively: If the probability that a random element of S has a • particular property is larger than 0, then there exists at least one • element with that property in S. • Note: We saw an earlier example of the probabilistic method when discussing the 7/8 alg. for 3-CNF.
Example: Lower bound for Ramsey numbers • Recall the definition of Ramsey number R(k,k): • Let R(k,k) be the minimal n such that if the edges of the complete • graph on n nodes are colored Red and Blue, then there either is a • complete subgraph of k nodes with all edges Red or a complete • subgraph of k nodes with all edges Blue. • R(3,3) = 6. So, any complete 6 node graphs has either a Red or • a Blue triangle. (Proof: see “party problem”.)
Let’s say she knows 3 others. Consider one person. Reminder: “The party problem” Dinner party of six:Either there is a group of 3 who all know each other, or there is a group of 3 who are all strangers. By contradiction. Assume we have a party of six where no three people all know each other and no three people are all strangers. If any of those 3 know each other, we have a blue , which means 3 people know each other. Contradicts assumption. So they all must be strangers. But then we have three strangers. Contradicts assumption. She either knows or doesn’t know each other person. The case where she doesn’t know 3 others is similar. Also, leads to constradiction. So, such a party does not exist! QED But there are 5 other people! So, she knows, or doesn’t know, at least 3 others. (GPH)
How do we get a lower bound on R(k,k)? E.g., lower bound for R(3,3)? We need to find an n such thatthe complete graph on n nodes doesnotcontain a Red or a Blue triangle. E.g. So, R(3,3) > 5. I.e., we have a lower bound on the Ramsey Number for k=3. Can we do this for R(k,k) in general? Very difficult to construct the graphs, but… we can prove they exist for non-trivial k.
Unbiased coloring is not essential. Only matters that each possible coloring has some non-zero probability. • Thm. For k ≥ 4, R(k,k) ≥ 2k/2 • So, e.g., k = 20, then there exists a Red/Blue coloring of the complete graph with 2^10 – 1 = 1023 nodes that does not have any complete monochromatic sub graph of size 20. (But we have no idea of how to find such a coloring!) • Proof: Consider a sample space where each possible coloring of • the n-node complete graph is equally likely. A sample coloring • can be obtained by randomly coloring each edge. I.e., with • probability ½ set edge to Blue, otherwise Red. • Let n < 2k/2 • Aside: each particular coloring has probability (1/2) (n * (n-1) /2) We want to show that there is a larger than 0 probability of getting a coloring with no monochromatic k-clique. Hmm… Why?
Consider a subset of k nodes. There will be of such subsets, S_1, S_2, … S_ . Let E_i event that S_i is a monochromatic subgraph (a Red or a Blue clique). So, a sample is a randomly selected graph coloring. What is the probability of E_i? Why? Note: number of edges in clique of size k is:
So, the probability that the randomly selected coloring will have some monochromatic k-clique is: If we can show that this probability is strictly less than 1, then we know that there must exist some coloring that does not have any monochromatic k-clique! We’ll do this. It’s mainly a matter of rewriting the combinatorial expression for the probability and finding an upper bound. First step: Can we just add up the individual probabilities?
But, computing the exact value of is very tricky. Why?? Events are not disjoint and not independent! A coloring can have multiple monochromatic cliques. Also, having e.g. several Blue cliques in part of the graph, makes it more likely to have more Blue cliques on other cliques that share many edges with the Blue cliques. So, all kinds of subtle dependencies! Would need to use the inclusion-exclusion formula (many terms)! Fortunately, we “just” need to upper bound the probability. So, we can truncate the formula to the first set of terms. This gives us Boole’s Inequality: General form, see exercise 15, sect. 6.2. p(E1 U E2) ≤p(E1) + p(E2)
? So, we get: < 1 So, we have “upper bounded” our probability. What’s left? We need to show that the left hand side is strictly less than 1. “Just” combinatorics… O(nk) Note: We have many terms in the sum, but p(E_i) can be pretty small.
Ex. 17, 5.4 by assumption: by assumption: So,
Note: Proof is non-constructive. No effective method known for finding such coverings! So, when and So, is the probability of having a monochromatic k-clique is strictly less than 1. Therefore, there must exist edge colorings in our sample space that do not have such monochromatic k-cliques! So, R(k,k) ≥ 2k/2 QED
Monte Carlo Algorithm • Probabilistic Algorithms --- algorithms that make random choices at one • or more steps. • Decision problems --- problems for which the answer is True or False. • E.g., is this number composite? • Monte Carlo algorithms – there is a small probability that they will return • an incorrect answer.
Probabilistic Primality Test:Miller’s Test • Let n be a positive integer and let n-1 =2st (i.e. divide out the factors of 2), where s is a non-negative integer and t is an odd positive integer. • We say that n passes the Miller test for the base b if either • bt≡1 (mod n) • or It can be shown that a composite integer n passes the Miller’s test for fewer than n/4 bases b, with 1 < b < n.
Probabilistic Primality Testing • Goal of the algorithm is to decide the question – “Is n composite?” • The basic structure of randomized primality tests is as follows: • Randomly pick a number 1 < b < n. • Does n pass the Miller test for b? • If n fails the test, then n is a composite number, and the answer is True; b is known as a witness for the compositeness, and the test STOPS. • Otherwise the answer is “unknown” • Repeat step 1, k times until the required certainty is achieved. • After k iterations, if n is not found to be a composite number, then it can be declared probably prime.
Probabilistic primality testing • Note: • This algorithm can only make mistakes when the answer is unknown. • The probability that a composite integer n passes the Miller’s test for a • randomly selected base b is less than ¼. • By repeating it k times, given that the iterations are independent, the • probability that n is composite but the algorithm responds that is prime is less than ¼ k
Probabilistic primality testing • By taking k sufficient large, we can make the probability of error really, really small. • 10 iterations less than 1 in 106 • 30 iterations less than 1 in 1018 • If we use n as prime as one of the two primes used in the RSA • cryptosystem and n is actually a composite, the procedures used to • decrypt messages will not produce the original encrypted message. • They key is then discarded and two new possible primes are used.
Probability theory is a very rich topic of • increasing importance in computer science, but • for now, this ends our probabilistic adventures. And, our adventures in Discrete Mathematics!! Of course, you’re still left with the popular party question: How does discrete math differ from indiscretemath?