230 likes | 444 Views
Probabilistic analysis. Wooram Heo. The birthday paradox. How many people must there be in a room before there is a 50% chance that two of them were born on the same day of the year? Index the people with integers 1, 2, …, k : the day of the year on which person i ’s birthday falls
E N D
Probabilistic analysis WooramHeo
The birthday paradox • How many people must there be in a room before there is a 50% chance that two of them were born on the same day of the year? • Index the people with integers 1, 2, …, k • : the day of the year on which person i’s birthday falls • Birthdays are uniformly distributed across the n days
The birthday paradox • Then, the prob. that i’s birthday and j’s birthday both fall on day r is • Thus, the prob. That they both fall on the same day is
The birthday paradox • Pr{at least 2 out of k people having matching birthdays} = 1 – Pr{k people have distinct birthday} • Ai : i’s birthday is different from j’s birthday for all j < i • Bk : Event that k people have distinct birthdays
The birthday paradox • If Bk-1 holds,
The birthday paradox • Prob. That all k birthdays are distinct is at most ½ when • For n = 365, k is bigger than or equal to 23 • Thus, if at least 23 people are in a room, the prob. is at least ½ that two people have the same birthday
Balls and bins • Consider the process of randomly tossing identical balls into b bins, numbered 1, 2, …, b • Tosses are independent. • Prob. that a tossed ball lands in any given bin is 1/b • Ball-tossing process is a sequence of Bernouli(1/b) • Useful for analyzing hashing
Balls and bins • How many balls must one toss until every bin contains at least one ball? • Call a toss in which a ball falls into an empty bin a “hit” • Expected number n of tosses required to get b hits? • Hit can be used partition the n tosses into stages. The ith stage consists of the tosses after the (i- 1)st hit until ith hit.
Balls and bins stage2 stage3 stage b • For each toss during the ith stage, prob. obtaining a hit is (b – i + 1) / b • ni : denote the number of tosses in the ith stage. stage1
Balls and bins • By linearity of expectation,
Streaks • Suppose you flip a fair coin n times. The longest streak of consecutive heads that you expect to see is • Proof consists of two steps; showing and • Aik : the event that a streak of heads of length at least k begins with the ith coin flip. I.e. coin flips i, i + 1, …, i + k – 1 yield only heads.
Streaks • f • Prob. that a streak of heads of length at least begins anywhere is
Streaks • Lj : event that the longest streak of heads has length exactly j • L : the length of the longest streak • E • Events Ljfor j = 0, 1, …, n are disjoint, so the prob. that a streak of heads of length at least begins anywhere is
The hiring problem • H • In worst-case, total hiring cost of • What is the expected number of times that manager hires a new office assistant?
The hiring problem • D • D • D • d
The On-line hiring problem • Manager is willing to settle for a candidate who is close to the best, in exchange for hiring exactly once. • After interviewing, either immediately offer the position to the applicant or immediately reject the applicant. • After manager has seen j applicants, he knows which of the j has the highest score, but he does not know whether any of the remaining n – j applicants will receive a higher score.
The On-line hiring problem • H • We wish to determine, for each possible value of k, the probability that we hire the most qualified applicant.
The On-line hiring problem • K • S : event that we succeed in choosing the best-qualified applicant • Si : event that we succeed when the best-qualified applicant is the ith one interviewd. • Since Si are disjoint,
The On-line hiring problem • Bi : event that the best-qualified applicant is in position i. • Oi : event that none of the applicants in position k + 1 through i – 1 chosen. I.e. all of the values score(k + 1) through score(i – 1) must be less than M(k). • Bi and Oiare independent.
The On-line hiring problem • D = 1/n , • D • d
The On-line hiring problem • d • Evaluating these integrals gives us the bounds • To maximize the probability of success, focus on choosing the value of k that maximizes the lower bound on Pr{S}. • By differentiating the expression (k / n) (lnn – lnk) with respect to k, and setting the derivative equal to 0, we will succeed in hiring our best-qualified applicant with prob. at least 1/e.