180 likes | 307 Views
Gang Xiang Philips Electronics Healthcare Informatics Business Group gxiang@acm.org. Estimating Maximal Information Amount under Interval Uncertainty. Measure of Uncertainty. Uncertainty: we know n possible states. Measure of uncertainty:
E N D
Gang Xiang Philips Electronics Healthcare Informatics Business Group gxiang@acm.org Estimating Maximal Information Amount under Interval Uncertainty
Measure of Uncertainty • Uncertainty: we know n possible states. • Measure of uncertainty: - A natural measure of uncertainty is the average number of binary (“yes” – “no”) questions we need to find the actual state. -In information theory, the average number of binary questions is called the amount of information. • Simple situation: - With K binary questions, we have 2K possible combinations of K answers. - So, for n=2K, we need K=log2n questions.
Entropy for a Probability Distribution • Situation: - We know n states. - We knowthe probability p1, …, pnof the states. - What is the average number of binary questions to determine the state? • Answer: We need questions. (Shannon’s Entropy) • Exmaple: When p1=, …,= pn= 1/n, we need S=log2n questions (as before).
Entropy of the Empirical Distribution • Probabilities can be estimated as frequencies. • Given:n values x1, …, xn where xiis the observed number of cases with state i. • Estimate: • Entropy:
Formulation of the Problem • In practice, we sometimes only know bounds on xi: • We want to know largest possible number of questions (maximal information amount), i.e.
Towards an O(n2) Time Algorithm • We have - If then , i.e., xi -> xi - h (h>0) increases S. - If then , i.e., xi -> xi + h (h>0) increases S. • So, considering each - If then to maximize S. (Selection of other value of xi leads to the contradiction that xi - h can obtain a greater S.) - If then to maximize S. (Selection of other value of xi leads to the contradiction that xi + h can obtain a greater S.) - If then to maximize S. (Selection of other value of xi leads to the contradiction that xi - h or xi + h can obtain a greater S.)
Towards an O(n2) Time Algorithm A • This means, to maximize S, the choice of xi should make log2xi as close as possible to log2 xi
Resulted O(n2) Time Algorithm • Step1: Sort 2n endpoints of into x(1) ≤ x(2) ,…, x(2n-1) ≤x(2n). Also, denote x(0)= 0, x(2n+1)=+∞. • Step 2: For each of 2n+1 intervals [x(k), x(k+1)], we assume that log2x(k) ≤ A≤ log2x(k+1). Then we can choose xi(the resulted tuple denoted as Xk) based on this assumption. Verify assumption by computing A (denoted as Ak), and check if log2x(k) ≤ Ak≤ log2x(k+1). If assumption is verified, continue to compute S (denoted as Sk) as a candidate of Smax. • Step 3: Pick up the largest of computed candidates as Smax.
O(n2) Time Complexity • Step 1 (sorting) requires O(n·log(n)) time. • In Step 2, for each of the 2n+1 iterations: Choosing values of xirequires O(n) time. Computing Akrequires O(n) time. Computing Skrequires O(n) time. There, each iteration requires O(n) + O(n) + O(n) = O(n) time. Total 2n+1 iterations require O(n2) time. • Step 3 (finding maximum) requires O(n) time. • The algorithm totally requires O(n·log(n)) + O(n2) + O(n) = O(n2) time. • Can we speed it up?
Towards an O(n·log(n)) Time Algorithm • Idea to speed up: we do not actually require O(n) time to obtain Xk, Ak and Sk for every k. • Once Xk, Ak and Sk are obtained, choosing Xk+1, computing Ak+1 and Sk+1 only require O(1) time, since Xk and Xk+1 only differ in the choice of one xi: • Therefore, all Xk, Ak and Sk can be obtained in O(n) time. (O(n) time to obtain X0, A0 and S0, and 2n·O(1) time to obtain others.) • The improved algorithm totally requires O(n·log(n)) (sorting) + O(n) (computing Smax’s candidates) + O(n) (finding maximum) = O(n·log(n)) time.
Towards a Linear Time Algorithm • First idea to speed up: we do not actually require to obtain Xk, Akand Sk for every k. • We proved that: with the increase of k, first Ak>log2 x(k+1) , thenlog2 x(k) ≤ Ak≤ log2 x(k+1)for just one case, then Ak< log2 x(k). • Therefore, we can use binary search to find the maximizing k. We only need to obtain O(log(n))Xk ’s and Ak ’s, and compute Sk only once. • However, since we need O(n·log(n)) time for sorting, this speed-up does not reduce the total time complexity of O(n·log(n)).
Towards a Linear Time Algorithm • Second approach to speed up: we do not actually need to sort all 2n endpoints of • To obtain Xk andAk,we only require to know the values ofx(k) and x(k+1), then classifyxi’s into “lower endpoint group”, “upper endpoint group” and “inner point group”. • Classification ofxiis actually the classification of the two endpoints of the interval whether they are both to the left of[x(k),x(k+1)], or they are both to the right of[x(k),x(k+1)], or one of them to the left and the other to the right.
Towards a Linear Time Algorithm • In each iteration of binary search, we have three sets of endpoints. Left-side Endpoints Undecided Endpoints Right-side Endpoints
Towards a Linear Time Algorithm • In each iteration of binary search. Find the median (attempted x(k)) and next greater endpoint (attempted x(k+1)) of undecided group, and use this median to divide undecided group. Left-side Endpoints Right-side Endpoints
Towards a Linear Time Algorithm • In each iteration of binary search. Compute Ak and compare it with log2 x(k) and ln x(k+1). Assume these 2n-k endpoints are all right-side endpoints. Assume these k+1 endpoints are all left-side endpoints. Left-side Endpoints Right-side Endpoints
Towards a Linear Time Algorithm • In each iteration of binary search. Update the sets of endpoints base on the result of comparison. For example, if Ak>log2 x(k+1) Left-side Endpoints Undecided Endpoints Right-side Endpoints
Towards a Linear Time Algorithm • In the first iteration, the sets of left-side endpoints and right-side endpoints are empty. Akis computed directly. Time complexity of the iteration is O(2n+1). • In each other iteration of binary search. Akis computed based on the value of Ak from previous iteration. Time complexity of the iteration is only O(U) where U is the size of undecided group. • Totally, O(2n+1) + O((2n+1)/2) + O((2n+1)/4)+… (finding the maximizing k) + O(n) (computing Skas Smax) = O(2n+1) + O(n) = O(n)
O(n·log(n)) Time Algorithm VS Linear Time Algorithm • The constant in the corresponding asymptotic. - 2 forthe O(n·log(n)) time algorithm. (Sorting 2n endpoints) - 2 * 20 = 40 for the lineartime algorithm. (Finding median in the 2n endpoints) • The linear time algorithm is better when log2(n)>20, i.e., n>106 • In practice, the O(n·log(n)) time algorithm is better.