430 likes | 437 Views
Explore the concepts of online decision problems, predicting based on expert advice, and solving shortest path challenges efficiently. Learn about the tree update problem in binary search trees. This seminar provides insights into minimizing regret and optimizing decision-making in dynamic environments using innovative algorithms.
E N D
Efficient algorithms foronline decinsion problems Adam Kalai, Santosh Vempala Seminar on Experts and Bandits, Fall 17/18 Ran Hochshtet
Contents • Online decision problem • N Experts • Online shortest paths • Tree update problem
Introduction • Online decision problem • No knowledge of the future • Each period we pick choice • Pay • Goal: Minimize the regret
Linear generalization • Series of decisions from infinite set • leads to state • Making decision in state costs • Total cost =
Linear generalization • - state • - decision • M computes the best single decision in hindsight
Predicting from expert advice • experts • Each period we pick expert • Pay • Goal: Minimize the regret
Predicting from expert advice • experts problem • - the costs vector
Motivation • Consider example with two experts • The costs are: • Follow the leader always incurs cost of 1The total cost is • Using perturbations we can achieve
On each period t: • Choose uniformly at random from the cube • Use
On each period t: • Choose at random according to the exponential distribution • Use
Notations • for • for all • for all • for all
Theorem 1.1 • Let be a state sequence • (a) Running with gives:
Theorem 1.1 • Let be a state sequence • (b) For nonnegative , gives:
Theorem 1.1 • If or are known: • For FPL:
Experts problem • It seems that , • In our algorithm the worst case is wheneach period only one expert incurs cost Min-cost is b After we choose b, there is a chance we choose c
Experts problem • , • By Theorem (b):
Online shortest paths • Input • Directed Graph - • Pair of nodes • Each period pick path from to • Then times on all edges are revealed • The cost is the sum of times on the chosen path
Online shortest paths • is the number of edges • - the times vector
Online shortest paths • Use • On each period • For each edge pick from exp. Distribution (same as ) • = the total times on edge so far • Use shortest path in the graph with weights
Online shortest paths • , • By Theorem :
Proof of Theorem - • “Be the leader” – use instead of • “Be the leader” has no regret • Prove by induction
Proof of Theorem - • We show that: • For – trivial • Induction step from to :
Lemma • We want to show that perturbations do not hurt too much • Still “be the leader” algorithm • For any state sequence , any and any vectors
Proof of Lemma • Pretend
Proof of Theorem - • Use , for all • No need to choose new each period • Applying Lemma :
Proof of Theorem - • Now we return to use instead of • We need to show that:
Proof of Theorem - • Key idea: the distributions over and are similar • If the cubes are identical, i.e. , then • If they overlap on fraction of their volume:
Lemma • For any , the cubes and overlap in at least a fraction
Proof of Lemma • Take a random point ,if , then for some , • With union bound we get:
Proof of Theorem - • By lemma : • Each period the difference between using and is at most • We get:
The tree update problem • Maintain a binary search tree over items • There is an unknown sequence of accesses • The cost is the number of comparisons • Equals to the depth in the tree
The tree update problem • We can solve the problem with • Each period we find the best tree so far, and use it • The problem: • For each access we do expensive computation
The tree update problem • Follow the lazy leading tree: • For , let and choose randomly from • Start with best tree as if there were accesses to node • After each access to item :(a) (b) if theni. ii. Compute best tree as if there were accesses to node
Calling the oracle can be a computationally expensive • We want to minimize the numbers of times we use • Trick: use as often as possible
is equivalent to in terms of expected cost • rarely calls the oracle • rarely changes decision from one period to the next
Once, choose uniformly • Determine a grid: • On period . Use where is the unique point in • If - no need to re-evaluate
Lemma • For any fixed sequence of states and (also and ) have identical expectations on each period . • The probability of (or ) performing an update is at most .
Proof of Lemma • chooses a uniformly random grid of spacing • There is exactly one grid point inside • By symmetry is uniformly distributed over • Same as - uniform over
Lemma 3.2:For any , the cubes and overlap in at least a fraction Proof of Lemma • In each update: • The grid point of is not in the cube • By lemma 3.2:
Summary • Online decision problem • N Experts • Online shortest paths • Tree update problem
THANKS! ANY QUESTIONS?