300 likes | 552 Views
Amortized Analysis. The average cost of a sequence of n operations on a given Data Structure. Aggregate Analysis Accounting Method. Amortized Analysis. Amortized analysis computes the average time required to perform a sequence of n operations on a data structure
E N D
Amortized Analysis The average cost of a sequence of n operations on a given Data Structure. Aggregate Analysis Accounting Method CS333 / Cutler Amortized Analysis
Amortized Analysis • Amortizedanalysis computes the average time required to perform a sequence of n operations on a data structure • Often worst case analysis is not tight and the amortized cost of an operation is less that its worst case. • Analogy (making coffee) CS333 / Cutler Amortized Analysis
Applications of amortized analysis • Vectors/ tables • Disjoint sets • Priority queues • Heaps, Binomial heaps, Fibonacci heaps • Hashing CS333 / Cutler Amortized Analysis
Difference between amortized and average cost • To do averages we need to use probability • For amortized analysis no such assumptions are needed • We compute the average cost per operation for any mix of n operations CS333 / Cutler Amortized Analysis
Operations on Data Structures • A data structure has a set of operations associated with it. • Example: A stack with • push(), pop() and MultiPop(k). • Often some operations may be slow while others are fast. • push() and pop() are fast. • MultiPop(k) may be slow. • Sometimes the time of a single operation can vary CS333 / Cutler Amortized Analysis
Methods • Aggregate analysis- the total amount of time needed for the n operations is computed and divided by n • Accounting - operations are assigned an amortized cost. Objects of the data structure are assigned a credit • Potential – The prepaid work (money in the “bank”) is represented as “potential” energy that can be released to pay for future operations CS333 / Cutler Amortized Analysis
Aggregate analysis • n operations take T(n) time • Amortized cost of an operation is T(n)/n CS333 / Cutler Amortized Analysis
Stack - aggregate analysis • A stack with operations • Push, Pop and Multipop. Multipop(S, k) while notempty(S) and k>0 do Pop(S); k:=k-1 endwhile CS333 / Cutler Amortized Analysis
Stack - aggregate analysis • Push and Pop are O(1) (to move 1 data element) • Multipop is O(min(s, k)) where s is the size of the stack and k the number of elements to pop. • Assume a sequence of n Push, Pop and Multipop operations CS333 / Cutler Amortized Analysis
Stack - aggregate analysis • Each object can be popped only once for each time it is pushed • So the total number of times Pop can be called ( directly or from Multipop) is bound by the number of Pushes <=n. CS333 / Cutler Amortized Analysis
Stack - aggregate analysis • A sequence of nPush and Pop operations is therefore O(n) and the amortized cost of each is O(n)/n=O(1) CS333 / Cutler Amortized Analysis
Stack Example: Op/Moves <=2 Operation Stack Stack Start a push a a b b push b a a c c b b push c a a b pop c Multipop(3) a pop b a pop a 6 Operation: 6 Moves 4 Operation: 6 Moves CS333 / Cutler Amortized Analysis
Accounting Method • Charge each operation an (invented) amortizedcost. • Often different from actual run time cost. Some operations may have an amortized cost larger than runtime, others may have less. • Unlike businesses we do not want to make a profit • We want to cover the actual cost • Amount charged but not used in performing an operation is stored with objects of the data structure CS333 / Cutler Amortized Analysis
Accounting method • Later operations can use stored amount to pay for their actual cost • Credit balance must not go negative(always enough to pay for performance of future operations) CS333 / Cutler Amortized Analysis
Stack - amortized analysis • We assign the amortized costs: $2 for Push $0 for both Pop and Multipop • For a sequence of nPush and Pop operations the total amortized cost is at most 2n or O(n) CS333 / Cutler Amortized Analysis
Stack - amortized analysis • Each time we do a Push we pay $1 for the actual cost of the Push and the element has a credit of $1. • Each time an element is popped we take the $1 credit to pay for it • Thus the balance is always nonnegative CS333 / Cutler Amortized Analysis
Increment (A) i = 0 while i < length[A] and A[i] = 1 A[i] = 0 i ++ if i < length[A] A[i] = 1 Initially the counter contains 0 Eventually it becomes 2k -1 Next it is reset to 0 k bit binary counter A k -1 1 0 CS333 / Cutler Amortized Analysis
Aggregate analysis 3 bit counterk=3 • Bit 2 1 0 • D No. • 0 0 0 0 • 0 01 • 2 0 1 0 • 3 0 1 1 • 4 1 0 0 • 5 1 0 1 • 6 1 1 0 • 1 1 1 • Flips 2 4 8 • Count number of times a bit is flipped. • Let number increments n = 2k (if n < 2k analysis similar) • A[0] flipped n times • A[1] flipped n/21 times • … • A[k - 1] flipped n/2k-1 times CS333 / Cutler Amortized Analysis
Accounting method • Charge amortized cost of $2 to set a bit to 1 • When a bit is set to 1, pay $1 for actual cost and store $1 with bit • Note: at all times a bit with value 1 has $1 • When a bit is reset to 0 use $1 to pay for actual cost • $2 per Increment operation CS333 / Cutler Amortized Analysis
Accounting method • Let the value stored in the counter be: • After increment and a payment of $2 = $1 +$1: $1 $1 $0 $0 $1 $0 $0 $1 $1 $1 1 1 0 0 1 0 0 1 1 1 $1 $1 $0 $0 $1 $0 $1 $0 $0 $0 1 1 0 0 1 0 1 0 0 0 CS333 / Cutler Amortized Analysis
Dynamic table (object table, hash table, vector, etc) • The table is dynamic • We can’t predict its maximum size • We would like to avoid allocating a lot of unused space (reasonable load balance) • May not be able to avoid table overflow. • Overflow should not cause run time failure CS333 / Cutler Amortized Analysis
java.util.Vector • A built in class which is a “growable” array. • The user can set: • initialCapacity - the initial capacity of the vector. • capacityIncrement - the amount by which the capacity is increased when the vector overflows. • The default for capacityIncrement is to double the size CS333 / Cutler Amortized Analysis
Dynamic table • Idea: Allocate more memory as needed. • Duplicate the size of the table after each overflow. • After each duplication must copy elements from old to new table • We assume for now the only operation is insert and calculate amortized cost • Table sizes: 1, 2, 4, 8, …, 2k CS333 / Cutler Amortized Analysis
Aggregate method Op. Size Cost The table before after 1 0 1 1 2 1 2 1+1 3 2 4 2+1 4 4 4 1 5 4 8 4+1 6 8 8 1 7 8 8 1 24/9 8 8 8 1 9 8 16 8+1 1 2 3 4 1 2 3 4 5 6 7 8 1 1 2 1 2 3 4 5 6 7 8 9 . . Copy CS333 / Cutler Amortized Analysis
Aggregate Analysis • Let the number of inserts be 2k <n 2k+1 • (Note: 2*2k = 2k+1 < 2n) • At this point the size of the table is2k+1 • After n inserts: • The total cost for copy operations only is 1 +2 + 4 + . . . + 2k = 2k+1 - 1 < 2n • The total cost for n inserts (without copy) is n. • Total < 3n • Therefore the amortized time is O(1). 2k n 2k CS333 / Cutler Amortized Analysis
Accounting analysis • Charge each insert $3. • When the table is not full, use $1 for the cost of insert, and store $2 with element • When the table doubles from m to 2m: • m/2 elements that never moved before have $2 credit, • m/2 elements which already moved have $0 • After copy all m elements have $0 credit CS333 / Cutler Amortized Analysis
Accounting analysis Size = 8 4 copied elements with $0 Size = 4 2 copied elements with $0 2 new elements with $2 $0 $0 $0 $0 $2 $2 $2 $2 $0 $0 $0 $0 Size =8 4 copied elements with $0 4 new elements with $2 $0 $0 $2 $2 CS333 / Cutler Amortized Analysis
java.util.Vectorfixed increment • Let the number of inserts n satisfy c0 +(m-1)c <nc0 +mc • So(n- c0 )/c m <1+ (n- c0 )/c and m = (n) • At this point the size of the array is c0 +mc • After the nth insert: • The total time for copy operations is ? • The total time for n inserts (without copy) is n. CS333 / Cutler Amortized Analysis
c0 c0 c0 c0 c c c c c java.util.Vector fixed increment Initial capacity c0 0 Capacity increment c 1 2 m-1 Cost for m vector copies = mc0 + c (1 + 2+ 3 + … +(m-1)) = mc0 + cm(m-1)/2= (m2) = (n2) CS333 / Cutler Amortized Analysis
java.util.Vectorfixed increment • After the nth insert: • The total time for copy operations is (n2) • The total time for n inserts (without copy) is n. • Average insert time is (n) CS333 / Cutler Amortized Analysis