1 / 14

Amortized Analysis of Rehashing

Amortized Analysis of Rehashing. What is rehashing. Hash table too full  spend a lot of time looking in buckets Solution: rehash make hash table twice the size for each item in original hash table, hash to location in bigger table. Assumptions for Thursday, Feb. 10, 2000.

lipscomb
Download Presentation

Amortized Analysis of Rehashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Amortized Analysis of Rehashing

  2. What is rehashing • Hash table too full  spend a lot of time looking in buckets • Solution: rehash • make hash table twice the size • for each item in original hash table, hash to location in bigger table

  3. Assumptions for Thursday, Feb. 10, 2000 • Rehash whenever table is 50% full or more • Just a sequence of inserts • (can be generalized for other operations) • We never get a collision • (once dealing with collisions, we’re in average-case analysis territory) • Hash table starts as size 2

  4. Observations • How expensive is an insert? • How expensive is a rehash? • When will we need to rehash?

  5. Observations • How expensive is an insert? • O(1) - assuming no collisions. Say it’s 1. • How expensive is a rehash? • O(N) - where N is current size of table. Say it’s N. • When will we need to rehash? • whenever size of table is power of 2 (1,2,4,8,16,…)

  6. Amortized analysisStrategy 1: add up operations Note: I’ve made assumptions about the constants, but the analysis could be done for any constants. Note 2: This is sometimes called the “Aggregate Method”

  7. Amortized analysisStrategy 2: Accounting Method • Charge the cost of some operations to other operations • Each operation gives us some “tokens”, which we can spend on future operations

  8. Accounting Method analysis • Each insert gives us 3 tokens • 1 token for that insert • 1 token for rehashing this item the first time • 1 token for rehashing another item that already got rehashed once or more(since we rehash on double the size, # of items never hashed = # of items already hashed once or more)

  9. Potential function • Potential function: more sophisticated tokens • Potential function is a function • When operation costs less, potential goes up • When operation costs more, take from potential • Always positive – or we’re taking too long

  10. Tokens as a potential function • potential function at operation #i P(i)= 1 + 2F - S/2 • F = # of filled (non-empty) hash table slots • S = total # of hash table slots • Actual cost of operation of operation #iC(i) • Amortized cost of operation #(i+1):CA(i+1)=C(i+1) + ( P(i+1) – P(i) )

  11. The Math • Beginning: • F=0, S=2. P(i)=0 • For insert with no rehashing • C(i+1)=1 • P(i+1)-P(i)=2 (added non-empty slot) • For insert with rehashing of N items • C(i+1)=1+N • P(i+1)-P(i)=2-N (before, F=N, S=2N. After, S=4N)

  12. Potential method analysis • P(i) is always positive. • Yes. We rehash when F  1/2S, at which point, 4F=S. Etc… • Amortized cost is constant • CA(i+1)=C(i+1)+P(i+1)-P(i)= 3, in both cases (previous slide)

More Related