1 / 13

Locality Optimizations in OceanStore

Locality Optimizations in OceanStore. An introduction to introspective techniques for exploiting locality in wide area storage utilities. Patrick R. Eaton Dennis Geels. Agenda. OceanStore Review Problem Overview Previous Work Proposed Solution Prefetching Algorithm Preliminary Results

rashad
Download Presentation

Locality Optimizations in OceanStore

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Locality Optimizations in OceanStore An introduction to introspective techniques for exploiting locality in wide area storage utilities. Patrick R. Eaton Dennis Geels

  2. Agenda • OceanStore Review • Problem Overview • Previous Work • Proposed Solution • Prefetching Algorithm • Preliminary Results • Future Work

  3. OceanStore Review • Properties of OceanStore relevant to introspective locality optimizations • implemented in the extremely wide area • has many places to put any single piece of data • cannot rely on users to make relationships among data explicit • depends on effective locality optimizations for improved performance • No possible way to solve exactly

  4. Problem Overview • Passively observe data accesses • data shared among multiple users • single users accessing the network from different physical locations • data is replicated across the network • Optimize the location of data to provide quicker access to users • cluster semantically related data • replicate data to move it closer to consumers • migrate primary replicas toward the source of updates

  5. Measurable Attributes • File Temperature • A measure that indicates the frequency of access to the file • A hot file is frequently accessed • Semantic Distance (Kuenning) • Any measure that can quantify relationships between files on the range [0,) • Local distance relates one instance of a file access to another • Reference distance is an aggregate measure that summarizes all local distances for a pair of files • Typical measures use access order or timing information

  6. Prefetching Techniques • Automatic Prefetching (Griffoen and Appleton) • construct a probability graph that records accesses which follow within a lookahead period • predict a prefetch when the chance of an access is above a tunable parameter • Context Modeling (Kroeger and Long) • record in a trie all access sequences which have been observed • maintain pointers to all nodes which represent current contexts • predict a prefetch when the chance of an access to a child of a current context is above a probability threshold

  7. Our Approach • Exploit the ideas of semantic distance to compute relationships among data • Cluster data based on the observed relationships • Store a summary of these relationships with the data • Migrate (prefetch) files based on familiar patterns in the access stream • recognize higher order correlations as in context modeling • tolerate noise in the access stream

  8. Motivation for Prefetching Algorithm A Y Many patterns can be predicted only by observation of higher-order correlation--combining several pieces of past history. K B Z A Other patterns can only be detected through identification and filtering of noise. B C

  9. FHB y B w g o F w K Distance Table (B,F) w K (y,B) w g o F K (o,w) K General Prefetching Algorithm • Update • Record the most recent file accesses in the file history buffer (FHB) • Each time a new file S is accessed, extract all triples of the form (FHB(i), FHB(j)) S from the FHB and update in the second-order distance table • Predict • Each time a new file S is accessed, examine the distance table entries of (FHB(i), S) • Prefetch files that are predicted with confidence above a certain threshold • Problems • O(k2) work to update distance table • Noise infects distance table

  10. y B w g o F K h y B w t o F w K y B w g w K p e Distance Table y B w g t B w g t o w g K t o Optimizations to the Prefetching Algorithm Indicative FHB’s • First-order distance table • Records files that are close, as measured by semantic distance • Allows reverse lookup • Use first-order distance tables to filter out irrelevant file relationships • Update only relevant entries in the second-order distance table • Search for predictions based on only relevant access pairs

  11. Update Extract relevant triples by intersecting the FHB with the results from the reverse lookup in first-order tables FHB 1st Order Table 2nd Order Table y Q t u v R w x S Q a b R c d (Q,R) S a b d f R b S g h t (Q,w) b t S t d e R v (Q,v) t d e Check table for prediction Find parents of S Find parents of R Update table Find parents of R t x b y t Q u v R Prefetching Algorithm Example • Predict • Extract relevant doubles by intersecting the FHB with the results from the reverse lookup in the first-order tables • Prefetch if the second-order table predicts a future access with sufficient confidence

  12. Preliminary Results (Local System)

  13. Future Work • Retarget the simulations to model OceanStore • Continue to refine the prefetching algorithm • Examine the potential of higher order prefetching • Combine prefetching and clustering • Look for opportunities to test the ideas on different workloads

More Related