Data Grid Technologies

Data Grid Technologies Sathish Vadhiyar Sources/Credits: Technical papers listed in references

Replica Strategies

Problem Motivation • Replication to deal with faults and provide scheduling flexibility. • Given a file that is partitioned into blocks that are replicated throughout a wide-area file system, how can a client retrieve the file with the best performance? • Various algorithms

Basic Downloading Algorithm • The client opens a thread to each server containing the file • A block size is chosen • Each thread selects a different block to download and all threads start downloading • A thread then chooses a new block that is currently not being downloaded by any other thread • Adaptive – Servers with higher bandwidths to clients download more blocks • Selection of block size - tricky

Aggressive Redundancy • To provide fault tolerance and to improve download time • A redundancy factor, R • The client downloads a block simultaneously from R servers • Only 1 is chosen – whichever returns first

Progress-Driven Redundancy • Retry a download only when it is progressing slowly • Progress number - P, redundancy factor – R • Each block assigned a download number initialized to 0 • When a thread attempts to download a block, it increments the block’s download number

Progress-Driven Redundancy (Continued) • For selecting a new block to download • If there is a block B whose download number < R, and if there are P blocks after B whose downloads have completed, then select B • Else select next block whose download number is zero

Fastest1 • Another approach • For downloading a block, choose a server that has minimum value of time*(l+1) • time – predicted time to download a block when there is no contention. Obtained from NWS numbers before download is initiated. • l – number of threads currently downloading from the server

Multiple clients • Situation arises when parallel data for computation on parallel clients have to be selected from available replica server locations • More challenges – download decision by a client can impact download performance on other clients. Need to predict this impact. • Periodic network monitoring have to be augmented by measurements corresponding to current downloads

Collective Download algorithm • Each algorithm connects to a server only once even if some of the data belongs to other clients – download phase • The clients then redistribute data among themselves – redistribution phase • Widely followed in parallel-I/O • Especially useful when clients and servers are on either side of WAN – multiple latencies can be avoided at the cost of less expensive redistribution phase

Replica Placement Strategies • Replica placement questions • When should replicas be created? • Which files should be replicated? • Where should replicas be placed? • The model assumes that data is produced in tier-1 (root) and there are storage spaces at various tiers (levels of hierarchy) • Clients that request data form the leaves of the hierarchy

Placement strategies • Best client • Each storage node maintains history regarding number of requests for the files it contains • If the number of requests for a file exceeds the threshold, the node creates a replica of the file in that client node that has generated most requests for that file (best-client) • The request details for the file are cleared.

Strategies … • Cascading replication • Analogy to a 3-tiered function • Once a threshold for a file is exceeded at the root, a replica is created at the next level on the path to the best client and so on… • Geographical locality is exploited • Plain caching – done at the client • Caching plus Cascading Replication

Strategies… • Fast Spread • A replica of the file is stored at each node along its path to the client • Replica selection – closest replica • Replica replacement – least popular file with oldest age is replaced. Popularity logs are cleared periodically

Findings • Best-client performs worst for random access patterns and shows improvement for access patterns with a bit of geographical locality • Fast spread works much better than cascading for random data access • Bandwidth savings are more in fast spread than in cascading • Fast spread has high storage requirements

Sources / References / Credits • Algorithms for high Performance, Wide-area distributed file downloads. J.S. Plank, S. Atchley, Y.Ding and M. Beck, Parallel Processing Letters, vol. 13, no. 2, pp 207-224, June 2003. • Downloading Replicated Wide-Area Files – a Framework and Empirical Evaluation. R.L. Collins and J.S. Plank. NCA 2004. • Identifying Dynamic Replication Strategies for a High-Performance Data Grid. K. Ranganathan and I. Foster. Grid 2002.

Sources / References / Credits • Grid-Based Galaxy Morphology Analysis for the National Virtual Observatory. Ewa Deelman, Raymond Plante, Carl Kesselman, Gurmeet Singh, Mei-Hui Su, Gretchen Greene, Robert Hanisch, Niall Gaffney, Antonio Volpicelli, James Annis, Vijay Sekhri, Fermi Tamas Budavari, Maria Nieto-Santisteban, William O'Mullane, David Bohlender, Tom McGlynn, Arnold Rots, Olga Pevunova, Supercomputing 2003. • Applying Chimera virtual data concepts to cluster finding in the Sloan Sky Survey. James Annis , Yong Zhao, Jens Voeckler, Michael Wilde, Steve Kent, Ian Foster. SC 2002.

Sources / References / Credits • Kavitha Ranganathan and Ian Foster, Decoupling Computation and Data Scheduling in Distributed Data Intensive Applications, Proceedings of the 11th International Symposium for High Performance Distributed Computing (HPDC-11), Edinburgh, July 2002.

Data Grid Technologies

Data Grid Technologies

Presentation Transcript

Grid Computing Using Modern Technologies

Smart Grid Technologies

Grid Technologies Research and Development

Lambda Data Grid

Grid Technologies Update

Grid Data Management

Collaboration and Grid Technologies

Grids and Grid Technologies

Data Grid Breakout

Advanced Grid Technologies in ATLAS Data Management

Topographic Data Grid

Data Grid Automation

Data Grid Services

Grid Technologies

Data Grid Plane

Grids, Grid Technologies and Data Mining

GRID technologies for learning

Collaboration and Grid Technologies

NERC Data Grid

Collaboration and Grid Technologies

Grid Data Management

Advanced Grid Technologies in ATLAS Data Management