280 likes | 433 Views
Optimizing of data access using replication technique. Renata Słota 1 , Darin Nikolow 1 ,Łukasz Skitał 2 , Jacek Kitowski 1,2 1 Institute of Computer Science AGH-UST, Cracow 2 ACC CYFRONET AGH, Cracow. Agenda. Motivation of the work Why does today grid computing need replication?
E N D
Optimizing of data access using replication technique Renata Słota1, Darin Nikolow1,Łukasz Skitał2,Jacek Kitowski1,2 1 Institute of Computer Science AGH-UST, Cracow 2 ACC CYFRONET AGH, Cracow
Agenda • Motivation of the work • Why doestoday grid computing need replication? • Replication basics • Clusterix Data Management System • Architecture, optimization and replication algorithms • Optimization Example • Replication Example • Summary, conclusions
Site-level vs. Grid-levelreplication • Site-level replication • Replicas in one site • Implementation examples: • RAID • HSM • Grid-level replication • Data management systems • Replicas spread on many sites
MotivationoftheworkWhy doestoday grid computing need replication? • Data protection and availability • Malfunction of one storage does not affect data itself, only performance is affected • Performance • Low level optimization and replication are not sufficient (RAID, HSM) • Limited network bandwidth • Limited storage performance
Replication scenarios • Static replication • Decision made by system administrator or user • Limited system support: replica selection, replica coherency, replica ordering • Dynamic replication • Decision made by dedicated grid component based on current data access pattern of users • Full system support
Replication consequences • Optimal replica selection algorithm • Replica creation and removal algorithm • Cost of replica creation, update and storage • Replica coherency
ClusterixNational Cluster of Linux Systems • Project aim: • To develop set of tools and procedures allowing to build productive Grid environment based on local PC clusters spread in independent supercomputing centers • Network Layer: • Pionier – Polish optical networks
Optimization Algorithm • Selects optimal storage element for: • data accessing • replica creation • Takes under consideration current state of the System • Optimal storage element is one with the maximal weight W(s,d) W(s,d)=min((1-NetLoad(s))bandwidth(s,d), (1-Sload(s))Sbandwidth(s)) s – storage element d – destination node NetLoad(s) – snetwork interface load Bandwidth(s,d) – available bandwidth betweens and d Sload(s) – storage system load Sbandwidth(s) – storage system bandwidth
Automatic replication algorithm • Takes under consideration gain from replication G(), cost of replica creation C(), cost of replicas update U() and administrative factor A(). • Replication profit: P(d,R,S,f)=G(d,R,S,f)+C(d,R,f)+U(d,R,S,f)+A(d,f) d – storage element, which profit is computed for R – set of storage elements containing replicas of f S – statistic data – history of file usage f – considered file
Storage oriented problems Data intensive applications for Clusterix • Simulation of transonic flow past a wings tips • Visualization of complex multidimensional structures • Ecosystem modeling and simulation
Optimization Example F • Node A needs file F stored on SE1, SE2 and SE3 NMS Optimizer F NMS CDMS SE1 JIMS NMS JIMS Node A F SE2 SE3 NMS NMS JIMS F
Optimization Example • Node A sends request to CDMS NMS Optimizer F NMS CDMS SE1 JIMS NMS JIMS Node A F SE2 SE3 NMS NMS JIMS F
Optimization Example • CDMS uses Optimizer to choice optimal SE NMS Optimizer F NMS CDMS SE1 JIMS NMS JIMS Node A F SE2 SE3 NMS NMS JIMS F
Optimization Example W(s3,d)=min((1-NetLoad(s3))bandwidth(s3,d), (1-Sload(s3))Sbandwidth(s3)) W(s2,d)=min((1-NetLoad(s2))bandwidth(s2,d), (1-Sload(s2))Sbandwidth(s2)) W(s1,d)=min((1-NetLoad(s1))bandwidth(s1,d), (1-Sload(s1))Sbandwidth(s1)) • Optimizer is working… NMS Optimizer F NMS CDMS SE1 JIMS NMS JIMS Node A F SE2 SE3 NMS NMS JIMS F
Automatic replication exampleSituation • 3 clusters • 4 storage elements • 2 contain replica of • Set of applications running on these clusters and accessing file F F SE1 SE4 SE2 SE3 F F
Automatic replication example Gain Optimizer F F Cost of rep. Replication Module Sleeping… Working… Cost of update Adm. factor SE1 SE2 SE3 CDMS Statistic Module SE4
F Decision: SE2 SE4 Automatic replication example Optimizer F F Replication Module Working… Sleeping… SE1 SE2 SE3 CDMS Statistic Module F F F SE4 F F F F
Automatic replication example Optimizer F F Replication Module Sleeping… SE1 SE2 SE3 CDMS Statistic Module F SE4
Summary • Architecture of CDMS with Optimization and Replication modules has been designed • Replication and optimization algorithms has been specified • Modules interfaces has been specified Future work • Integration and tests
Conclusions • Simulation of replication vs. real system implementation • Replication should be designed to meet specific Clusterix applications profile • Data availability • Replication drawbacks
Publications • Extended functionality of Virtual Storage System for grid Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski Cracow Grid Workshop 2004, poster no. 13 • Application of data replication methods in Clusterix project (in polish) Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski Pionier 2004, 19-20 May, Poznań, electronic publication • Implementation of replication methods in the Grid Environment Renata Słota, Darin Nikolow, Łukasz Skitał, Jacek Kitowski Submitted to European Grid Conference
Clusterix Data Management SystemArchitecture • Replication module • Responsible for: • Automatic replica creation/removal • Implementation • Java • Apache SOAP • Cooperate with: • Optimization module • Statistic module
Clusterix Data Management SystemArchitecture • Optimization Module • Responsible for: • storage element selection for newly created replica, • optimal replica selection. • Implementation • C/C++ • gSOAP • Cooperates with: • Network Monitoring System (NMS) • Information System • JMX-based Infrastructure Monitoring System (JIMS)
Clusterix Data Management SystemArchitecture • Information System (JIMS) • Department of Computer Science, AGH University of Science & Technology • Provides the following information for selected node: • Available storage capacity • Total storage capacity • Network interface load • Network interface bandwidth • Storage system load • Average storage system load • Maximal measured storage bandwidth
Clusterix Data Management SystemArchitecture • Network Monitoring System • Poznan Supercomputing and Networking Center • Provides the following information: • Maximum bandwidth between two network nodes • Current load between two network nodes • Nodes availability
Clusterix Data Management SystemArchitecture Statistic Module Białystok Technical University Responsible for gathering information about past data usage