160 likes | 180 Views
Explore the distributed data storage project at Masaryk University in collaboration with CESNET. Learn about the motivation, infrastructure, applications, and future extensions of this innovative network storage solution.
E N D
DiDaSDistributed Data Storage Ludek Matyska Masaryk University, Institute of Comp. Sci. And CESNET, z.s.p.o Ludek.Matyska@muni.cz APAN, Logistical Networking WS
Outline • Motivation • Infrastructure • Applications • Future extensions APAN, Logistical Networking WS
Motivation • Increased need for network storage • Computational Grids • Data Grids • Temporary Data Deposits • Transient Caches • Video deposits • National Library Requirements • Distribution of digitized content APAN, Logistical Networking WS
Requirements • Transparent • Location independent • Good geographical distribution • Providing support for • Access quality (e.g. Streaming) • Reliability (no single point of failure) APAN, Logistical Networking WS
Infrastructure • Data depots • Control: Personal computer • Storage: RAID of IDE disks • Capacity 1,5 TB each • Number: 7 (total capacity 10 TB) • Connectivity • Directly to the backbone • 100 Mb/s or 1 Gb/s APAN, Logistical Networking WS
Data Layer • IBP (70% capacity) • General use • GridFTP servers (30% capacity) • Grid support • Computer independent temporary data storage • Comparison with IBP based solution APAN, Logistical Networking WS
Traffic optimisation • Network traffic cost function • Inter-depots topology known • Instrumented clients • Measurement from depot to client • Simultaneous data transfer and measurements • Real-time transfer rate prediction • Choose depot • Decision between point and multipoint transfers APAN, Logistical Networking WS
Applications • National Technical Library • Video Streaming • Nonspecific Users APAN, Logistical Networking WS
National Technical Library • Requirements • Program of content digitalisation • Data stored on the central tape robot • Not optimised for distribution • Danger of overload • Model data: old cartographic maps APAN, Logistical Networking WS
National Technical Library • DiDaS role • Cache like storage • Load balancing optimisation • Data transfer reliability (multistreaming) APAN, Logistical Networking WS
Video Streaming • Permanent storage • Specific clients • QoS requirements (pre-caching) • Replica management • Not yet implemented APAN, Logistical Networking WS
Nonspecific Users • Temporary data deposits • Provide data for load balancing • Transfer outside of DiDaS core • Access reliability • Automatic replica generation • Transparent multi-access • Ability to react on connectivity loss APAN, Logistical Networking WS
Future work • New clients development • support for new application areas • Extended and transparent replica management • Full instrumentation • Data for • Load balancing • Replica creation/deletion • User access optimisation APAN, Logistical Networking WS
Thank you for your interest APAN, Logistical Networking WS