100 likes | 212 Views
A Distributed Data Architecture. Mark Jessop University of York. Swans. Grid Enabled Swans. London. Tokyo. Cape Town. Mexico City. How Big is that Lake?. Heathrow capped at 36 landings per hour. If half have 4 engines and half have 2, average aircraft carries 3 engines.
E N D
A Distributed Data Architecture Mark Jessop University of York
Grid Enabled Swans London Tokyo Cape Town Mexico City
How Big is that Lake? • Heathrow capped at 36 landings per hour. • If half have 4 engines and half have 2, average aircraft carries 3 engines. • Each engine generates around 1GB of data per flight. • 36 x 3 x 1 = 108GB raw engine data per hour. • Factor in the working day and the rest of the world… • …Terabytes and up!
London Tokyo Cape Town Mexico City Managing the Flow of Water
Plumbing Toolkit • Data Repository • Catalogue • Pattern Match Engine
Pattern Match Engine • Pattern Match Control • Data Extractor/Encoder • AURA Encoder • AURA-G • Back Check
DATA DATA DATA DATA DATA DATA MCAT MCAT MCAT MCAT MCAT MCAT MCAT MCAT DATA DATA DATA Data Repository • SDSC Storage Request Broker. • Manages distributed storage resources. • Meta Data Catalogue. • Many configurations. • Heterogeneous. • Efficient data delivery. • C++ and Java APIs.
MCAT A Distributed Architecture • One node per airport. • Single global MCAT. • Stream engine data. • Global Parallel Search. • Present Results. • Scalable. • Robust.
Summary • Large quantities of data arriving globally. • Distributed architecture for data management and search. • Scalable and Robust.