50 likes | 64 Views
A European High-Throughput Data Analysis infrastructure. Peter Solagna EGI.eu Operations Manager. Target Group, problemd, pains. Target Group
E N D
A European High-Throughput Data Analysis infrastructure Peter Solagna EGI.eu Operations Manager
Target Group, problemd, pains • Target Group • Research communities that need to analyse or produce a large datasets through the execution of large ensembles of (hundreds to thousands) loosely coupled computational tasks possibly combined with parallel ones • Problem: • Within the same community, researchers may have access to local resources, but they are not globally integrated (pain: they cannot easily share local resources, they cannot easily use non-local resources, it requires substantial effort and skills integrate and use them together) • Managing a huge amount of data within a collaboration is time consuming and prone to error (pain: lack of capabilities and resources to manage data in a collaborative and distributed environment) • Researchers do not have access to enough local capacity for their needs (pain: lack of resources)
EGI solution • Pan-European federated high-throughput data analysis infrastructure composed of independent resource centres • Needed Services • From EGI.eu: • Repository of validated software, Federated Operations, Helpdesk support • From NGIs/Resource Providers: • Grid compute, grid storage, workflow management, uniform VO management
Value Proposition Easy access to shared computing and data services from independent resource providers where to provision owned resources or access unused ones in a uniform way and preventing single vendor lock-in while optimisingutilisation • How does this solution solve the problems? • they cannot easily share local resources, they cannot easily use non-local resources -> we provide open standards based middleware components to uniform interfaces to heterogeneous resources • it requires substantial effort and skills to integrate and use them together -> we provide workflow management tools that can be reused • lack of capabilities and resources to manage data in a collaborative and distributed environment -> we provide data management and transfer tools that support VO-based authentication and authorisation • lack of resources -> through enabling the sharing of resources, we support opportunistic access to unused capacity
Strategic Impact & Return on Investment What is the strategic impact of providing this solution? On EGI2020: • VREs: provides large scale high-throughput data analysis platform for research communities to build their own VREs upon On EU2020: • Pooling of resources together • Supporting the ERA How this solution can be sustained? • Coordination: membership fees from resource providers • Resources: research community contribution or public-funded • Software: in-kind contribution from technology providers or research communities, EC project funding for innovation