280 likes | 474 Views
Generic e-Science Research Henri Bal bal@cs.vu.nl Vrije Universiteit Amsterdam. Workflow. e-Science. Virtual Laboratory for e-Science (VL-e). New e-Science plans (FES 2009). Generic e-Science research in FES 2009. Introduction. System level science
E N D
Generic e-Science Research Henri Balbal@cs.vu.nlVrije Universiteit Amsterdam
Workflow e-Science Virtual Laboratory for e-Science (VL-e) New e-Science plans (FES 2009) Generic e-Science research in FES 2009
Introduction • System level science • the integration of diverse sources of knowledge about the constituent parts of a complex system with the goal of obtaining an understanding of the system's properties as a whole [Ian Foster] • Multidisciplinary research • Each discipline can solve only part of a problem • Collaborations betweens distributed research groups • Research driven by (distributed) data • Data explosion, both volume and complexity
Examples • Functioning of the cell for system biology • Cognition • Cancer research • Cohort studies in medicine (biobanking) • Discovery of biomarkers for drug design • Ecosystems/biodiversity • Studies of water/air pollution • Study black matter
e-Science • Goal: allow scientists to collaborate in experiments and integration of research • Enable system level science • Design methods to optimally exploit underlying infrastructure • Hardware (network, computing, datastorage) • Software (web, grid middleware)
e-Science in context System level experiments e-Science Web/grid software Infrastructure
Virtual Laboratory fore-Science (VL-e) • Generic application support • Application cases are drivers for computer science • Rationalization of experimental process • Midterm review: • ``e-Science that works; gives Grid its correct role’’
Some VL-e results • Virtual Resource Browser: • Integration platform for different applications • Mirage & Virtual Lab for Medical Imaging: • Used autonomously at AMC for large-scale experiments • EcoGRID: • Species observation records from many organizations • Virtual Lab for Bird Migration Modeling: • Access military/meteorological radars & weather forecasts • Ibis: • Programming/deployment of large-scale grid applications
Bird behaviour in relation to weather and landscape RADAR Calibration and Data assimilation Predictions and on-line warnings Bird distributions Ensembles Dynamic bird behaviour MODELS
Example of generic approach SCALE 2008 DACH 2008 - BS DACH 2008 - FT Astronomy Multimedia Computing Semantic Web AAAI-VC 2007 ISWC 2008
Multimedia Content Analysis Ibis (Java) Client • Runs simultaneously on clusters (DAS-3, Japan, Australia), Desktop Grid, Amazon EC2 Cloud • Connectivity problems solved automatically by Ibis SmartSockets Servers Broker
eyeDentify • Object recognition on an Android smartphone • Smartphone is a limited device: • Can run only 64 x 48 pixels (memory bound) • 1024 x 768 pixels would take 5 minutes • Distributed Ibis version: = + + 2.0 seconds 1024 x 768 pixels
Outline • e-Science • Virtual Laboratory for e-Science (VL-e) • New e-Science plans (FES 2009) • Part of ICTregie FES proposal COMICT • Connecting, Mastering complexity, and Innovating by Cooperation • Generic e-Science research in FES 2009
Research questions • How can we design, develop and build an adaptive e-Science environment that in a flexible way enables global collaboration in key areas of science? • How can we establish an e-Science environment that is capable of handling the data explosion? • How can we manage complexity via integration, at the application level and the generic e-Science level?
e-Science application projects • e-Food & Flowers: WUR/VU (Top), TIFN, TIGG • e-BioScience & Life Sciences: UvA (Breit), RIVM • e-Biobanks: LUMC (Kok), AMOLF, AMC, Philips,Schering-Plough • e-COAST & analytical science: TI-COAST (vd Brink) • e-Ecology: UvA (Bouten), GAN • e-Data-intensive sciences: Nikhef (Templon), RUG
Two “real-life” environments • e-Food • With TI Food and Nutrition (TIFN), TI Green Genetics (TIGG), NBIC • e-Biobanking • With Pearl String Initiative (Parelsnoer), Biobanking and Bio-molecular Resources Research Infrastructure(BBMRI-NL) and NBIC • Under umbrella Netherlands Genomics Initiative (NGI) • FES2009 application Life Sciences & Health
Generic e-Science Panel VL-e workshop (29 Oct 2008)
Generic e-Science projects • Scientific data management • Information- and knowledge-management • Visualization • Computing and resourcemanagement • e-Science infrastructureengineering • Workflow management &application integration • Reliability and security
Scientific data managementCWI/UvA (Kersten), RUG (Valentijn) • Support large array-data (using MonetDB) • Multi-scale query execution • With increasing precision more and more data in the warehouse is used to answers queries • Astrosensor warehouse • Data lineage (back tracing to origin of data)
Information- and knowledge-managementVU (van Harmelen, Schreiber),UvA (Adriaanse, Marshall) • Robust & large scale techniques for accessing & reasoning over distributed data-sources • Tools for integration of data with scientific publications; data provenance, lineage, trust • Tools for data-sharing: entity naming, semantic enrichment, interlinking acrosssemantically heterogeneous vocabularies
Large Scale Data VisualizationCWI (van Liere), UvA (Belleman), SARA (Berg) • Knowledge Assisted Feature Visualization • How to provide semantic meaningful interactive visualizations for very large and complex data? • User Driven Exploratory Visual Analysis • If automatic analysis fails • Applications: mass spectrometry, CFD, HEP, CosmoGrid, Ijkdijk, …
Computing & resource managementVU (Bal, Seinstra), TU Delft (Epema) • Map e-Science applications onto hybrid systems, optimize performance & energy • DAS-4: Multicores, GPUs, FPGAs, MPSoCs (Cell/BE) • Scheduling algorithms supporting co-allocation of compute-, data-, and network-sources • Builds on Ibis & KOALA software • Many app’s (VUMC, AMOLF, MultimediaN, astronomy)
e-Science infrastructure engineeringUvA (de Laat), VU (Bal), TUD (Langendoen), TNO (Meijer) • Resource information system based on Semantic Web and RDF (Resource Description Framework) • Highly mobile data sensors • User Programmable Networks
“I want” approach Application: find video containing x, then trans-code to it view on Tiled Display RDF/CG RDF/VIZ RDF/CG RDF/ST RDF/NDL RDF/CPU RDF/NDL content content PG&CdL
Workflow management &application integrationUvA (Bubak, Belloum), VU (Kielmann) • Improve interoperability, sustainability and platform convergence in Scientific Workflow • Define a shared “standard”for workflow metadata • Workflow provenance models • Make middleware-independent APIs for applications & programming environments • Cf. JavaGAT, SAGA, XtreemOS
Reliability and securityNIKHEF (Groep), UvA (van ‘t Noordende),VU (Fokkink), TU Twente (van de Pol),Logica (Mulder), SARA (Bouwhuis) • Improve availability, stability, and reliability of the infrastructure • Monitoring, failure analysis • Self-healing • Use formal verification techniques • Large-scale Model Checking becomes feasible on grids (as shown on wide-area DAS-3) • Provide security • Security policy enforcement, auditing
Summary • Real-life application environments for e-Food and e-Biobanking • New partners (TI-COAST, RUG, UT, ….) and new groups (Kersten, van Harmelen, Langendoen, Fokkink, vd Pol, ….) • DAS-4 • Many new research topics