150 likes | 316 Views
Integrative Biology BOF - Usable Systems in the Global Environment All Hands 2006 Thursday 21 st September. Agenda. What is Integrative Biology ? – a quick recap! Who are the IB users? Challenges in developing solutions for a diverse community The IB technology to date.
E N D
Integrative Biology BOF - Usable Systems in the Global Environment All Hands 2006 Thursday 21st September
Agenda • What is Integrative Biology ? – a quick recap! • Who are the IB users? • Challenges in developing solutions for a diverse community • The IB technology to date
Integrative Biology - Project Rationale • To leverage the global Grid infrastructure to build an international “collaboratory” which places the applications scientist “within” the Grid allowing fully integrated and collaborative use of: • HPC resources (capacity and capability) • Computational steering, performance control and visualisation • Storage and data-mining of very large data sets • Easy incorporation of experimental data • User- and science-friendly access • => Predictive in-silico models to guide experiment and, ultimately, design of novel drugs and treatment regimes
What are our objectives? Teragrid EU Grids.. Your own Cluster.. HPCx – parallel optimal codes UCL Altix Test Machine Atlas Data Store NGS Ox Compute NGS Man NGS RAL NGS Leeds Compute Integrative Biology Global Users Hide the complexity from the users through the use of an IB portal or client
The Integrative Biology Scientific Users Typically… Degree and/or post grad qualification in Industrial engineering, maths, biology, physiology Computing skills developed over time to allow them to develop models. Not computer scientists. Not grid savvy. Based in Oxford, Nottingham, Birmingham, Auckland,Tulane, Washington Lee, Calgary, Baltimore, Sheffield, Utrecht, Graz… Keen to use and adapt other Scientists work
Determining requirements • Evolving users, disparate needs, identify current pains • Evolving knowledge driving new requirements • Don’t know what they want until they see and refine it • Grid not something they want to know about, consideration of • language • Initial interviews assessed as is, constraints and security • requirements for competitive research • Concept of collaboration varied • Do they need a grid? Exploratory journey for users
Key problems identified • Data management problematic, too much generated and tying information together an art • Current simulations tie up desktops for many hours • Visualising results on desktop limited by local facilities and ad hoc development of suitable tools • Research is sensitive, concept of an experiment either for an individual or a collaborative group • Laptop to HPC migration for most users a huge leap not a small step • Collaboration and Communication requires tools e.g. Oxford/Tulane • Cannot exclude scientific community who have not progressed to computational models (digital pens)
The ‘collaboratory’ - What have we developed ? • Facilities for submission of compute jobs to NGS and HPCx via portal or command line or Matlab. Extension to own clusters in development • Comprehensive data management and metadata management facilities including federation of catalogues and with Auckland and UK • Advanced visualisation techniques including movie generation utilising Meshalyser and Coolgraphics to date. Major revamp of these facilities due in the next 12-18 months for remote geometry generation and steering • Phase space exploration for multi variable visualisation in Leeds • A new VRE project developing usable interfaces to a digital research domain for IB through proof of concepts. Also exploring the digital world through a trial of digital pens for life scientist.
Job submission and management via the IB Portal This portal allows users to submit their jobs to these compute facilities, monitor their progress and to automatically pull input files from and store results in the project secure repository ‘Storage Resource Broker’. Users are able to select the compute resource to be used, manage their data in their own SRB space and to setup and manage their experiments through a metadata editor. Users can link files and simulation information to created studies thus simplifying the process of managing their scientific information.
Data Management and the Metadata editor The data storage facility allows users to store any associated user files including input files, codes and output results. Provenance data is automatically captured from a simulation run and stored alongside the results for later use.These facilities are designed to offer large scale secure facilities for the individual researcher as well as those interested in working more collaboratively with colleagues through the ability to share information.
Visualization • Cool Graphics/Meshalyzer (developed by Dr. J. Eason and Dr. E. Vigmond) • IB Tools Link to SRB and NGS • Issues Can only be done on local machine – problem for low bandwidth users … hence revised architecture Planned over next 12-18 months
Usable Solutions or lead weight? • Early releases have required tame users to deal with less elegant means of submitting and managing jobs • Constrained by infrastructure and agility of change • VRE project aims to pull together multifaceted aspects • Generic tools versus bespoke prototypes for selected groups e.g. Washington Lee parameter sweep • Benefits for scientists have outweighed pains (certificates, varied rules re job queues, libraries and licensing) but • Far from ideal solution…. Constraints still exist (bandwidth, monitoring, security)
Scientific users are customers of technology …. But technology team are users of provided infrastructure… • NGS • HPCx • (CSAR) • SRB • 3rd party tools …..
Challenges Need to apply for and manage certificates Code development for optimal use of facilities still a challenge Legacy code hurdle Benefits and challenges for users • Benefits • Access to powerful compute resources, • Access to vast file store facility, • Prompt, efficient support structures. • New science evolving and publication rate for scientists faster!
Summary • Integrative Biology has had to act as a bridge as well as a provider of interfaces and services • Starting small and iterating with users patient enough to stick with it has enabled both teams to progress • Security comes at a price • Usable or tolerable? …. But we have managed to increase publications for user community!