120 likes | 269 Views
Cyberinfrastructure for Data Intensive Science (DIS). Follow-on panel to DIS session at Internet2/ESCC Joint Techs Conference Baton Rouge – January 24, 2012 . Joint Techs Winter 2012 Focus. Data intensive science focus session Input from many groups in the community
E N D
Cyberinfrastructure for Data Intensive Science (DIS) Follow-on panel to DIS session at Internet2/ESCC Joint Techs Conference Baton Rouge – January 24, 2012
Joint Techs Winter 2012 Focus • Data intensive science focus session • Input from many groups in the community • Multiple science disciplines • Multiple infrastructure areas (networks, supercomputers, laboratory environments, mission agencies) • Success stories illustrated effective DIS support • The intent was to integrate the needs, context, and commonalities in a white paper
DIS Focus Area Presenters • Bill St. Arnaud, Green IT • Matthew Trunnell, Broad Institute • Don Middleton, NCAR • Rich Carlson, DOE Office of Science • Kevin Thompson, NSF OCI • Mike Ackerman, NIH NLM • Gary Jung, LBNL • Gwen Jacobs, Montana State/Hawai’i • Ruth Marinshaw, UNC-Chapel Hill • Eli Dart, ESnet • Brent Draney, NERSC • Ron Hutchins, Georgia Tech • Joe Breen, Utah • Tad Reynales, Calit2-UCSD • Jim Bottum, Clemson DIS Steering Committee: Scott Brim, Eric Boyd, Steve Corbató, Eli Dart, Susan Evett, Kate Mace, Jim Pepin, Dan Schmiedt, Steve Wolff
Joint Techs 2012 – What We Heard • Need for effective cyberinfrastructure voiced by multiple communities and disciplines • Genomics • Climate • Supercomputer centers • Success stories outlined the path forward • Science DMZ model • Effective communication between cyberinfrastructure providers, science disciplines, funding agencies
Rapidly Evolving Context • Things are moving quickly now • NSF CC-NIE call focused on improving campus networks • Federal Big Data initiative • This stuff is for real – it’s not just talk • Infrastructure funding • Grant funding • The direction is not in doubt – the only thing to decide is the actions to take • Institutions that are aggressive in this space are likely to acquire first-mover advantage • The wide area infrastructure is available now • The need for a white paper has passed
Solutions Required for Research Institutions • Means by which campuses can connect to science services outside their borders • Collaboration • Computation • Data sources and services • Support data-intensive collaboration • Foster environment for grants, projects • Attract new faculty, new programs • Refresh science infrastructure
Science Infrastructure Refresh • NSF call reinvestment in foundations of data intensive science • Architecture that has been shown to work: Science DMZ • In addition to technology, people and processes must be included in the refresh • Science programs, infrastructure providers and security officers must all be on board • Communication and a common vision are very important • Staff need the skills to manage high-performance science flows and the infrastructure to support them
The Science DMZ – Refresher • The Science DMZ is two things • An element of network architecture • A model for supporting data-intensive science at a research institution • Architecture • Portion of the network, at or near the site perimeter • Devoted exclusively for science support • Built with capable hardware • Dedicated resources for data transfer, network measurement • Appropriate security applied, application set restricted so that security controls, risk, and science mission are all aligned • http://fasterdata.es.net/science-dmz/science-dmz-architecture/
The Science DMZ Model • In general, the Science DMZ model is a framework for cyberinfrastructure • Explicitly accommodates science mission • Builds in flexibility to adopt tools and technologies for science support • Establishes appropriate security infrastructure to both enable and protect science • Must balance security, usability, and performance • The science mission is given what it needs to succeed
Integration of Campus with wider infrastructure • Science DMZ enables a campus to connect local scientists and resources in a frictionless manner to other sites and services • Science networks • Advanced services • Virtual circuit services, network overlays • Internet2 Innovation Platform • http://fasterdata.es.net/science-dmz/advanced-services/ • Science DMZ resources at other campuses • This is a critical point – remember Metcalfe’s Law • Value of a Science DMZ increases as others deploy them • The data-intensive era is upon us – the infrastructure must evolve to keep pace
Conclusions • The time to act is now • Lots of movement in this space – dynamic, evolving • Create a coalition of the willing • Set of Universities and National Labs of sufficient critical mass to create transformative environment to support DIS • Must create environment to encourage innovation while encouraging coherence to support scientific disciplines scattered across the globe • Infrastructure pieces are well-understood • Hence the NSF call for campus activities • Get these deployed now