250 likes | 380 Views
Building a Community Through GEON. Dr. Fran Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UCSD GEON Advisory Committee. wireless. sensors. field. computer. computer. network. network. data. computer. data. data. storage.
E N D
Building a Community Through GEON Dr. Fran Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UCSD GEON Advisory Committee
wireless sensors field computer computer network network data computer data data storage computer viz network fieldinstrument Redefining Computer • Today’s “computer” is a coordinated set of hardware, software, and services providing an “end-to-end” resource. • Cyberinfrastructure captures how the S&E community has redefined “computer” The “computer” as an integrated set of resources
Cyberinfrastructure Cyberinfrastructureis the coordinated aggregate of software, hardware and other technologies, as well as human expertise, required to support current and future discoveries in science and engineering. National Science Foundation’s Cyberinfrastructure NSF Blue Ribbon Panel (Atkins) Report provided compelling and comprehensive vision of an integrated Cyberinfrastructure “Thanks to Cyberinfrastructure and information systems, today’s scientific tool kit includes distributed systems of hardware, software, databases and expertise that can be accessed in person or remotely.” Arden Bement, NSF Director February, 2005
New infrastructurecapabilities motivate New applicationgoals enable New infrastructurecapabilities motivate Applicationgoals Bootstrapping as an Enabling Paradigm for Cyberinfrastructure
PRAGMA: Pacific Rim GridMiddleware Consortium NEES: Earthquake Engineering Cyberinfrastructure BIRN: Biomedical Informatics Research Network GEON: GeosciencesGrid infrastructure Community Projects Focused Efforts in Developing Cyberinfrastructure TeraGrid: NationalGrid infrastructure with Science Gateways NVO: Data Cyberinfrastructurefor Astronomy
Implementing CI: 5 Basic Principles • Science and engineering research and education must be the drivers • Community needs and requirements must directly drive the development, deployment, and use of CI • Useful and usable Cyberinfrastructure requires “bootstrapping” • Targeted “project-driven” tools/technologies drive the development of common infrastructure which enable new project tools/technologies • The “customers” should evaluate the “products” • Independent evaluation important • Cyberinfrastructure should be treated as infrastructure • Usefulness, usability, reliability, responsiveness to customer needs critical • A functional organizational framework is key for success.
Measuring Success • Broadening useof Cyberinfrastructure resources, tools, and technologies • measured by the number of new CI users who utilize resources, tools, and technologies greater than (an introductory) threshold number of times • Software coordination • measured by the number or percentage of times that software packages are used together, etc. • Tech Transfer to gauge the evolution of CI’s technological infrastructure • measured by the evolutionary path from the academic community to the commercial sector, etc. • Workforce evolution to gauge the development of CI’s human infrastructure • measured by the number of individuals involved in CI-related professions, and other workforce metrics) • Number of usersandother usage metrics • Broad research impact enabled by the use of Cyberinfrastructure • measured by publications and the number of distinct research disciplines, conferences, journals spanned by users • Deep research impact enabled by the use of Cyberinfrastructure • measured by community awards and recognition, and landmark publications with a very large number of citations • Educational impact enabled by the use of Cyberinfrastructure • measured by the breadth and depth of courses, training efforts, and other educational vehicles using CI • User satisfaction of Cyberinfrastructure tools and technologies • measured by independent user surveys, feedback from user advisory committees, projects during site reviews, etc.
Building Cyberinfrastructure Communities • Community buildingis a social enabler in the same way that technology is a discovery enabler – both allow researchers to do more than they can accomplish on their own • Community success greatly enabled by focused on well-articulated goals • High energy physicists want to search for new particles such as the Higgs boson • Astrophysicists want to develop models of the origin of the universe • Preservationists want to ensure long-term sustainability of valued digital assets • Computer science theorists want to prove or disprove P=NP The Gretzky Rule: “Skate to where the puck will be”
Key Questions for Cyberinfrastructure Communities • What should the key contributions be in 10 years? • What are the research goals? • What technologies and tools are needed to get there? • Who should be involved?
Sustaining Community Efforts -- 10 Year Issues • Data preservation • What data needs to be preserved over the long-term? How will the community support and sustain key collections? • Tool maintenance and evolution • What tools and technologies are critical? • What is the plan for deploying, maintaining, evolving and retargeting these tools over the long-term? • Community evolution • What will keep the community together? • How will the next generation of participants be engaged?
PI Institutions Arizona State University Bryn Mawr College Penn State University Rice University San Diego State University SDSC/UCSD University of Arizona University of Idaho University of Missouri, Columbia University of Texas at El Paso University of Utah Virginia Tech UNAVCO DLESE Partners ESRI Cal-(IT)2 Chronos CUAHSI-HIS Geological Survey of Canada Georeference Online HP IBM Kansas Geological Survey LLNL NASA Goddard, Earth System Division SCEC U.S. Geological Survey (USGS) Purdue University EarthScope IRIS VSTO The GEON Project NSF-funded IT Research Project, 2002-2007, $11.6M
Some Current GEON Innovations • Making technology easier; • Extension of ROCKS to a distributed environment. I.e. ability to "bootstrap" linux clusters using reference software images from a remote server. • Training of Geo PI's to create ROCKS "rolls“, enabling them to contribute their own, GEON-compliant software to the rest of the GEON network. • Reference portals which can be customized by GEON PIs • "Smart job routers", enhanced performance prediction capability and on-demand functionality for GEON jobs • Innovative knowledge-based integration of semantically different GIS maps. • Enhancing Data Management and Capabilities • Integration shopping carts for on-the-fly data integration • Technologies for augmented reality in the field, e.g. the ability to wear goggles and overlay database information on top of field observational data • Development of workflow systems for ingesting and serving large "point-cloud" data sets, e.g. LiDAR or hyperspectral imagery. • 2TB allocation for database space from SDSC
Building a Community Within GEON GEON AHM 2003
Project Communications and Outreach • Website provides an active site for project information (new design enhancing functionality) • Weekly project meetings at SDSC are webcast and archived • Weekly project reports (“blogs”) from C. Baru (archived on website) • 1 page monthly newsletter • Recent Project Meetings, Talks, and Workshops • 2004 Cyberinfrastructure Summer Institute • PI meeting in Idaho, 2004 • GSA in Denver (talks, posters, booth, Pardee Session) • AGU in San Francisco (talks, posters, sessions, booth) • Hydrology ontology meeting at SDSC • Workshop on sample ID’s at SDSC • Workshop on Visualization at SDSC/Cal-IT2 Synthesis Center • Workshop on Geo-ontology at SDSC • European Geophysical Union
Cyberinfrastructure Leverage and Synthesis • GEONandBIRNdeveloping common grid software stack • GEONandSEEKdeveloping common semantic integration services • GEON is an application driver forOptIPuter • Will field a common GIS and Viz/Synthesis center • GEONin active discussions withEarthscope • GEONparticipating in National Center for Hydrology Synthesis Computational Hydrology and Informatics Working Group • USGSis a major partner ofGEON • GEON is working with PRAGMAon education projects • Collaboration betweenSIO, WHOI, LDEOand GEON • GEON working with advisory committee ofLong-Term Ecological Research Network • GEON working with advisory committees of 3 differentCLEANER planning grants • GEONparticipating in renewal proposal ofChronos • GEON is a partner ofCUAHSI (hydrologic information systems) • GEONworking with DOEEarthSystem Grid(ES Grid) • GEON and LEAD coordinating efforts on education and outreach
Cyberinfrastructure at Scale – Community Building Challenges and Opportunities
Ensuring a “safe” Cyberinfrastructure for at-scale Communities • In 2003, the Slammer computer virus exploited a weakness in SQL server software to launch a “denial of service” attack which • Shut down over 13,000 Bank of America ATMS • Caused difficulties in Continental Airline’s electronic reservation and ticketing systems, causing cancellation of some regional flights • Caused failure of Korea Telecom Freetel and SK Telecom service, stranding millions of South Korean Internet users. • When the virus hit, operations centers were seeing between 200,000 and 300,000 attacks per hour
At-Scale Community Social Dynamics • Cyberinfrastructure technologies are providing new venues for communication, interaction, collaboration, and competition • What is the impact of Cyberinfrastructure on community development and evolution? • How can Cyberinfrastructure be structured to facilitate productive interactions?
Community Resource Allocation at Scale Cyberinfrastructure Economics • How to allocate resources so that • Aggregate user behavior does not destabilize the system • Individuals can optimize for performance
Community Organizations • What organizational frameworks best promote efficient and integrated Cyberinfrastructure? • What are useful ways to resolve conflicts? • How should decisions be made? • How do we promote integration andcoordination? The North AmericanPower Grid
How do we combine data, knowledgeand information management with simulation and modeling? Applications: Medical informatics, Biosciences, Ecoinformatics,… What is the distribution and U/ Pb zircon ages of A-type plutons in VA? How does it relate to host rock structures? Visualization How do we represent data, information and knowledge to the user? Data Mining, Simulation Modeling, Analysis, Data Fusion How do we detect trends and relationships in data? Data Integration Knowledge-Based Integration Advanced Query Processing Complex “multiple-worlds” mediation How do we obtain usableinformation from data? Integrating the “Data Stack” Grid Storage Filesystems, Database Systems How do we collect, accessand organize data? High speed networking How do we configure computer architectures to optimally support data-oriented computing? Networked Storage (SAN) Geo-Physical Geo-Chronologic Geo-Chemical Foliation Map Geologic Map instruments sensornets Storage hardware Translating Tools to Useful Infrastructure Data Integration in the Geosciences
Reliability • Infrastructure must be there when you need it. • How can communities ensure that • Data • Tools • Networks • Software and other resources are in good working order, and continue to enable new discovery?
Planning Ahead for Sustainable Cyberinfrastructure-enabled Communities • From the beginning, Cyberinfrastructure must be designed with its beneficiaries in mind. • Attention must be paid to • Social dynamics • Organization • Social mpact, etc. as well as technical issues • Long-term and strategic planning is critical Social Science ComputerScience and Engineering Domain and Application Science Borromean Ringsrepresent 3 key components of Cyberinfrastructure Dan Atkins