120 likes | 223 Views
Knowledge Environments for Science: Representative Projects. Ian Foster Argonne National Laboratory University of Chicago http://www.mcs.anl.gov/~foster. Symposium on Knowledge Environments for Science, November 26, 2002. Comments Informed By Participation in ….
E N D
Knowledge Environments for Science:Representative Projects Ian Foster Argonne National Laboratory University of Chicago http://www.mcs.anl.gov/~foster Symposium on Knowledge Environments for Science, November 26, 2002
Comments Informed By Participation in … • E-science/Grid application projects, e.g. • Earth System Grid: environmental science • GriPhyN, PPDG, EU DataGrid: physics • NEESgrid: earthquake engineering • Grid technology R&D projects • Globus Project and the Globus Toolkit • NSF Middleware Initiative • Grid infrastructure deployment projects • Alliance, TeraGrid, DOE Sci. Grid, NASA IPG • Intl. Virtual Data Grid Laboratory (iVDGL) • Global Grid Forum: community & standards
Data Grids for High Energy Physics • Enable community to access & analyze petabytes of data • Coordinated intl projects • GriPhyN, PPDG, iVDGL, EU DataGrid, DataTAG • Challenging computer science research • Real deployments and applications • Defining analysis architecture for LHC
NEESgrid Earthquake Engineering Collaboratory U.Nevada Reno www.neesgrid.org
Galaxy cluster size distribution Chimera Virtual Data System + GriPhyN Virtual Data Toolkit + iVDGL Data Grid (many CPUs) Communities Need Not be Large:E.g., Astronomical Data Analysis Size distribution of galaxy clusters? www.griphyn.org/chimera
A “Knowledge Environment” is a System For … “Small teams” “Accessing specialized devices” “Interpersonal collaboration” “Sharing information” “Accessing services” “Enabling large-scale computation” “Integrating data” “Large communities”
It’s All of the Above: Enabling “Post-Internet Science” • Pre-Internet science • Theorize &/or experiment, in small teams • Post-Internet science • Construct and mine very large databases • Develop computer simulations & analyses • Access specialized devices remotely • Exchange information within distributed multidisciplinary teams • Need to manage dynamic, distributed infrastructures, services, and applications
Enabling Infrastructure for Knowledge Environments for Science (aka “The Grid”) “Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
Grid Infrastructure • What? • Broadly deployed services in support of fundamental collaborative activities • Services, software, and policies enabling on-demand access to critical resources • Open standards, software, infrastructure • Open Grid Services Architecture (GGF) • Globus Toolkit (Globus Project: ANL, USC/ISI) • NMI, iVDGL, TeraGrid • Grid infrastructure R&D&ops is itself a distributed & international community
Lessons Learned (1) • Importance of standard infrastructure • Software: facilitate construction of systems, and construction of interoperable systems • Services: authentication, discovery, …, … • Needs investment in research, development, deployment, operations, training • Building & operating infrastructure is hard • Challenging technical & policy issues • Requisite skills not always available • Can challenge existing organizations
Lessons Learned (2) • Importance of community engagement • “Maine and Texas must have something to communicate” • Big science traditions seem to help • Discipline champions certainly help • Effective projects often true collaborations between disciplines and computer scientistis • Importance of international cooperation • Science is international, so is expertise • Challenging, requires incentives & support
Lessons Learned (3) • Collaborative science/Grids are a wonderful source of computer science problems • E.g., “virtual data grid” (GriPhyN): data, programs, derivations as community resources • E.g., security within virtual organizations • Work in this space can be of intense interest to industry • E.g., current rapid uptake of Grid technologies