430 likes | 438 Views
This article discusses the importance of following operational standards based on real applications in order to achieve the vision of grid computing. It looks at past visions and challenges, the growth of the internet, and the need for increased capacity and new services. The article also explores the use of smart clients and the National Virtual Observatory in scientific data analysis.
E N D
Grid ChallengesIt’s the vision, stupid…but it NEEDS TO be followed by operational standardsbased on real applications… The Global Grid Forum 25 June 2003 Gordon Bell Microsoft Corporation
A quick look at some past visionsand a challenge • NREN >> Internet • WWW • Challenge: Will match any Grid enabled application that wins a Gordon Bell Prize for parallelism
FCCSET NREN Plan 11/1987 10G- 1G- 100M- 10M- 1M- 100K- 10K- 3 G Optical a factor of 1000 makes a difference 45 M Phase 2 1.5 M Phase 1 56K 1988 1990 1992 1994 1996 1998 2000
Originating Bandwidth (Gb/s)U.S. Interstate Comm. traffic L Roberts ’92ARPAnet Goals c1972 = Grid Goals 10,000- 1,000- 100- 10- 1- Video Conf. Voice Video on Demand Email NSF bb• FAX Broadcast TV |1990 | |2000 | |2010 | |2020
Growth in hype vs reality WWW books, newspapers Infoway regulation Infoway speculation “how great it’ll be” (politicians , telecoms & futurists) Infoway addiction conferences lawsuits c 1995 Data from Gordon’s WAG
Articles per newspaper versusorders per second sent via Internet orders per second articles per newspaper c 1995 Data from Gordon’s WAG
Articles about security, privacy, & fraud versus commerce ($M) actual commerce articles about risk and NOT doing commerce organized crime on Internet c 1995 Data from Gordon’s WAG
Increased Demand Increase Capacity(circuits & bw) Create new service Lower response time WWW Audio Video Grids Voice! The virtuous cycle of bandwidth supply and demand Standards IP Telnet & FTP EMAIL Video Conf. FTP Web Svcs
Grid Book c1998 from 1996 www.mkp.com/grids The Globus Project™ www.globus.org OGSA www.globus.org/ogsa Global Grid Forum www.gridforum.org Grid Computing 2003 For More Information 651 pp. 22 chapters, 41 authors 1080 pages 43 chapters, O(100) authors
Progress...a review • Grid started out with great promise…c1998Interesting use at NASA for coupled programs • NMI (National Middleware Infrastructure)…State_Tools.gov, funded by NSF.govclearly open, clearly not “free” not IETF model • Tools vs. standards & evolving working code • Some examples: • C1980: Seti@home, folding@home, >> Napster p2p • 2001 15 TB Terraserver > Terraservice w/Web Services • 2003 Alex Szelay & Jim Gray: Skyserver/skyservice • Cornell Theory Center Web Services based apps • NEES—good poster child. An XML task • GRADs and Teragrid… dream or research or just $$s?
To the rescue! TerraServer Experiencec2001 • Successful Web Site • 50,000 daily users satisfied with “human-accessible” data • 59 GB imagery transmitted daily • New Feature Requests • Programmable access to meta-data • User selectable image sizes, i.e. “a map server” • Permission to use TerraServer data within server applications
Smart Clients WindowsForms .NET Framework ADO.NET .NET TerraService Architecture HTML Map UI Web Forms Standard Browsers Image/jpeg Existing DB Server Map Server Http Handler 668 m Rows SQL 20001.0 TB Db Image/jpeg TerraServer Web Service SQL 20001.0 TB Db XML SQL 20001.0 TB Db OLEDB
Data Intensive Science: the Next Frontier The W.M. Keck Fellowsin Advanced Scientific Data Analysis Alex SzalayThe Johns Hopkins UniversityDepartment of Physics and Astronomy
National Virtual Observatory • NSF ITR project, “Building the Framework for the National Virtual Observatory” is a collaboration of 17 funded and 3 unfunded organizations • Astronomy data centers • National observatories • Supercomputer centers • University departments • Computer science/information technology specialists • PI and project director: Alex Szalay (JHU) • CoPI: Roy Williams (Caltech/CACR)
Scientific Data Exploration • Thousand years ago: science was empirical • describing natural phenomena • Last few hundred years: theoretical branch • using models, generalizations • Last few decades: a computational branch • simulating complex phenomena • Today: data exploration is emerging • synthesizing theory, experiment and computation with advanced data management and statistics
Living in an Exponential World • Astronomers have a few hundred TB now • 1 pixel (byte) / sq arc second ~ 4TB • Multi-spectral, temporal, … → 1PB • They mine it looking fornew (kinds of) objects, more of interesting ones (quasars), density variations in 400-D space, correlations in 400-D space • Data doubles every year • Data is public after 1 year • So, 50% of the data is public • Same trend appears in all sciences
ROSAT ~keV DSS Optical IRAS 25m 2MASS 2m GB 6cm WENSS 92cm NVSS 20cm IRAS 100m Why Is Astronomy Special? • It has no commercial value • No privacy concerns, freely share results with others • Great for experimenting with algorithms • It is real and well documented • High-dimensional (with confidence intervals) • Spatial, temporal • Diverse and distributed • Many different instruments from many different places and many different times • The questions are interesting • There is a lot of it (soon petabytes) • GB: It is not over-funded aka it’s poor
Making Discoveries • When and where are discoveries made? • Always at the edges and boundaries • Going deeper, collecting more data, using more colors…. • Metcalfe’s law • Utility of computer networks grows as the number of possible connections: O(N2) • VO: Federation of N archives • Possibilities for new discoveries grow as O(N2) • Current sky surveys have proven this • Very early discoveries from SDSS, 2MASS, DPOSS
What can be learned from Sky Server? • It’s about data, not about harvesting flops • 1-2 hr. query programs versus 1 wk programs based on grep • 10 minute runs versus 3 day compute & searches • Database viewpoint. 100x speed-ups • Avoid costly re-computation and searches • Use indices and PARALLEL I/O. Read / Write >>1. • Parallelism is automatic, transparent, and just depends on the number of computers/disks. • Limited experience and talent to use dbases.
Soon: The Virtual Observatory • Many new surveys are coming • SDSS is a dry run for the next ones • LSST will be 5TB/night • All the data will be on the Internet • ftp, web services… • Data and applications will be associated with the instruments • Distributed world wide, cross-indexed • Federation is a must • Will be the best telescope in the world • World Wide Telescope • Finds the “needle in the haystack” • Successful demonstrations in Jan’03
Emerging Concepts • Standardizing distributed data access • Web Services, supported on all platforms • XML: Extensible Markup Language • SOAP: Simple Object Access Protocol • WSDL: Web Services Description Language • Standardizing distributed computing • Grid Services • Custom configure remote computing dynamically • Build your own remote computer, and discard • Virtual Data: new data sets on demand • Both needed for Data Exploration
Computational Science Simulations based on Web Services Gerd Heber Cornell Theory Center heber@tc.cornell.edu
Three Flavors of Adaptivity • Application-level • Mathematical model • High/low confidence • Algorithm-level • Discretization method • Solution technique • System-level • Resource availability • Fault tolerance
The Problem • Do distributed,coupled and adaptive multi-physics simulations of • Mechanics of chemically-reacting flows • (Damage) Thermo-Mechanics of solids • Components provided as Web Services
Geography • Cornell University • Theory Center • Department of Computer Science • Department of Civil Engineering • University of Alabama • Mississippi State University • College of William and Mary
Components • MiniCAD • Meshers • Surface (Delaunay, quality guarantees) • Volume (Dmesh, Jmesh, Gmesh) • Fluid/Thermal simulation (Loci, CHEM) • Thermo-mechanical component (CPTC) • Fracture mechanics • Visualization (OpenDX + SQL Server)
Web Services • “Web Services are self-contained, modular applications that can be described, published, located, and invoked over a network, …” (IBM) • Service oriented architecture: Publish, find, bind • XML, SOAP, UDDI, WSDL
Features and Requirements • Distributed expertise • No porting • Network accessibility (“firewall compliant”) • Platform and language neutrality • Security • Industry standards • Metadata • State • Students shouldn’t waste too much time with coding!
GrADS Vision • Build a National Problem-Solving System on the Grid • Transparent to the user, who sees a problem-solving system • Software Support for Application Development on Grids • Goal: Design and build programming systems for the Grid that broaden the community of users who can develop and run applications in this complex environment • Challenges: • Presenting a high-level application development interface* • If programming is hard, the Grid will not not reach its potential • Designing and constructing applications for adaptability • Late mapping of applications to Grid resources • Monitoring and control of performance • When should the application be interrupted and remapped? *GB note: This is a superset of the previously unsolved clusters programming problem!
Performance Feedback Real-time Performance Performance Problem Software Monitor Components Resource Config- Whole- Source Grid Negotiator urable Appli- Program Negotiation Runtime Object Compiler cation System Scheduler Program Binder Libraries GrADSoft Architecture
Network for Earthquake Eng. Simulation • NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other • On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
“Scales Away” spans organizations & geographies “Scales Out” by adding machines “Scales Up” on large systems “Scales In” on a machine “Scales Down” to devices A Universal Architecture for Web Services… Microsoft Vision Security Reliable Messaging Transactions Routing … Messaging Infrastructure Distributed applications Vertical processes Embedded systems Network equipment … 39
Web Services: Level IFoundation to Build Upon • Basic profile • Defined by WS-I • XML, SOAP, WSDL, UDDI • Broad vendor support • WS-I assures widespread compatibility
Level II Secure, Reliable, Transacted Connected Applications Business Process Management … Secure Reliable Transacted Metadata Messaging XML Transports
Level IIIFrom Infrastructure to Solutions • Application schemas • Domain specific profiles • Vertical industry services
Vison: Community/Data-Centric ComputingVersus Machine-Centered Centers • Goal: Enable technical communities to create and take responsibility for their own computing environments of personal, data, and program collaboration and distribution • Design based on technology and cost, e.g. networking, apps programs maintenance, databases, and providing 24x7 web and other services • Many alternative styles and locations are possible • Service from existing centers, including many state centers • Software vendors could be encouraged to supply apps web services • NCAR style center based on shared data and apps • Instrument- and model-based databases. Both central & distributed when multiple viewpoints create the whole. • Wholly distributed services supplied by many individual groups
Community/Data Centric: “web service” • Community is responsible • Planned & budget as resources • Responsible for its infrastructure • Apps are from community • Computing is integral to work • In sync with technologies • 1-3 Tflops/$M; 1-3 PBytes/$M to buy smallish Tflops & PBytes. • New scalables are “centers” • Community can afford and evolve • Dedicated to a community • Program, data & database centric • May be aligned with instruments or other community activities • Output = web service; Can communities form that can supply services?
Commitment to standards • A general architecture comes much from understanding the problems • Understanding the problems comes from actually solving such problems • This is bottom-up, based on experience • Microsoft is committed to develop community-wide web services standards… • Is the Grid Forum equally committed?
The EndHow can GRIDs become a real, useful, computer structure?Get a life.Use the standards and tools. Adopt an application and/or community…now!