400 likes | 414 Views
Explore the world of computational grids: their goals, example projects like CERN's Large Hadron Collider, infrastructure like Condor and Globus, deployed grids like NASA's Information Power Grid, and the latest grid news including NSF's Distributed Terascale Facility.
E N D
Object Web ArchitecturesPortals P2P XML ERDC Gateway Tutorial Geoffrey Fox IPCRES Laboratory for Grid Technology Computer Science, Informatics, Physics Indiana University Bloomington IN fox@csit.fsu.edu erdcgridportalaug01
Computational Grids Survey A brief introduction to computational grid projects and goals. erdcgridportalaug01
What Is a Computational Grid? • Grids link distributed scientific resources. • Resources can be geographically, politically distributed • Goal: provide means for sharing resources between organizations. • Example “high-end” resources: • Supercomputers and clusters • Mass storage • Advanced visualization (CAVES) and collaboration (Access Grid). • Particle colliders, telescopes, earthquake detectors • www.globus.org/research/papers/anatomy.pdf erdcgridportalaug01
What Does a Grid Need? • Multi-institutional security • PKI or Kerberos • Information services • Manage, store, deliver information about resources. • Use information to make decisions • Scheduling and Queuing • Advance reservation • Meta-queuing • Remote execution, file transfer, monitoring erdcgridportalaug01
Example of a Grid Problem:CERN’s Large Hadron Collider • Goes on-line in 2005 • Will generate petabytes of raw, distributed data, terabytes of event summary data. • Computing resources for data analysis will be distributed between CERN and regional centers spread all over the world • 1500-2000 people will collaborate on experiments. erdcgridportalaug01
Grid Projects • Grid Infrastructure • Condor: www.cs.wisc.edu • Globus: www.globus.org • Legion: www.cs.virginia.edu/~legion • Grid Applications • Netsolve: www.cs.utk.edu/netsolve • Ninf: www.etl.go.jp • Global Grid Forum: www.gridforum.org erdcgridportalaug01
Examples of Deployed Grids • NASA’s Information Power Grid • Links NASA’s Ames, Glenn, and Langley Centers. • LaunchPad currently available • www.ipg.nasa.gov • DOE’s ASCI Distributed Resource Management • Links classified computing resources at Lawrence Livermore, Los Alamos, and Sandia National Labs. • Full deployment scheduled by Nov 2001. erdcgridportalaug01
Latest Grid News • NSF will spend $53 million on the Distributed Terascale Facility (DTF) • 13.6 teraflops, 600 terabytes, 40 Gigabit/sec • DTF sites: NCSA, SDSC, Argonne, CalTech • Industry partners: IBM, Intel, Qwest • See www.ncsa.uiuc.edu/News/Access/Releases for more information (August 9). erdcgridportalaug01
Distributed Objects • Examples of current object technologies • Documents -- URL • "General Programs including database invocations" • Old Style Web -- CGI • New Style Web -- XML • CORBA and COM -- special "interface definition language" (IDL) defines invocation in C++ like syntax • RMI uses Java language as IDL language • Benefits of distributed objects • allows objects written in different languages to communicate seamlessly via standardized messaging protocols embodied by middleware. • Higher levels of transparency of interoperability • Objects can be “self-managing” of resources • provides flexible grain of decomposition for building complex systems erdcgridportalaug01
Distributed Object Web Technology Model • Basic Vision: Merge Web and Distributed Objects • E.g. Need to abstract entities (Web Pages, database entries, simulations) and services as objects with methods(interfaces) • CORBA .. XML is “just” CGI done right • COM(Microsoft) and CORBA(world) are competing cross platform and language object technologies • Javabeans plus RMI and perhaps JINI is 100% pure Java distributed object technology • W3C says you should use XML which defines a better IDL and with Schema an object specification model and SOAP an Object access model erdcgridportalaug01
3-Tier Architecture and Different Object Models ObjectRepository Database • There are several important Object Models: COM, CORBA, Java, Web, Oracle Database …… • But it doesn’t matter!! XMLFile System(Web Site) Request Or Export/Import Information Middle Tier“Business Logic”dissociatesUser and Back End erdcgridportalaug01
Emerging Object Web Multi-Server Model Back End Servers and their services Clients andtheir servers Middle Tier Custom Servers erdcgridportalaug01
Computational Science Grid: Multi-Server Web Computing System MultidisciplinaryControl (WebFlow) Portals are user Interfaces to a GridThe World Wide Web is a big Grid P2P Networks include Grids Portal Control Parallel DBProxy Database NEOS ControlOptimization OptimizationService Origin 2000Proxy MPP NetSolveLinear Alg.Server Matrix Solver Agent-basedChoice ofCompute Engine IBM SP2Proxy Data AnalysisServer Portals MPP The Grid erdcgridportalaug01
Global Grid Forum erdcgridportalaug01
Computational Grids • Exploit the analogy with electricity – make using a computer as natural as plugging an appliance (PDA, PC) into a wall socket • Make the ensemble of computers, storage devices, scientific instruments on the web “seamlessly accessible” • Link components of the grid together to solve a single problem • Clusters, metacomputers • There are computational grids, education grids, information grids, shopping grids etc. • The web is a (information) grid • Everything is an object • Generic access implies standardsfor API’s and protocols and services • USC (ISI Carl Kesselmann) and Argonne (Ian Foster) pioneered grids erdcgridportalaug01
Issues for Grids and hence Portals • Are the grid components pretty much fixed – such as giant ASCI supercomputers • Are they fleeting and mobile such as internet connected cell phones • The set of IP enabled home sensors, appliances and controllers is a grid • What are requirements? • anonymity, performance Security,, ease of use … • Different components and requirements implies that not likely to be just one grid but a federation of interoperable grids • What are the “standards” and who sets them • How do universities build grids they care about on graduate time while industry builds and abandons remarkable technologies on Internet time erdcgridportalaug01
Foster’s Grid architecture • What is difference between protocol (SOAP, HTTP) and Application interface (HTML, MIME) erdcgridportalaug01
ASCI Grid • Link the multi teraflop computers of ASCI together – today 12, 3 and 2 teraflops. By 2005 100, 60 and 20 teraflops erdcgridportalaug01
IPG Architecture erdcgridportalaug01
Information Power Grid Led by NASA Ames erdcgridportalaug01
Experimental Particle Physics Grid erdcgridportalaug01
Earthquake Engineering Grid • Links Experimental Facilities, Compute resources, people erdcgridportalaug01
Commodity Portals are Web Interfaces for Consumers Yahoo, NetCenter, Amazon.com, Ebay.com etc. are portals fore-commerce, news etc. We want to use these ideas in building computer interfaces erdcgridportalaug01
Hierarchy of Portals and Their Technology ……... ……... • Portal Building Tools and Frameworks (XML, iPlanet, Portlets, www.desktop.com) Generic Portals Collaboration Universal Access Security ……. Generic Services User customization, component libraries,fixed channels Information Services Databases ……. Enterprise Portals Grid Services Visualization ... Quizzes Grading ... Education Services Compute Services MathML etc Education andTraining Portals Science Portals K-12 University Biology Chem Egy erdcgridportalaug01
Services in Any Grid Application • Security • Fault Tolerance • ObjectLookup and Registration • Object Persistence and Database support • Event and Transaction Services • Information Services • Collaboration among users • Teachers and Students (Centra) • Market lead and Salespeople (WebeX) erdcgridportalaug01
Further Services in Computational Grids • Job Status • File Services (as in NPACI Storage Resource Broker) • Support (XML based) computational science specific metadata like MathML, XSIL • Visualization • Programming, Debugging, Performance Monitoring • Application Integration (chaining services viewed as backend compute filters) can be called Workflow • “Seamless Access” and integration of resources between different users/application domains • Job Scheduling (Condor) and special operating modes such as multitude of parameter search jobs • Parameter Specification Service (get data from Web form into Fortran program wrapped as backend object) • High Performancefor general services erdcgridportalaug01
Web Computing and P2P • Pleasingly (embarrassingly) parallel applications involvement the management of multiple jobs running on separate largely independent parts • Some Monte Carlo calculations and parameter searches • Also fancy number theoretic applications such as cracking of RSA security • Here we see “use of idle cycles” and similar job scheduling issues • Many have noticed value of Web for this and this is sometimes called P2P or peer-to-peer computing as involves Peers on edge of Internet – not monster servers in middle • Note total power of Web is around one thousand times that of most powerful supercomputer but how much can be harnessed? erdcgridportalaug01
P2P for Distributed Computing or Web Computing I • The P2P applications are highlighted by the use of millions of Internet clients to analyze data looking for extraterrestrial life (SETI@home http://setiathome.ssl.berkeley.edu/ ) and the • Newer project examining the folding of proteins ( Folding@home http://www.stanford.edu/group/pandegroup/Cosm/ ). • These are building distributed computing solutions for a special class of pleasingly or embarrassingly parallel applications: • Those that can be divided into a huge number of essentially independent computations, and a central server system doles out separate work chunks to each participating client. • This approach is called P2P because the computing is Peer based even though it does not have the "Peer only communication" characteristic of P2P information systems like Gnutella and Napster. • SETI@home and Folding@home are elegantly implemented as screen savers that you download. erdcgridportalaug01
Parabon • Pure Java model • Ensures Security erdcgridportalaug01
Entropia Financial Modeling I erdcgridportalaug01
Entropia Financial Modeling II • Each basic financial instrument can be calculated independently • Central Server interprets the total simulation • Make Money or Learn what causes market swings or …. erdcgridportalaug01
Drug Structure Simulations erdcgridportalaug01
United Devices also does Drug Simulation • Parameter Study: do billions of simulations – each with different parameters • Search Engine like interface to simulation • Works as each calculation fits in a PC – a detailed molecular model would usually not do this erdcgridportalaug01
Performance of Entropia Network erdcgridportalaug01
P2P for Distributed Computing or Web Computing II • Other projects of this type include: • United Devices (http://www.ud.com/home.htm based on SETI@home), • AppliedMeta (http://www.appliedmeta.com based on well known Legion project from the University of Virginia), • Parabon computation (http://www.parabon.com), • Condor (from Wisconsin http://www.cs.wisc.edu/condor/) and • Entropia (http://www.entropia.com/). • Other applications for this type of system include financial modeling, bio-informatics, measurement of web server performance and the scheduling of different jobs to use idle time on a network of workstations. • Ian Foster has given a more detailed review of these activities at http://www.nature.com/nature/webmatters/grid/grid.html and related them to computational grids (http://www.gridforum.org). erdcgridportalaug01
Learning Management Grid from DoD ADLADL= Advanced Distributed Learning LearningServer Content Server(s) External systems: “Learning HR, E-Commerce, ERP... Management Course Interchange: System” Course LMS Structure Format (CSF), Metadata Migration Adapter Common GridServices & Objects Services or Adapter Learning Server Server Adapter Server Side Runtime Client Side Environment: Client Launch, API, Browser Data Model API Adapter Application HTML+ www.adlnet.org erdcgridportalaug01
Properties of Educational Objects • Metadata from IEEE and IMS • Roughly Properties of educational objects thought of as “documents” (author, title …) • Course Packaging from ADL and IMS • How to form bigger (educational) objects from smaller objects • Enterprise Properties from IMS • Link to people (users) and organization databases (rather incomplete at present but must be important as probably can agree) • Tests and Quizzes from IMS • Specialized descriptors from ADL • Such as objectives, prerequisites, completion requirements AllGrids erdcgridportalaug01
Education Specific Portal Services • Administrative Structure • degrees, departments, lecturers, Deans ... • Performance (grading) information • Homework submission • Quizzesof various types (multiple choice, random parameters) • Assessment data and an analysis • Hierarchical Curriculum structure from document fragment to page to lecture to course • Napster/Gnutella type P2P distributed information system with personalized dynamic collections (analogy between CDROM of pirated music and dynamic lectures/personal info resource as in RealJukebox) erdcgridportalaug01
Some Science Portals and Services: Gannon JS Job Submission JM Job Management e.g. File Staging IS Information ServicesFM File Management AA Authorization and AccountingCT CompositionSC Scripting EJ Job Journaling erdcgridportalaug01
Some Science Portals and Services: Gannon JS Job Submission JM Job Management e.g. File Staging IS Information ServicesFM File Management AA Authorization and AccountingCT CompositionSC Scripting EJ Job Journaling erdcgridportalaug01