420 likes | 760 Views
EarthScope CSIT Workshop March 25 2002 Grids for GeoSensors, GeoScience and GeoScientists PTLIU Laboratory for Community Grids Geoffrey Fox Computer Science, Informatics, Physics
E N D
EarthScope CSIT Workshop March 25 2002 Grids for GeoSensors, GeoScience and GeoScientists PTLIU Laboratory for Community Grids Geoffrey Fox Computer Science, Informatics, Physics Indiana University, Bloomington IN 47404http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopesmallmar02 gcf@indiana.edu uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Trends of Importance • Resources of increasing performance or functionality • Computers (ASCI, Earth Simulator to TeraGrid), storage, sensors, networks, PDA’s • Applications of increasing sophistication • Size, multi-scales, multi-disciplines • New algorithms and mathematical techniques • Computer science • Compilers, Parallelism, Objects, Components • Grid and Internet Concepts and Technologies • Enabling new applications • XML, Web Services,Portals, Collaboration uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Projected Top 500 Until Year 2009 • First, Tenth, 100th, 500th, SUM of all 500 Projected in Time Earth Simulator from Japan http://geofem.tokyo.rist.or.jp/ uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
OC-12 vBNS Abilene MREN OC-12 OC-3 = 32x 1GbE 32 quad-processor McKinley Servers (128p @ 4GF, 8GB memory/server) PACI 13.6 TF Linux TeraGrid 574p IA-32 Chiba City 32 256p HP X-Class 32 Argonne 64 Nodes 1 TF 0.25 TB Memory 25 TB disk 32 32 Caltech 32 Nodes 0.5 TF 0.4 TB Memory 86 TB disk 128p Origin 24 32 128p HP V2500 32 HR Display & VR Facilities 24 8 8 5 5 92p IA-32 HPSS 24 HPSS OC-12 ESnet HSCC MREN/Abilene Starlight Extreme Black Diamond 4 Chicago & LA DTF Core Switch/Routers Cisco 65xx Catalyst Switch (256 Gb/s Crossbar) OC-48 Calren OC-48 OC-12 NTON GbE OC-12 ATM Juniper M160 NCSA 500 Nodes 8 TF, 4 TB Memory 240 TB disk SDSC 256 Nodes 4.1 TF, 2 TB Memory 225 TB disk Juniper M40 Juniper M40 OC-12 vBNS Abilene Calren ESnet OC-12 2 2 OC-12 OC-3 Myrinet Clos Spine 8 4 UniTree 8 HPSS 2 Sun Starcat Myrinet Clos Spine 4 1024p IA-32 320p IA-64 1176p IBM SP Blue Horizon 16 14 = 64x Myrinet 4 = 32x Myrinet 1500p Origin Sun E10K = 32x FibreChannel = 8x FibreChannel 10 GbE 32 quad-processor McKinley Servers (128p @ 4GF, 12GB memory/server) Fibre Channel Switch 16 quad-processor McKinley Servers (64p @ 4GF, 8GB memory/server) IA-32 nodes Cisco 6509 Catalyst Switch/Router uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
The HPCC Track • The 1990 HPCC 10 year initiative was largely aimed at enabling large scale simulations for a broad range of computational science and engineering problems • It was in many ways a success and we have methods and machines that can (begin to) tackle most 3D simulations • ASCI simulations particularly impressive • DoE still putting substantial resources into basic software and algorithms from adaptive meshes to PDE solver libraries • Machines are still increasing in performance exponentially and should achieve petaflops in next 7-10 years • EarthScope community needs to harness these capabilities • Japan’s Earth Simulator activity major effort with large hardware and software (GEOFEM) efforts uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Some HPCC Advice to EarthScope • Important to build Sustainable modular software • Use MPI and openMP if needed for performance on shared memory nodes • Adaptive Meshes • Load Balancing • PDE Solvers including fast multipoles • Particle dynamics • Other areas such as datamining, visualization and data assimilation quite advanced but still significant research } Are well understoodto get high performanceparallel simulationsUse broad communityexpertise uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Use of Object Technologies • There is emerging HPCC component architecture allowing production of more modern libraries (integration Infrastructure) • DoE has very large CCA – Common Component Architecture – effort • Package software (“system and applications”) as distributed objects – not as traditional libraries • CORBA Java and Web Services are not naturally high performance as component models but OK for coarse grain objects (“full programs”) • As a language, C++ can be high performance but Java is not uniformly so (it is improving) • Fortran (including Fortran90) will continue to decline in importance and interest – the community should prefer not to use it • Not essential to write modules in object oriented language • It is essential to package modules in object framework uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
What is a Web Service I • A web service is a computer program running on either the local or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL) • In principle, computer program can be in any language (Fortran .. Java .. Perl .. Python) and the interfaces can be implemented in any way what so ever • Interfaces can be method calls, Java RMI Messages, CGI Web invocations, totally compiled away (inlining) but • The simplest implementations involve XML messages (SOAP) and programs written in net friendly languages like Java and Python • Web Services separate the meaning of a port (message) interface from its implementation so CAN get high performance in spite of voluminous XML format • Enhances/Enables re-usable component model of ANY electronic resource uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
PaymentCredit Card WSDL interfaces Security Catalog Warehouse shipping WSDL interfaces What is a Web Service II • Web Services have important implication that ALL interfaces are XML messages based. In contrast • Web Services in some sense replace distributed object paradigms such as CORBA and Java but can wrap these other technologies as Web Services • We wrapped our CORBA + Java Computing Portal Gateway as Web services straightforwardly uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Web Service (WS) WS WS WS WS WS WS RawResources Raw Data Raw Data (Virtual) XML Data Interface WS WS etc. XML WS to WS Interfaces (Virtual) XML Knowledge (User) Interface Render to XML Display Format (Virtual) XML Rendering Interface Clients uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Database Database Classic Grid Architecture Resources Content Access Composition Middle TierBrokers Service Providers Netsolve Security Collaboration Computing Middle Tier becomes Web Services Clients Users and Devices uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Examples of System Web Services I • OGSA (Open Grid Service Architecture) • Integrate Web Service and Grid Concepts and allows Globus to be implemented as Web Services • Audio-Video Conferencing as a Web Service • Integrates H323, SIP, JXTA (etc.) protocols by mapping to single XML Interface • Provides VRVS reflector model from Messaging Web Service • Messaging or Event Web Service provides intelligent routing and buffering of messages • Computing as a Web service • Job submittal, status, composition, data services, visualization • Performance WS allows access to distributed monitoring data, analysis, models, and final benchmarks with interoperable XML interfaces uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
EarthScope Peer to Peer Grid Community “Everything”(people/sensors/applications) connected byXML messages Distributed Scientists usingCollaboration Web Serviceto access/use Application Web Services uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Gateway and Web Services • We can use the Gateway Computing Portal as an example (http://www.gatewayportal.org) • It is largely built using CORBA with a Java Server Pages front end • Several capabilities have been interfaced using WSDL • Job Submission (11 Methods including execute local and remote command, copy files etc. as well as Submit Job) • Manage WebFlow Session (67 Methods) • Generate Batch Script (just 1 method but two implementations developed – one at SDSC and one at Indiana – with UDDI to manage) • Each is one service – could have used finer grain services • Sample files are athttp://grids.ucs.indiana.edu/ptliupages/presentations/ggf4feb02 uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
WSDL Abstractions • WSDL abstracts a program as an entity that does something given one or more inputs with its results defined by streams on one or more outputs. • Functions are defined by method name and parametersmethodname(parm1,parm2, … parmN) • Where parameters are “Input” “Output” or both • In WSDL, we will have a Web Service which like a (Java or CORBA Program) can be thought of as a (distributed) object with many methods • Instead of a function call, the “calling routine” sends an XML message to the Web Service specifying methodname and values of the parameters • Note name of function is just another parameter uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
WSDL Message Example <message name="submitRequest"> <part name="xmljob" type="xsd:string"/> </message> <messagename="submitResponse"> <part name="response" type="xsd:string"/> </message> • For the batch script service, we pass the XML description of the job as a string and get back the script as a string. • In general, any XML primitive or complex types can be used in messages. • We could improve our service by defining a BatchScript complex type. uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
SOAP and Gateway Portal I • Having specified service in WSDL, the run-time is implemented in SOAP which is “just” an XML header (info needed by transport – empty here) and body • Here is SOAP transported by HTTP message • This is execLocalCommand WSDL operation to run one particular command (ls) on current WebFlow directory HTTP Header Argument of operation Specify ls as SOAP Envelope and body uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Examples of System Web Services II • Education as a Web Service • One of easiest to do as object standards well defined (IMS) and little performance issues • Grading, Homework submission, registration, assessment etc. • Universal Access and Web Services • As Web Services allow multiple implementation of a particular interface, one can adjust to needs of particular clients (PDA v. versus, impaired sight etc.) • Can build custom implementations of certain web services for particular communities but re-use others • Collaborative Web Services • As interfaces all message based, much easier to share Web Services than other applications (PowerPoint interface is NOT message based and harder to share than server app) uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Education as a Web Service • Can link to Science as a Web Service and substitute educational modules • “Learning Object” XML standards already exist from IMS/ADL http://www.adlnet.org – need to update architecture • Web Services for virtual university include: • Registration • Performance (grading) • Authoring of Curriculum • Online laboratories for real and virtual instruments • Homework submission • Quizzesof various types (multiple choice, random parameters) • Assessment data access and analysis • Synchronous Delivery of Curricula • Scheduling of courses and mentoring sessions • Asynchronous access, data-mining and knowledge discovery • Learning Plan agents to guide students and teachers uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Sensor Web Service Distributed Sensor Web Service Out Web Service portsUniversal sensor accessfor people/computers In Web Service portsDifferent formatSensor Data uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Prog1WS Prog2WS Filter1WS Filter2WS Filter3WS Build as multiple interdisciplinaryPrograms Build as multiple Filter Web Services Sensor Data as a Webservice (WS) Simulation WS Simulation WS Data Analysis WS Data Analysis WS Sensor ManagementWS Visualization WS Visualization WS Application Web Services • Note Service model integrates sensors, sensor analysis, simulations and people • An Application Web Service is a capability used either by another service or by a user • It has input and output ports – data is from users, sensors or other services • Big services built hierarchically from “basic” services SLE (space Link Extension) as a WS uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
XMLSkin XMLSkin Data base e-Science is XML Specified Resourcesconnected by XML specified messages Message Or Event Based InterConnection Software Resource Software Resource Implementation of resource and connection may or may not be XML uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Database e-Science is just a pile of XML • Each leaf is a piece of XML either defining a nugget of information and/or containing links to other XML or “raw resources” uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
XML (RSS) Specification of Information Nuggets • <itemrdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html"> • <title> Processing Inclusions with XSLT </title> • <link>http://xml.com/pub/2000/08/09/xslt/xslt.html</link> • <description> • Processing document inclusions with general XML tools can be • problematic. This article proposes a way of preserving inclusion • information through SAX-based processing. • </description> • </item> • <item rdf:about="http://xml.com/pub/2000/08/09/rdfdb/index.html"> • <title> Putting RDF to Work </title> • <link>http://xml.com/pub/2000/08/09/rdfdb/index.html</link> • <description> • Tool and API support for the Resource Description Framework • is slowly coming of age. Edd Dumbill takes a look at RDFDB, • one of the most exciting new RDF toolkits. • </description> • </item> • </rdf:RDF> Example of XML meta-data in the “pile”pointing to other (outside) resources uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Distributed Information Actually the XML is distributed all around in a dynamic Grid uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Structured (XML) Information Note XML specifiesboth internal andexternal nodes of tree root earthscope://root/one/two/bottom one two bottom uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Matching Information/Service Providers and Consumers I • Classic Centralized Approach • Those with services publish information as to location – this is percolated up and down the tree of brokers • At simplest, publish location; better publish location and meta-data allowing easier discovery of value • Those wanting service, look it up using either • Some search of information registered with brokers • A search using a system like Google • Because they were told some key • Like using an encyclopedia; very reliable and fast for well established information uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Hoosier National Forest showingstructured trees and a Gallimaufry of unstructured leaves (fall 2001) Unstructured and Structured XML root earthscope://root/one/two/mess one two mess “mess” can be multiple levels of tree uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Database Database Event/MessageBrokers Event/MessageBrokers Integrate P2P and Grid/WS Peer to Peer Grid JXTA Web Service Interfaces Web Service Interfaces JXTA Peer to Peer Grid uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Matching Information/Service Providers and Consumers II • Peer-to-peer Approach (or how to search the “mess”) • Those with services publish XML advertisements to their friends; their friends may forward it to other friends • Those wanting a service, publish an XML request to a chosen set of friends • Friends use their personal idiosyncratic approach to matching requests with advertisements and to choosing who else should be asked • Analogous to way communities exchange information as in a meeting like this • Uncertain reliability but scales well (communities intra-exchange information independently) and supports rapidly varying information (Web Services) • Allows many different approaches – EarthScope imposes interfaces NOT analysis methods uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Rival Estimate MainlyDigital Video Cohen’s Grid/P2P Use of Internet I ROBERT B. COHEN, PH.D. COHEN COMMUNICATIONS GROUP bcohen@bway.net 212-986-7720 Global Grid Forum Toronto Feb 18 2002 uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
S2S Server to Server Digital Video“on demand” Grid/P2P Use of Internet II uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Semantic Grid & Digital Brilliance I • The (XML) advertisement-request matching provides a publish-subscribe linkage between resources – these are people, computers and raw/processed data • The richer the meta-data, the more precise the linkage • This is spirit of Semantic Web – RDF/DAML/OIL metadata enables meaningful linkage • In a physics analogy, resources can be thought of as spins and the meta-data induced linkage as forces or interactions • Phase transitions will occur when “enough” resources are linked – one will get associated spins to align in the direction of new knowledge • Term this digital brilliance uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Semantic Grid & Digital Brilliance II • This suggests ways of quantifying value of metadata induced linkages and ways of identifying where one “should” add more resource specifications • Note that related resources are not necessarily directly connected but rather messages are forwarded through friends • Study of Peer to Peer networks teach us that we can build “small worlds” where distance between resources is logarithmic in number of nodes • This physics based picture provides an interesting underlying formalism to give a theory of e-Science …. • All you need to do is to build a lot of XML Meta-data specification wizards uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Semantic Grid & Digital Brilliance III • EarthScope Collaboratory consists of a set of connected “spins” (being a physicist; resources if I was W3C) • Resources are anything with a digital signature • Raw data, Analysers, Simulators, Simulations, Processed Information, Extracted Knowledge, Scientists …. • The linkage of Earthquake Fault Simulator Web Service to the Greens Function Solver Web Service is as program to subroutine; must have agreement on both syntax and Semantics • The linkage of Granular Physics model to (my) remark that Los Alamos has interesting new simulation technology is less precise • So linkages with very precise ontologies and those which are more qualitative are both part of Semantic Grid uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Portals and Web Services • Web Services allow us to build a component model (see CCA) for resources. • Each resource naturally has a user interface (which might be customized for user) • Web Service <--> Portlet • Natural to use a component model for portal building displayed web page from collection of portlets • So can customize each portlet and customize which portlets you want • Apache Jetspeed seems good open source technology supporting this model • JSP model is better than say a client-side Java integration in that also message based so this is “Portal as a Web Service” uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
4 available portletslinking to Web ServicesI choose two Jetspeed Computing Portal: Choose Portlets uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Choose Portlet Layout Choose 1-column Layout Original 2-column Layout uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
Two Computing Portlets uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
EarthScope CSIT Strategy • Make a list of resources with a hierarchical arrangement • People, Places, Results (Publications, meeting archives, Simulation Output), Activities, Sensors (Instruments), Data (raw and processed), Earth features, Computers, Software • Decide on component (Web Service) model and URI labeling (earthscope://devices/satellites/year/label …) • Respect performance requirements • Design so modules can be re-used, re-arranged and replaced for outreach (education) • Study related CSIT architectures of other fields • Grid Forum, PACI, ASCI for computing issues • W3C Web Consortium for basic IT infrastructure • openGIS XMML for related fields • IMS for Education uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
EarthScope HPCC Strategy • Decide what services are well enough understood and useful enough to be encapsulated as application Web Services • Parallel FEM Solvers • Visualization • Parallel Particle Dynamics • Access to Sensor Data • Image Processing • Make services as small as possible – smaller is simpler and more sustainable but with higher communication needs • Compose large services from smaller ones • Design Portals and portal components that allow one to manipulate services – set parameters, compose, invoke • Install chosen System Web Services (job submit, performance, queue) on central machines and local clusters • Make certain infrastructure supports compute, data, middleware needs • Set necessary hardware/software meta-data uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"
EarthScope IT Strategy • Design an internal EIF (EarthScope Internal Framework) defining architecture and interface standards of internal Web Services and data structures • Design EEF (EarthScope External Framework) which maps external raw data into sensor web services • Support diverse set of explorations as many new approaches to Earth Science enabled by EarthScope • Choose some appropriate (mix of) middleware frameworks • .net, IBM, BEA, Sun, Oracle • Look at special requirements for key system services • Hardware/Data systems (new and legacy issues) • Security • Collaboration including Audio/Video conferencing • Peer-to-peer networking • Develop necessary meta-data wizards uri="http://grids.ucs.indiana.edu/ptliupages/presentations/earthscopemar02" email="gcf@indiana.edu"