290 likes | 437 Views
Grids and the Harmony and Prosperity of Civilizations. “Beijing Forum” (2004) The Harmony and Prosperity of Civilizations http://www.beijingforum.org/english/index.htm Geoffrey Fox Professor of Computer Science, Informatics, Physics Pervasive Technology Laboratories
E N D
Grids and the Harmony and Prosperity of Civilizations “Beijing Forum” (2004) The Harmony and Prosperity of Civilizations http://www.beijingforum.org/english/index.htm Geoffrey FoxProfessor of Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN 47401 gcf@indiana.edu http://www.infomall.org
CPU and Network Infrastructure • Moore’s law predicts that electronic components will improve in performance by a factor of 100 or so every ten years (double every 18 months) • Networks are increasing in performance every year much faster than this as more and better technology is deployed (Gilder’s law) • Last-mile versus backbone performance • Latency versus bandwidth • Cable, DSL, Satellite, Optical fiber, wireless are competing to provide high speed connectivity to the citizens of the world • By 2006, GTRN (Global Terabit Research Network) aims at a 1000:1000:100:10:1 gigabit performance ratio representing international backbone: national: organization: optical desktop: Copper desktop links.
Global Enterprises • As communication improves, activities are spread more and more across the globe. • Faster physical transportation (cars, trains, aircraft) enabled • Increasing international tourism • Separation of manufacturing, design and sales of vehicles, consumer electronics, clothes • Universal networking is allowing instant global information • The latest event at the Olympic Games or • The latest terrorist event • e-Infrastructure is allowing more and more sophisticated activities to become distributed • Scientific research, Business and for this meeting Civilization
e-Infrastructure • e-Infrastructure builds on the inevitable increasing performance of networks and computers linking them together to support new flexible linkages between computers, data systems and people • Grids and peer-to-peer networks are the technologies that build e-Infrastructure • e-Infrastructure called CyberInfrastructure in USA • We imagine billions of conventional local or global connections • Phones, web page accesses, plane trips, hallway conversations • On this we superimpose high value multi-way linkages • Such as collection of people at this meeting • If N items are joined to M others, added value goes like N × M for small M but in broadcast limit M ≈ N, the value decreases to a constant × N. (A Complex System theorem) • Conventional Internet technology manages billions of broadcast or low (2-way) or broadcast links • Grids superimpose multiple M-way overlaid organizations with optimized resources and system support
On Complex Systems Language • Web and Grid resources (people, pages, databases, computers) are “just spins” • Local Interactions are terms in an energy function • E = sum( nearest neighbor i,j) weight(i,j).s(i).s(j) • “Internet Communication” corresponds to a long range force with • E= sum(all spins i) H . s(i) • And behaves like a magnetic field aligning spins in physics (complex systems) analogy • Aligning is harmonizing • Maximizing Prosperity is minimizing “Complex Systems Energy” • Abrupt Social changes are phase transitions • In this language, Grids provide different local energy functions (enhanced interaction) and harmonizing forces through community shared resources
4×N Interactions • In days gone by people communicated with their local community • Nearest neighbour communications in a physics analogy with communication = force
N plus N Interactions • Television and the Web allows individuals to communicate instantly with each other via Web Pages and Headline News acting as proxies • N resources deposit information and N can view Call N plus N
M2Interactions • Superimpose M way “Grids” on the sea (heatbath) of “2 by N” or N plus N ordinary interactions Implement Gridsas a softwareoverlay network
R1 R2 Enterprise Grid Dynamic light-weight Peer-to-peer Collaboration Training Grid Students Information Grid Compute Grid Campus Grid Teacher 4 Overlay Networks With a 5th superimposed
Large and Small Grids • N resources in a community (N is billions for the world and 1000-10000 for many scientific fields) • Communities are arranged hierarchically with real work being done in “groups” of M resources – M could be 10-100 in e-Science • Metcalfe’s law: value of network grows like square of number of nodes M – we call Grids where this true Metcalfe or M2 Grids • Nature of Interaction depends on size of M or N • N plus N Shared Information Grids for large N • M2 Metcalfe Grids for smaller M • Technology support depends on M – might use a relatively static DHT (Distributed Hash Table) for large M and a distributed shared memory for small M • Grids must merge with peer-to-peer networks to support both N plus N and M2 Grids
Architecture of (Web Service) Grids • We view the “ordinary” Internet as providing support for the huge number of low-complexity interactions which are the dominant traffic • We superimpose multiple Grids on top of these; each Grid supports a high value high complexity interaction • Grids built from Web Services communicating through an overlay network • Grids provide the special quality of service (security, performance, fault-tolerance) and customized services needed for “distributed complex enterprises” • We need to work with Web Service community as they debate the 60 or so proposed Web Service specifications • Use Web Service Interoperability WS-I as “best practice” • Must add further specifications to support high performance • Database “Grid Services” for N plus N case • Streaming support for M2case
Application Specific Grids Generally Useful Services and Grids Workflow WSFL/BPEL Service Management (“Context etc.”) Service Discovery (UDDI) / Information Service Internet Transport Protocol Service Interfaces WSDL Higher Level Services ServiceContext ServiceInternet Base Hosting Environment Protocol HTTP FTP DNS … Presentation XDR … Session SSH … Transport TCP UDP … Network IP … Data Link / Physical Bit level Internet (OSI Stack) Layered Architecture for Web Services and Grids
Working up from the Bottom • We have the classic (CISCO, Juniper ….) Internet routing the flood of ordinary packets in OSI stack architecture • Web Services build the “Service Internet” or IOI (Internet on Internet) with • Routing via WS-Addressing not IP header • Fault Tolerance (WS-RM not TCP) • Security (WS-Security/SecureConversation not IPSec/SSL) • Information Services (UDDI/WS-Context not DNS/Configuration files) • At message/web service level and not packet/IP address level • Software-based Service Internet possible as computers “fast” • Familiar from Peer-to-peer networks and built as a software overlay network defining Grid (analogy is VPN) • SOAP Header contains all information needed for the “Service Internet” (Grid Operating System) with SOAP Body containing information for Grid application service
Service Context • On top of “Service Internet”, one supports dynamic context or the “shared memory” supporting groups (M from 2 to more) of services that are inevitable for Grids • Context information defines “state” (a token linking messages and services together), policy/implementation for security, fault tolerance, lifetime etc. • Includes generalization of “environment” and “configuration” variables • This context can be implemented as a Service itself – using SOAP message interactions with a database • This is a lightweight highly dynamic database • Interesting debate between shared (a single service) memory or distributed memory (Collection of messages with context in header) architectures • Familiar from parallel computing with “distributed shared memory” a natural solution • Note this can only be done dynamically if Grids are small –full Internet case needs larger but less dynamic context support
Alternative definitions of a Grid • Supporting human decision making with a network of at least four large computers, perhaps six or eight small computers, and a great assortment of disc files and magnetic tape units - not to mention remote consoles and teletype stations - all churning away. (Licklider 1960) • Coordinated resource sharing and problem solving in dynamic multi-institutional virtual organizations • Infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications. • Realizing thirty year dream of science fiction writers that have spun yarns featuring worldwide networks of interconnected computers that behave as a single entity.
e-Business e-Science and the Grid • e-Business captures an emerging view of corporations as dynamic virtual organizations linking employees, customers and stakeholders across the world. • The growing use of outsourcing is one example • e-Science is the similar vision for scientific research with international participation in large accelerators, satellites or distributed gene analyses. • The Grid integrates the best of the Web, traditional enterprise software, high performance computing and Peer-to-peer systems to provide the information technology infrastructure for e-moreorlessanything. • A deluge of data of unprecedented and inevitable size must be managed and understood. • People, computers, data and instruments must be linked. • On demand assignment of experts, computers, networks and storage resources must be supported
e-Defense and e-Crisis • Grids support Command and Control and provide Global Situational Awareness • Link commanders and frontline troops to themselves and to archival and real-time data; link to what-if simulations • Dynamic heterogeneous wired and wireless networks • Security and fault tolerance essential • System of Systems; Grid of Grids • The command and information infrastructure of each ship is a Grid; each fleet is linked together by a Grid; the President is informed by and informs the national defense Grid • Grids must be heterogeneous and federated • Crisis Management and Response enabled by a Grid linking sensors, disaster managers, and first responders with decision support
e-Business and (Virtual) Organizations • Enterprise Grid supports information system for an organization; includes “university computer center”, “(digital) library”, sales, marketing, manufacturing … • Outsourcing Grid links different parts of an enterprise together (Gridsourcing) • Manufacturing plants with designers • Animators with electronic game or film designers and producers • Coaches with aspiring players (e-NCAA or e-NFL etc.) • Customer Grid links businesses and their customers as in many web sites such as amazon.com • e-Multimedia can use secure peer-to-peer Grids to link creators, distributors and consumers of digital music, games and films respecting rights • Distance education Grid links teacher at one place, students all over the place, mentors and graders; shared curriculum, homework, live classes …
Information/Knowledge Grids • Distributed (10’s to 1000’s) of data sources (instruments, file systems, curated databases …) • Data Deluge: 1 (now) to 100’s petabytes/year (2012) • Moore’s law for Sensors • Possible filters assigned dynamically (on-demand) • Run image processing algorithm on telescope image • Run Gene sequencing algorithm on compiled data • Good example of N plus N Grid • Metadata (provenance) critical to annotate data • Integrate across experiments as in multi-wavelength astronomy Data Deluge comes from pixels/year available
Virtual Observatory Astronomy N plus N Grid that Integrates Experiments Radio Far-Infrared Visible Dust Map Visible + X-ray Galaxy Density Map
CERN LHC Data Analysis Grid • Typical experiment at LHC has 2000 physicists • Analyzing data from LHC is a “N plus N Grid” with huge scale • 30,000 CPU’s processing simultaneously LHC data • In a few years, over a 100 of Petabytes of data • Physics discovery is a M2 Grid with perhaps M=10 • Lots of such groups working simultaneously • Note hierarchical structure • M=10 in Physics analysis • M=2,000 in one LHC Experiment • M=10,000 physicists in particle physics • M= 100,000 total physicists • M=? Scientists • M= Billions People
Rolls Royce and UK e-Science ProgramDistributed Aircraft Maintenance Environment In flight data ~5000 engines ~ Gigabyte per aircraft per Engine per transatlantic flight Global Network Such as SITA Ground Station Airline Engine Health (Data) Center Maintenance Centre Internet, e-mail, pager DAME Several small M2 Grids – one for each aircraft back-ended by N plus N Grid of reference data of all engines
Information Complexity I • Consider a community of N resources with groups of size M with each group complexity C • N/M Groups • Information in systems varies from coherent (harmonious) to incoherent limits • Web and Grid data resources supply coherence as in curated astronomy or bioinformatics database • Can consider N plus N Grids as Coherent or Harmonious Grids • I = (NM)0.5 . (C/M) Incoherent to N . (C/M) Coherent • In this language Grids do one or both of • Coherence/Harmony – common shared asynchronous resources • Interactivity – Increase complexity to M2 with real-time linkage of interacting resources
Information Complexity II • N plus N Community database has I = N Coherent • Improving on N0.5 incoherent case • Nearest Neighbor groups is I = (NM)0.5 • Becoming I = N in limit M = N • M is correlation length in Complex Systems approach • M-ary Interactive group (M2 Metcalfe Grids) has C = M2 and I = (NM3)0.5Incoherent to I = NM Coherent • Coherent case most natural in science due to synergy between Metcalfe and Coherence Grids • “Small World (logarithmic) networks” and hierarchical group structure require more discussion
Grids and e-globalcommunity • Peer-to-peer networks already are a good example of value of Information Technology supporting broad global communities • File sharing, text chats, bulletin boards • Grids must include these capabilities and extend in terms of increased functionality and quality of service • This will support business and cultural interactions between nations • Several interesting applications can be supported by • Replacing files by multi-media streams so can collaborate in real-time • Adding traditional tools like audio-video conferencing and shared applications to P2P set • This integration of P2P and Grid to give M2 Grids impacts e-Business as well as e-globalcommunity
Outsourcing or Not? • In the USA, over last 30 years people worried about loss of manufacturing jobs from the first wave of enterprise distribution created by “physical communication” • Now they worry about the next wave of outsourcing seen in areas like software, and movie/game animation created by e-Infrastructure – electronic communication • Probably this globalization of enterprises will increase not decrease as it allows one to tap the cheapest and best expertise for a particular task • Further the core software and electronic infrastructure will continue dramatic improvements • Assuming global enterprises are inevitable each community should identify its expertise and enhance its ability to work in a distributed fashion • Suggests increasing specialization within communities
Streaming M2 Grids • e-Textilemanufacturing involves Clothes designers in USA and manufacturers in Hong Kongexchanging designswhich arestreams of images • e-Sports is a possible collaboration between Indiana University and Beijing Sport University • Basket ball coaches (teacher) interact with aspiring NBA players in China • Martial Arts masters in China train neophytes in Indiana • Faculty recreational sports adviser works from university with faculty exercising at home • Hope to have working incredibly well by the 2008 Olympics • Interactive TV Grid: allows anybody to discuss professional or home video (of sports or other events) within a custom Grid • Multi-player distributed games which should be supported with exactly the same overlay Grid • Video Game Production Grid links artistic direction (design) in one country with digital animation (manufacturing) in another • e-Science: Physics and Environmental Science Sensors • Surveillance Grid enables security personnel to annotate and discuss suspicious remote camera images/streams
Some Technology for Streaming M2 Grids • Basic capability is collaborative annotatable multimedia tool for images, sensors and real-time video streams • Allow Grid participants to view real-time streams, rewind on the fly and add text and graphical comments • Similar to instant replay on TV but far more flexible • Need rich metadata system to label and correlate streams, images and annotations • Extend Grid and P2P file access paradigms to stream storage, browsing and access • Core Technologies shared with distance education • Using http://www.globalmmcs.org for multimedia services and http://www.naradabrokering.org for overlay network
Grid Farm in the Sky (clouds) Grid Servers P2P P2P and Server based solutions • Peer-to-peer architectures have advantage that they can be deployed just using client resources and no system commitment is needed • Typically clients do not have good network QoS and it is hard for example to support rich multi-point audio video conferencing in this way • M2 Grids typically require multicast so average load in P2P case on client legs goes like O(M) • Server-side multicast puts O(M) load on backbone and O(1) load on clients and can lead to much better scaling and performance • N plus N Grids may not see such large improvements with server side support • So Grids should support initial P2P deployment with a seamless upgrade to add better QoS using Servers. • Extend familiar P2P paradigms like BitTorrent to Grids and Streaming • Grid and peer-to-peer linkage combines scalable performance with ease of deployment