Introduction to DISTRIBUTED COMPUTING

Introduction to DISTRIBUTED COMPUTING Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology 2009-2010

Outline • Why distributed computing needed ? • performed by distributed systems • Examples • Definitions • Goals to build distributed systems 2009-2010

Why distributed systems needed ? (1) • Functional distribution: computers have different functional capabilities • Client/server • Host/terminal • Data gathering/data processing • sharing of resources with specific functionalities • Inherent distribution: stemming from application domain, e.g., • cash register and inventory systems for supermarket chains • computer supported collaborative work 2009-2010

Why distributed systems needed ? (2) • Load distribution/balancing: assign tasks to computers such that overall performance is optimized • Replication of processing power: independent computers working on the same task • collection of microcomputers may have processing power that no supercomputer will ever achieve 2009-2010

Why distributed systems needed ? (3) • Physical separation: relying on the fact that computers are physically separated (e.g., to satisfy reliability requirements) • Economics: collections of microprocessors offer a better price/performance ratio than large mainframes • mainframes: 10 times faster, 1000 times as expensive 2009-2010

Examples (1) • Network of workstations • all files accessible from all machines in the same way and using the same path name • system looks for the best place to execute a command • distributed system • Workflow information system: automatic order processing • people from several departments at different locations • users unaware how an order to be processed • distributed system 2009-2010

Examples (2) • World Wide Web: offering uniform model of distributed documents • in theory, no need to know where the document is fetched • in practice, the location should be awared 2009-2010

intranet % % ISP % % backbone satellite link desktop computer: server: network link: Examples (3) • Internet • interconnected collection of computer networks of many different types • computer interacts by passing messages using a common means of communication 2009-2010

Examples (4) • Intranet • resources shared to different computers 2009-2010

Definitions (1) • “A system in which hardware or software located at networked computers communicate and coordinate their actions only by message passing”. [Coulouris] • “A system that consists of a collection of two or more independent computers which coordinate their processing through exchange of synchronous or asynchronous message passing”. 2009-2010

Definitions (2) • “A distributed system is a collection of independent computers that appear to the users of the system as a single computer”. [Tanenbaum] • “A distributed system is a collection of autonomous computers linked by a network with software designed to produce an integrated computing facility”. 2009-2010

Definitions (3) • There are several autonomous computational entities, each of which has its own local memory. [Andrewset al] 2009-2010

Computer networks vs. Distributed systems • Computer network: autonomous computers are explicitly visible (have to be explicitly addressed) • Distributed system: existence of multiple computers is transparent • However, • many problems in common • in some sense networks (or parts of them, e.g. name services) are also distributed systems • normally, every distributed system relies on services provided by a computer network 2009-2010

Which examples are distributed systems ? • Network of workstations • distributed system • Workflow information system: automatic order processing • distributed system • World Wide Web • not fully qualified as a distributed system (Tanenbaum) • distributed system (Coulouris) 2009-2010

Middleware service Machine A Machine B Machine C • To guarantee • supporting heterogeneous computers • providing single view to users Distributed applications Middleware service Local OS Local OS Local OS 2009-2010

Goals to build a distributed systems (1) • Connecting users and resources • sharing resource • easier to collaborate and exchange information • disadvantage: security (intrusion), privacy violation (communication tracking) 2009-2010

Goals to build a distributed systems (2) • Transparency tradeoff between a high degree of transparency and the performance of system 2009-2010

Goals to build a distributed systems (3) • Openness • Offering services according to standard rules that describe syntax and semantics of those services • syntax specification: in interface definition language • semantic specification: in natural language • Interoperability and portability • Flexibility: using different components from different developers 2009-2010

Goals to build a distributed systems (4) • Scalability • Measured in three dimensions • size: more users, resources can be added easily • geographics: users, resources may lie far apart • administration: still easy to manage even spanning many independent administrative organizations • Some problems must be solved • size: centralization • centralized service: single server for all users • centralized data: single online telephone book • centralized algorithm: routing based on complete information 2009-2010

Goals to build a distributed systems (5) • size: centralization • centralized service: single server for all users • centralized data: single online telephone book • centralized algorithm: routing based on complete information • geographics: synchronous & unreliable communication, • some system only designed for LAN (blocking communication depends strongly on quick response) • administration: conflicting policies w.r.t. resource usage, management, security 2009-2010

Scaling techniques • Asynchronous communication • Distribution • Replication, caching 2009-2010

Typical properties • tolerate failures in individual computers • The structure of the system (network topology, network latency, number of computers) is not known in advance • Each computer has only a limited, incomplete view of the system 2009-2010

Architectures • Client-server: • permanent data on server • 3-tier architecture: • stateless client, • N-tier: web applications • Tightly-coupled (clustered): • NOW, cluster of machines • Peer-to-peer • Grid computing (VO level) • Space-based • virtualization as one single address-space 2009-2010

source:wikipedia.org 2009-2010

Computers Date Web servers 0 1979, Dec. 188 1989, July 130,000 0 1999, July 56,218,000 5,560,866 2003, Jan. 171,638,297 35,424,956 Some numbers (1) • Computers in the Internet 2009-2010

Some numbers (2) • Computers vs. Web servers in the Internet Date Computers Web servers Percentage 1993, July 130 1,776,000 0.008 23,500 6,642,000 1995, July 0.4 1997, July 19,540,000 1,203,096 6 1999, July 56,218,000 6,598,697 12 31,299,592 2001, July 125,888,197 25 2009-2010

Text books & materials • Andrew S. Tanenbaum, Maaten Van Steen, Distributed Systems: Principles and Paradigms, Prentice Hall, Second Edition, 2007 • George Coulouris, Jean Dollimore, Tim Kindberg, Distributed Systems: Concepts and Design, Addison Wesley, Fourth Edition, 2005 • Google 2009-2010

Introduction to DISTRIBUTED COMPUTING

Introduction to DISTRIBUTED COMPUTING

Presentation Transcript

Introduction to High Performance Computing: Parallel Computing, Distributed Computing, Grid Computing and More

introduction to distributed computing

Distributed computing

DISTRIBUTED COMPUTING

Distributed Computing

Teaser - Introduction to Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

Distributed Computing

Lecture 4 Introduction to Principles of Distributed Computing

Distributed Computing

DISTRIBUTED COMPUTING

Distributed Computing

Lecture 2 Introduction to Principles of Distributed Computing

Lecture 1 Introduction to Principles of Distributed Computing

Distributed Computing

Teaser - Introduction to Distributed Computing

Distributed computing

Lecture 4 Introduction to Principles of Distributed Computing

Lecture 3 Introduction to Principles of Distributed Computing