1 / 28

Distributed Systems

Learn about distributed systems, their main characteristics, resources, communication methods, design, and implementation. Explore the significance of datacenters, internet usage, and the differences between parallel and distributed computing.

joglesby
Download Presentation

Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Systems Lecture 1 Introduction to distributed systems

  2. Distributed systems • “A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine. This is in contrast to a network, where the user is aware that there are several machines, and their location, storage replication, load balancing and functionality is not transparent. Distributed systems usually use some kind of client-server organization.” – FOLDOC • “A Distributed System comprises several single components on different computers, which normally do not operate using shared memory and as a consequence communicate via the exchange of messages. The various components involved cooperate to achieve a common objective such as the performing of a business process.” – Schill & Springer • Main characteristics • Components: • Multiple spatially separated individual components • Components posses own memory • Cooperation towards a common objective • Resources: • Access to common resources (e.g., databases, file systems) • Communication: • Communication via messages • Infrastructure: • Heterogeneous hardware infrastructure & software middleware

  3. Distributed systems • These definitions do not define the insides of a distributed system • Design and implementation • Maintenance • Algorithmics (i.e., protocols) Facebook social network graph among humans. The Internet color coded by ISPs.

  4. A working definition • “A distributed system (DS) is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium.” • Terms • Entity = process on a device (PC, server, tablet, smartphone) • Communication medium = wired or wireless network • Course objective: • Design and implementation of distributed systems Source: https://courses.engr.illinois.edu/cs425/fa2013/lectures.html

  5. The datacenter • The datacenter lies at the foundation of many DSs • Amazon Web Services, Google Cloud, Microsoft Azure • However, DSs can be comprised of PCs too. • P2P file sharing systems (e.g., Gnutella) Facebook’s Forest City Datacenter.

  6. Example – Gnutella P2P • What are the entities and communicationmedium?

  7. Example – web domains • What are the entities and communication medium?

  8. The Internet • Used by many distributed systems • Vast collection of heterogeneous computer networks • ISPs – companies that provide services for accessing and using the Internet • Intranets – subnetworks operated by companies and organizations • Offer services unavailable to the public from the Internet • Can be the ISP’s core routers • Linked by backbones • High bandwidth network links

  9. Example - Intranet • What are the entities and communicationmedium?

  10. Parallel vs. distributed computing • Parallelism • Perform multiple tasks at the same time • True parallelism requires distribution on multiple processors/cores/machines • Can range from many core to multi processor to many computer on shared or distributed memory • Concurrency • Computations with multiple threads • Can exploit hardware parallelism but it is inherently related to the software need (i.e., react to different asynchronous events) • Concurrency becomes parallelism if parallelism is true (one thread per processor/core/machine) not virtual • Distributedcomputing • Related to where the computation physically resides • Distributed algorithm is executed on multiple CPUs, connected by networks, buses or any other data communication channel • Computers are connected by communication links on distributed memories • Rely fundamentally on message passing • Usually part of the goal • If resources are geographically spread than the system is inherently distributed

  11. Parallel vs. distributed computing • Is distributed computing a subset of parallel computing? • Not an easy answer • In favor • Distributed computing is parallel computing on geographically spread machines • distributed  parallel  concurrent computing • Against • They address different issues • Distributed computing is focused on issues related to computation and data distribution • Parallel computing does not address problems such as partial failures • Parallel computing focuses on tightly coupled applications

  12. Parallel vs. distributed systems Source: courses.washington.edu/css434/slides/w03w04/Fundamentals.ppt

  13. Reasons for DS • Inherently distributed applications • Distributed DB, worldwide airline reservation, banking system • Information sharing among distributed users • CSCW or groupware • Resource sharing • Sharing DB/expensive hardware and controlling remote lab. devices • Better cost-performance ratio / Performance • Emergence of Gbit network and high-speed/cheap MPUs • Effective for coarse-grained or embarrassingly parallel applications • MapReduce • Reliability • Non-stopping (availability) and voting features. • Scalability • Loosely coupled connection and hot plug-in • Flexibility • Reconfigure the system to meet users’ requirements

  14. DS layered architecture Application layer protocol smtp [RFC 821] telnet [RFC 854] http [RFC 2068] ftp [RFC 959] proprietary (e.g. RealNetworks) NFS proprietary (e.g., Skype) Underlying transport protocol TCP TCP TCP TCP TCP or UDP TCP or UDP typically UDP Source: https://courses.engr.illinois.edu/cs425/fa2013/lectures.html Application e-mail remote terminal access Web file transfer streaming multimedia remote file server Internet telephony Implemented via network “sockets”. Basic primitive that allows machines to send messages to each other TCP=Transmission Control Protocol UDP=User Datagram Protocol Distributed System Protocols! Networking Protocols

  15. Main issues of DS • No global clock • No single global notion of the correct time (asynchrony) • Unpredictable failures of components • Lack of response may be due to either failure of a network component, network path being down, or a computer crash (failure-prone, unreliable) • Highly variable bandwidth • From 16Kbps (slow modems or Google Balloon) to Gbps (Internet2) to Tbps (in between DCs of same big company) • Large and variable latency • Few ms to several seconds • Large numbers of hosts • Up to several million • Security and privacy • Due to geographical and political spread • Interoperability • Due to various standards and protocols

  16. DS design goals • Heterogeneity – can the system handle a large variety of types of hardware and software (interoperability)? • Robustness – is the system resilient to hardware and software crashes and failures, and to network dropping messages? • Availability – are data & services always available to clients? • Transparency – can the system hide its internal workings from users? • Concurrency – can the server handle multiple clients simultaneously? • Efficiency – is the service fast enough? Does it utilize 100% of all resources? • Scalability – can it handle 100 million nodes without degrading service? (nodes=clients and/or servers) How about 6 B? More? • Security – can the system withstand hacker attacks? • Privacy – is the user data safely stored? • Openness – is the system extensible?

  17. History of distributed computing

  18. DS system models • Minicomputer model • Workstation model • Workstation-server model • Processor-pool model • Cluster model • Grid computing

  19. Mini- computer Mini- computer Mini- computer Minicomputer Model • Extension of time sharing system • User must log on his/her home minicomputer • Thereafter, he/she can log on a remote machine by telnet • Resource sharing • Database • High-performance devices ARPA net

  20. Workstation Model • Process migration • Users first log on his/her personal workstation • If there are idle remote workstations, a heavy job may migrate to one of them • Problems: • How to find am idle workstation • How to migrate a job • What if a user log on the remote machine Workstation Workstation Workstation 100Gbps LAN Workstation Workstation

  21. Workstation-Server Model • Client workstations • Diskless • Graphic/interactive applications processed in local • All file, print, http and even cycle computation requests are sent to servers • Server minicomputers • Each minicomputer is dedicated to one or more different types of services • Client-Server model of communication • RPC (Remote Procedure Call) • RMI (Remote Method Invocation) • A client process calls a server process’ function • No process migration invoked • Example: NSF Workstation Workstation Workstation 100Gbps LAN Mini- Computer file server Mini- Computer http server Mini- Computer cycle server

  22. Processor-Pool Model • Clients • They log in one of terminals (diskless workstations or X terminals) • All services are dispatched to servers • Servers • Necessary number of processors are allocated to each user from the pool • Better utilization but less interactivity 100Gbps LAN Server 1 Server N

  23. Cluster Model • Client • Takes a client-server model • Server • Consists of many PC/workstations connected to a high-speed network • Puts more focus on performance: • Serves for requests in parallel Workstation Workstation Workstation 100Gbps LAN http server2 http server N http server1 Slave N Master node Slave 1 Slave 2 1Gbps SAN

  24. Grid Computing • Goal • Collect computing power of supercomputers and clusters sparsely located over the nation and make it available as if it were the electric grid • Distributed supercomputing • Very large problems needing lots of CPU, memory, etc. • High-Throughput computing • Harnessing many idle resources • On-Demand computing • Remote resources integrated with local computation • Data-intensive computing • Using distributed data • Collaborative computing • Support communication among multiple parties Workstation Super- computer High-speed Information high way Mini- computer Cluster Super- computer Cluster Workstation Workstation

  25. Cloud Computing • Goal • On demand virtualized access to hardware infrastructure • “pay per use” model for public clouds • “as a service” paradigm • Several models • Infrastructure as a Service • Clients manage virtualized resources • Amazon EC2, Google Cloud • Platform as a Service • Clients have access to various platform services to develop, run, and manage applications without dealing with the infrastructure • Microsoft Azure • Software as a Service • Clients have access only to specific software tools • GMail, Dropbox • Data as a Service • Clients can access remotely stored data • Amazon Public Data Sets: sciences, economics. • … Workstation Internet Specific services VM VM VM Database Workstation Workstation

  26. What will you learn? • Real distributed systems • Cloud computing • Lectures 2 and 3 • All labs (Google Cloud) • Hadoop • MapReduce (lecture 2) • Key-value stores • Lab 5 • Apache Storm • Lecture 14 • Labs 7 • P2P systems • Lecture 10 • Classical problems • Failure detection (lecture 4) • Time and synchronization (lecture 5) • Global states (lecture 6) • Multicast (lecture 7) • Leader election (lecture 9) • Networking and routing (lecture 11) • Gossiping (lecture 13) • Concurrency • RPCandWeb Services(lecture 8) • Replication control (lecture 12)

  27. Grading • Scientific paper analysis (20%) • Students will have to pick a paper from a top conference (published in the last 3 years) and present it • Labassigments (80%) • Assignments given during lab hours • Documentation • Lecture slides, references inside the slides, Googlecloud, ScienceDirect, IEEE Explore, ResearchGate.

  28. Next lecture • Introduction to cloud computing

More Related