150 likes | 283 Views
Lecture XVIII: Concluding Remarks. CMPT 401 Summer 2007 Dr. Alexandra Fedorova. Outline. Discuss A Note on Distributed Computing by Jim Waldo et al. Jim Waldo: Distinguished Engineer at Sun Microsystems Chief architect of Jini Adjunct professor at Harvard.
E N D
Lecture XVIII: Concluding Remarks CMPT 401 Summer 2007 Dr. Alexandra Fedorova
Outline • Discuss A Note on Distributed Computing by Jim Waldo et al. • Jim Waldo: • Distinguished Engineer at Sun Microsystems • Chief architect of Jini • Adjunct professor at Harvard
A Note on Distributed Computing • Distributed computing is fundamentally different from local computing • The two paradigms are so different that it would be very inefficient to try and make them look the same • You’d end up with distributed applications that aren’t robust to failures • Or with local applications that are more complex than they need to be • Most programming environments for DS attempt to mask the difference between local and remote invocation • But this is not what’s hard about distributed computing…
Key Argument • Achieving interface transparency in distributed systems is unreasonable • Distributed systems have different failure modes than local systems • Handling those failures properly requires a certain interface • Therefore, distributed systems must be accessed via different interfaces • Those interfaces would be an overkill for local systems
Differences Between Local and Distributed Applications • Latency • Memory access • Partial failure and concurrency
Latency • A remote method call takes longer to execute than a local method call • If you build your application without taking this into account, you are doomed to have performance problems • Suppose you disregard local/remote differences: • You build/test your application using local objects • You decide later which objects are local and which are remote • You find out that if frequently accessed objects are remote, your performance sucks
Latency (cont.) • One way to overcome the latency problem: • Make available tools that will allow developer to debug performance • Understand what components are slowing down the system • Make recommendations about the components that should be local • But can we be sure that such tools would be available? (Do you know of a good one?) This is an active research area – this means that this is hard!
Memory Access • A local pointer does not make sense in a remote address space • What are the solutions? • Create a language where all memory access is managed by a runtime system (i.e., Java) – everything is a reference • But not everyone uses Java • Force the programmer to access memory in a way that does not use pointers (in C++ you can do both) • But not all programmers are well behaved
Memory Access and Latency: The Verdict • Conceptually, it is possible to mask the difference between local and distributed computing w.r.t. memory access and latency • Latency: • Develop your application without consideration for object locations • Decide on object locations later • Rely on good debugging tools to determine the right location • Memory access • Enforce memory access though the underlying management system • But masking this difference is difficult, and so it’s not clear whether we can realistically expect it to be masked
Partial Failure • One component has failed others keep operating • You don’t know how much of the computation has actually completed – this is unique to distributed systems • Has the server failed or is it just slow? • Did it update my bank account before it failed? • With local computing, a function can also fail, or a system may block or deadlock, but • You can always find out what’s happening by asking the operating system or the application • In distributed computing, you cannot always find out what happened, because you may be unable communicate with the entity in question
Concurrency • Aren’t local multithreaded applications subject to same issues as distributed applications? • Not quite: • In local programming, a programmer can always force a certain order of operations • In distributed computing this cannot be done • In local programming, the underlying system provides synchronization primitives and mechanisms • In distributed systems, this is not easily available, and the system providing the synchronization infrastructure may fail
So What Do We Do? • Design the right interfaces • Interfaces must allow the programmer to handle errors that are unique to distributed systems • For example: a read() system call: • Local interface: int read(intfd, char *buf, int size) • Remote interface: int read(intfd, char *buf, int size, long timeout) Error codes are expanded to indicate timeout or network failure
But Wait… Can’t You Unify Interfaces • Can’t you use the beefed-up remote interface even when programming local applications? • Then you don’t need to have different sets of interfaces • You could, but • Local programming would become a nightmare • This defeats the purpose of unifying local and distributed paradigms: instead of making distributed programming simpler you’d be making local programming more complex
So What Does Jim Suggest? • Design objects with local interfaces • Add an extension to the interface if the object is to be distributed • The programmer will be aware of the object’s location • How is this actually done? Recall RMI: • A remote object must implement Remoteinterface • A method invoked on a remote object must catch Remoteexception • But the same object can be used locally, without specifying that it implements Remote
Summary • Distributed computing is fundamentally different from local computing because of different failure modes • By making distributed interfaces look like local interfaces, we are diminishing our ability to properly handle those failures – this results in brittle applications • To handle those failures properly, interfaces must be designed in a certain way • Therefore, remote interfaces must be different from local interfaces (unless you want to make local interfaces unnecessarily complicated)