230 likes | 240 Views
CSE 291 System Services for the World Wide Web. Winter 2000 Geoffrey M. Voelker. Class Goal. Provide background for doing experimental systems research in wide-area systems such as the Web Architectures for wide-area distributed systems
E N D
CSE 291System Services for the World Wide Web Winter 2000 Geoffrey M. Voelker
Class Goal • Provide background for doing experimental systems research in wide-area systems such as the Web • Architectures for wide-area distributed systems • What are good models for structuring systems to support apps? • Understanding system behavior • What are the workloads? • Enhancing performance • Caching, prefetching • New opportunities • Multimedia, XML • Read, evaluate, present a variety of research papers • Do projects on interesting problems in area CSE 291 Intro
Today • Course structure • Presentations • Evaluations • Projects • Course contents • Web intro • Topics and papers CSE 291 Intro
Presentations • We will present papers in class for discussion • Roughly 2 papers per 1.5 hour class (sometimes 3) • Everyone presents (including me) • How many depends on # students registered • I will present Thursday’s papers and do my share during the rest of the quarter • Look over schedule between now and Thursday • We’ll allocate papers on Thursday CSE 291 Intro
Evaluations • You must submit evaluations of papers • Email them to me by noon of day of class • No evals if you have to present • Brief (½ page) • Summary of paper (research problem, conclusions) • What you learned • Any ideas that occurred to you • Your frank opinion of topic and/or work • If this gets to be too burdensome, might notch it down to one evaluation per day CSE 291 Intro
Class Participation • The presentations are for fostering discussion …so I expect you to participate in discussions • Presenters • Come prepared with discussion questions • Rest of us • Use your evaluations as a basis for discussion CSE 291 Intro
Class Project • For those signed up for four units • Work in pairs • Schedule in handout, on the Web • Roughly 5 weeks setup, 5 weeks working • Start thinking about what you might want to work on, who you might want to work with • I’ll have a list of topics, but I also encourage you to use your own • Your final will be a class presentation on your project CSE 291 Intro
Course Contents • Topics • Wide-area system architectures • Naming • Scalable servers • Workload characterizations • Caching, prefetching • Protocols • Security • Emerging applications • Overview of Web to help put things into context CSE 291 Intro
How does the Web work? • The canonical example in your Web browser Click here • “here” is a Uniform Resource Locator (URL) http://www-cse.ucsd.edu • It names the location of an object on a server CSE 291 Intro
Client Server In Action… • Client resolves name of server (www-cse.ucsd.edu) • Establishes a connection with the server • Sends the server the name of the object (null) • Server returns the object http://www-cse.ucsd.edu HTTP CSE 291 Intro
Naming • How should objects be named? • URLs name locations…if an object moves, the URL breaks • Location-independent names seem like the obvious way to go • Why don’t we use them (e.g., URIs)? • How do we make them work, esp. in the face of mobility? • How it works now, how it might work in the future • DNS [Mockapetris88] • DNS for URIs [Daniel96] • Names as programs [Vahdat99] • Finding replicas [Guyton95], [vanSteen98] CSE 291 Intro
Client Server Communication • Communication between the client and server is done via HTTP over TCP/IP http://www-cse.ucsd.edu HTTP CSE 291 Intro
Protocols • What kind of transport protocol should the Web use? • HTTP 1.0 • One TCP connection/object • Complaints: inefficient, slow, burdensome… • HTTP 1.1 • One TCP connection/many objects (persistent connections) • Solves all problems, right? Huge amount of complexity • Clients, proxies, servers • How do they compare? • Protocol differences [Krishnamurthy99], performance comparison [Nielsen97], effects on servers [Manley97], overhead of TCP connections [Caceres98] CSE 291 Intro
Scalable Servers • Of course, you are not the only person accessing the server… Server CSE 291 Intro
Scalable Servers • How do you build servers to handle millions of hits a day? • Web servers: Flash [Pai99], scheduling [Crovella99] • Mail servers: EarthLink [Christenson97, Saito99] • Principles: Transcend, HotBot [Fox97] • Techniques: Load balancing [Pai98] CSE 291 Intro
Clients Proxy Cache Servers Web Caching • Gee, is there some way to offload those busy servers? • Use caches to exploit reference locality among clients CSE 291 Intro
Caching • How should we build caching systems for the Web? • Seminal paper [Chankhunthod96] • Proxy caches [Duska97] • Akamai hack [Karger99] • Cooperative caching [Tewari99, Fan98, Wolman99] • Popularity distributions [Breslau99] CSE 291 Intro
Prefetching • The fastest way to download a page is to fetch it before it is accessed • How do you know what will be accessed? • How much bandwidth can you afford for mistakes? • Performances bounds [Kroeger97] • Survey paper w/ practical approach [Duchamp99] CSE 291 Intro
Security • We can’t just assume we’re in a back room anymore • How do we secure access to resources? • Infrastructure: SDSI [Rivest96] • Wide-area service: CRISIS [Belani98] • Computational grids: Globus [Foster98] • E-commerce: SSL [Wagner96] • Downloaded code: Java [Wallach97] CSE 291 Intro
Workload Characterizations • How can you fix it if you don’t look inside? • Is the Web slow because of the network, the server, CPU-hogging browsers? • What is the behavior of clients, proxies, and servers? • Golden fleece: Invariants across populations and time • E.g., Zipf-like distribution of object popularities • How can we use workloads to shape the systems we design? • Characterization survey [Pitkow98] • Rate of change [Douglis97] • Key to data dissemination (caching, prefetching, etc.) CSE 291 Intro
Emerging Applications • HTML, gif, jpeg, etc. are all old news • What’s the new, cool stuff? • Multimedia • Streaming multimedia next big thing (20% b/w in UW traces) • Workloads [Mena00], tools [Caceres99], delivery [Eager00] • XML • I don’t know what it is, so I’d like to learn • Papers TBD CSE 291 Intro
Architectures • The Web is basically a simple read-only data access system • Click, fetch, click, fetch, click, fetch… • Why not fully generalize it into a universal wide-area distributed system? • The Web as an operating system: WebOS [Vahdat98] • Wide-area computational grids: Legion [Lewis96, Grimshaw98], Globe [vanSteen97], Globus [Foster97, Foster98] CSE 291 Intro
Hits Misses Misses Clients Proxy Cache Servers CSE 291 Intro