320 likes | 430 Views
Untangling the Web from DNS. Scott Shenker UC Berkeley/ICSI. Michael Walfish Hari Balakrishnan Massachusetts Institute of Technology. IRIS Project 30 March 2004. Introduction. The Web depends on linking; links contain references. < A HREF= http://domain_name/path_name > click here </A>.
E N D
Untangling the Web from DNS Scott Shenker UC Berkeley/ICSI Michael Walfish Hari Balakrishnan Massachusetts Institute of Technology IRIS Project 30 March 2004
Introduction • The Web depends on linking; links contain references <A HREF=http://domain_name/path_name>click here</A> • Properties of DNS-based references • encode administrative domain • human-friendly • These properties are problems!
Web Links Should Use Flat Identifiers Proposed Current <A HREF= http://isp.com/dog.jpg >my friend’s dog</A> <A HREF= http://f0120123112/ >my friend’s dog</A> • This talk: • That we should build this • That we can build this
Outline • Argue for flat tags instead of DNS-based URLs • Resolution service for flat tags
Status Quo Web Page Browser a.com DNS <A HREF= http://a.com/dog.jpg>Spot</A> http:// IP addr “Reference Resolution Service” HTTP GET: /dog.jpg Why not DNS?
Goal #1: Stable References Stable=“reference is invariant when object moves” • In other words, links shouldn’t break • DNS-based URLs are not stable . . .
Object Movement Breaks Links URLs hard-code a domain and a path HTTP GET: /dog.jpg isp.com <A HREF= http://isp.com/dog.jpg >Spot</A> http:// “dog.jpg” “HTTP 404” Browser isp-2.com “spot.jpg”
Object Movement Breaks Links, Cont’d HTTP GET: /dog.jpg isp.com <A HREF= http://isp.com/dog.jpg >Spot</A> http:// “dog.jpg” “HTTP 404” Browser Today’s solutions not stable: • HTTP redirects • need cooperation of original host • Vanity domains, e.g.: internetjoe.org • now owner can’t change isp-2.com “spot.jpg”
Goal #2: Supporting Object Replication • Host replication relatively easy today • But per-object replication requires: • separate DNS name for each object • virtual hosting so replica servers recognize names • configuring DNS to refer to replica servers isp.com HTTP “GET /” host: object26.org “/docs/foo.ps” http://object26.org mit.edu HTTP “GET /” host: object26.org “~joe/foo.ps”
What Should References Encode? • Observe: if the object is allowed to change administrative domains, then the reference can’t encode an administrative domain • What can the reference encode? • Nothing about the object that might change! • Especially not the object’s whereabouts! • What kind of namespace should we use?
Goal #3: Automate Namespace Management • Automated management implies no fighting over references • DNS-based URLs do not satisfy this . . .
DNS is a Locus of Contention • Used as a branding mechanism • tremendous legal combat • “name squatting”, “typo squatting”, “reverse hijacking”, . . . • ICANN and WIPO politics • technical coordinator inventing naming rights • set-asides for misspelled trademarks • Humans will always fight over names . . .
Separate References and User-level Handles User Handles (AOL Keywords, New Services, etc.) Human-unfriendly References Object Location • “So aren’t you just moving the problem?” • Yes. • But. Let people fight over handles, not references tussle space [Clark et al., 2002]
Two Principles for References • References should not embed object or location semantics • References should be human-unfriendly Flat tags Minimal interface Natural choice:
Outline • Argue for flat tags instead of DNS-based URLs • Resolution service for flat tags: SFR User Handle (AOL Keyword, New Handle, etc.) Flat Tag Object Location
SFR in a Nutshell GET(0xf012c1d) <A HREF= http://f012c1d/ >Spot</A> Managed DHT-based Infrastructure o-record (10.1.2.3, 80, /pics/dog.gif) orec API • orec = get(tag); • put(tag, orec); Anyone can put() or get() HTTP GET: /pics/dog.gif 10.1.2.3 /pics/dog.gif Web Server
Resilient Linking • tag abstracts all object reachability information • objects: any granularity (files, directories, hosts) HTTP GET: /docs/pub.pdf 10.1.2.3 <A HREF= http://f012012/pub.pdf >here is a paper</A> /docs/ HTTP GET: /~user/pubs/pub.pdf 20.2.4.6 (10.1.2.3,80, /docs/) /~user/pubs/ (20.2.4.6,80, /~user/pubs/) SFR
Flexible Object Replication o-record (Doesn’t address massive replication) SFR (IP1, port1, path1), (IP2, port2, path2), (IP3, port3, path3), . . . 0xf012012 • Grass-roots Replication • People replicate each other’s content • Does not require control over Web servers
Reference Management • Requirements • No collisions, even under network partition • References must be human-unfriendly • Only authorized updates to o-records • Approach: randomness and self-certification • tag = hash(pubkey, salt) • o-record has pubkey, salt, signature • anyone can check if tag and o-record match
Latency • Problem: Lookups should be fast • Solution: lots of TTL-based caching • Clients and DHT nodes cache o-records • DHT nodes cache each other’s locations In Chord, aggressive location caching 2 or 3 hops per lookup Could also use “one-hop” or Beehive
Outline • Argue for flat tags instead of DNS-based URLs • Resolution service for flat tags: SFR • Related Work / Summary / Conclusion
Related Work • URN (Universal Resource Name) • DOI: an existing URN implementation • PURL (Permanent URL) • Globe • Open Network Handles • DNS over Chord
Summary • Should we build flat references for the Web? • Yes! • Can we build flat references for the Web? • Yes! (Our implementation is usable.) • Lots of future work . . .
Conclusion • DNS-based URLs certainly convenient! • But flat tags better for linking • Future: type DNS names, link with flat tags?
Implementation HTTP http:// HTTP SFR Web Proxy SFR Portal orec = get(tag) put(orec,tag) PlanetLab SFRPortal SFR Client Web Client GetRequest • Proxy allows: • End-users to experience SFR latency • Dynamic population of SFR infrastructure with o-records GetResponse SFR Server DHash Chord
Evaluation SFR data ——— • eight day trace • 390 virtual hosts; 130 nodes in Chord ring on PlanetLab • latency seen by SFR portal • most queries are two hops • informal feedback: generally indistinguishable from DNS DNS data —————— • collected at MIT CSAIL, Feb. 2004 Comparison meant to be suggestive not conclusive
SFR Components Organization SFR Client SFR Client Application Application SFR Client Application Portal Relay SFR Infrastructure Org Store Relay Portal Relay SFR Client Application • Infrastructure stores (tag, o-record) pairs • Caching throughout; o-record has TTL field
Fate Sharing Organization SFR Client SFR Client Application Application SFR Client SFR Client Application Application Portal Relay SFR Infrastructure put(tag,orec) Org Store Relay Portal get(tag) Relay • Fate sharing via write-locality Simple case . . .
Alternate SFR Design: SFR-- • SFR stores only pointers to organizations • Analogous to NS records in DNS SFR Infrastructure GET(0xf012120) User x Organization org ptr: x GET(0xf012120) meta-data (IP addr, etc.)
When Files Separate From Directories HTTP GET: /doc/pub3.ps 10.1.2.3 <A HREF= http://f012012/pub3.ps >here is a paper</A> /doc/pub1.ps /doc/pub2.ps GET(0xf012012) /doc/pub3.ps x (10.1.2.3,80, /doc/) redirect ptr: x HTTP GET: /abc/pub3.ps 20.2.4.6 SFR /abc/pub3.ps
Location Caching • Simulate effect of location caching • 20 lookups/sec; one failure and one birth per 10 sec. • Timeout rate: 4% 1000 nodes