1 / 9

Determining the Geographic Location of Internet Hosts

Determining the Geographic Location of Internet Hosts. Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at Berkeley SIGMETRICS 2001. Background. Location-aware services are relevant in the Internet context too targeted advertising

snow
Download Presentation

Determining the Geographic Location of Internet Hosts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at Berkeley SIGMETRICS 2001

  2. Background • Location-aware services are relevant in the Internet context too • targeted advertising • event notification • territorial rights management • Existing approaches: • user input: burdensome, error-prone • whois: manual updates, host may not be at registered location • Goal: estimate location based on client IP address • challenging problem because an IP address does not inherently indicate location

  3. IP2Geo Multi-pronged approach that exploits various “properties” of the Internet • DNS names of router interfaces often indicate location • Network delay tends to correlate with geographic distance • Hosts that are aggregated for the purposes of Internet routing also tend to be clustered geographically • GeoTrack • determine location of closest router with recognizable DNS name • GeoPing • use delay measurements to triangulate location • GeoCluster • extrapolate partial IP-to-location mapping information using cluster information derived from BGP routing data

  4. GeoPing • Delay-based triangulation is conceptually simple • delay  distance • distance from 3 or more non-collinear points  location • But there are practical difficulties • network path may be circuitous • transmission and queuing delays may corrupt delay estimate • one-way delay is hard to measure • GeoPing • delay is measured from several distributed probes • minimum delay among several samples is picked • Nearest Neighbor in Delay Space (NNDS) algorithm • construct a delay map containing (delay vector,location) tuples • given a delay vector, search through the delay map for closest match • location corresponding to the closest match is our location estimate

  5. Validation of Delay-based Approach Delay tends to increase with geographic distance

  6. Impact of the Number of Probes Highest accuracy when 7-9 probes are used

  7. GeoCluster • Basic idea • divide up the space of IP addresses into clusters using BGP prefixes • use partial IP-to-location mapping data to infer location of each cluster • given target IP address, find matching cluster via longest-prefix match. • location of the matching cluster is our estimate of host location • Issues • partial IP-to-location mapping information may not be entirely accurate • BGP prefixes might not correspond to geographic clusters • Sub-clustering algorithm • use partial IP-to-location mapping information to test whether a BGP prefix is likely to correspond to a geographic cluster • if the test is negative, divide the prefix into two and recursively apply the test to each half • in the end we are only left with geographically clustered prefixes • dispersion offers an indication of the accuracy of a location estimate

  8. Performance of IP2Geo Median error: GeoCluster: 28 km,GeoTrack: 102 km, GeoPing: 382 km

  9. Summary • IP2Geo combines several techniques that leverage different sources of information • GeoTrack: DNS names • GeoPing: network delay • GeoCluster: address aggregates used for routing • Median error varies between 20 and 400 km • Even a 30% success rate is useful especially since we can tell when the estimate is likely to be accurate • Forthcoming paper at SIGCOMM 2001 • For more information visit: http://www.research.microsoft.com/~padmanab/

More Related