140 likes | 169 Views
Explore distance metrics, Euclidean & rectilinear, for better problem-solving in large location datasets. Learn to estimate parameters k and s using samples and real-world examples.
E N D
Management Science 461 Lecture 1b - Distance Metrics September 9, 2008
Distance Metrics • Without distances, DM problems usually aren’t DM problems at all • If not distances, then metrics based on distance • Time • Dependencies
Location Problems In large-scale location problems, it may be hard to obtain all distances Consider a problem with 1000 nodes: we need a 1000x1000 distance matrix (or do we?) Distance metrics allow us to estimate with relative accuracy, without resorting to more complicated methods
Basic Metrics • Two fundamental metrics: Euclidean and rectilinear • Rectilinear or right-angle distance metric • Euclidean or straight-line distance metric
Do maps help us? Google Earth Pick two points – is the path between them a grid, straight line, or some combination?
Fit a Distance Metric y2 y1 x1 x2
Distances Rectilinear: |x1-x2| + |y1-y2| Euclidean: [(x1-x2)2 + (y1-y2)2]1/2 Can we combine these two into a single formula?
k and s Distance Metric If s=1: Rectilinear metric If s=2: Euclidean metric k is a scaling factor
Fit a Distance Metric • Determine the actual distances for a subset and estimate parameters k and s • Estimate k and s by minimizing the sum of squared differences between actual and estimate distances • Choose as large and diverse a sample as possible; bigger sample means better fit • Be careful of overselecting one type!
Straight to the answer It may seem that the straight-line metric would be a poor approximation in most cases … … but straight-line metric provides a surprisingly good approximation of the total distance between many pairs of points
Some real-world examples Within the state of Wisconsin, road distances between cities are, on average, 18% longer than the straight-line metric In Ontario, they are about 30% longer So – when we set s = 2, then the optimal k comes out to be 1.18 and 1.30, assuming we use the same scale on the grid to generate the x and y coordinates as the scale on the map
Distances in Alberta • To model highway distances in Alberta, it is a good idea to use the rectilinear metric (s = 1). • Rural road network is a grid (two miles between E-W roads and one mile between N-S roads, with corrections) • Travel distances can be approximated quite accurately using the rectilinear metric
Distance Metrics – Final Note In some instances, actual distances will be longer (due to rivers, mountains), and in other instances, actual distances will be shorter (interprovincial highways) Note the difference between distance and time travelled