480 likes | 757 Views
Geography 625. Intermediate Geographic Information Science. Week4: Point Pattern Analysis. Instructor : Changshan Wu Department of Geography The University of Wisconsin-Milwaukee Fall 2006. Outline. Revisit IRP/CSR First- and second order effects Introduction to point pattern analysis
E N D
Geography 625 Intermediate Geographic Information Science Week4: Point Pattern Analysis Instructor: Changshan Wu Department of Geography The University of Wisconsin-Milwaukee Fall 2006
Outline • Revisit IRP/CSR • First- and second order effects • Introduction to point pattern analysis • Describing a point pattern • Density-based point pattern measures • Distance-based point pattern measures • Assessing point patterns statistically
1. Revisit IRP/CSR Independent random process (IRP) Complete spatial randomness (CSR) • Equal probability: any point has equal probability of being in any position or, equivalently, each small sub-area of the map has an equal chance of receiving a point. • Independence: the positioning of any point is independent of the positioning of any other point. and
2. First- and second order effects IRP/CSR is not realistic • The independent random process is mathematically elegant and forms a useful starting point for spatial analysis, but its use is often exceedingly naive and unrealistic. • If real-world spatial patterns were indeed generated by unconstrained randomness, geography would have little meaning or interest, and most GIS operations would be pointless.
2. First- and second order effects 1. First-order effect • The assumption of Equal probability cannot be satisfied • The locations of disease cases tends to cluster in more densely populated areas • Plants are always clustered in the areas with favored soils. From (http://www.crimereduction.gov.uk/toolkits/fa020203.htm)
2. First- and second order effects 2. Second-order effect • The assumption of Independence cannot be satisfied • New developed residential areas tend to near to existing residential areas • Stores of McDonald tend to be far away from each other.
2. First- and second order effects In a point process the basic properties of the process are set by a single parameter, the probability that any small area will receive a point – the intensity of the process. First-order stationary: no variation in its intensity over space. Second-order stationary: no interaction between events.
3. Introduction to point pattern analysis Point patterns, where the only data are the locations of a set of point objects, represent the simplest possible spatial data. • Examples • Hot-spot analysis for crime locations • Disease analysis (patterns and environmental relations) • Freeway accident pattern analysis
3. Introduction to point pattern analysis Requirements for a set of events to constitute a point pattern • The pattern should be mapped on the plane (prefer to preserve distance between points) • The study area should be determined objectively. • The pattern should be an enumeration or census of the entities of interest, not a sample • There should be a one-to-one correspondence between objects in the study area and events in the pattern • Event locations must be proper (should not be the centroids of polygons)
4. Describing a Point Pattern Point density (first-order or second-order?) Point separation (first-order or second-order?) When first-order effects are marked, absolute location is an important determinant of observations, and in a point pattern clear variations across space in the number of events per unit area are observed. When second-order effects are strong, there is interaction between locations, depending on the distance between them, and relative location is important.
4. Describing a Point Pattern First-order or second order?
4. Describing a Point Pattern A set of locations S with n events s1 (x1, y1) The study region A has an area a. Mean Center Standard Distance: a measure of how dispersed the events are around their mean center
4. Describing a Point Pattern A summary circle can be plotted for the point pattern, centered at with radius d If the standard distance is computed separately for each axis, a summary ellipse can be obtained. Summary circle Summary ellipse
5. Density-based point pattern measures Crude density/Overall intensity The crude density changes depending on the study area
5. Density-based point pattern measures -Quadrat Count Methods • Exhaustive census of quadrats that completely fill the study region with no overlaps • The choice of origin, quadrat orientation, and quadrat size affects the observed frequency distribution • If quadrat size is too large, then ? • If quadrat size is too small, then?
5. Density-based point pattern measures -Quadrat Count Methods • 2. Random sampling approach is more frequently applied in fieldwork. • It is possible to increase the sample size simply by adding more quadrats (for sparse patterns) • May describe a point pattern without having complete data on the entire pattern.
5. Density-based point pattern measures -Quadrat Count Methods Other shapes of quadrats
5. Density-based point pattern measures -Density Estimation The pattern has a density at any location in the study region, not just locations where there is an event This density is estimated by counting the number of events in a region, or kernel, centered at the location where the estimate is to be made. Simple density estimation C(p,r) is a circle of radius r centered at the location of interest p
5. Density-based point pattern measures -Density Estimation Bandwidth r If r is too large, then ? If r is too small, then?
5. Density-based point pattern measures -Density Estimation Density transformation 1) visualize a point pattern to detect hot spots 2) check whether or not that process is first-order stationary from the local intensity variations 3) Link point objects to other geographic data (e.g. disease and pollution)
5. Density-based point pattern measures -Density Estimation Kernel-density estimation (KDE) Kernel functions: weight nearby events more heavily than distant ones in estimating the local density • IDW • Spline • Kriging
6. Distance-based point pattern measures • Look at the distances between events in a point pattern • More direct description of the second-order properties
6. Distance-based point pattern measures -Nearest-Neighbor Distance Euclidean distance
6. Distance-based point pattern measures -Nearest-Neighbor Distance If clustered, has a higher or lower value?
6. Distance-based point pattern measures -Distance Functions: G function
6. Distance-based point pattern measures -Distance Functions: G function
6. Distance-based point pattern measures -Distance Functions: G function The shape of G-function can tell us the way the events are spaced in a point pattern. • If events are closely clustered together, G increases rapidly at short distance • If events tend to evenly spaced, then G increases slowly up to the distance at which most events are spaced, and only then increases rapidly.
6. Distance-based point pattern measures -Distance Functions: F function • Three steps • Randomly select m locations {p1, p2, …, pm} • Calculate dmin(pi, s) as the minimum distance from location pi to any event in the point pattern s • 3) Calculate F(d)
6. Distance-based point pattern measures -Distance Functions: F function • For clustered events, F function rises slowly at first, but more rapidly at longer distances, because a good proportion of the study area is fairly empty. • For evenly distributed events, F functions rises rapidly at first, then slowly at longer distances.
6. Distance-based point pattern measures -Comparisons between G and F functions
6. Distance-based point pattern measures -Comparisons between G and F functions
6. Distance-based point pattern measures -Distance Functions: K Function The nearest-neighbor distance, and the G and F functions only make use of the nearest neighbor for each event or point in a pattern This can be a major drawback, especially with clustered patterns where nearest-neighbor distances are very short relative to other distances in the pattern. K functions (Ripley 1976) are based on all the distances between events in S.
6. Distance-based point pattern measures -Distance Functions: K Function • Four steps • For a particular event, draw a circle centered at the event (si) and with a radius of d • Count the number of other events within the circle • Calculate the mean count of all events • This mean count is divided by the overall study area event density
6. Distance-based point pattern measures -Distance Functions: K Function is the study area event density
6. Distance-based point pattern measures -Distance Functions: K Function Clustered? Evenly distributed?
6. Distance-based point pattern measures -Edge effects Edge effects arise from the fact that events near the edge of the study area tend to have higher nearest-neighbor distances, even though they might have neighbors outside of the study area that are closer than any inside it.
7. Assessing Point Patterns Statistically A clustered pattern is likely to have a peaky density pattern, which will be evident in either the quadrat counts or in strong peaks on a kernel-density estimated surface. An evenly distributed pattern exhibits the opposite, an even distribution of quadrat counts or a flat kernel-density estimated surface and relatively long nearest-neighbor distances. But, how cluster? How dispersed?
7. Assessing Point Patterns Statistically -Quadrat Counts Independent random process (IRP) Complete spatial randomness (CSR) A B and Mean Variance The variance/mean (VMR) is expected to be 1.0 if the distribution is Poisson. How about mean > variance? mean < variance?
7. Assessing Point Patterns Statistically -Quadrat Counts For a particular observation Mean = number of events / study area n is the number of events x is the number of quadrats A B
7. Assessing Point Patterns Statistically -Quadrat Counts Variance 2 * (0 – 1.25)2 = 3.125 k = 0: k = 1: 3 * (1 – 1.25)2 = 0.1875 k = 2: 2 * (2 – 1.25)2 = 1.125 A B k = 3: 1 * (3 – 1.25)2 = 3.0625
7. Assessing Point Patterns Statistically -Quadrat Counts A VMR = Variance/Mean = 0.9375/1.25 = 0.75 B Clustered? Random? Dispersed?
7. Assessing Point Patterns Statistically -Nearest Neighbor Distances The expected value for mean nearest-neighbor distance for a IRP/CSR is The ratio R between observed nearest-neighbor distance to this value is used to assess the pattern If R > 1 then dispersed, else if R < 1 then clustered?
7. Assessing Point Patterns Statistically -G and F Functions Clustered Evenly Spaced
7. Assessing Point Patterns Statistically K Functions IRP/CSR