110 likes | 350 Views
Spatial Analysis of Crime Data: A Case Study. Mike Tischler Presented by Arnold Boedihardjo. Outline. Motivation Spatial autocorrelation Approach Issues Data sets. Motivation. Goal: reduce crime activity Develop a tool to extract crime patterns Allow visualization of patterns
E N D
Spatial Analysis of Crime Data: A Case Study Mike Tischler Presented by Arnold Boedihardjo
Outline • Motivation • Spatial autocorrelation • Approach • Issues • Data sets
Motivation • Goal: reduce crime activity • Develop a tool to extract crime patterns • Allow visualization of patterns • Ultimately, predict crime occurrences
Spatial Autocorrelation • Tobler’s first law of geography: “everything is related to everything else, but near things are more related than distant things” • Possible causes of spatial dependency • Spatial causality: an object (event) is a direct cause of nearby objects (events) • Spatial correlation: nearby objects (events) behave similarly • Spatial interaction: movements of objects induce a relationship between objects in different locations
Approach • Provide a spatial-based model to describe the density ofincident objects (e.g., crime locations) within a given set of spatial objects • The density values are essentially probability values, hence can be used as a predictive metric for future occurrences of incident objects
Example: When will the next crime happen? C C Bank A Store Bank C C Store C Bank B C C
How to formalize our intuition in a probabilistic framework? • The probability of a crime occurring at bank C is higher than the stores • Furthermore, the probability is equivalent to bank A and bank B • How to define the probabilities? • Kernel Density Estimation
Applying the KDE • Suppose that the our sample set, S, is not the incident points, but the pair-wise distances of the incidents to the NN non-incident objects (e.g., banks and stores) • If we apply the KDE to S, the kernel functions will be centered at these pair-wise distances and our query points will be transformed to the NN of the non-incident spatial objects • Formally, we have the following multivariate KDE
After applying the KDE, we have the following… Bank A Store Bank C Store Bank B
Research Issues • How to select the features (e.g., banks, stores)? Employ notions of density attractors and repellers. • If the above is solved, how to improve the quality of the density estimates? Currently, an adaptive KDE approach is being tested. • How to incorporate temporal correlation? • Producing this model is computationally intensive: feature selection, NN search for every feature, and multiple queries on KDE
Data Set • Washington DC crime data • Crime incident reports in parse-able formats: • XML, Text/CSV, KML or ESRI • Geographic feature layers are also available for download (could not verify, but was told by a very reliable source) • Other regional information are available (e.g., census tract) • http://data.octo.dc.gov