260 likes | 421 Views
Map-Based Interactive Crime Data Exploitation. Crimes in Washington D.C. CS5524 Introduction to Data Mining. Naser Hdieb. Qiben Yan. Kaiqun Fu. December 13 2012. Agenda. INTRODUCTION PROPOSED APPROACH SYSTEM DESIGN IMPLEMENTATION CONCLUSION REFERENCES QUESTIONS. INTRODUCTION.
E N D
Map-Based Interactive Crime Data Exploitation Crimes in Washington D.C.
CS5524 Introduction to Data Mining Naser Hdieb Qiben Yan KaiqunFu December 13 2012
Agenda INTRODUCTION PROPOSED APPROACH SYSTEM DESIGN IMPLEMENTATION CONCLUSION REFERENCES QUESTIONS
INTRODUCTION Heat map of Crimes in Washington D.C.
INTRODUCTION • A big challenge facing law enforcement agencies and the public • Crime patterns are important to predict future occurrences of offenses • Location related crimes (bar and bank example.) • Data mining techniques are powerful tools • Goal: advising Law enforcement and the public • Association rules discovers frequently occurring items in a database and presents the associations between events.
INTRODUCTION • The association rules technique will be used to determine crime patterns based on a spatial-temporal crime data • The occurrences of some crimes can also be related to specific facilities and specific time periods • For example, around Bank, the occurrence of theft or robbery can be common and frequent • Implementation of an Integrated Data Mining Web application that can help identify crime patterns and extracting useful information
Proposed Approach • Data analysis engine consists of three major parts: • Data preprocessing • Ranking service • Association service • Used Data Set: • Crime data sets around the DC area • Spatial data sets for different facilities around the same DC area.
Proposed Approach • Crime Ranking Service: • Rank by crime types • Rank by TI(Time Interval). • Day: 8am to 4pm • Eve: 4pm to 12pm • Mid:12pm to 8am • Unk
Proposed Approach • Multiple-Level Spatial Association Rules Discovery • e.g. close_to(shopping center) →has_crime(theft) • close_to(bank) →has_crime(robbery) • close_to(bank) ^ is_in(mid)→has_crime(robbery) • A concept hierarchy for facility: • (facility(hospital, school, bar, shopping center…)) • A concept hierarchy for crime: • (crime(theft, robbery, homicide…)) • A concept hierarchy for TI: • (TI(daytime(day), nighttime(evening, mid)))
Proposed Approach Coarse Level Fine Level
Proposed Approach • Step1: Prepare the spatial relationships for two different concept levels; • Step2: Compute the counts of k-predicates for the coarse concept level; (Algorithm Apriori) • Step3: Filter out the predicates with supports or confidence lower than the minimal thresholds at the first level; • Step4: Compute the counts for k-predicates of refined concept level; (Algorithm Apriori) • Step5: Find the large predicate at the refined concept level and find association rules.
Layout Design Control Panel Tab (Facility & Association Rule) Ad Area Control Buttons Map Panel (Output)
System Design • Crime and Facility Data Importation • Data Set Preprocessing • Filter out the incomplete data • Track obvious flaws in the data • Visualization Component • Javascript application layout • Google Map Interface • Data Interaction • Ajax technology provided by jQuery & Extjs • Data format transformation
Implementation • Crime and Facility Data Importation Raw Data Data after importation Inserted into database (SQL Server) .xml files
Implementation • Data Set Preprocessing • Filter out the incomplete data • Track obvious flaws in the data • Association Rule Database Generation
Implementation • Visualization Component • Javascript application layout • Google Map Interface
Implementation • Data Interaction • Ajax technology provided by jQuery & Extjs • Data format transformation
Results Our web-site address----- http://8.31.187.116 Welcome !
Results Association Rule from Crime to Facilities Support Count Result Confidence Result
Results Association Rule from Facilities to Crimes Support Count Result Confidence Result
Conclusion • Our main objective is to analyze the given crime dataset and present the information to users in a meaningful and more understandable forms so that it can be utilized by law enforcement and the general public in solving problems and making decision concerning safety. • To achieve such objective, we implemented a web based application. • We analyzed the data and used association rule algorithm to discover spatial-temporal patterns in the data to present to the user through a visualization component.
References • P. Mohan, S. Shekhar, J.A. Shine, and J.P. Rogers ,”Cascading Spatio-temporal Pattern Discovery: A Summary of Results,” Proceedings of the 10th SIAM Data Mining (SDM 2010), pp. 327-338, Columbus, Ohio, Apr. 29 - May 1, 2010. • K. Koperski, J. Han, “Discovery of Spatial Association Rules in Geographic Information System,” Proceedings of the 4th International Symposium on Advances in Spatial Databases (SSD 95), pp.44-76, 1995. • Y. Huang, S. Shekhar, H. Xiong, "Discovering Colocation Patterns from Spatial Data Sets: A General Approach," IEEE Transactions on Knowledge and Data Engineering, Vol. 16(12), pp. 1472-1485, Dec. 2004. • E. Clementini, P.D. Felice, K. Koperski, "Mining Multiple-level Spatial Association Rules for Objects with A Broad Boundary." Journal of Data and Knowledge Engineering, Vol. 34(3), pp. 251-270, Sept. 2000. • V. Bogorny, B. Kuijpers, L. Alvares, "Reducing Uninteresting Spatial Association Rules in Geographic Databases Using Background Knowledge: A Summary of Results," International Journal of Geographical Information Science, Vol. 22(4), pp. 361-386, April 2008. • V. Bogorny, B. Kuijpers, L. Alvares, "Semantic-based Pruning of Redundant and Uninteresting Frequent Geographic Patterns,"
References • GeoInformatica, Vol. 14(2): 201-220, April 2010. • R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” Proceedings of the 20th International Conference on Very Large Data Bases (VLDB 1994), Morgan Kaufmann Publishers Inc., pp. 487-499, 1994. • X. Liu, C. Jian, C.T. Lu, “Demo Paper: A Spatio-Temporal Crime Search Engine,” Proceedings of the 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2010), pp. 528-529, San Jose, California, Nov. 2-5, 2010. • H. Chen, J. Jie, Y. Qin, and M. Chau, “Crime Data Mining: A General Framework and Some Examples,” IEEE Computer, Vol.37(4), pp. 50 - 56, 2004. • S. Shah, F. Bao, C.T. Lu, I.R. Chen, “CROWDSAFE : Crowd Sourcing of Crime Incidents and Safe Routing on Mobile Devices (Demo Paper),” Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2011), pp. 521-524, Chicago, Illinois, Nov. 1-4, 2011. • S. Shah, F. Bao, C.T. Lu, I.R. Chen, “CROWDSAFE : Crowd Sourcing of Crime Incidents and Safe Routing on Mobile Devices (Demo Paper),” Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS 2011), pp. 521-524, Chicago, Illinois, Nov. 1-4, 2011