1 / 19

Crime Forecasting Using Boosted Ensemble Classifiers

Crime Forecasting Using Boosted Ensemble Classifiers. Present by: Chung- Hsien Yu. Advisor: Prof. Wei Ding. Department of Computer Science University of Massachusetts Boston . 2012 GRADUATE STUDENTS SYMPOSIUM. Abstract.

braden
Download Presentation

Crime Forecasting Using Boosted Ensemble Classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crime Forecasting Using Boosted Ensemble Classifiers Present by: Chung-Hsien Yu Advisor: Prof. Wei Ding Department of Computer Science University of Massachusetts Boston 2012 GRADUATE STUDENTS SYMPOSIUM

  2. Abstract • Retaining spatiotemporal knowledge by applying multi-clustering to monthly aggregated crime data. • Training baseline learners on these clusters obtained from clustering. • Adapting a greedy algorithm to find a rule-based ensemble classifier during each boosting round. • Pruning the ensemble classifier to prevent it from overfitting. • Constructing a strong hypothesis based on these ensemble classifiers obtained from each round.

  3. Original Data

  4. Aggregated Data 1 3 1 1

  5. Monthly Data 1 0 3 0 3 1 0 2 0 4 2 5 0 4 5 1 0 3 0 1 4 1 5 2 0 1 3 0 0 5 1 0 0 1 3 0 0 2 2 1 5 2 3 3 4 4 0 3 5 2 4 1 2 0 2 0 0 2 5 3 3 0 3 0 1 6 3 0 4 2 0 2 0 5 4 3 1 2 0 3 0 1 0 2 0 2 1 4 4 1 5 4 3 1 0 3 0 3 1 0 2 0 0 2 2 0 0 1 3 5 4 2 2 0 3 1 0 3 0 0 6 0 5 4 6 1 4 4 6 7 8 4 0 2 4 3 0 4 6 0 4 8 4 0 0 2 0 0 3 6 1 3 3 8 1 8 6 1 0 3 4 5 3 3 3 8 2 0 5 2 4 8 0 2 0 4 6 5 4 1 5 3 2 6 0 3 8 5 3 6 5 3 6 2 0 6 2 6 6 1 6 2 0 4 8 4 4 0 3 4 1 2 4 3 0 8 1 0 3 5 3 3 3 4 2 0 6 5 3 5 6 6 1 6 6 0 4 1 8 0 1 0 0 0 0 1 0 3 1 9 0 0 0 1 1 3 0 0 2 1 9 1 0 0 2 4 0 4 5 6 2 3 9 3 3 0 0 0 0 4 9 0 1 3 4 0 1 4 0 0 2 2 3 0 9 4 0 3 3 3 3 2 3 4 0 0 9 9 0 0 1 9 6 0 0 3 3 3 0 9 4 2 4 3 4 0 2 3 0 0 1 1 3 3 1 0 0 0 0 0 9 0 0 0 0 0 1 1 1 1 0 1 3 1 3 1 2 0 3 0 0 1 5 2 3 2 0 3 0 0 2 2 0 0 3 0 3 2 3 1 0 2 0 0 3 2 0 2 2 1 0 5 1 0 2 0 0 3 2 0 3 2 5 1 0 2 0 0 3 2 0 3 0 0 0 2 5 0 2 0 5 3 0 2 0 3 1 3 3 5 0 4 5 0 2 0 2 2 0 2 0 0 1 0 2 3 0 0 2 3 0 1 0 2 0 0 3 2 0 5 3 5 3 1 0 2 0 0 3 2 0 0 2 0 5 1 0 2 2 0

  6. Monthly Clusters (k=3)

  7. Monthly Clusters (k=4)

  8. Flow Chart

  9. Algorithm (Part I)

  10. Algorithm (Part II)

  11. Confidence Value From AdaBoosting (Schapire & Singer 1998) we have Let and ignore the boosting round . is defined as the confidence value for the rule and if .

  12. Objective Function Therefore,

  13. Minimum Z Value has the minimum value when

  14. BuildChain Function Repeatedly adding a classifier to R until it maximizes . This will minimize as well.

  15. PruneChain Function Loss Function: is obtained from GrowSet. are obtained from applying R to PruneSet Minimize by removing the last classifier from R.

  16. Update Weights Calculate with ensemble classifier R on the entire data set. where

  17. Strong Hypothesis At the end of boosting, there are chains,

  18. SUMMARY The grid cells with the similar crime counts clustered together also are close to each other on the map geographically. Besides, the high-crime-rate area and low-crime-rate area are separated with cluster. The original data set is randomly divided into two subsets each round. The greedy weak-learn algorithm adapts confidence-rate evaluation to “chain” the base-line classifiers using one data set. And then, “trim” the chain using the other data set. The strong hypothesis is easy to calculate.

  19. Q & A THANK YOU!!

More Related