1 / 15

Data Mining on New Road Prediction

Data Mining on New Road Prediction. By Qing Liu Dec. 9, 2004. Agenda. Introduction Purpose Input Output - Data Mart Techniques Applied Result Uncompleted Learned From This Project Question?. Introduction. Caltrans’ Interagencies tracking system has seven agencies:

tal
Download Presentation

Data Mining on New Road Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining on New Road Prediction By Qing Liu Dec. 9, 2004

  2. Agenda • Introduction • Purpose • Input • Output - Data Mart • Techniques Applied • Result • Uncompleted • Learned From This Project • Question?

  3. Introduction • Caltrans’ Interagencies tracking system has seven agencies: • ACOE, Army Corps of Engineers • CCC, CA Coastal Commission • DFG, CA Department of Fish and Game • EPA, Environmental Protection Agency • FWS, Fish and Wildlife service • NOAA, National Oceanic & Atmospheric Administration • OHP, Office of Historic Preservation

  4. Purpose Only ACOE, CCC, FWS and OHP applied for projects dealing with building new roads in the past few years. This project will predict the budget these agencies need on building road in the next three years.

  5. Input

  6. Output - Data Mart Star Schema Dimension Table Dimension Table ACOE IDNUM(FK) Office App_date Resp_date… FWS IDNUM(FK) Office App_date Resp_date… Fact Table Project Table autoID (PK) District County Route Postmile… Dimension Table Dimension Table CCC IDNUM(FK) Office App_date Resp_date… OHP IDNUM(FK) Office App_date Resp_date…

  7. Techniques Applied PD_test.mdb (636 cases) (Data source) Transformation1 PD_test.xls Transformation2 Project.xls Acoe.xls ccc.xls Fws.xls Ohp.xls (428 cases) Agencies.csv District.csv Transport and convert Load WEKA

  8. Techniques Applied (continue) • Access query - cleaning and calculation • Convert – CSV (MS-DOS) • Filters – supervised  attribute  ClassOrder • Classifier – LinearRegression & J48

  9. Result

  10. Result (continue)

  11. Result (continue)

  12. Uncompleted • Get the cost per mileage information • Add new field “predict_cost” • Run through WEKA and get the final result

  13. Learned From This Project • Apply data mining technique to real database • How to find the right algorithm and model • The power of data mining in prediction • How to use WEKA

  14. Reference • http://prdownloads.sourceforge.net/weka/weka.ppt • Concepts and Techniques" by Jiawei Han and Micheline Kamber, Morgan Kaufmann 2001. • “Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations” by Ian H. Witten and Eibe Frank, Morgan Kaufmann 2000.

  15. Question ?

More Related