150 likes | 286 Views
Data Mining on New Road Prediction. By Qing Liu Dec. 9, 2004. Agenda. Introduction Purpose Input Output - Data Mart Techniques Applied Result Uncompleted Learned From This Project Question?. Introduction. Caltrans’ Interagencies tracking system has seven agencies:
E N D
Data Mining on New Road Prediction By Qing Liu Dec. 9, 2004
Agenda • Introduction • Purpose • Input • Output - Data Mart • Techniques Applied • Result • Uncompleted • Learned From This Project • Question?
Introduction • Caltrans’ Interagencies tracking system has seven agencies: • ACOE, Army Corps of Engineers • CCC, CA Coastal Commission • DFG, CA Department of Fish and Game • EPA, Environmental Protection Agency • FWS, Fish and Wildlife service • NOAA, National Oceanic & Atmospheric Administration • OHP, Office of Historic Preservation
Purpose Only ACOE, CCC, FWS and OHP applied for projects dealing with building new roads in the past few years. This project will predict the budget these agencies need on building road in the next three years.
Output - Data Mart Star Schema Dimension Table Dimension Table ACOE IDNUM(FK) Office App_date Resp_date… FWS IDNUM(FK) Office App_date Resp_date… Fact Table Project Table autoID (PK) District County Route Postmile… Dimension Table Dimension Table CCC IDNUM(FK) Office App_date Resp_date… OHP IDNUM(FK) Office App_date Resp_date…
Techniques Applied PD_test.mdb (636 cases) (Data source) Transformation1 PD_test.xls Transformation2 Project.xls Acoe.xls ccc.xls Fws.xls Ohp.xls (428 cases) Agencies.csv District.csv Transport and convert Load WEKA
Techniques Applied (continue) • Access query - cleaning and calculation • Convert – CSV (MS-DOS) • Filters – supervised attribute ClassOrder • Classifier – LinearRegression & J48
Uncompleted • Get the cost per mileage information • Add new field “predict_cost” • Run through WEKA and get the final result
Learned From This Project • Apply data mining technique to real database • How to find the right algorithm and model • The power of data mining in prediction • How to use WEKA
Reference • http://prdownloads.sourceforge.net/weka/weka.ppt • Concepts and Techniques" by Jiawei Han and Micheline Kamber, Morgan Kaufmann 2001. • “Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations” by Ian H. Witten and Eibe Frank, Morgan Kaufmann 2000.
Question ?