250 likes | 381 Views
SUM 2012 Marburg Germany. Six th International Conference on Scalable Uncertainty Management. An Attempt to Employ Genetic Fuzzy Systems to Predict from a Data Stream of Premises Transactions. Research team: Bogdan Trawiński Tadeusz Lasota Magdalena Smętek Grzegorz Trawiński
E N D
SUM 2012 Marburg Germany Sixth International Conference on Scalable Uncertainty Management An Attempt to Employ Genetic Fuzzy Systems to Predict from a Data Stream of Premises Transactions Research team: Bogdan Trawiński Tadeusz Lasota Magdalena Smętek Grzegorz Trawiński Wrocław University of Environmental and Life Sciences, Poland Wrocław University of Technology, Poland
SUM 2012 Marburg Germany Agenda • Motivation & goals • Architecture of prospect system • Outline of the method • Experimental setup • Results of experiments • Conclusions • Future works
SUM 2012 Marburg Germany Motivation & goals • General goal: • What machine learning algorithms are the most appropriate forinternet system aiding in real estate appraisal? • An attempt to work out an intermediate method between evolving fuzzy sytems and static fuzzy systems • We try to apply nonincremental genetic fuzzy systems to build reliable predictive models from a data stream • We present a method consisting in the utilization of aged models to compose ensembles and correction of the output provided by component models by means of the trend functions reflecting the changes of prices in the market over time.
SUM 2012 Marburg Germany Architecture of the prospect system Fig. 1. Schema of automated data-driven system for property valuation
SUM 2012 Marburg Germany Outline of the Method Fig. 2Change trend of average transactional prices per square metre over time
SUM 2012 Marburg Germany Outline of the Method Fig. 3GFS ensemble approach to predict from a data stream
SUM 2012 Marburg Germany Outline of the Method Fig. 4The idea of correcting the output of aged models („delta” method)
SUM 2012 Marburg Germany Outline of the Method Fig. 5The idea of correcting the output of aged models („ratio” method)
SUM 2012 Marburg Germany Outline of the Method
SUM 2012 Marburg Germany Datasets used • Dataset – real-world data from a cadastral system • 5213 samples (transactionsmadeduring 1998-2008) • Input variables: • Five following attributes were pointed out as main price drivers by professional appraisers: • usable area of a flat (Area), • age of a building construction (Age), • number of storeys in the building (Storeys), • number of rooms in the flat including a kitchen (Rooms), • the distance of the building from the city centre (Centre), • Outputvariable: • Price of premises (Price).
SUM 2012 Marburg Germany Outline of the Method
SUM 2012 Marburg Germany Experimental setup (1st series) Fig. 6Schema of experiments („delta” method)
SUM 2012 Marburg Germany Results of experiment (1st series) Fig. 7Performance of ageing single models trained over 12 month data windows
SUM 2012 Marburg Germany Results of experiment (1st series) Fig. 8Performance of ensembles comprising GFSs trained over 12 month data windows
SUM 2012 Marburg Germany Results of experiment (1st series) Fig. 9Performance comparison of ensembles trained over 3 and 12 month data windows
SUM 2012 Marburg Germany Experimental setup (2nd series) Fig. 10Schema of experiments („delta” method, Age Ti , Beg Ti, i - polynomial degree 1-4)
SUM 2012 Marburg Germany Results of experiment (2nd series) Fig. 11Performance of GFS ensembles for trends over shorter intervals (Age Ti)
SUM 2012 Marburg Germany Results of experiment (2nd series) Fig. 12Performance of GFS ensembles for trends over longer intervals (Beg Ti)
SUM 2012 Marburg Germany Results of experiment (2nd series) Fig. 13Performance comparison of GFS ensembles for trends over shorter (AgeT1) and longer (BegT3) intervals and without trend update (NoT)
SUM 2012 Marburg Germany Experimental setup (3rd series) Fig. 14Schema of experiments („delta” and „ratio” methods, Age Ti , Beg Ti, i - polynomial degree 1-4)
SUM 2012 Marburg Germany Results of experiment (3rd series) Fig. 15Performance of ensembles comprising 12 models
SUM 2012 Marburg Germany Results of experiment (3rd series) Friedman nonparametric tests Fig. 16Average rank positions of ensembles for individual methods determined during Friedman test
SUM 2012 Marburg Germany Conclusions • An approach to apply ensembles of genetic fuzzy systems to aid in residential premises valuation was proposed. • The approach consists in incremental expanding an ensemble by systematically generated models in the course of time. • The output of aged component models produced for current data is updated according to a trend function reflecting the changes of premises prices since the moment of individual model generation. • An experimental evaluation of the proposed method using real-world data taken from a dynamically changing real estate market revealed its advantage in terms of predictive accuracy.
SUM 2012 Marburg Germany Future works • Further investigation is planned to explore the intrinsic structure of component models, i.e. their knowledge and rule bases, as well as their generation efficiency, interpretability, overfitting, and outlier issues. • Moreover, the weighting component models requires more thorough investigation. We will consider weights proportional to ageing time and the accuracy of component models.
SUM 2012 Marburg Germany Sixth International Conference on Scalable Uncertainty Management Thank You very much for Your attention