1 / 36

Over-parameterisation and model reduction

Over-parameterisation and model reduction. Neil Crout Environmental Science School of Biosciences University of Nottingham. Co-workers. Davide Tarsitano Glen Cox James Gibbons Andy Wood Jim Craigon Steve Ramsden Yan Jiao Tim Reid with thanks to BBSRC and Leverhulme. Motivation.

tricia
Download Presentation

Over-parameterisation and model reduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Over-parameterisation and model reduction Neil Crout Environmental Science School of Biosciences University of Nottingham

  2. Co-workers • Davide Tarsitano • Glen Cox • James Gibbons • Andy Wood • Jim Craigon • Steve Ramsden • Yan Jiao • Tim Reid • with thanks to BBSRC and Leverhulme

  3. Motivation

  4. Over-parameterisation • Increase the complexity of the model to improve agreement with observations • Can continue this until ALL the variation in the data is accounted for • Including the noise! • Models can become overparameterised

  5. Over-parameterisation • Judgements have to be made about the level of complexity in a model (parsimony) • In a statistical context methods for model selection exist • Key feature - comparing models • Key feature - comparing models

  6. So what! • Our models are not statistical • Mechanistic • No fitting involved • Predicting lots of different things (multi-variate) • Have to account for lots of different cases • Did we mention they’re mechanistic • Newton • Thermodynamics • etc all on our side

  7. Overparameterisation and mechanistic models? • Mechanistically inspired – but empirically implemented • So normal statistical rules of business apply • Model parameters are often not formally fitted • worse they may be ‘tuned’ • might be measured (but model structure may not be consistent with real value) • Often (maybe always) models are implicitly fitted • comparisons are made with the real world • model changes • development iterates • But normal rules of business apply – greater complexity may not equal greater realism

  8. Overparameterisation and mechanistic models? • Mechanistic models have the potential for over-parameterisation • This is a source of uncertainty • Perhaps of equal importance • Mechanistic models are cited as tools to test understanding • For this we do need to know whether a model’s formulation really can be supported by observations

  9. Model Reduction – Why? • Judgements have to be made about the level of complexity in a model • Mostly this is subjective • supported by tools such as sensitivity analysis etc • sometimes by consideration of alternative model formulations • but this is generally difficult…. • Want some statistical support for model selection • But that requires a set of alternative models to compare

  10. Model Reduction – Why? • Want to assess whether a model has the most appropriate level of detail • We can get some idea on this by comparing models of the same system which have different levels of detail • But, we don’t have lots of different models • So we consider reducing the model we have to create alternative model formulations which can be compared • Comparison is restricted • reduced models are drawn from the same source • but hopefully better than no comparison at all

  11. Some illustrations

  12. Example: radioacaesium uptake

  13. ‘Absalom’ radiocaesium uptake model % clay K+ pH % C θclay RIPclay(2) θhumus Kx soil CEChumus M_camg CECclay Kx humus(2) Kdhumus NH4 Kdclay(1) mK(1) Kdr Kdl CF(2) D factor Cssol Csplant Cssoil

  14. Model Reduction – How? • Start with a ‘full’ model of a system • Reduce it (systematically, automatically) • Producing a ‘set’ of alternative model formulations for the same system • Reduction? • Inter-connected nature of typical models mean can’t simply leave things out • We replace variables with a constant • e.g. the mean or median that the variable attains in the full model

  15. Reduction by variable replacement • Suppose the full model has a variable Vi • Which is a function of the model variables, inputs, and parameters • Arrange so that the model has • Where  is either 0 (no replacement) or 1 (replaced) • Systematically reducing the model then involves setting the values of the vector 

  16. Search the model replacement space • Search the combination of possible models (i.e. varying ) • exhaustive if computationally feasible • stochastic search (e.g. MCMC) can be more efficient • discrete space • Calculate a posterior probability for each model configuration • This is based on comparison with observations • So this is a data-led approach • Data has to be representative for analysis to be representative • Certainly need to be aware of the data’s limitations

  17. 10 candidate variables – 210 combinations % clay K+ pH % C θclay RIPclay(2) θhumus Kx soil CEChumus M_camg CECclay Kx humus(2) Kdhumus NH4 Kdclay(1) mK(1) Kdr Kdl CF(2) D factor Cssol Csplant Cssoil

  18. Posterior probability for top 10 models Full Model

  19. Full model (as published) % clay K+ pH % C θclay RIPclay(2) θhumus Kx soil CEChumus M_camg CECclay Kx humus(2) Kdhumus NH4 Kdclay(1) mK(1) Kdr Kdl CF(2) D factor Cssol Csplant Cssoil

  20. this one works better… % clay K+ pH % C θclay RIPclay(2) θhumus Kx soil CEChumus M_camg CECclay Kx humus(2) Kdhumus NH4 Kdclay(1) mK(1) Kdr Kdl CF(2) D factor Cssol Csplant Cssoil

  21. Better? • Reduced uncertainty, better predicting models • Tested with independent data • Requiring fewer inputs • Specifically soil pH • Model is used spatially, so this is a good thing operationally • Greater confidence that the model is not over-parameterised….? • Diagnostic to guide/support model development

  22. Methane Emissions from Wetlands

  23. Methane emission from wetlands

  24. Top ‘performing’ models Full Model

  25. Mechanistic Interpretation • Model diagnostics – present results as probability of replacing specific variables • Summing over the model combinations explored • seasonal transfer of plant matter to soil organic matter (100%) • reducing methane oxidation from mechalis-menten to first order (39%) • temperature dependency of methane oxidation (15%) • seasonal dependency of plant growth (99%).

  26. C & N dynamics in the arctic • N15 applied to field plots • 3 years subsequent measurement • used to parameterise a model based on MBL-GEM

  27. C & N dynamics in the arctic

  28. 14 candidate variables (16k combinations)

  29. Ensemble prediction over set of models

  30. Sirius – Wheat Crop Growth Model

  31. Roots Vernalisation EVAPO CANOPY SOIL.AW[1..30] TempMax TempMin exw[1..30] Rootlen Potlfno TTOPT TTFIX TTOUT SLOSL ECUM TTOUT Amnlfno Soilmax aw[1..30 AVWATER WATCON SHOUT CanTemp PrimordNo Vprog TTROOT SoilMin Def Leafnumber HSLOP(Tmean) TTCAN LAIStage PHENO TTOUT DrFacLAI Rootlength Areamax PTSOIL RN Rad PTAY WINTER Leafarea Alpha Reduc CROPUP PENMAN TAU GAKLIR Areaopt Lastleaf FleafNo SF Anthesis PotTrans Deadleaves Maxgaklir leafarea IPHASE Anthesis TTOUT Transp AreaANTH BIOANTH EVSOIL GRAIN Therm Biomass Reduc EVAP ENAV Grainstart CropN CONDUC Hcrop BiomassBGF BiomassBGF Demand Hsoil Grainend BIOANTH TCMax TTOUT TCmin Biomass TDeepsoil DrFacLai Inputs SF DrFacgrowth Soilmin GrainDM Temp Min Temp Max GrainNoverGrainDm tadj Tmean Soilmax LeafNc TTOUT TDeepsoil MinGrainDemand Temp Max Temp Max Water balance Rain GrainDemand UptakeN AvN VARIETY Si Soil leafarea Radiation AreaANTH leafarea Dry matter SoilN DeadLeafN GrainN PoolN Lastleaf EarWt Therm RUE ExStemNc StemNc ExLeafNc Anthesis SF Biomass Tau PAR drfacgrowth rad CropN Equilibrium leafarea uw[1..30] exw[1..30] aw[1..30] Naw[1..30] Nex[1..30] biomass dn stembiomass MaxStemdemand AvN Rootlen Naw[1..30] Nuw[1..30] Nex[1..30] Soil moisture SF Percolation StemN N’Pulses anforP Wateruptake Rain+Irr exw[1] SumP Fert minstemDemand UptakeN Rootlen Norganic Tmean aw[1] Q Nf Q exw[1..30] StemNc AW[1..30] Waterbalance.Si Ta Nm Ts Ni aw[1..30] Ndp uw[1] CropN si Naw[1] Qwp LeafNc WP(Pottrans) RZexw uw[1..8] Soil evap aw[1..30] Qr x[1..8] WP (Rain +irrigation) x Leafdemand StemExN si aw[1..8] Qwp Qfc Q EVAPO.evsoil exw[1..30] x AW[1] Qfc TransP EVAPO.pottrans leafarea leafarea Nuw[1..8] Naw[1..8] aw[1..5] SOIL.EXW[1..30] exw[1..5] evsoil Nex[1..30] SF Results

  32. Top 75% of the distribution/ensemble

  33. Interpretation.... Vernalisation 99% N-mineralisation Temp/moisture adjustment~99% Crop N physiology 50% Canopy Temperature 96% Nitrogen Leaching 50% Diurnal adjustment of thermal time 99% 40 other variables <<50%

  34. Other applications… • Applying the same/similar approach to other models, e.g. • Marine Ecology Models (carbon cycling) • JULES – land surface scheme for GCM • FARM-ADAPT – very large farm management model • Linear programming based model • V. large number of variables (10 000s) • More challenging

  35. Some conclusions • Whether a model’s formulation is appropriate is uncertain • Model reduction provides a way to test (challenge) model formulation (perhaps a brutal test) • May give us some insurance against over-parameterisation (reduce uncertainty) • Does test the understanding built into our models • All models investigated so far can be reduced by variable replacement • Most reduced models have some advantage

  36. Useful references? • NMJ Crout, D Tarsitano, AT Wood. Is my model too complex? Evaluating model formulation using model reduction.Environmental Modelling & Software, in press. • JM Gibbons, GM Cox, ATA Wood, J Craigon, SJ Ramsden, D Tarsitano, NMJ Crout (2008). Applying Bayesian Model Averaging to mechanistic models: an example and comparison of methods. Environmental Modelling & Software, 23:973-985. • Cox GM, Gibbons JM, Wood ATA, Craigon J, Ramsden SJ, Crout NMJ (2006). Towards the systematic simplification of mechanistic models. Ecological Modelling 198:240-246. • Bernhardt, K. 2008. Finding alternatives and reduced formulations for process based models. Evolutionary Computation 16:1-16. Alternative approach. • Asgharbeygi, N., Langley, P., Bay, S., Arrigo, 2006. Inductive revision of quantitative process models. Ecol. Mod., 194:70-79. Alternative approach to model reduction, albeit without a statistical framework • Brooks RJ, Tobias AM (1996) Choosing the best model: level of detail, complexity, & model... Mathl. Comput. Mod. 24:1-14. Interesting article on model utility • Anderson, T.R., 2005. Plankton functional type modelling: running before we can walk? J. Plankton Research 27:1073-1081. Nice discussion of model complexity in context of marine systems. Quite controversial, replies to the replies to the replies. • Jakeman, A.J, Letcher, R.A., Norton, J.P., 2006 Ten iterative steps in development and evaluation of environmental models. Environmental Modelling & Software 21, 602-614. Very nice discussion on what is good practice in model development • Myung, J., Pitt, M. A., 2002. When a good fit can be bad. Trends. Cogn. Sci., 6:421-425. Good introduction to model selection criteria • Burnham, K.P., Anderson, D.R., 2002. Model Selection and Multimodel Inference: A Practical Information-Theoretical Approach. 2nd ed. New York: Springer-Verlag. Ultimate reference, but not without critics.

More Related