Modeling

Modeling • Create an abstraction of the something in the real world • Can be parameterized • Should be validated against real-world data • Types: • Interpolation • Correlation • Simulation

Correlation Models • Predicting values of a dependent variable from one or more independent variables • Remember: correlation does not imply causation Wikipedia

Parametric Methods • Typical Probability Distribution Functions • Gaussian, Negative Exponential, Binomial, Gamma, Poisson, … • Generalized Linear Models • Linearize data • Polynomial • Linear • N Order Polynomials • Generalized Additive Models • Box-Models (BioClim) • Logistic…

Gaussian (Normal) Function Wikipedia

Exponential (negative) Wikipedia

Binomial • Number of successes in a series of yes/no trials Wikipedia

Gamma function Wikipedia

Poisson distribution • Probability of a given number of events occurring in a fixed interval of time Wikipedia

Generalized Linear Models

Polynomial • Flexible and adaptable • Notorious for oscillations between exact-fit values

Non-Parametric Methods • Piece-wise Regression (MaxEnt) • Kernel Smoothing (NPMR) • Neural Nets • Regression/Decision Trees • Multivariate adaptive regressionsplines (MARS) • Genetic Algorithms

Generalized Additive Models • Can use parametric functions or parametric

Specific Methods/Software • MaxEnt: Species Distribution/Habitat Suitability Models • Non-Parametric Multiplicative Regression (NPMR) • Genetic Algorithm for Rule Set Production (GARP) • Others…

Trees A tree showing survival of passengers on the Titanic ("sibsp" is the number of spouses or siblings aboard). The figures under the leaves show the probability of survival and the percentage of observations in the leaf. Wikipedia

Trees • Classification Trees • Predicted outcome is a class (sex) • Regression Trees • Predicted outcome is a value (percent) • Boosted Trees • Combines classification and regression trees • Random Forests • Combines many trees to improve fit

Model Selection • Type of model should be selected based on what is known about the phenomenon being modeled • Given: • A set of data from “tests” • A set of models where we can compute the probability of each test • We can compute the “best” model based on it’s fit to the data and number of parameters (if we can compute a probability for each ‘test’ and the data is independent and identically distributed)

Parsimony • “…too few parameters and the model will be so unrealistic as to make prediction unreliable, but too many parameters and the model will be so specific to the particular data set so to make prediction unreliable.” Edwards, A. W. F. (2001). Occam’s bonus. p. 128–139; in Zellner, A., Keuzenkamp, H. A., and McAleer, M. Simplicity, inference and modelling. Cambridge University Press, Cambridge, UK.

Likelihood • Likelihood of a set of parameter values given some observed data=probability of observed data given parameter values

Likelihood

Akaike Information Criterion • AIC • K = number of estimated parameters in the model • L = Maximized likelihood function for the estimated model

Parsimony Model Based Inference in the Life Sciences, Anderson

AIC

AIC • Only a relative meaning • Smaller is “better” • Balance between complexity (over fitting, lots of parameters), and bias

Modeling

Modeling

Presentation Transcript

Modeling

Modeling Issues Modeling Enterprises

Modeling

Modeling

Dynamic Modeling: Modeling “events”

Modeling

Modeling VS Modeling

MODELING

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Modeling

Dynamic Modeling: Modeling “State”

Modeling

Modeling