HSRP 734: Advanced Statistical Methods June 19, 2008

HSRP 734: Advanced Statistical MethodsJune 19, 2008

Extensions of Logistic Regression • Outcomes with more than 2 categories • Categories have order • Unordered • Conditional logistic regression • Analysis of matched data

Extensions of Logistic Regression • Exact methods for small samples • Fisher’s exact • Exact logistic regression • Correlated/Clustered data • GEE method • Mixed models

Extensions of Logistic Regression • Outcomes with more than 2 categories (polytomous or polychotomous) • Cumulative logit model – Proportional odds model for ordinal outcomes (ordered categories) • Generalized logit model for nominal outcomes or non-proportional odds models (unordered categories)

Extensions of Logistic Regression • Cumulative logit model • Fits a logistic regression model with g-1 intercepts for a g category outcome and one model coefficient for each predictor • Models cumulative probability of being in a “lower” category

Ordinal Logistic Regression • Odds ratios take on interpretation “% increase/decrease in the odds of being in a lower/higher category” • Subject to the “Proportional Odds” assumption

Extensions of Logistic Regression • Generalized logit model • Fits a logistic regression model with g-1 intercepts and g-1 model coefficients for a g category outcome • Model captures the multinomial probability of being in a particular category using generalized logits

Nominal Logistic Regression • Odds ratios have regular interpretation, just have to be careful with which comparisons are being made (reference category) • Does not assume “Proportional Odds”

SAS

Conditional logistic regression • Can use for matched data (e.g., case-control studies) • Provides unbiased estimates of odds ratios and CI’s

SAS

Extensions to Logistic Regression • Exact Logistic Regression • Small Sample Size • Adequate sample size but rare event (sparse data)

Fisher’s exact test • Exact test for RxC table where Chi-square test assumptions are doubtful • Why not always use Fisher’s exact test and Exact logistic regression?

SAS

Extensions of Logistic Regression • Longitudinal data / repeated measures data / Clustered data with binary outcomes • Multilevel models (nested data structures) • GEE (Generalized Estimating Equations) • GLMM (Generalized Linear Mixed Models)

Two methods for handling clustered outcomes • Mixed models • Likelihood based • Use random effects to model clustered observations • continuous outcome (but now extended for categorical) • Generalized Estimating Equation (GEE) • Non-likelihood based • Can handle large number of clusters • categorical outcome

GEE • GEE can be used in • Longitudinal studies • repeated measures of the same individual form a cluster • Community studies • subjects clustered by neighborhood • Familial studies • subjects clustered by family • Epidemiological studies • Different forms of clusters – e.g., pedigree

GEE • In general GEE has 3 sets of parameters to estimate: • Regression parameter (population-averaged effects) • Correlation parameter (cluster parameter) • Scale factor (not uncommon to assume =1)

Comparing SLR and GEE

GEE • In its simplest form, GEE can be considered an extension of logistic regression for clustered data • Clustered data are common • Time: Longitudinal analysis with repeated measurements on individual (e.g., BL, 1m, 2m, 6m follow-up) • Individual: Cross-sectional analysis with multiple outcomes (e.g., left eye, right eye) • Background: Subjects clustered because of common geographical or social background (e.g., clinic)

Correlation structure • Correlation structure • Often called the working correlation structure in GEE • Specifies how the observations within a cluster are related • Often assumes correlation structure uniform throughout clusters

Unstructured • All correlation coefficients free to take any value • E.g.,

Exchangeable • Any responses within the same cluster has the same correlation • Simple (1 parameter to estimate)

Autogressive AR(1) • Correlation between responses depends on the interval of time between responses • Farther apart responses => weaker correlation • Only 1 parameter to estimate!

Correlation matrix • Selection of a “working correlation structure” is at the discretion of the researcher! • How does the correlation structure affects the results?

Properties of GEE estimators • How about estimate of correlation if “working” correlation matrix is not correctly specified? • Model-based estimate => not consistent • Empirical (robust) estimate => still consistent

Properties of GEE estimators • Even if correlation structure misspecified, estimate for logistic regression is still consistent • if correlation misspecified, estimate not as efficient (SE is larger) • This property contributes to the popularity of GEE • GEE works well with larger #’s of clusters

SAS

Review

HSRP 734: Advanced Statistical Methods June 19, 2008

HSRP 734: Advanced Statistical Methods June 19, 2008

Presentation Transcript

4-1 Statistical Inference

Improving Availability in Multilayer Switched Networks

Advanced Injectors Lecture BACD Regional Meeting London June 2008

Advanced Statistical Methods: Beyond Linear Regression

DOE and Statistical Methods

Comparing statistical downscaling methods: From simple to complex

Corpora and Statistical Methods Lecture 7

Statistical Machine Translation Part IV - Assignments and Advanced Topics

Methods of data collection

Advanced Methods and Analysis for the Learning and Social Sciences

HSRP

Part 4: ADVANCED SVM-based LEARNING METHODS

Statistical techniques in NLP

Performance of Statistical Learning Methods

Statistical Methods II

Introduction to statistical estimation methods

Advanced Statistical Methods: Continuous Variables statisticalmethods.wordpress

Statistical Analysis

“New trends in statistical methods applied in a semiconductor company”