1 / 53

Splines and Kernels in Nonparametric Regression for Clustered Data

Learn about utilizing splines and kernels in nonparametric regression for clustered and longitudinal data, focusing on efficiency and correlation considerations. Discover methods like GLS, working independence, and local likelihood kernels.

klibbey
Download Presentation

Splines and Kernels in Nonparametric Regression for Clustered Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non/Semiparametric Regression and Clustered/Longitudinal Data Raymond J. Carroll Texas A&M University http://stat.tamu.edu/~carroll carroll@stat.tamu.edu Postdoctoral Training Program: http://stat.tamu.edu/B3NC

  2. Palo Duro Canyon, the Grand Canyon of Texas West Texas  East Texas  Wichita Falls, my hometown Guadalupe Mountains National Park College Station, home of Texas A&M University Midland I-45 Big Bend National Park I-35

  3. Acknowledgments Raymond Carroll Oliver Linton Alan Welsh Series of papers are on my web site Lin, Wang and Welsh: Longitudinal data (Mammen & Linton for pseudo-observation methods) Linton and Mammen: time series data Xihong Lin Naisyin Wang Enno Mammen

  4. Outline • Longitudinal models: • panel data • Background: • splines = kernels for independent data • Correlated data: • do splines = kernels?

  5. Panel Data (for simplicity) • i = 1,…,n clusters/individuals • j = 1,…,m observations per cluster

  6. Panel Data (for simplicity) • i = 1,…,n clusters/individuals • j = 1,…,m observations per cluster • Important points: • The cluster size m is meant to be fixed • This is not a multiple time series problem where the cluster size increases to infinity • Some comments on the single time series problem are given near the end of the talk

  7. The Marginal Nonparametric Model • Y = Response • X = time-varying covariate • Question: can we improve efficiency by accounting for correlation?

  8. Nonstandard Example: Colon carcinogenesis experiments. Cell-Based Measures in single rats of DNA damage, differentiation, proliferation, apoptosis, P27, etc.

  9. The Marginal Nonparametric Model • Important assumption • Covariates at other waves are not conditionally predictive, i.e., they are surrogates • This assumption is required for any GLS fit, including parametric GLS

  10. Independent Data • Splines (smoothing, P-splines, etc.) with penalty parameter = l • Ridge regression fit • Some bias, smaller variance • is over-parameterized least squares • is a polynomial regression

  11. Independent Data • Kernels (local averages, local linear, etc.), with kernel density function K and bandwidth h • As the bandwidth h 0, only observations with X near t get any weight in the fit

  12. Independent Data • Major methods • Splines • Kernels • Smoothing parameters required for both • Fits: similar in many (most?) datasets • Expectation: some combination of bandwidths and kernel functions look like splines 12

  13. Independent Data • Splines and kernels are linear in the responses • Silverman showed that there is a kernel function and a bandwidth so that the weight functions are asymptotically equivalent • In this sense, splines = kernels

  14. The weight functions Gn(t=.25,x) in a specific case for independent data Kernel Smoothing Spline Note the similarity of shape and the locality: only X’s near t=0.25 get any weight

  15. Working Independence • Working independence: Ignore all correlations • Fix up standard errors at the end • Advantage: the assumption is not required • Disadvantage: possible severe loss of efficiency if carried too far

  16. Working Independence • Working independence: • Ignore all correlations • Should posit some reasonable marginal variances • Weighting important for efficiency • Weighted versions: Splines and kernels have obvious analogues • Standard method: Zeger & Diggle, Hoover, Rice, Wu & Yang, Lin & Ying, etc.

  17. Working Independence • Working independence: • Weighted splines and weighted kernels are linear in the responses • The Silverman result still holds • In this sense, splines = kernels

  18. Accounting for Correlation • Splines have an obvious analogue for non-independent data • Let be a working covariance matrix • Penalized Generalized least squares (GLS) • GLS ridge regression • Because splines are based on likelihood ideas, they generalize quickly to new problems

  19. Accounting for Correlation • Splines have an obvious analogue for non-independent data • Kernels are not so obvious • One can do theory with kernels • Local likelihood kernel ideas are standard in independent data problems • Most attempts at kernels for correlated data have tried to use local likelihood kernel methods

  20. Kernels and Correlation • Problem: how to define locality for kernels? • Goal: estimate the function at t • Let be a diagonal matrix of standard kernel weights • Standard Kernel method: GLS pretending inverse covariance matrix is • The estimate is inherently local

  21. Kernels and Correlation Specific case: m=3, n=35 Exchangeable correlation structure Red: r = 0.0 Green: r = 0.4 Blue: r= 0.8 Note the locality of the kernel method The weight functions Gn(t=.25,x) in a specific case 18

  22. Splines and Correlation Specific case: m=3, n=35 Exchangeable correlation structure Red: r = 0.0 Green: r = 0.4 Blue: r= 0.8 Note the lack of locality of the spline method The weight functions Gn(t=.25,x) in a specific case

  23. Splines and Correlation Specific case: m=3, n=35 Complex correlation structure Red: Nearly singular Green: r = 0.0 Blue: r= AR(0.8) Note the lack of locality of the spline method The weight functions Gn(t=.25,x) in a specific case

  24. Splines and Standard Kernels • Accounting for correlation: • Standard kernels remain local • Splines are not local • Numerical results can be confirmed theoretically

  25. Results on Kernels and Correlation • GLS with weights • Optimal working covariance matrix is working independence! • Using the correct covariance matrix • Increases variance • Increases MSE • Splines Kernels (or at least these kernels) 24

  26. Pseudo-Observation Kernel Methods • Better kernel methods are possible • Pseudo-observation: original method • Construction: specific linear transformation of Y • Mean = Q(X) • Covariance = diagonal matrix • This adjusts the original responses without affecting the mean

  27. Pseudo-Observation Kernel Methods • Construction: specific linear transformation of Y • Mean = Q(X) • Covariance = diagonal • Iterative: • Efficiency: More efficient than working independence • Proof of Principle: kernel methods can be constructed to take advantage of correlation

  28. Efficiency of Splines and Pseudo-Observation Kernels Exchng: Exchangeable with correlation 0.6 AR: autoregressive with correlation 0.6 Near Sing: A nearly singular matrix

  29. What Do GLS Splines Do? • GLS Splines are really working independence splines using pseudo-observations • Let • GLS Splines are working independence splines

  30. GLS Splines and SUR Kernels • GLS Splines are working independence splines • Algorithm: iterate until convergence • Idea: for kernels, do same thing • This is Naisyin Wang’s SUR method

  31. Better Kernel Methods: SUR • Another view • Consider current state in iteration • For every j, • assume function is fixed and known for • Use the seemingly unrelated regression (SUR) idea • For j, form estimating equation for local averages/linear for jth component only using GLS with weights • Sum the estimating equations together, and solve

  32. SUR Kernel Methods • It is well known that the GLS spline has an exact, analytic expression • We have shown that the SUR kernel method has an exact, analytic expression • Both methods are linear in the responses • Relatively nontrivial calculations show that Silverman’s result still holds • Splines = SUR Kernels

  33. Nonlocality • The lack of locality of GLS splines and SUR kernels is surprising • Suppose we want to estimate the function at t • Result: All observations in a cluster contribute to the fit, not just those with covariates near t • Locality: Defined at the cluster level

  34. Nonlocality • Wang’s SUR kernels = pseudo kernels with a clever linear transformation. Let • SUR kernels are working independence kernels

  35. Locality of Splines • Splines = SUR kernels (Silverman-type result) • GLS spline: • Iterative • standard independent spline smoothing • SUR pseudo-observations at each iteration • GLS splines are not local • GLS splines are local in (the same!) pseudo-observations

  36. Time Series Problems • Time series problems: many of the same issues arise • Original pseudo-observation method • Two stages • Linear transformation • Mean Q(X) • Independent errors • Single standard kernel applied • Potential for great gains in efficiency (even infinite for AR problems with large correlation)

  37. Time Series: AR(1) Illustration, First Pseudo Observation Method • AR(1), correlation r: • Regress Yt0 on Xt

  38. Time Series Problems • More efficient methods can be constructed • Series of regression problems: for all j, • Pseudo observations • Mean • White noise errors • Regress for each j: fits are asymptotically independent • Then weighted average • Time series version of SUR-kernels for longitudinal data?

  39. Time Series: AR(1) Illustration, New Pseudo Observation Method • AR(1), correlation r: • Regress Yt0 on Xt and Yt1 on Xt-1 • Weights: 1 and r2

  40. Time Series Problems • AR(1) errors with correlation r • Efficiency of original pseudo-observation method to working independence: • Efficiency of new (SUR?) pseudo-observation method to original method: 36

  41. The Semiparametric Model • Y = Response • X,Z = time-varying covariates • Question: can we improve efficiency for bby accounting for correlation?

  42. Profile Methods • Given b, solve for Q, say • Basic idea: Regress • Working independence • Standard kernels • Pseudo –observations kernels • SUR kernels

  43. Profile Methods • Given b, solve for Q, say • Then fit GLS or W.I. to the model with mean • Question: does it matter what kernel method is used? • Question: How bad is using W.I. everywhere? • Question: are there efficient choices?

  44. The Semiparametric Model: Special Case • If X does not vary with time, simple semiparametric efficient method available • The basic point is that has common mean and covariance matrix • If were a polynomial, GLS likelihood methods would be natural

  45. The Semiparametric Model: Special Case • Method: Replace polynomial GLS likelihood with GLS local likelihood with weights • Then do GLS on the derived variable • Semiparametric efficient

  46. Profile Method: General Case • Given b, solve for Q, say • Then fit GLS or W.I. to the model with mean • In this general case, how you estimate Q matters • Working independence • Standard kernel • Pseudo-observation kernel • SUR kernel

  47. Profile Methods • In this general case, how you estimate Q matters • Working independence • Standard kernel • Pseudo-observation kernel • SUR kernel • The SUR method leads to the semiparametric efficient estimate • So too does the GLS spline

  48. Longitudinal CD4 Count Data (Zeger and Diggle) Semiparametric GLS refit Semiparametric GLS Z-D Working Independence Est. s.e. Est. s.e. Est. s.e.

  49. Conclusions (1/3): Nonparametric Regression • In nonparametric regression • Kernels = splines for working independence (W.I.) • Working independence is inefficient • Standard kernels splines for correlated data

  50. Conclusions (2/3): Nonparametric Regression • In nonparametric regression • Pseudo-observation methods improve upon working independence • SUR kernels = splines for correlated data • Splines and SUR kernels are not local • Splines and SUR kernels are local in pseudo-observations

More Related