230 likes | 455 Views
Advanced Statistics for Interventional Cardiologists. What you will learn. Introduction Basics of multivariable statistical modeling Advanced linear regression methods Hands-on session: linear regression Bayesian methods Logistic regression and generalized linear model Resampling methods
E N D
What you will learn • Introduction • Basics of multivariable statistical modeling • Advanced linear regression methods • Hands-on session: linear regression • Bayesian methods • Logistic regression and generalized linear model • Resampling methods • Meta-analysis • Hands-on session: logistic regression and meta-analysis • Multifactor analysis of variance • Cox proportional hazards analysis • Hands-on session: Cox proportional hazard analysis • Propensity analysis • Most popular statistical packages • Conclusions and take home messages 1st day 2ndday
Focus on different programs • Complex and all-purpose statistical packages are not always necessary for statistical analyses: R, S, S-Plus, SAS, SPSS, Stata, Statistica, WinBUGS … • Leaner and less expensive programs can sometimes be effective and available (eg 30-day free trials): StatsDirect, Jmp, Minitab, StatXact, LogXact, … • However, if you wish to become more competent in advanced statistical analysis for clinical cardiovascular research, the best choice is to progressively familiarize yourself with one or two complex and all-purpose statistical packages
R • R is a programming language and software environment for statistical computing and graphics, and it is an implementation of the S programming language with lexical scoping semantics. • R is widely used for statistical software development and data analysis. Its source code is freely available under the GNU General Public License, and pre-compiled binary versions are provided for various operating systems. R uses a command line interface, though several graphical user interfaces are available. • Pro: flexibility and programming capabilities (eg for bootstrap), sophisticated graphical capabilities. • Cons: complex and user-unfriendly interface. • Price: free.
S and S-Plus • S-PLUS is a commercial package sold by TIBCO Software Inc. with a focus on exploratory data analysis, graphics and statistical modeling • It is an implementation of the S programming language. It features object-oriented programming capabilities and advanced analytical algorithms (eg for robust regression, repeated measurements, …) • Pros: flexibility and programming capabilities (eg for bootstrap), user-friendly graphical user interface • Cons: complex matrix programming environment • Price: €€€€-€€.
S and S-Plus Regression with S-Plus Menu Programming Call lm object to model stack.loss as a linear function of three predictors: > stack.lm <- lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.)
SAS • SAS (originally Statistical Analysis System, 1968) is an integrated suite of platform independent software modules provided by SAS Institute (1976, Jim Goodnight and Co). • The functionality of the system is very complete and built around four major tasks: data access, data management, data analysis and data presentation. • Applications of the SAS system include: statistical analysis, data mining, forecasting; report writing and graphics; operations research and quality improvement; applications development; data warehousing (extract, transform, load). • Pros: very complete tool for data analysis, flexibility and programming capabilities (eg for Bayesian, bootstrap, conditional, or meta-analyses), large volumes of data • Cons: complex programming environment, labyrinth of modules and interfaces, very expensive • Price: €€€€-€€€€
SAS Programming Frequency table … proc freq data=AAA(where = (charge > 100)); table charge; run; … ANCOVA model … proc mixed data=name; class gregion trtl; model varY = gregion trtl varX / ddfm=kr; lsmeans trtl / pdiff=all; estimate 'Difference CZP' trtl -1 1 / cl alpha=0.05; run; ….
JMPStatistical Discovery Software • JMP is a software package that was first developed by John Sall, co-founder of SAS, to perform simple and complex statistical analyses. It dynamically links statistics with graphics to interactively explore, understand, and visualize data. This allows you to click on any point in a graph, and see the corresponding data point highlighted in the data table, and other graphs. • JMP provides a comprehensive set of statistical tools as well as design of experiments and statistical quality control in a single package. • JMP allows for custom programming and script development via JSL, originally know as "John's Scripting Language“. • An add-on JMP Genomics comes with over 100 analytic procedures to facilitate the treatment of data involving genetics, microarrays or proteomics. • Pros: very intuitive, lean package for design and analysis in research • Cons: less complete and less flexible than the complete SAS system • Price: €€€€.
Statistica • STATISTICA is a powerful statistics and analytics software package developed by StatSoft, Inc. • Provides a wide selection of data analysis, data management, data mining, and data visualization procedures. Features of the software include basic and multivariate statistical analysis, quality control modules and a collection of data mining techniques. • Pros: extensive range of methods, user-friendly graphical interface, has been called “the king of graphics” • Cons: limited flexibility and programming capabilities, labyrinth • Price: €€€€.
SPSS • SPSS (originally, Statistical Package for the Social Sciences) is a computer program used for statistical analysis released in its first version in 1968 and now distributed by SPSS Inc. • SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. • Pros: extensive range of tests and procedures, user-friendly graphical interface. • Cons: limited flexibility and programming capabilities. • Price: €€€€.
Stata • Stata (name formed by blending "statistics" and "data“) is a general-purpose statistical software package created in 1985 by StataCorp. • Stata's full range of capabilities includes: data management, statistical analysis, graphics generation, simulations, custom programming. Most meta-analyses tools were first developed for Stata, and thus this package offers one of the most extensive library of statistical tools for systematic reviewers • Pros: flexibility and programming capabilities (eg for bootstrap, or meta-analyses), sophisticated graphical capabilities • Cons: relatively complex interface • Price: €€€€-€€€€
WinBUGS and OpenBUGS • WinBUGS (Windows-based Bayesian inference Using Gibbs Sampling) is a statistical software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods, developed by the MRC Biostatistics Unit, at the University of Cambridge, UK. It is based on the BUGS (Bayesian inference Using Gibbs Sampling) project started in 1989. • OpenBUGS is the open source variant of WinBUGS. • Pros: flexibility and programming capabilities • Cons: complex interface • Price: free
And many more … Extensive overview of functionality and cost of statistical software packages: http://en.wikipedia.org/wiki/Comparison_of_statistical_packages
For further slides on these topics please feel free to visit the metcardio.org website:http://www.metcardio.org/slides.html