380 likes | 393 Views
Get support for statistical computing, numerical analysis, and internet tools for research. We offer training and assistance to staff and postgraduate students, with a focus on data management and software training.
E N D
N U I T, Research Support • Our remit is to support researchers and includes support in: • Statistical Computing • Numerical Analysis (Condor, HPC, cloud computing, etc.) • Internet Tools (website development, collaboration through the net with other researchers within and outside NU(Sakai), Wikis, blogs, mobile applications • Training in the above
Statistical Computing Support / Advice • Staff • Postgraduate Students (Master and PhD) • Website: http://www.ncl.ac.uk/itservice/dataanalysis/ If you need help get in touch: Email: simon.kometa@ncl.ac.uk also copy to it.servicedesk@ncl.ac.uk Based in Claremont Tower
Stats News Mailing List • To subscribe go to: • https://lists.ncl.ac.uk/wws/info/statsnews • Community of Users
Research Data Management (RDM) • What do you understand by research data? • Why do you think managing your research data is important? • RDM is becoming very important and includes: • Creating, Processing, Analysing, Preserving, Giving Access, and Re-using your research data • http://datalib.edina.ac.uk/mantra/ • RDM at NCL http://www.ncl.ac.uk/res/research/gov-ethics/rdm/
Other Statistic Software Training • Dr Collin Gillespie: http://www.ncl.ac.uk/maths/rcourse/ • Mr David McGeeney: Practical Statistic
Text Analytics • Anyone currently involved with text analysis? Software? • Text Analytics • Open ended questions • Social media • document
Introduction: Why use SPSS, Minitab, SAS or Excel? • There are many statistic software e.g. STATA, PRISM, STATISTICA, CLUSTAN, GENSTAT, Mathematica, GLIM, SPLUS, JMP, R, etc • Some Definitions • Availability of Basics Statistics • Original Intent of the Software • Ease of Use • Industrial Vs Academic Usage • Frequency of New Releases • Operating Platform (Systems) • Graphics Output • Advanced Statistics • Availability of help Interpreting Output • Availability of Technical Support • Online Getting Started Tutorial / Online forum • Onscreen Statistics Wizard • Design of Experiment • Prospective and Retrospective Power Analysis • Interchanging Files with Other Software • Getting Output (results of analysis) into Other Software • Programming Language • Automatic Updating of Output after adding more Data • Pivot Tables availability and Interactivity • Text Analytics • ANOVA, Regression and Time Series Analysis • Popularity and Peer Pressure • Questions
Some Definitions • SPSS: Statistical Package for the Social Sciences (IBM SPSS Statistics these days) • SAS: Statistical Application Systems (Just SAS these days) • Minitab • Excel • SPSS, SAS, and Minitab are statistical packages while Excel is a spreadsheet
Descriptive (Summary) Statistics Options Minitab SPSS Excel SAS EG
SPSS Minitab SAS Excel Advanced Statistics: Model Building Fertility: Average number of kids. Infant mortality: deaths per 1000 live births
SAS SPSS Excel Minitab Bar Charts
How Easy to Use? • SPSS and Minitab are relatively easy to use • SAS a bit more difficult to use but made easier via Enterprise Guide (EG) • Excel very easy to use! • In general most software are easy to use if you learn how to use them! Getting Started Guides are available for SPSS, Minitab, Excel and SAS.
Online Tutorial / Online Forum • Demonstrate the Getting Started Tutorials for all four packages via RAS or Common Desktop and ask for the views of audience • Which is the best? • Online Tutorial available in Help menu • Does the software has an active online forum? sasprofessionals.net; Assess (SPSS);
Help to Interpret Output • SPSS provide assistant to help you interpret some of the output via Case Studies • Minitab provide assistant too via StatGuide • SAS by examples (similar to Case Studies) • Excel • Demonstrate with example
Technical Support • SPSS Good, but only via representative • Minitab Good, deals directly with anybody • SAS very Good, deals directly with anybody • Excel Good, deals directly with anybody • User Groups if available can be very useful
Statistics Wizard • SPSS has a statistics wizard that ask you some questions and then suggest what statistical test you can do (demonstrate). But it is best to be very clear about what you want! • SAS (via EG) • Minitab and Excel don’t
Design of Experiment Very Important aspect of Statistics: helps you to set optimum conditions for your experiment. For example what temperature, catalyst and pressure are needed for maximum yield of a chemical reaction. • Not available in SPSS • Available in Minitab • Available in SAS • Not available in Excel
Prospective and Retrospective Power Analysis Power Analysis helps you to make a decision on sample size / power • Only retrospective in SPSS • Both in SAS • Both in Minitab • Not available in Excel except maybe through an add-in module
Reading Data Files from other software • SPSS is okay • Minitab Good • SAS Good • Excel Very Good (most statistical software can read (open) an Excel file • The best way to store data is as flat file using Notepad (text).
Getting Output (tables and graphs) into other software • They are all good in this, particularly getting output into a Word document or PowerPoint presentation via Copy and Paste. • SPSS output file can be exported as html, pdf, rtf, doc • SAS output can be listing (local to SAS) or html (or pdf and rtf via EG) • Minitab output can be txt, html, pdf, rtf • Excel output can also be copied and paste in a variety of formats
Programming Language • SPSS, Minitab and SAS all have very powerful programming language. As you point-and-click a programme is being built behind the scene. Demonstrate • Excel also have a programming language e.g. Visual Basics (Anyone here used it before?)
Automatic Updating of Graphs and Tables after adding more data • SPSS does not have this function, you simple have to reproduced the output again! Syntax. • With Minitab, you can update the graph and table • SAS does not have this function • Excel has this function
New Releases • IBM SPSS Statistics produces new version too often, almost every year! Currently version 23 on campus, but 24 is out! • Minitab Ltd produces new releases just at the right pace, currently on version 17 on campus • SAS Institute Inc. produces new releases about every couple of years, latest version is 9.4. 9.1/9.2/9.3 for SAS Base and 7 for SAS EG • Microsoft produces new releases about every couple of years, currently on Windows 10.
Usage In Industry Vs Academia • SPSS and Minitab heavily used in Academia, used in Industry but not a lot • SAS not heavily used in Academia, heavily used in Industry (most clinical trials use SAS) • Excel heavily used in both Academia and Industry
Operating Systems • SPSS runs on Windows, Macintosh, and Unix • Minitab runs mainly on Windows and Macintosh (10Xtra last version on Mac) • SAS runs on Windows, Macintosh, Unix, Linux • Excel runs mainly on Windows and Macintosh
Pivot Tables Very good for displaying information online. Can be very interactive • Minitab: static, not interactive • Excel: interactive • SPSS: interactive • SAS: interactive Demonstrate if possible!
Text Analytics • SPSS • SPSS Modeler (different from IBM SPSS Statistics) • SAS • SAS Enterprise Miner (different SAS Base and Enterprise Guide) • Minitab • ?? • Excel • With extension
Statistical Modelling • “A statistical model is a class of mathematical model, which embodies a set of assumptions concerning the generation of some sample data, and similar data from a larger population. A statistical model represents, often in considerably idealized form, the data-generating process” • There are three purposes for a statistical model: • Predictions • Extraction of information • Description of stochastic structures • Some common models: • Regression Model • Simple Linear Regression • Multiple Linear Regression • Logistics Regression (Binary and Multinomial) • Ordinal • Nonlinear SAS, SPSS and Minitab are good for statistical modelling
Ordinary Least Square (OLS); Weighted Least Square (WLS); Two Stage Least Square 2SLS; Non-Linear Least Square NLLS); General Linear Model (GLM); Least Absolute Deviation regression (LAD)
Now Some Demonstrations • Most of these packages can open more than one file in a session • SPSS • Minitab • SAS • Excel
Big Data Analytics • IBM SPSS Modeler • SAS Enterprise Miner • R
Text Mining Tools Palette Text Mining Tools Project Panel Diagram Workspace showing Connected Nodes Help Panel Properties Panel
END • You will use many different software depending on your needs!!!! • Questions
Summary • DOE: Minitab or SAS • Power / Sample size calculation: Minitab or SAS • Best way to store your data: Notepad • Automatic Update of Output: Minitab or Excel • SAS very popular in industry • Pivot table: SAS, SPSS or Excel • Modelling: SAS, SPSS or Minitab • Summary statistics: SAS, SPSS or Minitab