1 / 15

Statistical Data Analysis STAT221A

Statistical Data Analysis STAT221A. Dr. Judi McWhirter Room G3.29 – third floor of G block Office hours: by appointment Web page for this course http://www.stats.waikato.ac.nz/Courses

dusty
Download Presentation

Statistical Data Analysis STAT221A

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Data Analysis STAT221A Dr. Judi McWhirter Room G3.29 – third floor of G block Office hours: by appointment Web page for this course http://www.stats.waikato.ac.nz/Courses We will use this website for distribution of lecture notes, assignments, and related documents or computer files Statistical Data Analysis - Lecture 1 - 04/03/03

  2. Course Structure Data Exploration, Presentation, and Analysis (3 weeks) Analysis of Variance (ANOVA) and Design (3 weeks) Regression (3 weeks) Multivariate data (3 weeks) Statistical Data Analysis - Lecture 1 - 04/03/03

  3. Textbooks • There are no set texts for this course, but you might like to read: • A.J. Lee - Data analysis – an introduction using R • This book will be on desk copy in the library and is currently out of print • Don’t use this as a reference for R – the commands used are extras written especially for that book • Peter Dalgaard – Introductory Statistics using R • This book will be on desk copy in the library. • Moore & McCabe – Introduction to the practice of Statistics 3rd Ed. • This book is on desk copy in the library. If you took 0655.121 last year you should already own a copy. Statistical Data Analysis - Lecture 1 - 04/03/03

  4. Computer Laboratories • Lab is Lab 5 in R Block, room RG.12 • Login names/User names will be on the R block notice board • If your name is not on the board see Harry Johnston, room RG.20 • Get in early! Inability to get to the computer lab is not a valid excuse for late assignments Statistical Data Analysis - Lecture 1 - 04/03/03

  5. Lab times • There is a space reserved (sometimes with demonstrators) for our class on: • Tuesday 11am-1pm • Thursday 2pm-4pm • Friday 2pm-4pm Statistical Data Analysis - Lecture 1 - 04/03/03

  6. Assessment • Internal assessment • Computing assignments 80% • One test during class time (30th April) 20% • Exam/internal assessment • Ratio 1:1 • Late assignments get zero. Medical certificates/Counsellors certificates are the only excuse • You must get over 40% for your coursework to get credit for it. Statistical Data Analysis - Lecture 1 - 04/03/03

  7. Some aims of course • Learn more about how statistics can help solve real problems. • Including how to figure out what the problem really is! • What the statistical result means! • Learn more about communicating the results of statistical manipulations to the ‘client’, • Gain and improve specific skills in statistical analysis eg. regression, analysis of variance, multivariate analysis, Statistical Data Analysis - Lecture 1 - 04/03/03

  8. Some aims of course • But NOT to learn mathematical theorems which may be behind statistical procedures • Note the emphasis is on practical statistical inference. • R will be used to perform all statistical calculations. We will also use Excel for data management. • R – http://lib.stat.cmu.edu/R/CRAN Statistical Data Analysis - Lecture 1 - 04/03/03

  9. The statistical process Population Sample The statistician Population value Sample Estimate Sampling Calculating Inferring Estimating Statistical Data Analysis - Lecture 1 - 04/03/03

  10. Statistical inference about a population • Population -The entire set of things we wish to make statements about. • Sample--The set of things we have data from • Statistical inference--making probabilistic statements about population parameters based on sample statistics. • This requires that the sample was chosen randomly by some probabilistic method Statistical Data Analysis - Lecture 1 - 04/03/03

  11. Exploration and presentation of univariate data • Single variable. (there may be multiple samples of the same variable) • E.g. Reported rapes for 2001 in NZ • We could group these into regions. • E.g. Results of “Sex, Drugs, & Rock n Roll” class survey. • We will look at each of the questions separately. • We will look at responses grouped into males and females. Statistical Data Analysis - Lecture 1 - 04/03/03

  12. Statistical inference about a relationship • We have observations on two or more variables for a number of items. • We believe there is an underlying (linear) relationship between the variables, but we don’t know the parameters. • Assuming a random error structure on the observations. • We make inferences about the parameters of the relationship. • Whether or not the conclusions extend to the population of items depends how the ones we measured were selected. Statistical Data Analysis - Lecture 1 - 04/03/03

  13. Inference on a relationship • Observational study--which items go into the treatment group(s) and the control group are not determined using randomisation. • Randomised Experiment--we use randomisation to determine which items go into treatment groups and which go into control group. • We can infer the relationship is a causal one only when the data comes from a randomised experiment. Statistical Data Analysis - Lecture 1 - 04/03/03

  14. Sex Drugs & Rock n Roll Data • Census (population : students taking first year statistics 121 or 122 at Waikato) data with errors (incorrect responses) • Class was surveyed using randomised response technique. • Each person responded either to sensitive or dummy question, depending on outcome of roll of dice. • Data contains incorrect responses. • Did not get individual information from the data • The incorrect responses are based on probability, so we can make inferences about population parameters from the data. Statistical Data Analysis - Lecture 1 - 04/03/03

  15. Questions to think about • Can we generalise the conclusions from population of students taking first year statistics course 121 or 122 at Waikato to any wider population? • What population would you think it could be generalised to? • On what basis could you justify generalising it? Statistical Data Analysis - Lecture 1 - 04/03/03

More Related