380 likes | 548 Views
Statistics 100 Lecture Set 1. Lecture Set 1. Course outline and important details about the course Chapter 1 … today Will be doing chapter 2 in the next lecture set Some suggested problems: Chapter 1: 1.3, 1.5, 1.11, 1.13, 1.17. Important Stuff.
E N D
Lecture Set 1 • Course outline and important details about the course • Chapter 1 … today • Will be doing chapter 2 in the next lecture set • Some suggested problems: • Chapter 1: 1.3, 1.5, 1.11, 1.13, 1.17
Important Stuff • Statistics and Actuarial Science Stats Lab (Statistics Workshop) • What is Stats Lab for? One-on-one help is available during its operation hours. • Where is it?The Stats Lab is located in K9516 (inside k9510)… • How does the Stats Lab Work? • The Statistics Workshop opens for regular use from the second week of classes. The hours will depend on the amount of T.A. time available and will be posted at the end of the first week of classes. The Workshop will be open only when there is a T.A. on duty. • Typically, Mon-Fri: 9:30-16:30
Important Stuff • Text: Statistics: Concepts and Controversies, 8th edition, by Moore and Notz • People have asked about 7th edition … • Read Chapters 1 and 2 this week (they are short)
Important Stuff • Course web page can be found: www.stat.sfu.ca/~dbingham/stat100 • Download lecture notes day before class • Will also have announcements (e.g., exam dates) • Also has my office hours posted (Monday and Wednesday 1:00-2:00)
Important Stuff Grading Scheme: • Assignments – 10% • Midterm 1 – 20% • Midterm 2 – 20% • Final – 50% Tentative mid-term dates • Mid-Term 1:Monday, February 17 • Mid-Term 2:Monday, March 17
Important Stuff • Assignments: 8-10 of them • Usually will be due Wednesdays, before 4:30 in boxes outside lab • The boxes are labeled (by class and alphabetically) • Note: • Late assignments will not be accepted • Assignments placed in the wrong box (e.g., stat 270) will not be accepted
Important Stuff • The classroom is likely to be full • Be courteous … when you come in, do not sit in the aisle seat (unless you are left-handed) • Do not put your bag down on a seat …. • Turn off cell phones, do NOT text, … • People with laptops …
Important Stuff Other stuff • Class email list: I will occasionally email the class with hints and other information. • If the email is not from me or Robin Insley (lab instructor), then it is likely spam
What is this course about • Statistical methods are are used everywhere • Health studies • Industry • Economics • Studying manuscripts • Most courses I teach are concerned with statistical methods • How to fit models to data • How to use statistics to make decisions • This course is not about those things • This course is about statistical reasoning
How to do well • Study and practice • Ask questions • Office hours and the drop-in lab
PART I: Producing Data • Not every product you could buy is well-made • Cars, phones, clothes, food • Cheaply & poorly made vs. carefully & properly made • Data are the same way • Not all numbers should be viewed as having equal quality • How they are collected says a lot about the information that they convey and our degree of belief • Chapter 1 introduces data collection
Chapter 1: Where do Data Come From? What are the data?
Example • Does living high voltage power lines cause childhood leukemia? • Study conducted to see if there is evidence that magnetic fields were related to leukemia a study was conducted • Researchers compared 628 children who had leukemia and 620 who did not • Measured magnetic field in the rooms in their houses • What are the data?
Example • What are Data? Variables Individuals
Some Definitions • Interested in something about a population. • Population is a collection of individuals. • Individuals are the objects described by the data • Data sets contain information/facts relating to individuals. • Variablesare attributes of an individual (e.g., hair color, pain severity, ...).
How are data collected? • A good deal of effort is spent trying to figure out what data to collect • Which individuals are measured? • What should be measured to answer the questions of interest? • What population was the data collected from? • What is the population of interest? • Can we afford to conduct the study?
How are data collected? • Purpose of Study • Learn something about a group of individuals • Population= group of individuals that you want to know about • Sample= group of individuals that you actually measure • Examples… • Why not just measure the entire population (census)?
Observational studies • Observational study: observes individuals and measures variables but does not attempt to influence a study. • The outcome(s) of interest is called the response variable. • Observational studies (“you can observe a lot by watching”) • Identify an individual, watch/measure variables • Do not interfere, merely observe (collect data) • Generally inexpensive… very common
Example (back to leukemia study) • Chapter 1 has a discussion of Leukemia and Power lines • Looking for association between magnetic fields and Leukemia • Measured Electro-magnetic fields & lots of other variables • Found no link, despite anecdotes • Notice that researchers did not interfere in the study (e.g., did not intentionally expose children to magnetic fields)
Sample Surveys • Sample Surveys: A collection of individuals (the sample) from the population are measured and chosen in a specific quantifiable manner • Special kind of observational study • Use a sample carefully chosen from the population to best represent the population • Idea is that the sample should be representative of the population and can learn from the sample • Examples: • Political Polls: How can we tell who will win an election • Government Surveys: inform policy • Market Research • NOT the leukemia study (more on this in chapters 2-4)
Census • Census: A sample survey where the sample is (ideally) the entire population • Example: Statistics Canada conducts the Census of Population and the Census of Agriculture to develop a statistical portrait of Canada and its people
Census • Interesting side note: In the summer of 2010, the Canadian Federal Government announced that the 2011 long-form census questionnaire will no longer be mandatory • What does this mean?
Census • Interesting side note: In the summer of 2010, the Canadian Federal Government announced that the 2011 long-form census questionnaire will no longer be mandatory • What does this mean? • “I want to take this opportunity to comment on a technical statistical issue which has become the subject of media discussion. This relates to the question of whether a voluntary survey can become a substitute for a mandatory census. It can not.” — MunirSheikh, Chief Statistician of Canada
Experiments • Experiment: Is a study where a treatment is deliberately imposed on an individual in order to observe their response. • Why do this? • Why was this not done in the leukemia study? • Experiments: • Clinical Trials • Agriculture • Manufacturing
Example (Pain Reduction and Reiki) • Is Reiki an effective pain management tool? • Reiki treatment is touch therapy used as an alternative to pain medication. • A pilot study involving 20 volunteers experiencing pain was conducted • All treatments were provided by a certified Reiki therapist • Pain was measured using before and after the Reiki treatment • What kind of study is this? • Is this a good study (more on this later)? • If study was repeated, would we see the same results?
Example (Saving for Retirement) • What are the attitudes of low wage earners about saving for retirement? • Americans earning $35,000 or less were asked how they are likely to accumulate enough money to retire. • What are the data? • What is the population? • What kind of study is this?
Observational study, likely a sample survey Chapter 1: Where do Data Come From?
Which is worse: • Not knowing the answer to a question • Thinking you know the answer, but being wrong
Which is worse: • Not knowing the answer to a question • Thinking you know the answer, but being wrong "We know he's been absolutely devoted to trying to acquire nuclear weapons, and we believe he has, in fact, reconstituted nuclear weapons." Dick Cheney, March 16, 2003
There are many ways to collect data • Some studies provide good information • Most don’t • How can you tell which is which? • Being skeptical about studies and identifying good sampling techniques is key
Brief Moment of Statistical Relevance • Two highlights from the commercial: • Shoe is proven to “…work your hamstrings and calves 11% harder…” • Shoe is proven to “…tone your butt up to 25% more than regular sneakers just by walking…”
Brief Moment of Statistical Relevance • Two highlights from the commercial: • Shoe is proven to “…work your hamstrings and calves 11% harder…” • Shoe is proven to “…tone your butt up to 25% more than regular sneakers just by walking…” • Some other facts not in the commercial: • The study was based on a sample of 5 women who walked on a treadmill for 500 steps wearing either the EasyTone or another Reebok walking shoe, and while barefoot.
Brief Moment of Statistical Relevance • Two highlights from the commercial: • Shoe is proven to “…work your hamstrings and calves 11% harder…” • Shoe is proven to “…tone your butt up to 25% more than regular sneakers just by walking…” • Some other facts not in the commercial: • The study was based on a sample of 5 women who walked on a treadmill for 500 steps wearing either the EasyTone or another Reebok walking shoe, and while barefoot. • From the Reebok fine print: “The shoes are designed only for walking, and because of the instability design, wearers are discouraged from running, jumping and engaging in other athletic activities while wearing them.”