Chapter 1

Chapter 1 Introduction to Statistics Section 1.1

Getting Started Section 1.1 Section 1.1

Defining Statistics • As a category of study: • Statistics is the science of gathering, describing, and analyzing data. • As an item of interest: • Statistics are the actual numerical descriptions of sample data. • We will tend to be using the second definition in most of what we do here Section 1.1

More Vocabulary • A population is a particular group of interest • A variable is a value or characteristic that changes among members of the population • Data are the counts, measurements, or observations gathered about a specific variable in a population in order to study it. • A census is a study in which data are obtained from every member of the population • A parameter is a numerical description of a population characteristic • A sample is a subset of the population from which data are collected • Sample statistics are numerical descriptions of sample characteristics • Sample statistics are used to estimate population parameters Section 1.1

Branches of Statistics • The branch of descriptive statistics, as a science, gathers, sorts, summarizes, and displays the data • The branch of inferential statistics, as a science, involves using descriptive statistics to estimate population parameters Section 1.1

Data Classification Section 1.2 Section 1.2

Qualitative vs. Quantitative Data • Qualitative data consist of labels or descriptions of traits of the sample • Quantitative data consist of counts or measurements • The number on a players jersey is qualitative data even though it is a number. It is really a label. • Numerical scales may also represent qualitative data. You must consider what it represents. If item 1 is rated as a 4 and item 2 is rated as an 8, does this imply that item 2 is twice as good as item 1? Section 1.2

Continuous vs. Discrete Data • Quantitative data can be either continuous or discrete • Continuous data can have any value within the range of values. • Consider a number line. Each point represent some real number. • Given any two values in a range of continuous data, there is always another value between the two selected. • Discrete data can only be from a set of non-continuous values. • Consider the time shown on a digital clock. • There is always a next or previous value in the range of available values Section 1.2

Levels of Measurement • The higher the level of measurement is, the more mathematical calculation that can be performed • The nominal level of measurement consists of qualitative data such as labels or names. You could count the number of the same type items in a sample. • The ordinal level of measurement consists of qualitative data that can be arranged in a meaningful order. You could also sort the data. • The interval level of measurement consists of quantitative data that can be ordered and the difference between values is meaningful. You could find the difference between two data values which would be meaningful. • The ratio level of measurement consists of quantitative data where the zero point also means the absence of data. You can add, subtract, multiply, and divide data points with meaningful results. • Data should be categorized using the highest level possible Section 1.2

The Process of a Statistical Study Section 1.3 Section 1.3

Procedure for Conducting a Statistical Study • Determine the design of the study • State the question to be studied • Determine the population and variables • Determine the sampling method • Collect the data • Organize he data • Analyze the data to answer the question Section 1.3

Data Collection • An observational study observes data that already exists. It cannot be used to determine cause-and-effect relationships. • An experiment or simulation can be used to create data. This technique can be used to determine a cause-and-effect relationship. • There may be some moral issues to consider when conducting experiments which will preclude he experiment. In particular, experiments on living things. Section 1.3

Observational Studies • It may not be possible or cost effective to perform a census for a given study, so a representative sample must be collected. A representative sample has the same relevant characteristics as the population and does not favor one group from he population over another. • The implementation of the sample is very important to the validity of the study. Section 1.3

Sampling Methods • Random Sampling • Simple Random Sampling • Stratified Sampling • Cluster Sampling • Systematic Sampling • Convenience Sampling Section 1.3

Random Sampling • A random sample is one in which every member of the population has an equal chance of being selected • For large populations, technology tends to be used to select members randomly. See the technology section at the end of chapter 1 of the textbook. Section 1.3

Simple Random Sampling • Simple Random Sampling is similar to Random Sampling in that each member of the population has an equally likely chance of being selected. • It goes further though in that each sample has an equally likely chance of occurring also. Section 1.3

Stratified Sampling • To collect a stratified sample, we start by dividing the population into some number of strata, or groups, that have similar characteristics that are important to the study. • Random samples are then selected from each strata. All of the group samples are collected together to make the stratified sample. • This type of sampling is done to make sure the sample represents the structure of the population given the characteristics used. • Quota sampling is a type of stratified sampling that would allow for matching the percentages in the sample with the population percentages. For example, if 15% of the population has characteristic A then one would make characteristic A a strata and make sure that 15% of the total sample is from that strata. Section 1.3

Cluster Sampling • To collect a cluster sample, you must divide the population into clusters, groups of members, where each cluster is similar to the entire population. • Then some number of clusters are randomly chosen. Each member of a chosen cluster is hen observed. Section 1.3

Systematic Sampling • To collect a systematic sample, you choose every nth member of the population. • This techniques may not be sufficient if there is a pattern to the population. Section 1.3

Convenience Sampling • A convenience sample is one in which it is convenient for the researcher to take the sample. • For example, if you were to collect a sample from people, it would probably be convenient to talk to the people in the room with you right now. • Convenience sample may not produce samples which accurately represent the population. Section 1.3

Types of Observational Studies • In a cross-sectional study, data are gathered at a single point in time. • This gives a “snapshot” of the situation at a point in time. • In a longitudinal study, data are gathered by following a particular group over a period of time. • This allows for the observation of patterns that are not visible with only a snapshot of the information. • A meta-analysis is a study that compiles information from previous studies. • This technique requires all of the other studies to have been done with good technique. • A case study looks at multiple variables that affect a single event. • This technique focusses on a single event. Section 1.3

Experiments • A treatment is some condition that is applied to a group of subjects in an experiment • Subjects are people or things being studied in an experiment • Participants are people being studied in an experiment • The response variable is the variable in an experiment that responds to the treatment • The explanatory variable is the variable in an experiment that causes the change in the response variable Section 1.3

Principles of Experimental Design • Randomize the control and treatment groups. • Control for outside effects on the response variable. • Replicate the experiment a significant number of times to see meaningful patterns. Section 1.3

Definitions • A control group is a group of subjects to which no treatment is applied in an experiment. • A treatment group is a group of subjects to which researchers apply a treatment in an experiment. • Confounding variables are factors other than the treatment that cause an effect on the subjects of an experiment. • The placebo effect is a response to the power of suggestion, rather than the treatment itself, by participants of an experiment. • A placebo is a substance that appears identical o the actual treatment but contains no intrinsic beneficial elements. • In a single-blind experiment, subjects do not know if they are in the control group or the treatment group, but the people interacting with the subjects in the experiment know in which group each subject has been placed. • In a double-blind experiment, neither the subjects nor the people interacting with the subjects know to which group each subject belongs. Section 1.3

Institutional Review Boards • An Institutional Review Board is a group of people who review the design of a study to make sure that it is appropriate and that no unnecessary harm will come to the subjects involved. • Informed consent, human or animal subjects, and confidentiality are of great concern in experiments. • Informed consent involves completely disclosing to participants the gals and procedures involved in a study and obtaining their agreement to participate. Section 1.3

How to Critique a Published Study Section 1.4 Section 1.4

Consider the Source • You need to consider why the study was done and by whom. • You need to consider when and where the study was done. • You need to consider the population studied. • All of these issues may give you cause to not trust the validity or applicableness of the study. Section 1.4

Consider the Variables • Were the terms clearly defined? Do they mean the same thing to the researchers, respondents, and you? • Were all of the confounding variables accounted for? Section 1.4

Consider the Setup • You need to consider any bias in the study. The different types of biases are given later but they tend to fall into a few categories. • The sample needs to be large in most cases and reflect the total population. • The actual collection and processing of the data need to be accurate and fair. • The researcher needs to not influence the respondents. • The respondents need to truthfully participate in the study. Section 1.4

Consider the Conclusions • You need to consider if the data actually supports the conclusion. • Are all of the results given? Section 1.4

Definitions • Bias is favoring of a certain outcome in a study • Sampling bias occurs when the sample chosen does not accurately represent the population being studied • Dropouts are participants who begins a study but fail to complete it • Processing errors are errors that occur simply from the data being processed, such as typos when data are being entered • Nonadherents are participants who remain in the study until the end but stray from the directions they were given • Researcher bias occurs when a researcher influences the results of a study • Response bias occurs when a researcher’s behavior causes a participant to alter his or her response or when a participant gives an inaccurate response • Participation bias occurs when there is a problem with either the participation – or lack thereof - of those chosen for the study • Nonresponse bias occurs when there is a lack of participation in a self-selected sample from certain segments of a population, when a person refuses to participate in a survey, or when a respondent omits questions when answering a survey Section 1.4

Chapter 1

Chapter 1

Presentation Transcript

Chapter 1

CHAPTER 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1

CHAPTER 1 1

Chapter 1

Chapter 1

Chapter 1

Chapter 1.

Chapter 1 - 1

Chapter 1 1