1 / 83

Biotechnology Laboratory Technician Program

Biotechnology Laboratory Technician Program. Course: Basic Biotechnology Laboratory Skills for a Regulated Workplace Lisa Seidman, Ph.D. Ph.D. STATISTICS. A BRIEF INTRODUCTION. WHY LEARN ABOUT STATISTICS?. Statistics provides tools that are used in Quality control Research

coswald
Download Presentation

Biotechnology Laboratory Technician Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biotechnology Laboratory Technician Program Course: Basic Biotechnology Laboratory Skills for a Regulated Workplace Lisa Seidman, Ph.D. Ph.D.

  2. STATISTICS A BRIEF INTRODUCTION

  3. WHY LEARN ABOUT STATISTICS? • Statistics provides tools that are used in • Quality control • Research • Measurements • Sports

  4. IN THIS COURSE • We will use some of these tools • Ideas • Vocabulary • A few calculations

  5. VARIATION • There is variation in the natural world • People vary • Measurements vary • Plants vary • Weather varies

  6. Variation among organisms is the basis of natural selection and evolution

  7. EXAMPLE • 100 people take a drug and 75 of them get better • 100 people don’t take the drug but 68 get better without it • Did the drug help?

  8. VARIABILITY IS A PROBLEM • There is variation in response to the illness • There is variation in response to the drug • So it’s difficult to figure out if the drug helped

  9. STATISTICS • Provides mathematical tools to help arrive at meaningful conclusions in the presence of variability

  10. Might help researchers decide if a drug is helpful or not • This is a more advanced application of statistics than we will get into

  11. DESCRIPTIVE STATISTICS • Chapter 16 in your textbook • Descriptive statistics is one area within statistics

  12. DESCRIPTIVE STATISTICS • Provides tools to DESCRIBE, organize and interpret variability in our observations of the natural world

  13. DEFINITIONS • Population: • Entire group of events, objects, results, or individuals, all of whom share some unifying characteristic

  14. POPULATIONS • Examples: • All of a person’s red blood cells • All the enzyme molecules in a test tube • All the college students in the U.S.

  15. SAMPLE • Sample: Portion of the whole population that represents the whole population

  16. Example: It is virtually impossible to measure the level of hemoglobin in every cell of a patient • Rather, take a sample of the patient’s blood and measure the hemoglobin level

  17. MORE ABOUT SAMPLES • Representative sample: sample that truly represents the variability in the population -- good sample

  18. TWO VOCABULARY WORDS • A sample is random if all members of the population have an equal chance of being drawn • A sample is independent if the choice of one member does not influence the choice of another • Samples need to be taken randomly and independently in order to be representative

  19. SAMPLING • How we take a sample is critical and often complex • If sample is not taken correctly, it will not be representative

  20. EXAMPLE • How would you sample a field of corn?

  21. VARIABLES • Variables: • Characteristics of a population (or a sample) that can be observed or measured • Called variables because they can vary among individuals

  22. VARIABLES • Examples: • Blood hemoglobin levels • Activity of enzymes • Test scores of students

  23. A population or sample can have many variables that can be studied • Example • Same population of six year old children can be studied for • Height • Shoe size • Reading level • Etc.

  24. DATA • Data: Observations of a variable (singular is datum) • May or may not be numerical • Examples: • Heights of all the children in a sample (numerical) • Lengths of insects (numerical) • Pictures of mouse kidney cells (not numerical)

  25. ALWAYS UNCERTAINTY • Even if you take a sample correctly, there is uncertainty when you use a sample to represent the whole population • Various samples from the same population are unlikely to be identical • So, need to be careful about drawing conclusions about a population, based on a sample – there is always some uncertainty

  26. SAMPLE SIZE • If a sample is drawn correctly, then, the larger the sample, the more likely it is to accurately reflect the entire population • If it is not done correctly, then a bigger sample may not be any better • How does this apply to the corn field?

  27. INFERENTIAL STATISTICS • Another branch of statistics • Won’t talk about it much • Deals with tools to handle the uncertainty of using a sample to represent a population

  28. EXAMPLE PROBLEM • In a quality control setting, 15 vials of product from a batch are tested. What is the sample? What is the population? • In an experiment, the effect of a carcinogenic compound was tested on 2000 lab rats. What is the sample? What is the population?

  29. A clinical study of a new drug was tested on fifty patients. What is the sample? What is the population?

  30. ANSWERS • 15 vials, the sample, were tested for QC. The population is all the vials in the batch. • The sample is the rats that were tested. The population is probably all lab rats. • The sample is the 50 patients tested in the trial. The population is all patients with the same condition.

  31. EXAMPLE PROBLEM • An advertisement says that 2 out of 3 doctors recommend Brand X. • What is the sample? What is the population? • Is the sample representative? • Does this statement ensure that Brand X is better than competitors?

  32. ANSWER • Many abuses of statistics relate to poor sampling. The population of interest is all doctors. No way to know what the sample is. The sample could have included only relatives of employees at Brand X headquarters, or only doctors in a certain area. Therefore the statement does not ensure that the majority of doctors recommend Brand X. It certainly does not ensure that Brand X is best.

  33. DESCRIBING DATA SETS • Draw a sample from a population • Measure values for a particular variable • Result is a data set

  34. DATA SETS • Individuals vary, therefore the data set has variation • Data without organization is like letters that aren’t arranged into words

  35. Numerical data can be arranged in ways that are meaningful – or that are confusing or deceptive

  36. DESCRIPTIVE STATISTICS • Provides tools to organize, summarize, and describe data in meaningful ways • Example: • Exam scores for a class is the data set • What is the variable of interest? • Can summarize with the class “average”, what does this tell you?

  37. A measure that describes a data set, such as the average, is sometimes called a “statistic” • Average gives information about the center of the data

  38. MEDIAN AND MODE • Two other statistics that give information about the center of a set of data • Median is the middle value • Mode is most frequent value

  39. MEASURES OF CENTRAL TENDENCY • Measures that describe the center of a data set are called: Measures of Central Tendency • Mean, median, and the mode

  40. HYPOTHETICAL DATA SET 2 5 6 7 8 3 9 3 10 4 7 4 6 11 9 Simplest way to organize them is to put in order: 2 3 3 4 4 5 6 6 7 7 8 9 9 10 11 By inspection they center around 6 or 7

  41. MEAN • Mean is basically the same as the average • Add all the numbers together and divide by number of values 2 3 3 4 4 5 6 6 7 7 8 9 9 10 11 What is the mean for this data set?

  42. NOMENCLATURE • Mean = 6.3 =  read “X bar” • The observations are called X1, X2, etc. • There are 15 observations in this example, so the last one is X15 Mean = Xi n Where n = number of values

  43. EXAMPLE • Data set 2 3 3 4 5 6 7 8 9 What is the mode? What is the median?

  44. MEAN OF A POPULATION VERSUS THE MEAN OF A SAMPLE • Statisticians distinguish between the mean of a sample and the mean of a population • The sample mean is  • The population mean is μ • It is rare to know the population mean, so the sample mean is used to represent it

  45. DISPERSION • Data sets A and B both have the same average: A4 5 5 5 6 6 B1 2 4 7 8 9 • But are not the same: • A is more clumped around the center of the central value • B is more dispersed, or spread out

  46. MEASURES OF DISPERSION • Measures of central tendency do not describe how dispersed a data set is • Measures of dispersion do; they describe how much the values in a data set vary from one another

  47. MEASURES OF DISPERSION • Common measures of dispersion are: • Range • Variance • Standard deviation • Coefficient of variation

  48. CALCULATIONS OF DISPERSION • Measures of dispersion, like measures of central tendency, are calculated • Range is the difference between the lowest and highest values in a data set

  49. Example: 2 3 3 4 4 5 6 6 7 7 8 9 9 10 11 • Range: 11-2 = 9 or, 2 to 11 • Range is not particularly informative because it is based only on two values from the data set

More Related