250 likes | 272 Views
Statistical Reasoning and Applications. Nutan S. Mishra. Definition. Statistics as a field of study is a group of methods to collect , analyze, present and interpret data . Statistics is different from Mathematics. How? In Mathematics we study deterministic models
E N D
Statistical Reasoning and Applications Nutan S. Mishra University Of South Alabama
Definition • Statistics as a field of study is a group of methods to collect , analyze, present and interpret data. • Statistics is different from Mathematics. • How? • In Mathematics we study deterministic models • In Statistics we study non-deterministic models. University Of South Alabama
Two areas of Statistics • Descriptive Statistics : group of methods to organize, display and describe the data • Inferential Statistics: Methods to draw inference about a population from given sample. University Of South Alabama
Question: what is a population and what is a sample? • A population (target population) is a collection of all items of interest. (usually defined in the statement of the problem itself) • Examples of populations • To study the GPA of all students at USA, the population consists of all students at USA • To study the GPA of all students at USA with age<20, the population consists of all students at USA with age<20. University Of South Alabama
More examples of population • To study the annual income of the citizens of United States, the population consists of all citizens of United States. • To study the rainfall pattern in Mobile during 2003, population consists of all 365 days of 2003. • To study the monthly growth pattern of a child, population consists of all months since she is born. University Of South Alabama
What is data? • The information collected from a member of the population is a datum. Collection of such information on different members is data. University Of South Alabama
Example of data Problem: To study the GPA performance of all students at USA with age<20. Target population Jennie is a member of this population The characteristic we studying here is GPA Jennie's GPA is 3.68 , 3.68 is a datum Tom’s GPA is 2.95, 2.95 is a datum Collection of datum is called data. All students at USA with age <20 Jennie, Tom,….. University Of South Alabama
What is a Sample? A Sample is a small portion of the population. • Collection of information on a small set of members (sample) of the population is called SAMPLE SURVEY • Collection of information on all the members of the population is called CENSUS University Of South Alabama
Why Sample? When collection of information on all the members of the population is not possible , we draw a sample from the population and collect information on members of the sample. Some of the situations are: • Time constraint • Budget constraint • Destructive test. • Laaarge population • Unreachable members University Of South Alabama
A desirable property of a sample The sample should be representative of the population. (the purpose of the sample survey is to make the decisions about the population, so it is important that sample should closely match with the population) A sample is representative of the population when it represents the population as closely as possible. University Of South Alabama
Example (of a representative sample) Problem: To study the annual income of the all citizens of United states. A representative sample consists of members from all brackets of incomes i.e. some from low income group, some from middle income and others from high income. A non-representative sample would consists of all members from high income group only (and would give a rosy picture) University Of South Alabama
Simple Random Sample A sample is called RANDOM SAMPLE if each member of population has some chance of being selected in the sample. A sample is called SIMPLE RAMDOM SAMPLE if each member of population has equal chance of being selected in the sample. A simple random sample is representative sample. University Of South Alabama
How to draw a simple random sample? • Use lottery system • Use random number table. A simple random sample can be drawn in two different ways • With replacement • Without replacement. These two ways give same results when the population size is laaarge compared to sample size. University Of South Alabama
Basic Terms • Variable: A characteristic under study, which assumes different values for different members of the population • Example: To study the GPA of the students @ USA with age<20, the variable is GPA. Tom’s GPA is 2.95, Jennie’s GPA is 3.68, ….. • Usually the variables are denoted by x, y z etc. University Of South Alabama
Basic terms • Observations/ Measurement is a specific value of the variable • For example value of the variable GPA is 3.68, thus 3.68 is an observation • Dataset is collection of observations on one or more variables. • The number of observations is a data set is called size of the dataset. University Of South Alabama
Types of variables University Of South Alabama
Types of variables • Quantitative: Where the observation is a numerical value • Example GPA Member Tom , Observation 2.95 • Qualitative: Where the observation is one of the categories specified. • Example: an opinion pole where the answers could be YES, NO or DON’T KNOWS Member Tom , observation YES University Of South Alabama
More on quantitative variables • Discrete variables take values on the set of integers. • Example: x= # cars/family. Then x takes values 0,1,2,3,… • Example: x= #earthquakes/year in California. Then x takes values 0,1,2,3,4,…. • Continuous variables take values on an interval of values. • Example: x=GPA of a student @ USA with age<20 then x takes any value in the interval (0, 4) University Of South Alabama
Important Note • A continuous variable can be discretized. • Example : Time is a continuous variable. • We discretize it as minutes or seconds or milliseconds or microseconds depending on the precision of our measuring instrument. University Of South Alabama
Cross Section Data • Cross section data is collected on different members at the same point of time. • Example: Collecting data on all the 2003 Camry cars to study the miles/gallon of the model. • Example: Collecting data on percentage increase in the stock values of all listed stocks in the month of December. University Of South Alabama
Time Series data • Time series data is collected on the same member at different points in time • Example: Collecting data on the annual change in stock value of the company ABCH during last thirty years. • Example: Collecting data on monthly growth of a child since she is born. University Of South Alabama
Summation notation Let us consider the study of GPA So the variable is GPA. Let us call it by name x. then GPA of the first student x1 = 3.68 GPA of the second student x2 = 2.95 …. GPA of 100th student x100 = 3.44 Then x1+x2+…x100 = Σ x Σ is upper case greek letter sigma. Denotes the sum of the values of the variable in the dataset. University Of South Alabama
Example of summation Let the sample consists of four students and let x1 = 3.95, x2 = 2.65, x3 = 3.00 and x4 = 2.00 then Σx = 3.95+2.65+3.00+2.00 = 11.60 Square of sum (Σx)2 = (11.60)2 = 134.56 Sum of squares Σx2 = (3.95)2+(2.65)2+(3)2+(2)2 =35.625 University Of South Alabama
Sum of product of two set of numbers Consider a sample of size 10. Inside the sample there are two students with GPA 3.95, two students with 2.65, three each with 2.00 and 3.00 GPA Then this data can be summerized as follows: Value of X 3.95 3.00 2.65 2.00 Frequency 2 3 2 3 University Of South Alabama
Sum of products Value of X 3.95 3.00 2.65 2.00 Frequency (f) 2 3 2 3 Σfx = f1x1+f2x2+f3x3+f4x4 = 2*3.95+3*3.00+2*2.65+2*2.00 = 7.90 +9.00 +5.30 +4.00 = 26.20 Practice assignment : solve 1.25 and 1.26 University Of South Alabama