660 likes | 926 Views
F u n. with Statistics Workshop. How the workshop works. 3 hours @ Temasek Polytechnic First 1.5 hours workshop on statistics next 1.5 hours to create an interesting infographic to tell a story using the data. . Why attend this workshop?.
E N D
Fun with Statistics Workshop
How the workshop works • 3 hours @ Temasek Polytechnic • First 1.5 hours workshop on statistics • next 1.5 hours to create an interesting infographicto tell a story using the data.
Why attend this workshop? • Learn key statistics concepts that will help you make better decisions • Pick up useful Microsoft Excel Skills • Win attractive prizes!
What is Statistics ? • Statistics is the study of the • Collection • Organization • Analysis • Interpretation of data
Why Study Statistics? • Numbers are everywhere! • Statistical techniques are used to make decisions that effect our daily lives • How can stats affect you?
Types of Statistics Statistics Descriptive Statistics Inferential Statistics
Definition Descriptive Statistics – Methods of organizing, summarizing, and presenting data in an informative way. Examples: (Mean, Median, Mode), Frequency distribution table, Charts (Bar Chart, Line Chart), graphs (Histogram, Box-and-Whisker Plot) etc. Inferential Statistics – The methods used to estimate a property of a population on the basis of a sample. Example: Sampling
Which type of statistics is involved when ... • a research firm observes that women are twice as likely as men to shop impulsively. ANSWER : • an accountant observes that the current year’s total sales of $60 million represents a 20% increase compared to last year’s total sales. ANSWER : Inferential Statistics Descriptive Statistics
Population refers to a set or collection of all possible observations of some specific characteristics. Sample refers to a portion of the population. Population and Sample
Variable Definition • 1. Qualitative variable: When the characteristic being studied is nonnumeric, it is called a qualitative variable. Examples are gender, state, country etc. It is discrete • 2. Quantitative variable: When the variable studied can be reported numerically, the variable is called a quantitative variable. Examples are age, amount, no. of children etc. can be either discrete or continuous • Discrete variable: Individually separate and distinct. can only assume certain values and there are usually “gaps” between values. Example: Children in a family, number of students, number of employees etc. • (b) Continuous variable: can assume any value within a specified range. Example: Amount, height, temperature etc.
Levels of Measurement There are four levels of data: • Nominal • Ordinal • Interval • Ratio
Definition • Nominal: variables which are classified into categories and order will be meaningless. Example: Race, Gender, Religious affiliations etc. • Nominal level variables must be: • (a) Mutually exclusive • An individual object, can only belong to one category at a time. Not possible to have 2 categories at a single time. Can you be both F and M? • (b) Exhaustive • Each individual objectmust belong to either a For M
Definition • 2. Ordinal: Ordinal level variables are arranged in some • order and the categories have some relationship among them. • Example: Student’s grade, customer’s rating, military rank.
Definition • 3. Interval: Similar to the ordinal level, but there is • a meaningful difference between values. • 0 ≤ x ≤ 1 is an interval which contains 0 and 1, as well as all numbers between them • Example: Temperature, Dress size, time
Definition 4. Ratio: Practically all quantitative data is recorded as ratio level of measurement. Similar to the interval level, but has an absolute zero (0). Example: Number of employees, distance etc.
M.E.A.N M.E.D.I.A.N M.O.D.E
Mean the average value of the data set. the most important and most frequently used measure of central tendency. computed as the sum of all observed values divided by the total number of observations.
903 1745 1883 863 1204 1624 1698 957 1041 1138 1354 1802 Example The following shows the net profits of 12 branches of Evergreen Florist Shop on Mother’s Day. Net Profits ($) • Compute the mean net profit • assuming that data are from a population.
Solution Population Data sum of all observed values Population Mean Population Size 903+1745+1883+863+1204+1624+1698+957+1041+1138+1354+1802 12 = 16212 12 $1,351 = 1 =
) Median= th observation in data array ( n + 1 2 ( ) 11+ 1 2 Median = th observation = 6 th observation Median Middle value in the data set Compute the median for the following odd number of observations. Net Profits ($) of Evergreen Florist 903 1745 1883 863 1204 1624 1698 957 1041 1138 1354 First arrange the data in an array ( in ascending order ) 863 903 957 1041 1138 1204 1354 1624 1698 1745 1883 = $1,204
) Median= th observation in data array ( n + 1 2 Example Compute the median for the following evennumber of observations. Net Profits ($) of Evergreen Florist 903 1745 1883 863 1204 1624 1698 957 1041 1138 1354 1895 First arrange the data in an array ( in ascending order ) 863 903 957 1041 1138 1204 1354 1624 1698 1745 1883 1895 2 Median = 6.5 = 1204 + 1354 /2 = $1,279
Mode • the value that occurs most frequently. Determine the mode for the following data : $100,000 $5,000 $10,000 $20,000 $30,000 $50,000 $100,000 $100,000 Since the value occurs most frequently, Mode = $100,000
3 Example No Mode Raw data: 8 6 7 9 2 5 Answer : No Mode One Mode Raw data: 8 8 7 9 2 8 Answer : 8 More Than One Mode Raw data: 8 8 7 9 2 9 8 and 9 Answer :
4 Comparison of Mean, Median & Mode mode median mean Symmetrical Distribution orNormal Distribution Mean = Median = Mode mode median Distribution Skewed to Right or Positively Skewed Mean > Median mean median mode Distribution Skewed to Left or Negatively Skewed Mean < Median mean For skewed distributions, the is the best measure as it lies between the mean and mode. MEDIAN
903 1745 1883 863 1204 1624 1698 957 1041 1138 1354 1802 Range The following shows the net profits of 12 branches of Evergreen Florist Shop on Mother’s Day. Find the range for the net profit. Range = Largest Value - Smallest Value 1883 - 863 Range = = $1,020 5
1. Variance the average of the squared distances of the observations from the mean. Population Variance Sample Variance
What’s the difference? What is the difference between the 3 curves? Curve A Curve B Curve C So how far is each data value from the mean? They have same mean but different amount of spread (variability).
Standard Deviation defined as the square root of variance, i.e. the square root of the average of the squared distances / deviations of the observations from the mean. most important and most commonly used measure of dispersion. 6
Histogram a graphical presentation of a frequency distribution. is constructed by (i) marking class intervals on the x-axis, and (ii) drawing rectangles whose heights correspond to the class frequencies.
Frequency polygon showing daily sales turnover. 10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 69 10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 69 Frequency Polygon (Line Chart) is formed by letting the midpoint of each class represents the data in that class and then connecting the sequence of midpoints at their respective frequencies.
Number of Degrees Relative Value For Each Category of the Category 360o Total: 360º Pie Chart • circular display divided into sections based on the number of observations. • useful in showing proportional relationships, such as market share & budgets. Total = 3000
Example Pie Chart Showing the Ethnic Composition of Residents in ABC New Town Others (130) Indian (230) Malay (400) Chinese (2240) 10
Pictogram • a display that uses pictures or symbols to represent • frequencies. Pictogram Showing the Ethnic Composition of Residents in ABC New Town Chinese 2240 Malay 400 11 Indian 230 Others = 100 residents 130
Scatter & Bubble Plot Showing relation based on 2 dimensions Showing relation based on 2 dimensions 12
A Short Survey https://docs.google.com/forms/d/1kXw0_RvLI-A5S-O6oTHC4svOsCtTtgU_DNz63mrgBqw/viewform?usp=sharing&edit_requested=true