360 likes | 542 Views
ENGR 610 Applied Statistics Fall 2007 - Week 1. Marshall University CITE Jack Smith http://mupfc.marshall.edu/~smith1106. Overview for Today. Syllabus Introductions Chapters 1-3 Introduction to Statistics and Quality Improvement Tables and Charts Describing and Summarizing Data
E N D
ENGR 610Applied StatisticsFall 2007 -Week 1 Marshall University CITE Jack Smith http://mupfc.marshall.edu/~smith1106
Overview for Today • Syllabus • Introductions • Chapters 1-3 • Introduction to Statistics and Quality Improvement • Tables and Charts • Describing and Summarizing Data • Homework assignment
Syllabus, cont’d Text -- Levine, Ramsey, Smidt, “Applied Statistics for Engineers and Scientists: Using Microsoft Excel and MINITAB” (Prentice-Hall, 2001) - with CD-ROM
Grading • 25% - Homework and attendance • 25% - Exam 1 • 25% - Exam 2 • 25% - Exam 3
Introductions • Name • Home town • Undergraduate degree, major, where • Major focus of study at MU • Occupation, if working • Background in statistics • Hopes for this course
Introduction to Statistics(Ch 1) • What is Statistics? • Variables • Operational Definitions • Sampling • Software
What is Statistics? • Descriptive Statistics • Methods that lead to the collection, tabulation, summarization and presentation of data • Inferential Statistics • Methods that lead to conclusions, or estimates of parameters, about a population (of size N) based on summary measures (statistics) on a sample(of size n) - in lieu of a census
Why Statistics? • Describe numerical information • Draw conclusions on a large population from sample information only • Derive and test models • Understand and control variation • Improve quality of processes • Design experiments to extract maximum information • Predict or affect future behavior
Variables • Categorical • Nominal • Mutually exclusive • Collectively exhaustive • Numerical • Discrete or Continuous • Scale • Ordered • Interval - equally spaced • Ratio - with absolute zero
Operational Definitions • Objective, not subjective • Specific tests, measurements • Specific criteria • Agreed to by all • Consistent between individuals • Stable over time
Sampling • Advantages • Cost, time, accuracy, feasibility, scope • Minimize destructive tests • Probability samples • Simple random • With or without replacement • Systematic random • Random start, but constant increment or rate • Non-probability samples • Convenience, Judgment, Quota (representative)
Software • Historical (mainframe, batch) • SAS, SPSS,… • Specialized (workstations, stand-alone) • SAS, SPSS, MINITAB, S-PLUS (R*), BMDP,… • Integrated (standard desktops) • DataDesk, JMP, SYSTAT, MINITAB • Excel, add-ons (e.g., PHStat- from Prentice-Hall) • MATLAB (Octave*) *Open Source
Introduction to Quality Improvement • Quality = fitness of use • Meeting user/customer needs, expectations, perceptions and experience • Quality of… • Design - intentional differences, grades • Conformance- meets/exceeds design • Performance- long-term consistency
History of Quality Improvement Middle Ages > Industrial Revolution > Information Age Smith, Taylor, Ford, Shewhart, Deming Read text!
Themes of Quality Improvement • The primary focus is on process improvement • Shewhart-Deming cycle: Plan, Do, Study, Act • Most of the variation in a process is systemic and not due to the individual • Teamwork is an integral part of a quality-management organization • Customer satisfaction - primary organizational goal • Organizational transformation needs to occur to implement quality management • Fear must be removed from organizations • Higher quality costs less, not more, but it requires an investment in training
Tables and Charts (Ch 2) • Process Flow Diagrams • Cause-and-Effect Diagrams • Time-Order Plots • Numerical Data • Concentration Diagrams • Categorical Data • Bivariate Categorical Data • Graphical Excellence
Cause-and-Effect Diagrams • Also known as an Ishikawa or a “fishbone” Diagram Procedures or methods People or personnel Effect Environment Materials or supplies Machinery or equipment
Tables and Charts forNumerical Data • Stem-and-Leaf Displays • Poor man’s histogram • Frequency Distribution • “Binning” by range • Histogram • Polygon
Concentration Diagrams • Data points overlaid on schematic or picture of object or process of interest • By location • Displayed as individual symbols or tallies
Tables and Charts forCategorical Data • Bar Chart • Pie Chart • Almost always in percentages • Pareto Diagram • Sorted (usually descending) • Overlaid with cumulative line (polygon) plot • Separate scales • Usually in percentages
Tables and Charts forBivariate Categorical Data • Contingency Table • Cross-classification • Joint responses • Percentages by row, column, total • Side-by-Side (Cluster) Bar Chart • May prefer stacked bars with percentage data A B C 1 5 3 2 10 2 2 3 4 9 3 0 2 3 5 7 8 9 24
Graphical Excellence • Tufte, “The Visual Display of Quantitative Information” • Graphical excellence… gives the viewer the largest number of ideas, in the shortest time, with the least ink - clearly, precisely, efficiently, and truthfully • Data-ink Ratio • (data-ink)/(total ink used in graphic) • Chartjunk • Non-data or redundant “ink” • Lie Factor • (size of effect in graph)/(size of effect in data)
Describing and Summarizing Data - Descriptive Statistics (Ch 3) • Measures of… • Central Tendency • Variation • Shape • Skewness • Kurtosis • Box-and-Whisker Plots
Measures ofCentral Tendency • Mean (arithmetic) • Average value: • Median • Middle value - 50th percentile (2nd quartile) • Mode • Most popular (peak) value(s) - can be multi-modal • Midrange • (Max+Min)/2 • Midhinge • (Q3+Q1)/2 - average of 1st and 3rd quartiles
Measures of Variation • Range (max-min) • Inter-Quartile Range (Q3-Q1) • Variance • Sum of squares (SS) of the deviation from mean divided by the degrees of freedom (df) - see pp 113-5 • df = N, for the whole population • df = n-1, for a sample • 2nd moment about the mean (dispersion) (1st moment about the mean is zero!) • Standard Deviation • Square root of variance (same units as variable) • Sample (s2, s, n) vsPopulation (2, , N)
Quantiles • Equipartitions of ranked array of observations • Percentiles - 100 • Deciles - 10 • Quartiles - 4 (25%, 50%, 75%) • Median - 2 Pn = n(N+1)/100 -th ordered observation Dn = n(N+1)/10 Qn = n(N+1)/4 Median = (N+1)/2 = Q2 = D5 = P50
Measures of Shape • Symmetry • Skewness - extended tail in one direction • 3rd moment about the mean • Kurtosis • Flatness, peakedness • Leptokurtic - highly peaked, long tails • Mesokurtic - “normal”, triangular, short tails • Platykurtic - broad, even • 4th moment about the mean See p 118.
Box-and-Whisker Plots • Graphical representation of five-number summary • Min, Max (full range) • Q1, Q3 (middle 50%) • Median (50th %-ile) See pp 123-5 • Shows symmetry (skewness) of distribution
Homework • Ch 1 • Appendix 1.2 • Excel, Analysis ToolPak, PHStat add-in • Problems: 1.25 • Ch 2 • Appendix 2.1 • Problems: 2.54, 2.55, 2.61 • Ch 3 • Appendix 3.1 • Problems: 3.27, 3.31 (data on CD)
Next Week • Probability and Discrete Probability Distributions (Ch 4)