1 / 9

Understanding Covariance and Correlation in Data Analysis

Learn how to analyze relationships between variables through covariance and correlation. Explore binomial and multinomial distributions with examples and calculations in EGR 252, including probability distributions and the Binomial Distribution. Discover how to model and predict results using known probability distributions.

rebeccal
Download Presentation

Understanding Covariance and Correlation in Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Covariance/ Correlation • A measure of the nature of the association between two variables • Describes a potential linear relationship • Positive relationship • Large values of X result in large values of Y • Negative relationship • Large values of X result in small values of Y • “Manual” calculations are based on the joint probability distributions • See examples in Chapter 4 (pp. 123-126) EGR 252 2015

  2. Known Probability Distributions • Engineers frequently work with data that can be modeled as one of several known probability distributions. • Being able to model the data allows us to: • model real systems • design • predict results • Key discrete probability distributions include: • binomial • negative binomial • hypergeometric • Poisson EGR 252 2015

  3. Binomial & Multinomial Distributions • Bernoulli Trials • Inspect tires coming off the production line. Classify each as defective or not defective. Define “success” as defective. If historical data shows that 95% of all tires are defect-free, then P(“success”) = 0.05. • Signals picked up at a communications site are either incoming speech signals or “noise.” Define “success” as the presence of speech. P(“success”) = P(“speech”) • Bernoulli Process • n repeated trials • the outcome may be classified as “success” or “failure” • the probability of success (p) is constant from trial to trial • repeated trials are independent EGR 252 2015

  4. Binomial Distribution • Example: Historical data indicates that 10% of all bits transmitted through a digital transmission channel are received in error. Let X = the number of bits in error in the next 4 bits transmitted. Assume that the transmission trials are independent. What is the probability that • Exactly 2 of the bits are in error? • At most 2 of the 4 bits are in error? • More than 2 of the 4 bits are in error? • The number of successes, X, in n Bernoulli trials is called a binomial random variable. EGR 252 2015

  5. Binomial Distribution • The probability distribution is called the binomial distribution. • b(x; n, p) = , x = 0, 1, 2, …, n where p = probability of success q = probability of failure = 1-p For our example, • b(x; n, p) = EGR 252 2015

  6. For Our Example … • What is the probability that exactly 2 of the bits are in error? • At most 2 of the 4 bits are in error? • More than 2 of the 4 bits are in error? EGR 252 2015

  7. Expectations of the Binomial Distribution • The mean and variance of the binomial distribution are given by μ =np σ2 = npq • Suppose, in our example, we check the next 20 bits. What are the expected number of bits in error? What is the standard deviation? μ = 20 (0.1) = 2 σ2 = 20 (0.1) (0.9) = 1.8σ = 1.34 EGR 252 2015

  8. Another example • A worn machine tool produces 1% defective parts. If we assume that parts produced are independent, what is the mean number of defective parts that would be expected if we inspect 25 parts? • μ = 25 (0.01) = 0.25 • What is the expected variance of the 25 parts? • σ2 = 25 (0.01) (0.99) = 0.2475 • Note that 0.2475 does not equal 0.25. EGR 252 2015

  9. Helpful Hints … • Suppose we inspect the next 5 parts …b(x ; 5, 0.01) • Sometimes it helps to draw a picture. P(at least 3)  ________________ 0 1 2 3 4 5 P(2 ≤ X ≤ 4)  ________________ 0 1 2 3 4 5 P(less than 4)  ________________ 0 1 2 3 4 5 • Appendix Table A.1 (pp. 726-731) lists Binomial Probability Sums, ∑ rx=0 b(x; n, p) EGR 252 2015

More Related