1 / 29

Statistical model for count data

Statistical model for count data. Speaker : Tzu-Chun Lo Advisor : Yao-Ting Huang. Outline. Why use statistical model Target Gene expression Binomial distribution Poisson distribution Over dispersion Negative binomial Chi-square approximation Conclusion . Statistics model.

clive
Download Presentation

Statistical model for count data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical model for count data Speaker : Tzu-Chun Lo Advisor : Yao-Ting Huang

  2. Outline • Why use statistical model • Target • Gene expression • Binomial distribution • Poisson distribution • Over dispersion • Negative binomial • Chi-square approximation • Conclusion

  3. Statistics model • A statistical model is a probability distribution constructed to enable inferences to be drawn or decisions made from data. Information : sample Height, weight, etc. Population Inference We We have to choose a statistics model for sample (mean, variance) Make a decision : Hypothesis testing (mean, variance) size designer consumer

  4. Target • Gene expression • We like to use statistical model to test an observed difference in read counts is significant. Look like a significant region How about this Can we sure ? Noise or not

  5. Count data • A type of data in which the observations can take only the non-negative integer values {0, 1, 2, 3, ...}, and where these integers arise from counting rather than ranking. • An individual piece of count data is often termed a count variable. Poisson All of them are this type Binomial Negative binomial

  6. Binomial distribution • The number of successes in a sequence ofnindependent yes/no experiments, each of which yields success with probability p. • Notation :

  7. Binomial distribution Ex : p=0.8 , (1-p)=0.2 , times : 3 , success : 2 (1 1 0) (1 0 1) (0 1 1) f(2)=0.384 33 goals 110 shots in this season Success : 0.3 Fail : 0.7 What is the probability if he scored 6 goals in 10 shots

  8. Binomial distribution • Exactly six goals • Most three goals 0 1 2 3 4 5 6 7 8 9 10 6

  9. Poisson distribution • Expresses the probability of a given number of events occurring in a fixed interval. • Notation :

  10. e = 2.718281828… Poisson distribution • Suppose interval : goals per game

  11. Poisson Games • Total : 11 games • Score : 33 goals • (33/11) = 3 goals per game • Poisson : • Raw data : • We could test inaccurately in this case by poisson goals

  12. Overdispersion • The presence of greater variability (statistical dispersion) in a data set than would be expected based on a given simple statistical model.

  13. Negative binomial • Gamma-poisson (mixture) distribution

  14. Negative binomial

  15. Parameter estimation

  16. Approximate control limits • Chi-square approximation

  17. Example = 67.0

  18. Conclusion • Conclusion • Thanks for attention

  19. Statistics model • Suitable type • Which distribution should we use • Parameters • Get some information from data • Inference • What do we want to know • How could we make a decision • Hypothesis testing

  20. Statistics model • Suitable type • Binomial distribution • Parameters • n = 10, p = 0.7 • Inference • 2 successes

  21. Multinomial distribution • The analog of the Bernoulli distribution is the categorical distribution, where each trial results in exactly one of some fixed finite number k of possible outcomes. • http://en.wikipedia.org/wiki/Multinomial_distribution

  22. Trinomial distribution

  23. Count data • A type of data in which the observations can take only the non-negative integer values {0, 1, 2, 3, ...}, and where these integers arise from counting rather than ranking. • We tend to use fixed fractions of genes. The probability that reads appeared in this region The number of read counts in this interval (Binomial distribution) (Poisson distribution)

  24. Poisson example

  25. Negative binomial

More Related