1 / 14

Genome-wide association studies

Genome-wide association studies. BNFO 602 Roshan. Application of SNPs: association with disease. Experimental design to detect cancer associated SNPs: Pick random humans with and without cancer (say breast cancer) Perform SNP genotyping Look for associated SNPs

kael
Download Presentation

Genome-wide association studies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genome-wide association studies BNFO 602 Roshan

  2. Application of SNPs: association with disease • Experimental design to detect cancer associated SNPs: • Pick random humans with and without cancer (say breast cancer) • Perform SNP genotyping • Look for associated SNPs • Also called genome-wide association study

  3. Study of 100 people: Case: 50 subjects with cancer Control: 50 subjects without cancer Count number of alleles and form a contingency table Case-control example

  4. Odds of allele 1 in cancer = a/b = e Odds of allele 1 in healthy = c/d = f Odds ratio of recessive in cancer vs healthy = e/f Odds ratio

  5. Odds of allele 1 in case = 15/35 Odds of allele 1 in control = 2/48 Odds ratio of allele 1 in case vs control = (15/35)/(2/48) = 10.3 Example

  6. Statistical test of association (P-values) • P-value = probability of the observed data (or worse) under the null hypothesis • Example: • Suppose we are given a series of coin-tosses • We feel that a biased coin produced the tosses • We can ask the following question: what is the probability that a fair coin produced the tosses? • If this probability is very small then we can say there is a small chance that a fair coin produced the observed tosses. • In this example the null hypothesis is the fair coin and the alternative hypothesis is the biased coin

  7. Binomial distribution • Bernoulli random variable: • Two outcomes: success of failure • Example: coin toss • Binomial random variable: • Number of successes in a series of independent Bernoulli trials • Example: • Probability of heads=0.5 • Given four coin tosses what is the probability of three heads? • Possible outcomes: HHHT, HHTH HTHH, HHHT • Each outcome has probability = 0.5^4 • Total probability = 4 * 0.5^4

  8. Binomial distribution • Bernoulli trial probability of success=p, probability of failure = 1-p • Given n independent Bernoulli trials what is the probability of k successes? • Binomial applet: http://www.stat.tamu.edu/~west/applets/binomialdemo.html

  9. Hypothesis testing under Binomial hypothesis • Null hypothesis: fair coin (probability of heads = probability of tails = 0.5) • Data: HHHHTHTHHHHHHHTHTHTH • P-value under null hypothesis = probability that #heads >= 15 • This probability is 0.021 • Since it is below 0.05 we can reject the null hypothesis

  10. We have two random variables: X: disease status A: allele type. Null hypothesis: the two variables are independent of each other (unrelated) Under independence P(X=case and A=1)= P(X=case)P(A=1) Expected number of cases with allele 1 is P(X=case)P(A=1)N where N is total observations P(X=case)=(a+b)/N P(A=1)=(a+c)/N What is expected number of controls with allele 2? Do the probabilities sum to 1? Null hypothesis for case control contingency table

  11. Chi-square statistic Oi = observed frequency for ith outcome Ei = expected frequency for ith outcome n = total outcomes The probability distribution of this statistic is given by the chi-square distribution with n-1 degrees of freedom. Proof can be found at http://ocw.mit.edu/NR/rdonlyres/Mathematics/18-443Fall2003/4226DF27-A1D0-4BB8-939A-B2A4167B5480/0/lec23.pdf

  12. Chi-square • Using chi-square we can test how well do observed values fit expected values computed under the independence hypothesis • We can also test for the data under multinomial or multivariate normal distribution with probabilities given by the independence assumption. This would require cumulative distribution functions of multinomial and multi-variate normal which are hard to compute. • Chi-square p-values are easier to compute

  13. Case control E1: expected cases with allele 1 E2: expected cases with allele 2 E3: expected controls with allele 1 E4: expected controls with allele 2 N = a + b + c + d E1 = ((a+b)/N)((a+c)/N) N = (a+b)(a+c)/N E2 = (a+b)(b+d)/N E3 = (c+d)(a+c)/N E4 = (c+d)(b+d)/N Now compute chi-square statistic

  14. Chi-square statistic • Compute expected values • and chi-square statistic • Compute chi-square • p-value by referring to • chi-square distribution

More Related