590 likes | 692 Views
MEDG 505 Pharmacogenomics March 18, 2004 A. Brooks-Wilson. Reminder: What is Genomics?. According to http://genomics.ucdavis.edu/what.html : “Genomics is operationally defined as investigations into the structure and function of very large numbers of genes
E N D
Reminder: What is Genomics? According to http://genomics.ucdavis.edu/what.html: “Genomics is operationally defined as investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion”
Pharmacogenetics • “The study of how genes affect people’s response to medicines” (NIH) • A subset of complex genetics where the traits relate to drugs • First observed in 1957 • Part of “personalized medicine” • 20-95% of variability in drug disposition and effects is thought to be genetic • Non-genetic factors: age, interacting medications, organ function • Drug metabolism: >30 families of genes
Pharmacogenetics: Examples • Drug metabolism genes • NAT2, isoniazid anti-tuberculosis drug hepatotoxicity • CYP3A5, many drugs • Thiopurine S-methyltransferase (TPMT), 6-thioguanine • Drug targets (receptors) • B2 Adrenergic Receptor, inhaled B agonists for asthma • Drug transporters • P-glycoprotein (ABCB1, MDR1), resistance to anti-epileptic drugs • The examples known today are those that come closest to simple genetic traits
Potential Consequences • Extended pharmacological effect • Adverse drug reactions • Lack of pro-drug activation • Increased effective dose • Metabolism by alternative, deleterious pathways • Exacerbated drug-drug interactions
Cancer Patients Gene expression profiling Somatic mutation detection Pharmacogenomics A cancer patient visits a BC Cancer Agency clinic Susceptibility Testing: Genotyping Genetic analysis Genomics and the Future of Cancer Care: Where does PGX fit? Early detection and Characterization of lesion B Gene expression profiling Somatic mutation detection Pharmacogenomics Customized Treatment + Clinical Care A Increased Surveillance: Healthy individuals each provide a blood Proteomics: New Tumour Markers Frequent screening: Blood-based tests Mammography Colonoscopy Cervical screening sample for testing at a clinic General Population Legend No appreciable cancer susceptibility Susceptible to developing a cancer Diagnosed with cancer
Concepts • Family studies vs. population studies • Penetrance • Genetic heterogeneity • Linkage vs. association • Haplotypes in family and association studies • Genetic variation, SNPs • Genotyping
Types of Genetic Studies • Family studies • multi-generation families • Association studies • Case / control (easiest to collect)
Penetrance • Penetrance = the proportion of carriers who show the phenotype • Expressivity = severity of the phenotype
Genetic Heterogeneity • Locus heterogeneity (what we usually refer to when we talk about genetic heterogeneity) • Allelic heterogeneity • Examples: • in a family study, effect • in an association study, effect
Family Studies Identify Highly Penetrant Mutations High penetrance disease allele(s) Availability of suitable families is the limiting factor Family studies are effective for only a minority of conditions
THE IMPORTANCE OFGENETIC BACKGROUND * * Many tumors Few tumors * = mutation carrier
HETEROGENEOUSGENETIC BACKGROUNDS * * * * * * Few tumors Tumor type B only Many A and B Tumors Tumor type A only Few A or B tumors Many tumors
GENETICS WITH ENVIRONMENT * * * No exercise Accountant Smokes Dyes hair Chronic stress Smokes * * * Fireman Farmer Chronic stress Hates broccoli Smokes Eats too much * * * Smokes No exercise Chemist Smokes No exercise Eats too much
RANDOM MATING * * * * * * No exercise Accountant Smokes Dyes hair Chronic stress Smokes * * * * * * Fireman Farmer Chronic stress Hates broccoli Smokes Eats too much * * * * * Smokes No exercise Chemist Smokes No exercise Eats too much
Association Studies Can Identify Variants with High or Low Penetrance • Case / control groups • Not limited to high penetrance alleles • Amenable to the study of gene-environment interactions • Preferred approach for the majority of complex • genetic disorders
Complex Diseases • Multigenic (genetic heterogeneity) • Environmental effects (multiple) • Gene-gene interactions • Gene-environment interactions • Association studies will hold up under these complications but family studies will not!
Linkage vs. Association • Linkage is to a locus • different families can be linked to the same locus but have different disease alleles • how to take advantage of this in proving a gene is responsible for a disease • Association is with an allele • done in groups or populations • the allele arose and was propagated in the population; the haplotype was degraded by recombination
Case / Control Studies • Collect blood samples from patients and controls, with consent • Establish database of clinical and epidemiological data • Select ‘candidate’ genes of interest for each trait • Sequence the candidate genes in a small group of patients • Genotype selected variants in case / control groups • Analyze for association with a phenotype • Analyze for gene-gene and gene-environment interactions • Genetic, Ethical, Legal and Social (GELS) issues investigations
LD and Association • Direct association • asks about the effect of a variant • if negative, the gene may still be involved! • Indirect association • uses LD • can be conclusively negative if all haplotypes are assessed
Human haplotype blocks . . . Ancestral chromosomes Observed pattern of historical recombination in common haplotypes Rather than 50 kb
. . . Simplify association studies SNP1 SNP2 Ancestral chromosomes A C A C G T T G A disease-causing mutation arises * A A C A C G T T G G Association with nearby SNPs * T G C A A T G G A C Location of mutation Gene
Haplotype Blocks • Became clear in October 2001 • 87% of the genome is in blocks ~> 30 kb • Average block 20 kb (Gabriel et al, 2002) • A few common haplotypes at a given locus in a given population • African populations generally have the greatest number of haplotypes and the shortest haplotype blocks • Strength of LD and size of blocks varies greatly between regions • ‘Haplotype tagging’
How to Generate Haplotypes • Haplotyping in families • Impute haplotype frequencies • Expectation Maximization algorithm • generate frequencies for case group, control group separately • Physical determination • long-range PCR, separation of molecules • cloning of single molecules • very labor intensive
Haplotypes and Association Tests A negative result of a single SNP association test rules out the SNP, not the gene or the region !
Uses of Haplotypes • In the context of family studies • In the context of association studies • In which context will the haplotypes extend further physically?
Strategies • Candidate gene based studies • hypothesis-driven • must guess the right gene!! • Current state of the art • Genome scans • “hypothesis-free” • true genome scans not currently done • scans of ~ 50,000 markers possible
Genetic Markers SNPs: Substitutions, for example, C / T Most common type of genetic variation 1 SNP every ~ 300 base pairs The SNP Consortium db contains 1.4 M mapped SNPs Ideal for association mapping over short distances Microsatellites: (CA)n or other short repeats More polymorphic than SNPs Less common than SNPs 1 polymorphic microsatellite per ~ 100,000 base pairs Best for linkage mapping over long distances, in families Minisatellites: Also known as VNTRs (variable number of tandem repeats) Highly polymorphic Forensic applications and paternity testing
SNPs • Single Nucleotide Polymorphisms • Can also use “Indels”, though some investigators throw them away! • Synonymous, non-synonymous SNPs • Mutation vs. polymorphism vs. variant • The 1% definition (and why I don’t like it)
Genetic Markers • Linkage is seen over large distances • think about why! • Microsatellites, repeats of 2, 3 or 4 bp units • 400 markers for a “10 cM” genome scan • Association (LD) is seen over short distances • Think about why! • SNPs • Could need ~500,000 markers for a true genome scan
SNP Databases • dbSNP • Human Genome Variation Database • At least 11 others! • ~ 50,000 non-synonymous SNPs in the human genome • ~ 10 million SNPs with minor allele >1% • ~ 7 million SNPs with minor allele >5%
SNP Discovery is Still Necessary • Most have been found by multi-read sequence mining • Directed SNP discovery in certain genes but not most • Individuals used were “unaffected” • John Todd’s group: the current databases of human variation are inadequate to specify all the common haplotypes in most gene regions
SNP Discovery All exons and regulatory regions of each gene Identify regulatory regions by comparative genomics Bi-directional sequencing Denaturing High Performance Liquid Chromatography (DHPLC) Other methods
GG GA AA Sample Output
Does a variant have a functional effect? • Worthy of excitement: • Nonsense mutations (create a stop codon) • Splice site mutations • Coding region deletions • To assay further: • Missense mutations (a.a. substitutions) • Promoter variants • Probably not causal: • Synonymous variation • Using LD to infer function: • is it the only likely variant in the block?
Genotyping • Determining the allele(s) present in a particular sample at a particular (SNP) marker • Many methods
Homozygous 1,1 Heterozygous Homozygous 2,2 TaqMan Output
Allele 2 Allele 1 Allele 2 Allele 1 EXTEND Primer EXTEND Primer EXTEND Primer G A A T MassEXTEND REACTION Allele 1 Allele 2 Unlabeled Primer (23-mer) Same Primer (23-mer) TCT ACT +Enzyme +ddATP +dCTP/dGTP/dTTP Extended Primer (24-mer) Extended Primer (26-mer) A C T T C T Diagram courtesy of Sequenom
* T C * CT * A G * AG * AG Multiplex Genotyping Diagram courtesy of Sequenom
Linkage Disequilibrium • The difference between the observed frequency of a haplotype and its expected frequency if all alleles were segregating randomly • For adjacent loci: A,a B,b • D = PAB - PA x PB • D is dependent on allele frequencies • 2 most commonly used measures now: • D’ = absolute value of D/Dmax • D’ = 1 if complete LD, but inflated in small samples • r2 = D2 / product of the 4 allele frequencies • r2 = 1 only if the markers have not recombined • r2 is emerging as the preferred measure of LD
The Common Disease / Common Variant Hypothesis • Vs. the Common Disease / multiple rare variant hypothesis • Combinations
1 2 PCR Set-up: Packard Multiprobe II liquid handler Template aliquotting: Robbins Hydra 3 Purification of PCR Products: Agencourt PCR and cycle sequencing: MJ Tetrads 6 Sequencing: ABI 3700s 5 PCR products Cycle Sequencing 4
Determining Allele Frequencies • Reduce the number of genotypes and genotyping cost, particularly for whole genome scans • Pool of case DNAs vs. pool of control DNAs • DNAs must be mixed in precisely equimolar proportions in the pools! • Requires a quantitative genotyping technique • E.g. 40% in cases vs. 20% in controls • Verify positives by genotyping individual samples