440 likes | 561 Views
MEDG 505 Pharmacogenomics March 17, 2005 A. Brooks-Wilson. Reminder: What is Genomics?. According to http://genomics.ucdavis.edu/what.html : “Genomics is operationally defined as investigations into the structure and function of very large numbers of genes
E N D
Reminder: What is Genomics? According to http://genomics.ucdavis.edu/what.html: “Genomics is operationally defined as investigations into the structure and function of very large numbers of genes undertaken in a simultaneous fashion”
Pharmacogenetics • “The study of how genes affect people’s response to medicines” (NIH) • A subset of complex genetics for which the traits relate to drugs • First observed in 1957 • Part of “personalized medicine” • 20-95% of variability in drug disposition and effects is thought to be genetic • Non-genetic factors: age, interacting medications, organ function • Drug absorption, distribution, metabolism, excretion • >30 families of genes
Pharmacogenetics: Examples • Drug metabolism genes • NAT2, isoniazid anti-tuberculosis drug hepatotoxicity • CYP3A5, many drugs • Thiopurine S-methyltransferase (TPMT), 6-thioguanine • Drug targets (receptors) • B2 Adrenergic Receptor, inhaled B agonists for asthma • Drug transporters • P-glycoprotein (ABCB1, MDR1), resistance to anti-epileptic drugs • The examples known today are those that come closest to simple genetic traits
Potential Consequences • Extended / shortened pharmacological effect • Adverse drug reactions • Lack of pro-drug activation • Increased / decreased effective dose • Metabolism by alternative, deleterious pathways • Exacerbated drug-drug interactions
The Goal of Pharmacogenomics Picture from Perlegen website: www.perlegen.com
Complex Genetics: Concepts • Family studies vs. population studies • Penetrance • Genetic heterogeneity • Linkage vs. association • Haplotypes in family and association studies • Genetic variation, SNPs • Genotyping
Types of Genetic Studies • Family studies • multi-generation families • Association studies • Case / control (easiest to collect)
Penetrance • Penetrance = the proportion of carriers who show the phenotype • Expressivity = severity of the phenotype
Genetic Heterogeneity • Locus heterogeneity (what we usually refer to when we talk about genetic heterogeneity) • Allelic heterogeneity
Family Studies Identify Highly Penetrant Mutations High penetrance disease allele(s) Availability of suitable families is the limiting factor Family studies are effective for only a minority of conditions
Association Studies Can Identify Variants with High or Low Penetrance • Case / control groups • Not limited to high penetrance alleles • Amenable to the study of gene-environment interactions • A preferred approach for the majority of complex genetic disorders
Complex Diseases / Phenotypes • Multigenic (genetic heterogeneity) • Environmental effects (multiple) • Gene-gene interactions • Gene-environment interactions (for pharmacogenetic traits: age, alcohol consumption, hepatitis exposure, etc.) • Association studies will hold up under these complications but family-based linkage studies will not!
Linkage vs. Association • Linkage is to a locus • different families can be linked to the same locus but have different disease alleles • how to take advantage of this in proving a gene is responsible for a disease • Association is with an allele • done in groups or populations • the allele arose and was propagated in the population; the haplotype was degraded by recombination
Genetic Markers SNPs: Substitutions, for example, C / T Most common type of genetic variation Ideal for association mapping over short distances 1 SNP every ~ 200 base pairs in a population 1 SNP every ~1000 base pairs between 2 individuals dbSNP: >10M putative SNPs, > 5M validated SNPs Microsatellites: (CA)n or other short repeats More polymorphic than SNPs Less common than SNPs 1 polymorphic microsatellite per ~ 100,000 base pairs Best for linkage mapping over long distances, in families
SNPs • Single Nucleotide Polymorphisms • Can also use “Indels”, though some investigators throw them away! • Synonymous, non-synonymous SNPs • Mutation vs. polymorphism vs. variant or variation • The 1% definition
SNP Databases • dbSNP (more than just human) • Human Genome Variation Database • At least 11 others! • ~ 10 million SNPs with minor allele >1% • ~ 7 million SNPs with minor allele >5% • ~ 50,000 non-synonymous SNPs in the human genome
Case / Control Studies • Collect blood samples from patients and controls, with consent • Establish database of clinical and epidemiological data • Select ‘candidate’ genes of interest for each trait • Sequence the candidate genes in a small group of patients • Genotype selected variants in case / control groups • Analyze for association with a phenotype • Analyze for gene-gene and gene-environment interactions • Genetic, Ethical, Legal and Social (GELS) issues investigations
Linkage Disequilibrium • The difference between the observed frequency of a haplotype and its expected frequency if all alleles were segregating randomly • For adjacent loci: A,a B,b • D = PAB - PA x PB • D is dependent on allele frequencies • Other related measures also used
Human haplotype blocks . . . Ancestral chromosomes Observed pattern of historical recombination in common haplotypes Rather than 50 kb
. . . Simplify association studies SNP1 SNP2 Ancestral chromosomes A C A C G T T G A disease-causing mutation arises * A A C A C G T T G G Association with nearby SNPs * T G C A A T G G A C Location of mutation Gene
LD and Association • Direct association • asks about the effect of a variant • if negative, the gene may still be involved! • Indirect association • uses LD • can be more convincingly negative if haplotypes are assessed
Haplotype Blocks • Became clear in October 2001 • 87% of the genome is in blocks ~> 30 kb • Not all of the genome is in haplotype blocks! • Average block 22 kb, 11kb in African populations (Gabriel et al, 2002) • A few common haplotypes at a given locus in a given population • African populations generally have the greatest number of haplotypes and the shortest haplotype blocks • Strength of LD and size of blocks varies greatly between regions
How to Generate Haplotypes • Haplotyping in families • Physical determination • long-range PCR, separation of molecules • cloning of single molecules • labor intensive • Estimate haplotype frequencies • Expectation Maximization algorithm, others • generate frequencies for case group, control group
Tag SNPs Chromosome copy 1 Chromosome copy 2 Chromosome copy 3 Chromosome copy 4
The HapMap • Reference map for association studies • Expected to reduce the number of markers required to conduct effective genome scans for association • 270 samples from 4 populations: • 30 Yoruban trios (Nigeria) • 45 unrelated Japanese (Tokyo) • 45 unrelated Chinese (Beijing) • 30 U.S. trios (CEPH, N/W European ancestry) • >400,000 markers genotyped in all samples, nearly 1M in CEPH trios
Strategies • Candidate gene based studies • hypothesis-driven • must guess (one of) the right gene(s)!! • Current state of the art • Genome scans • “hypothesis-free” • scans of ~ 1 million markers are now possible
SNP Discovery is Still Necessary • Many have been found by multi-read sequence mining • Directed public SNP discovery in certain sets of genes, e.g.: • SNP500Cancer • Environmental Genome Project (EGP) • Individuals used usually “unaffected”
SNP Discovery All exons and regulatory regions of each gene Identify regulatory regions by comparative genomics Bi-directional sequencing Denaturing High Performance Liquid Chromatography (DHPLC) Other methods
1 2 PCR Set-up: Packard Multiprobe II liquid handler Template aliquotting: Robbins Hydra 3 Purification of PCR Products: Agencourt PCR and cycle sequencing: MJ Tetrads 6 Sequencing: ABI 3700s 5 PCR products Cycle Sequencing 4
SNP Discovery: PolyPhred and Consed PolyPhred: Debbie Nickerson; Consed, Phil Green
GG GA AA Sample Output
Genotyping, Technology • Determining the allele(s) present in a particular sample at a particular (SNP) marker • Many methods
Homozygous 1,1 Heterozygous Homozygous 2,2 TaqMan Output
Allele 2 Allele 1 Allele 2 Allele 1 EXTEND Primer EXTEND Primer EXTEND Primer G A A T MassEXTEND REACTION Allele 1 Allele 2 Unlabeled Primer (23-mer) Same Primer (23-mer) TCT ACT +Enzyme +ddATP +dCTP/dGTP/dTTP Extended Primer (24-mer) Extended Primer (26-mer) A C T T C T Diagram courtesy of Sequenom
* T C * CT * A G * AG * AG Sequenom MassARRAY: < 12-plex Diagram courtesy of Sequenom
Illumina BeadArray System: 1152-plex • 1152-fold multiplexing • 0.26 ng of genomic DNA per genotype • $ 0.05 USD per genotype
B A Illumina BeadArray System
Affymetrix Whole Genome Sampling Analysis: 500,000-plex Kennedy et al., 2003
Affymetrix: Allele-Specific Hybridization PM = perfect match MM = mismatch
DNA Pooling Strategies • Reduce the number of genotypes and genotyping cost, particularly for whole genome scans • Pool of case DNAs vs. pool of control DNAs • DNAs must be mixed in precisely equimolar proportions in the pools! • Requires a quantitative genotyping technique • E.g. 40% in cases vs. 20% in controls • Verify positives by genotyping individual samples