210 likes | 232 Views
Optical Mapping as a Method of Whole Genome Analysis. May 4, 2009 Course: 22M:151 Presented by: Austin J. Ramme. Presentation Outline. Introduction to Optical Mapping Definitions for Understanding Modern Optical Mapping Process Data Analysis Overview
E N D
Optical Mapping as a Method of Whole Genome Analysis May 4, 2009 Course: 22M:151 Presented by: Austin J. Ramme
Presentation Outline • Introduction to Optical Mapping • Definitions for Understanding • Modern Optical Mapping Process • Data Analysis • Overview • Steps to Restriction Map Generation • Applications of Optical Mapping • Conclusions
Optical Mapping (OM) Introduction • The number of identified polygenetic diseases is ever increasing • Methods to analyze the entire genome will enhance current diagnostic and treatment methods for a variety of diseases • Patient-specific genomic analysis has become the goal in genetics-based medical research • Optical mapping(OM) is an automated method of ordered restriction map generation with a goal of whole genome analysis that avoids the limitations inherent to traditional techniques
Definitions • Restriction Enzymes • Proteins that cleave DNA molecules based on a specific base pair sequence (e.g. ATCG) = + http://www.belchfire.net/screenshots/Pacman.jpg http://www.dnavitaminpro.com/wp-content/uploads/2008/07/dna-horizontal.jpg http://static.rbytes.net/full_screenshots/z/e/zenwaw-pacman.jpg
Definitions • Restriction Map • Representation of the cut sites on a given DNA molecule to provide spatial information of genetic loci • Optical Mapping • Process to generate ordered restriction maps from single DNA molecules • Optical Map • Ordered restriction map of a portion of genomic DNA DNA strand [2]
Computer Representation of Imaging Data • Imaged datasets are converted into barcode patterns corresponding to the cleaved fragments • Lengths are determined using an internal λ standard and fluorescence intensity values Imaged Cleaved DNA Fragments [5] Computer Representation of Ordered DNA Fragments
Raw Data • Description • Image collection containing genomic restriction fragments of known length deposited in an ordered manner • Fragments represent randomly sheared genomic DNA • Each OM imaging study redundantly represents the entire genomic region of interest • Challenges with analyzing individual DNA molecules: • Extra cut sites - physical breakage • Missing cut sites - partial digestion • Loss of small fragments • Sizing error • Chimeric maps- physically overlapped molecules • Combining multiple OMs gives more accurate restriction maps • Graphing has been used to accomplish this
Steps to Restriction Map Generation • Calculation of OM Overlaps • Overlap Graph Construction • Graph Correction Procedure • Identification of Islands • Contig Construction • Construction of Draft Consensus Map • Consensus Map Refinement
Calculation of Overlaps • A multitude of OMs are collected per optical mapping experiment • Scoring system used to find overlaps between individual optical maps: [6] • Scoring system components: • Matching sites are rewarded • Discordant sites are penalized • Length similarity is rewarded
Overlap Graph Construction • Overlap Graph = G(V,E) • Literature describes it as a graph, but its technically a digraph • The set of nodes (V) represent individual optical maps • The set of edges (E) represent high quality overlaps between pairs of maps • Weighting and orienting the edges of the graph • Edge weights correspond to genomic distances of the overlapping map regions • Orientation based on the sign of distance measurements from neighboring map centerpoints • Goal: Heaviest weight path represents the most comprehensive genomic restriction map Optical Mapping Data OM1 OM2 Graph Construction OM3 OM4 …
Graph Correction Procedure (1) • False edges correspond to falsely identified overlaps • Spurious edges • Connect two nodes forming a cycle which is not possible in linear DNA • Orientation consistent false overlaps (cut edge) • Edges that connect two unrelated portions of the genome [4] [4]
Graph Correction Procedure (2) • False Nodes Chimeric maps • Consist of two groups of nodes only connected via a single node (cut vertex) • Connect two unrelated portions of the genome [4]
Identification of Islands • Islands correspond to genomic regions spanned by multiple overlapping optical maps [4] Island 1 Island 2 Island 3 Contig Construction • For each island, “contigs” are defined as paths from sources to sinks within the overlap graph for the island • The most complete representation of the genomic region is represented by the heaviest edge path from source to sink
Construction of Draft Consensus Map • Using the determined paths, the nodes and edges are used to merge the individual optical maps corresponding to each chosen island component • Each of the individual composite optical maps are stored for further analysis [4]
Consensus Map Refinement (1) • The draft map may contain errors: • Missing cut sites • False cut sites • Hidden Markov Model (HMM) for map refinement • Compares draft map to many other optical maps • Statistics used to identify matching, deleted, and inserted cut sites [7] Hidden Markov Model
Consensus Map Refinement (2) • The corrected consensus map for each island pieced back together to form a complete genomic restriction map • Typically takes 13-15 iterations for statistical correction of the draft map Sample HMM Path [7]
Applications of Optical Mapping • Identification of genetic insertions, deletions, inversions, and repeats • Establish genotype-phenotype correlations for advancements in diagnosis and treatment of genetic disorders • Reduction of the time needed and the cost to sequence entire strands of DNA • In the future: Patient-specific whole genome analysis
Conclusions • Optical mapping is a method of restriction map generation for whole genome analysis • Applications range from clinical phenotype-genotype correlations to identification of polymorphisms in a variety of diseases • In the future, optical mapping technology will help to realize the goal of patient-specific whole genomic analysis • Optical Mapping is a modern application of discrete mathematics with potential to change medicine
References • Samad A, Huff EF, Cai W, Schwartz DC. Optical mapping: A novel, single-molecule approach to genomic analysis. Genome Res. 1995;5:1-4. • Ramme AJ. Personal image collection. . • Schwartz DC, Samad A. Optical mapping approaches to molecular genomics. Curr Opin Biotechnol. 1997;8:70-74. • Valouev A, Schwartz DC, Zhou S, Waterman MS. An algorithm for assembly of ordered restriction maps from single DNA molecules. Proc Natl Acad Sci U S A. 2006;103:15770-15775. • Aston C, Mishra B, Schwartz DC. Optical mapping and its potential for large-scale sequencing projects. Trends Biotechnol. 1999;17:297-302. • Valouev A, Li L, Liu YC, et al. Alignment of optical maps. J Comput Biol. 2006;13:442-462. • Valouev A, Zhang Y, Schwartz DC, Waterman MS. Refinement of optical map assemblies. Bioinformatics. 2006;22:1217-1224.
Questions? Further information available from: 1.) Laboratory for Molecular and Computational Genetics (http://www.lmcg.wisc.edu/) 2.) Opgen (http://www.opgen.com/)