310 likes | 467 Views
Sorting Cancer Karyotypes by Elementary Operations. Michal Ozery-Flato and Ron Shamir School of Computer Science, Tel Aviv University. Introduction Modeling the evolution of cancer karyotypes The karyotype sorting problem Combinatorial Analysis Results. Outline. Introduction.
E N D
Sorting Cancer Karyotypes by Elementary Operations Michal Ozery-Flato and Ron Shamir School of Computer Science, Tel Aviv University
Introduction Modeling the evolution of cancer karyotypes The karyotype sorting problem Combinatorial Analysis Results Outline
Normal female karyotype http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Breast cancer karytype (MCF-7) http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Chromosomal Instability • A phenotype of most cancer cells. • Losses or gains of chromosomes result from errors during mitosis • Chromosome rearrangements are associated with "double strand breaks" multi-polar mitoses
Double Strand Breaks • Constitute the most dangerous type of DNA damage • A successful repair ligates two matching broken ends • Mis-repair can result in rearrangements (e.g. translocations) or deletions Double strand break M.C. Escher, 1953
The Challenge Analyze the evolution of aberration events in cancer karyotypes
The Mitelman Database of Chromosome Aberrations in Cancer • Over 55,000 cancer karyotypes, culled from over 8000 scientific publications • Can be parsed automatically (CyDAS parser www.cydas.org) • The largest current data resource on cancer genomes' organization
The Normal Karyotype • Band = basic unit observable in karyotype. A unique region in the genome, identified by integer • Normal Chromosome = interval of bands • Two normal chromosomes are either disjoint or equivalent • Normal karyotype = a collection of normal chromosomes • Usually contains two copies of each chromosome (with the possible exception of the sex chromosomes)
The Cancer karyotype • Fragment = a sub-interval (>1 bands) of a normal chromosome • Chromosome = • One fragment, or a concatenation of several fragments • Orientation-less: [1,4]::[37,40][40,37]::[4,1] • Cancer karyotype = a collection of chromosomes concatenation (breakpoint)
Elementary Operations These operations can generate all known chromosomal aberrations! Breakage deletion Fusion duplication
The Karyotype Sorting (KS) Problem • Find a shortest sequence of elementary operations that transforms the normal karyotype into given cancer karyotype • Find the elementary distance = #operations in such a solution to KS. ???
The Karyotype Sorting (KS) Problem(inverse formulation) • Find a shortest sequence of inverseelementary operations that transforms the given cancer karyotype into the normal karyotype ???
Inverse Elementary Operations Breakage addition deletion Fusion c-deletion duplication
Assumptions Breakpoint ID={390,120} [20,39]::[12,1] • ~95% of the karyotypes in the Mitelman Database have no recurrent breakpoints • Assumptions: • The cancer karyotype contains no recurrent breakpoints • Every added chromosome contains no breakpoints
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 The Reduced Karyotype Sorting (RKS) Problem • Assumptions reduced problem: • No breakpoints in the cancer karyotype(i.e every chromosome is an interval) • No breakpoints created by fusions / additions • All the normal chromosomes are identical identical chromosomes breakage, fusion, c-deletion, addition The normal karyotype The cancer karyotype
Combinatorial Analysis (RKS Problem)
0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 The cancer karyotype Extending the karyotypes The normal karyotype
0 1 2 3 4 5 6 7 8 9 10 11 Parameter 1: f = #disjoint pairs of complementing interval ends • Observation: • f = -1 for fusion; f = 1 for breakage • f {0,-1,-2} for c-deletion • f {0,1,2} for addition f =5
The histogram 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 The cancer karyotype The histogram • Parameter 2: w = #bricks • Observations: • w is even • w = 0 for breakage / fusion • w {0,2} for addition / c-deletion A brick A wall with 2 bricks
Simple bricks 0 1 2 3 4 5 6 7 8 9 10 11 Simple Bricks • A brick is simple if • no lower brick (in the same wall), and • no complementing interval ends • Parameter 3: s = #simple bricks • Observation: • s {0,-1} for breakage • s =0 for constrained-deletion • |s| 2 for addition
Positive bricks 0 1 2 3 4 5 6 7 8 9 10 11 The Weighted Bipartite Graph of Bricks • Parameter 4: m = the minimum weight of a perfect matching
Main Theorem • The elementary-distance, d, satisfies: w/2+f+s+m-2N d 3w/2+f+s+m-2N N = #intervals in the normal karyotype
Results (2) • Used the main theorem to devise a polynomial-time 3-Approximation algorithm • Combined with a greedy heuristic on real data (95% of Mitelman DB) optimal solutions computed for 100% of karyotypes • 99.99% cases : lower bound is achieved (hence solution is optimal) • 30 cases: lower-bound+2 but actually optimal (manual verification)
Summary • A new framework for analyzing chromosomal aberrations in cancer • A 3-approximation algorithm when there are no recurrent breakpoints • 100% success on 57,252 karyotypes (with no recurrent breakpoints) from the Mitelman DB. • Future work: handle recurrent breakpoints • Analyze the remaining 5% of the karyotypes in the Mitelman DB.
Thank for your attention. Questions?