1 / 44

CSCI2950-C Lecture 9 Cancer Genomics

Explore the genomics of cancer, including rearrangements and fusion genes, through paired-end sequencing and comparative genomic hybridization.

barberd
Download Presentation

CSCI2950-C Lecture 9 Cancer Genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCI2950-CLecture 9Cancer Genomics October 16, 2008 http://cs.brown.edu/courses/csci2950-c/

  2. Outline • Cancer Genomes • Paired-end Sequencing • Rearrangements • Comparative Genomic Hybridization

  3. Single nucleotide change Cell Division and Mutation Copy number Structural

  4. Rearrangements in Cancer 1) Change gene structure, create novel fusion genes Gleevec targets ABL-BCR fusion 2) Alter gene regulation Burkitt’s lymphoma IMAGE CREDIT: Gregory Schuler, NCBI, NIH, Bethesda, MD

  5. Cancer Genomes Fusion gene in >50% prostate cancer patients (Tomlins et al. Science 2005)

  6. Shotgun Sequencing genomic segment cut many times at random (shotgun) Get one or two reads from each segment ~500 bp ~500 bp

  7. Sequencing of Cancer Genomes What to sequence from each tumor? • Whole genome: all alterations • Specific genes: point mutations • Hybrid approach: structural rearrangements etc.

  8. End Sequence Profiling (ESP)C. Collins and S. Volik (2003) • Pieces of cancer genome: clones (100-250kb). Cancer DNA Sequence ends of clones (500bp). Map end sequences to human genome. x y Human DNA Each clone corresponds to pair of end sequences (ES pair)(x,y). Retain clones that correspond to a unique ES pair.

  9. ValidES pairs • Lmin ≤ y – x ≤ Lmax, min (max) size of clone. • Convergent orientation. End Sequence Profiling (ESP)C. Collins and S. Volik (2003) • Pieces of cancer genome: clones (100-250kb). Cancer DNA Sequence ends of clones (500bp). L Map end sequences to human genome. x y Human DNA

  10. End Sequence Profiling (ESP)C. Collins and S. Volik (2003) • Pieces of cancer genome: clones (100-250kb). Cancer DNA Sequence ends of clones (500bp). L Map end sequences to human genome. x y a b Human DNA • InvalidES pairs • Putative rearrangement in cancer • ES directions toward breakpoints (a,b): • Lmin ≤ |x-a| + |y-b| ≤ Lmax x y a b

  11. ESP of Normal Cell All ES pairs valid. x y Human DNA • Lmin ≤ y – x ≤ Lmax • 2D Representation • Each point (x,y) is ES pair. Genome Coordinate Genome Coordinate

  12. ESP of Tumor Cell • Valid ES pairs • satisfy length/direction • constraints • Lmin ≤ y – x ≤ Lmax • Invalid ES pairs • indicate rearrangements • experimental errors

  13. y x Clusters and Coverage Cancer DNA • Pieces of tumor genome: clones (100-250kb). Rearrangement Chimeric clone Sequence ends of clones (500bp). Cluster invalid pairs Isolated invalid pair Map end sequences to human genome. Human DNA

  14. x1 x2 a y2 y1 b Clusters Clone size: (a – x1) + (b – y1) Lmin   Lmax Genome coordinate Lmax Lmin (a,b) (a,b) (x1,y1) (x2,y2) Genome coordinate

  15. Fusion Genes Gene 1 Gene 2 Human x y a b Tumor

  16. Gene1 Gene2 Fusion Genes Gene 1 Gene 2 x y a b Lmax Lmin (a,b) (x1,y1) (x2,y2) Intersection → probability of fusion gene Respect direction of transcription Bashir, et al. (2008) PLOS Comp Biol.

  17. Results: Fusion Gene in Breast CancerBCAS3-BCAS4 Probability of Fusion = 1 Note: More precise sizing information available for some clones Bashir, et al. (2008) In Press.

  18. ESP Data Coverage of human genome: ≈ 0.34 for MCF7, BT474 Breast Cancer Cell Lines Tumors Raphael, et al. (2008)

  19. 3 9 97kb Sequenced Clone PTPRG ASTN2 Candidate Fusion Genes Gene 1 Gene 2 x y a b Confirmed by clone sequencing

  20. Breakpoint Detection Detect a rearrangement breakpoint when clone includes breakpoint. Cancer Genome breakpoint ζ Normal Genome xC yC

  21. Lander-Waterman Statistics Given: N clones of length L from a genome of size G P(ζ covered by clone) = 1 – (1 – L/G)N ≈1 – e-c, where c = N L / G is coverage P(breakpoint ζ detected) ≈1 – e-c

  22. Cancer Genome Organization • What are detailed organization of cancer genomes? • What sequence of rearrangements produce these architectures?

  23. x1 x2 x3 x4 y1 y2 x5 y5 y4 y3 ESP Genome Reconstruction Problem Human genome (known) A C E B D Unknown sequence of rearrangements Tumor genome (unknown) Map ES pairs to human genome. Reconstruct tumor genome Location of ES pairs in human genome. (known)

  24. A -C E -D B x1 x2 x3 x4 y1 y2 x5 y5 y4 y3 ESP Genome Reconstruction Problem Human genome (known) A C E B D Unknown sequence of rearrangements Tumor genome (unknown) Map ES pairs to human genome. Reconstruct tumor genome Location of ES pairs in human genome. (known)

  25. ESP Plot E (x3,y3) (x4,y4) D (x2,y2) • 2D Representation of ESP Data • Each point is ES pair. • Can we reconstruct the tumor genome from the positions of the ES pairs? Human (x1,y1) C B A A B C D E Human

  26. A -C E -D B ESP Plot → Tumor Genome E E D -D Human C -C B B A A A B C D E Human Reconstructed Tumor Genome

  27. Real data noisy and incomplete! • Valid ES pairs • satisfy length/direction • constraints • Lmin ≤ y – x ≤ Lmax • Invalid ES pairs • indicate rearrangements • experimental errors

  28. Human Tumor inversion A B C A -B C t s t s translocation A B C D -C -B D A t s t s Computational Approach • Use known genome rearrangement mechanisms • Find simplest explanation for ESP data, given these mechanisms. • Motivation: Genome rearrangements studies in evolution/phylogeny.

  29. ESP Sorting Problem • G = [0,M], unichromosomal genome. • Inversion (Reversal) s,t s,t(x) = G C A B x1 y1 x2 y2 x, if x < s or x > t, t – (x – s), otherwise. t s  C A -B G’ =G x1 y1 x2 y2 t s • Given: ES pairs (x1, y1), …, (xn, yn) • Find: • Minimum number of reversals s1,t1, …, sn, tn such that if  = s1,t1… sn, tn, • then (x1,  y1), …, (xn, yn) are valid ES pairs.

  30. tumor human x1 x2 y1 y2 x3 y3 x1 x2 x3 y2 y1 y3 Sparse Data Assumptions • Each cluster results from single inversion or translocation. 2. Each clone contains at most one breakpoint. tumor

  31. ESP Genome Reconstruction: Discrete Approximation Human • Remove isolated invalid pairs (x,y) Human

  32. ESP Genome Reconstruction: Discrete Approximation Human • Remove isolated invalid pairs (x,y) • Define segments from clusters Human

  33. ESP Genome Reconstruction: Discrete Approximation Human • Remove isolated invalid pairs (x,y) • Define segments from clusters • ES Orientations define links between segment ends Human

  34. ESP Genome Reconstruction: Discrete Approximation (x2, y2) (x3, y3) t (x1, y1) s Human • Remove isolated invalid pairs (x,y) • Define segments from clusters • ES Orientations define links between segment ends Human

  35. 5 5 4 4 3 3 Human Genome (1 2 3 4 5) Tumor Genome (1 -3 -4 2 5 ) 2 2 Minimal sequence* of translocations and inversions 1 1 1 2 3 4 5 ESP Graph • Edges: • Human genome • segments • ES pairs Paths in graph are tumor genome architectures. *Hannenhalli-Pevzner theory

  36. Sorting Permutations by Reversals  = 12…n signed permutation (Sankoff et al.1990) Reversal (i,j) [inversion] 1…i-1 -j... -ij+1…n Problem: Given , find a sequence of reversals 1, …, t with such that:  . 1 . 2 … t = (1, 2, …, n) andt is minimal. Solution: Analysis of breakpoint graph ← ESP graph • Polynomial time algorithms • O(n4) : Hannenhalli and Pevzner, 1995. O(n2) : Kaplan, Shamir, Tarjan, 1997. • O(n) [distance t] : Bader, Moret, and Yan, 2001.O(n3) : Bergeron, 2001.

  37. 1 -3 -2 4 5 Sorting Permutations  1 -3 -4 2 5 1 2 3 4 5

  38. Breakpoint Graph Black edges: adjacent elements of   1 -3 -4 2 5 end start Gray edges: adjacent elements of i = 1 2 3 4 5 1 2 3 4 5 start end Key parameter: Black-gray cycles

  39. 1 -3 -2 4 5 end start Breakpoint Graph Black edges: adjacent elements of   1 -3 -4 2 5 end start Gray edges: adjacent elements of i = 1 2 3 4 5 1 2 3 4 5 start end Key parameter: Black-gray cycles ESP Graph → Tumor Permutation and Breakpoint Graph Theorem: Minimum number of reversals to transform to identity permutation i is: d() ≥ n+1 - c() where c() = number of gray-black cycles.

  40. 1 -3 -2 4 5 end start Breakpoint Graph Black edges: adjacent elements of   1 -3 -4 2 5 end start Gray edges: adjacent elements of i = 1 2 3 4 5 1 2 3 4 5 start end ESP Graph → Tumor Permutation and Breakpoint Graph Theorem: Minimum number of reversals to transform to identity permutation i is: d() = n+1 - c() + h() + f() where c() = number of gray-black cycles.

  41. -B1 -A2 Multichromosomal Sorting • Concatenate chromosomes • Translocations modeled by reversals in concatenate • Minimal sequence in polynomial time (Hannenhalli & Pevzner 1996, Tesler 2003, Ozery-Flato and Shamir, 2003.) A1 A2 A1 B2 translocation B1 B2 B1 A2 concatenation concatenation reversal A1 A2 -B2 -B1 A1 B2

  42. MCF7 Breast Cancer Cell Line

  43. MCF7 Breast Cancer Cell Line Sequence Human chromosomes MCF7 chromosomes 5 inversions 15 translocations Raphael, et al. (2003) Bioinformatics

  44. What about duplications? • 11240 ES pairs • 10453 valid (black) • 737 invalid • 489 isolated (red) • 248 form 70 clusters (blue) 33/70 clusters Total length: 31Mb

More Related