1 / 14

On the weight of indels in genomic distances

On the weight of indels in genomic distances. Marília D. V. Braga, Raphael Machado, Leonardo C. Ribeiro and Jens Stoye. ( Inmetro - Brazil / Bielefeld University - Germany ). RECOMB-CG 2011. Guidance. Hybrid models for genome rearrangements Triangle inequality disruption

kosey
Download Presentation

On the weight of indels in genomic distances

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On the weight of indels in genomic distances MaríliaD. V. Braga, Raphael Machado, Leonardo C. Ribeiroand Jens Stoye ( Inmetro - Brazil / Bielefeld University - Germany ) RECOMB-CG 2011

  2. Guidance • Hybrid models for genome rearrangements • Triangle inequality disruption • General framework to establish the triangle inequality • Tight bounds for DCJ-indel (and DCJ-substitution) distance Background Results

  3. Definitions genome chromosome Marker d b c telomere w a s t A: ct dt at bt dt dh wt at st tt ch ah ch dh th ah wh sh dh dt bh ct vt vh tail head b c d a v B:

  4. Genomic distance Inversion a b c c b d Some models: Classicalgenomic distances Hannenhalli & Pevzner 1995 (inv.+transloc.) Yancopouloset al. 2005 (DCJ) Bergeron et al. 2006 (DCJ) d c b Translocation b d c Organizational Operations a a b w Distances with indels El Mabrouk 2001 (inversion-indel distance) Yancopoulos et al. 2008 (“ghost-DCJ” distance) Braga et al. 2010 (DCJ-indel distance) Insertion Indel Operations • Indels in these models are applied to blocks of markers

  5. Triangle Inequality When indel operations of multiple markers are allowed, the triangle inequality may be disrupted [Yancopoulos et al. 2008] dist= 3 inversions A = a b c d e B = a c d b e dist(A, B)≤ dist(A, C) +dist(C, B) dist= 1 indel dist= 1 indel C = a e Is there a distance definition that does not disrupt the triangle inequality?

  6. Double cut and join with indels The adjacency graph AG(A, B): A: ct chbh btwat ahdt dh ahxzbh ct chdt bt dhat B: • Sorting A into B • Only common markers: • Minimum number of DCJs: dDCJ(A, B) = nAB - (# cycles+ # AB-paths/2) [Bergeron et al. 2006] • Including unique markers: • DCJ + indel operations: A-run A-run L1 L4 L2 Λ(P) = # of runs in C dDCJ-id(A, B) ≤ dDCJ(A, B) + λ (P) Λ(P) + 1 term related to the number of markers added or removed λ(P) = 2 L3 [WABI 2010] B-run

  7. A posterioricorrection Fixing the triangle inequality – prior work [JCB 2011]: Applying an a posteriori correction, the triangular inequality holds for the function mid(A , B) = dDCJ-id(A , B) + ku(A , B) and for any constant k≥ 3/2, where u(A,B) = #unique markers in A and B. To improve the lower bound of k we study the worst case for the inequality disruption.

  8. Evaluation of k Worst case (suppose unichromosomal genomes) General case maximum distance dDCJ-id = diameter A B A B Minimum distance dDCJ-id = 1 Minimum distance dDCJ-id = 1 C C = { }

  9. Finding the diameter/lowest k 2. The number of vertices in the adjacency graph AG(A,B) is 2nAB + 2: number of common markers +1 ch A: at ah et ettct So, we have: dDCJ-id(A,B) ≤ |AG(A,B)| = 2nAB + 2 1. The DCJ distance is at least equal to the number of vertices of AG(A,B) dDCJ(P)= λ(P)≤ dDCJ-id(P) = dDCJ(P) + λ(P) ≤|P| dDCJ-id(A,B) ≤ ΣdDCJ-id(P) = Σ |P| = |AG(A,B)| 3. The corrected distance mid satisfies the triangular inequality if k ≥ 1: dDCJ-id (A,C) + k u(A,C) + dDCJ-id (B,C) + k u(B,C) ≥ dDCJ-id (A,B) + k u(A,B) 1 + 1 + k (2 nAB + nA + nB)≥2 nAB + 2 + k (nA + nB) 2knAB≥2nAB

  10. Framework to assign weights to Indels Let w(ρ) be the weight of an operation ρ. • For any organizational operation: • w(ρ) = 1 • For indels: • w(ρ) = p + k  m(ρ) where m(ρ) the number of markers inserted or deleted by ρ. a b w s Insertion m(ρ) = 2

  11. Distance on Hybrid Model Assuming p=1 = k  (m(ρ2) + m(ρ3) + . . . + m(ρn)) = k  u(A,B) = dDCJ-indel Number of operations dHp,k(A,B) = dHp,0(A,B) + k u(A,B)

  12. More plausible distances? 3 inversions a c d b e a b c d e 1 indel 1 indel a e „ghost-DCJ model“ DCJ-indelmodel (k=1) 3 3 a c d b e a b c d e a c d b e a b c d e 2 2 4 4 a e a e

  13. Conclusion • DCJ-indel distance is a metric for • A posteriori distance correction is equivalent to the hybrid model • Similar results for DCJ-substitution distance(see talk by Marília Braga, Sunday) • Open: • p ≠ 1 • Other weight functions • Inversion-indel distance

  14. Thank You.

More Related