1 / 31

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data. David Clifford Research Scientist June 2009. Talk Outline. Gas Chromatography Mass Spectrometry Examples and Properties Dynamic time warping – origins in speech recognition

sorena
Download Presentation

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Variable Penalty Dynamic Time WarpingFor Aligning Chromatography Data David Clifford Research Scientist June 2009

  2. Talk Outline • Gas Chromatography Mass Spectrometry • Examples and Properties • Dynamic time warping – origins in speech recognition • Uses in the 21st century aligning GC-MS data • Central Idea of the talk – variable penalty DTW, joint work with Glenn Stone • Results of alignment and How to do it CSIRO Issues in aligning multiple - MS spectra

  3. Gas Chromatography • Separates a gas into its constituent parts • These elute from machine over period of 40 minutes • Measures quantity several times a second • Does not identify compounds • Gold standard in analytical chemistry • Slow process, expensive technology CSIRO Issues in aligning multiple - MS spectra

  4. Uses of Gas Chromatography • Wine Chemistry • Meat quality • Metabolomic studies • Data format is similar to Liquid Chromatography-MS etc CSIRO Issues in aligning multiple - MS spectra

  5. Goal of this talk • How can we align the two signals • How can we align many signals • Dynamic time warping – yes but it overdoes the warping • Variable penalty DTW – balances warping with alignment needs • VPdtw package now available on CRAN CSIRO Issues in aligning multiple - MS spectra

  6. Before and After Alignment CSIRO Issues in aligning multiple - MS spectra

  7. Calling for a taxi…. • Matches what you say with database of placenames • Dynamic time warping was invented in the late 60s early 70s to do this kind of matching. • DTW can expand or contract your words to match placenames • DTW is natural choice for matching speech • Speed of speech differs between individuals • Um’s and ah’s need to be cut out etc. • DTW is a very fast algorithm, achieves global optimum CSIRO Issues in aligning multiple - MS spectra

  8. No alignment CSIRO Issues in aligning multiple - MS spectra

  9. Alignment by Shift CSIRO Issues in aligning multiple - MS spectra

  10. Linear Transformation (Shift and Stretch) CSIRO Issues in aligning multiple - MS spectra

  11. Parametric Time Warping CSIRO Issues in aligning multiple - MS spectra

  12. Asymmetric Dynamic Time Warping CSIRO Issues in aligning multiple - MS spectra

  13. Sakoe-Chiba DTW (bound on shift) Memory efficient variation of DTW – faster method CSIRO Issues in aligning multiple - MS spectra

  14. Dynamic Time Warping Guaranteed global optimum, but lots of non-diagonal moves CSIRO Issues in aligning multiple - MS spectra

  15. Paths found with two different penalties CSIRO Issues in aligning multiple - MS spectra

  16. Why do we need to care about this Analysis is based on peak area – and overwarping will affect peak shape and area. Overwarping introduces artificial features into data. Overwarping occurs due to too many non-diagonal moves Solution #1: penalise non-diagonal moves Solution #2: variable penalty dependent on size of peaks CSIRO Issues in aligning multiple - MS spectra

  17. Variable penalty DTW • Minimise over paths w • Choose penalty vector using a dilation of the signals • Large penalty with large peaks • Minimise this function using dynamic programming • Easy to implement • How does it compare to DTW, constant penalty DTW, and parametric time warping? CSIRO Issues in aligning multiple - MS spectra

  18. Key Ingredient for VPdtw • Penalty vector – proportional to a dilation of the signal. • There is some subjectivity here to balance the need for alignment with the affect on raw signals. CSIRO Issues in aligning multiple - MS spectra

  19. Before Alignment – can’t see detail but CSIRO Issues in aligning multiple - MS spectra

  20. Check Alignment #1 CSIRO Issues in aligning multiple - MS spectra

  21. Check Alignment #2 CSIRO Issues in aligning multiple - MS spectra

  22. Check Alignment #3 CSIRO Issues in aligning multiple - MS spectra

  23. How far are points moved by alignment? CSIRO Issues in aligning multiple - MS spectra

  24. VPdtw package – now on CRAN, GPL 2 • VPdtw, dilation, plot.VPdtw, print.VPdtw • result <- VPdtw(reference, query, penalty, maxshift = 350) • print(result) • plot(result,”Before”) • plot(result,”After”) • plot(result,”Shifts”) • plot(result) • Many queries, one penalty • One query, many penalties • Reference can be NULL CSIRO Issues in aligning multiple - MS spectra

  25. Comparisons – Time CSIRO Issues in aligning multiple - MS spectra

  26. Summary • Introduced GC-MS data • This talk is really about improving data quality • Improvement via alignment • without data reduction • without unnatural features • via fast computation • VPdtw available on CRAN • Faster • Better than available alternatives CSIRO Issues in aligning multiple - MS spectra

  27. References DTW: Vintsyuk, T. K. Kibernetika1968 4 81 - 88 Sakoe, H., and Chiba, S. Proceedings of the International Congress on Acoustics, Budapest, Hungary, 1971; paper 20 c 13. Parametric Time Warping: Eilers, P.H.C. Anal. Chem.2004 76 404 - 411 Alignment Using Variable Penalty Dynamic Time Warping by Clifford, Stone, Montoliu, Rezzi, Martin, Guy, Bruce and Kochhar.Anal. Chem., 2009, 81 (3), pp 1000–1007 CSIRO Issues in aligning multiple - MS spectra

  28. Statistical Bioinformatics - Agribusiness David Clifford Research Scientist CSIRO Division of Mathematics, Informatics and Statistics Phone: +61 2 9325 3210 Email: David.Clifford@csiro.au Web: www.csiro.au/science/org/CMIS.html Thank you Contact UsPhone: 1300 363 400 or +61 3 9545 2176Email: Enquiries@csiro.au Web: www.csiro.au

  29. VPdtw package – plot(result,”Before”) CSIRO Issues in aligning multiple - MS spectra

  30. VPdtw package – plot(result,”After”) CSIRO Issues in aligning multiple - MS spectra

  31. VPdtw package – print(result) Reference is NULL. Query column # 13 is chosen at random. Query matrix is made up of 16 samples of length 5000. Single Penalty vector supplied by user. Max allowed shift is 150. Cost Overlap Max Obs Shift # Diag Moves # Expanded # Dropped Query #1: 1521.10 4994 51 4996 47 2 Query #2: 1708.30 4996 53 5000 49 0 Query #3: 1479.60 4998 59 5000 57 0 Query #4: 1302.30 4998 62 5000 60 0 Query #5: 1505.40 4996 61 5000 57 0 Query #6: 1296.80 4997 60 5000 57 0 Query #7: 1420.80 5000 61 5000 62 0 Query #8: 1484.20 5000 59 5000 60 0 Query #9: 1424.30 5000 51 5000 53 0 Query #10:1306.30 4997 42 5000 39 0 Query #11:1193.30 4994 29 4990 28 5 Query #12: 225.04 4999 13 4998 13 1 Query #13: 0.00 5000 0 5000 0 0 Query #14: 266.09 4944 56 4894 2 53 Query #15: 746.93 4937 63 4880 4 60 Query #16: 345.87 4914 86 4836 0 82 CSIRO Issues in aligning multiple - MS spectra

More Related