360 likes | 471 Views
Software Project MassAnalyst. Roeland Luitwieler Marnix Kammer April 24, 2006. Overview. Introduction System requirements Our solution: Spectre Progress so far Conclusion. Introduction. Project initiator Scientific background The need for software tools. Project initiator.
E N D
Software Project MassAnalyst Roeland Luitwieler Marnix Kammer April 24, 2006
Overview • Introduction • System requirements • Our solution: Spectre • Progress so far • Conclusion
Introduction • Project initiator • Scientific background • The need for software tools
Project initiator • Dr. ir. Bas van Breukelen • Department of Biomolecular Mass Spectrometry • Utrecht University! • WENT building • Expert in: • Bioinformatics • Proteomics
Scientific background: Proteomics • Our body consists of cells • Cell functionality and structure is offered by proteins • Proteomics • Main research areas: • Identification of proteins • Interaction of proteins • Comparison of protein levels
Protein identification • How to identify proteins? • Identity defined by their structure • Protein structure • Protein: sequence of peptides • Peptide: sequence of amino acids • 20 common types • Consist of different atoms – have different masses • Too small to see… but not to weigh • Mass Spectrometry!
Mass Spectrometry (MS) • Technique using a mass spectrometer • Input: sample of peptides • Proteins have been split chemically • Provides a.o. more accuracy, efficiency • Most head / tail subsequences are present • Output: mass spectrum • Frequencies of particles of certain masses • Full peptide sequence can be derived
Mass Spectrometry (MS) • How does it work? • Ionize particles • Now particles have an electrical charge • Accelerate them in an electric field • Deflect them in a magnetic field • Deflection depends on mass (F = m a) • Measure how far they have been deflected
Mass Spectrometry (MS) • Improvements for better analysis (1) • Use chromatography • Spreads input over time: more details • Output: a sequence of MS spectra
Mass Spectrometry (MS) • Improvements for better analysis (2) • Use “recursive” mass spectrometry • Called MS/MS (or MS2 or tandem MS) • Take part of the sample that produces a peak • Usually concerns one certain peptide • Output: MS spectra with related MS/MS spectra
Mass Spectrometry (MS) • Improvements for better analysis (3) • Use bioinformatics • All output is translated to mzXML • A database is searched on MS/MS spectra • Input: raw MS data • Output: pepXML: peptide information • Tools are used to e.g. display the data • Lots of redundant / boring work is taken care of!
Bioinformatics:what can be done? • Remember the Proteomics research areas: • Identification of proteins • Interaction of proteins • Comparison of protein levels • Most research: differ one aspect at a time • Requires interactive display of data • Zooming, “stacking”, cross sections, etc. • But not just display of data • Filtering, “warping”, peak detection, etc.
Bioinformatics:existing tools • Tools exist, but… • Lots of different tools to do different things • Functionality not always as desired • They also lack functionality • Not easily extendable • Example: Pep3D • Nice visualization, but • Only one sample at a time, only a single view • Solution: develop new software
System requirements • Load raw spectrometry data • Visualize the data • Manipulate and analyze the data interactively • Export data • Extendibility • Use in open community • Open source
Loading data • mzXML: raw spectrometry data • MS spectra • Embedded MS/MS spectra • pepXML: database of matches with peptides
Visualizing the data • List of loaded samples • MS spectrum • Cross sections of the MS spectrum • MS/MS spectra • Peptide information
Manipulating and analyzing the data • Stacking: toggle samples on/off • Warping • Zooming • Peak detection • More analysis, like ratio calculation
Export data • Lists of peak pairs • Modified PepXML (i.e. with ratios) • Images of spectra • Modified samples
The structure of Spectre • Graph: MS spectra, cross sections, MS/MS spectra • Workspace: a collection of samples and settings • Sample: internal data structure for one sample • GUI: the user interface • Processor: the main link between parts of the program
The structure of Spectre GUI 1 1 Processor 4 * Workspace Graph * Sample
Systematic approachto the problem • Phased development • Three versions • Lots of diagrams • Application of courses MSO, PM • HCI team and data layer team • Later on: data visualization team • Extreme Programming
Progress so far • First version will be due in week 18 • Functionality: • Loading raw data • Visualization and user interface • Basic interaction with zooming etc. • Complete internal data structures • Export of images • Missing link between mzXML and pepXML!
Further planning • Version 2 – week 23 • Warping • Peak detection / analysis • Export of calculated data • Version 3 – week 27 • Ratio calculation • Modification of samples
After completion of the project • Web site • Open source • further maintaining • extendable
Conclusion • Spectre: a modular and extendable program • A combination of many different requirements • Phased addition of features • Any questions?
The data structure Sample 1 MzTable SampleParser SampleWriter * MzNode … … MzParser PepParser