260 likes | 304 Views
Improving peptide identification for tandem mass spectrometry by incorporating translatomics information. Chuan-Le Xiao ( 肖传乐 ). 中山大学眼科学国家重点实验室. 1. Background. 3 steps in protein identification :. 1. Background. J. Proteome Res. 2014, 13, 4113−4119. 1.
E N D
Improving peptide identification for tandem mass spectrometry by incorporating translatomics information Chuan-Le Xiao (肖传乐) 中山大学眼科学国家重点实验室
1 Background 3 steps in protein identification:
1 Background J. Proteome Res. 2014, 13, 4113−4119
1 Background Translatomics (Ribosome profiling, Ribo-seq)
1 Sequencing read 50-150bp Background 目前测序产生数百万个短读序列(reads) ,将每个read在基因组上准确定位。 Mapping 困难:基因组序列太长, 需要比对reads量大。 计算方法要求: • 速度和准确度 • 可接受的内存耗用
1 Background FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads. Nucleic Acids Res. 2012 FANSe2: A Robust and Cost-Efficient Alignment Tool for Quantitative Next-Generation Sequencing Applications.PLoS ONE 高灵敏度!高速度!高精度!
1 A549正在转录mRNA量与蛋白质量关系 Background
峰强度 峰强度 峰强度 ? m/z 1 Background 问题:蛋白的鉴定效率低(约10-30%)
1 Background ProVerB: 高鉴定能力和高精度,广泛使用性且可靠性高
1 1) 配对氨基酸与峰强度统计分析(b, y离子强度矩阵) Background i=A,C,…. j=A,C,….
1 2) 产生理论图谱 Background 理论峰产生规则: 1. b ,y碎片离子必须产生 2. 碎片离子包含S,T,E,D产生 b-H2O和 y-H2O 3. 包含R,K,Q,N产生b-NH3,和y-NH3 4.母离子价态大于1且包含S,H,K 生成二价离子 .
b3和b4 ,b4和b5 1 3) 打分模型 Background 实例 • 匹配打分模型 P0=0.06 • 连续匹配打分模型 r=0.09083 • b, y离子匹配打分模型 • 总分和去背景值
1 Background IPomics we propose a novel strategy and develop a software system called IPomics for peptides identification by incorporating prior information from tranlatomics abundance information
2 Materials and method 1. Five data resource Ribo-seq and MS/MS paired datasets
2 Materials and method ProVerB FANSe2 2. Analysis pipline The analysis pipeline of IPomicswas made up of five key steps
3 RESULTS 1. The prior information of FPKM for protein identification 2. The incorporation of tranlatomic FPKM in scoring model 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and Pfind 4. Computational validation with SILAC and Tyrosine phosphorylation datasets
3 RESULTS 1. The prior information of FPKM for protein identification
3 RESULTS Established a quantification model to transform the FPKM of translatomic into the corresponding probability of protein identification
3 RESULTS 2. The incorporation of tranlatomic FPKM in scoring model There were two ways included simple fragment match and consecutive ion match for incorporating the PF of prior information FPKM in the binomial scoring model we evaluated the different distribution of peptide score by applying two scoring methods -10·lg(P) and -10·lg(Psimple)
3 RESULTS 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind
3 RESULTS 3. Comparison of IPomics with Mascot, OMSSA, X!Tandem and pFind
3 Comparison_peptides
3 Comparison_high-confidence peptides Table 2. Fractions of high confidence peptides of the five algorithms
3 RESULTS 4. Computational validation with SILAC and Tyrosine phosphorylation datasets
3 RESULTS 4. Computational validation with Tyrosine phosphorylation datasets Table S8. The identified spectra and peptides in tyrosine dataset The 175 of 304 tyrosine sites identified by IPomics were also searched in both Mascot and OMSSA, and the high confidence tyr peptides that at least identified by two engines were as high as 85.5% in IPomics (Fig. 7). The 14.5% (44) tyrosine phosphorylation peptides were uniquely identified by IPomics without overlap. However, all those peptides with tyrosine phosphorylation sites had been experimental verified in PhosphoSitePlus