310 likes | 466 Views
wKinMut. An integrated tool for the analysis and interpretation of mutations in human protein kinases. José MG Izarzugaza 1 Spanish National Cancer Research Center (CNIO) 2 Center for Biological Sequence Analysis (CBS). Protein Kinases: Metabolic switches of the cell. PK Functions.
E N D
wKinMut An integrated tool for the analysis and interpretation of mutations in human protein kinases José MG Izarzugaza 1 Spanish National Cancer Research Center (CNIO) 2 Center for Biological Sequence Analysis (CBS)
Protein Kinases: Metabolic switches of the cell JoséMG Izarzugaza - txema@cbs.dtu.dk PK Functions Disease / Cancer Cell Cycle Regulation Signal Transduction Angiogenesis Immune Evasion Proliferation Anti-apoptosis Metastasis …
Kinases and Cancer: Some examples JoséMG Izarzugaza - txema@cbs.dtu.dk ~30,000 articles refer to ‘kinase AND mutation AND cancer’ in PubMed.
wKinMut: Interpreting Kinase Mutations JoséMG Izarzugaza - txema@cbs.dtu.dk http://wkinmut.bioinfo.cnio.es
wKinMut: Interpreting Kinase Mutations JoséMG Izarzugaza - txema@cbs.dtu.dk
wKinMut: Gene & Protein info JoséMG Izarzugaza - txema@cbs.dtu.dk
wKinMut: PFAM Domain Information JoséMG Izarzugaza - txema@cbs.dtu.dk
wKinMut: Mutations onto structures JoséMG Izarzugaza - txema@cbs.dtu.dk (Izarzugaza et al., BMC Bioinformatics 2009)
wKinMut: Annotation databases JoséMG Izarzugaza - txema@cbs.dtu.dk Mutations occurring at the same position, different Aa allowed Uniprot General information, experimental characterization SAAPdb Structural consequences of mutations KinMutBase Kinase mutations and disease COSMIC Somatic Mutations in Cancer
wKinMut: Assessing the Pathogenicity JoséMG Izarzugaza - txema@cbs.dtu.dk
Methods to predict the pathogenicity of Muts. SNAP SIFT SNPs&GO Polyphen-2 PMUT JoséMG Izarzugaza - txema@cbs.dtu.dk MutationAssessor Torkamani
KinMut: A Kinase-Specific Predictor • Mutation selection criteria • - Source db Uniprot • - Human protein kinases • - Experimentally classified • - Non-synonymous, non-truncating. • Dataset Description • - Disease set: 865 muts, 65 kinases • - Neutral set: 2627 muts, 447 kinases • - Best dataset according to Care et al., 2007 • - Unevenly distributed in kinase groups. JoséMG Izarzugaza - txema@cbs.dtu.dk (Izarzugaza et al., BMC Genomics 2012)
Features: General & Kinase-Specific • Mutations vectors of features • General + Kinase-specific features Gene Level Residue Level Domain Level - Membership to KinBase groups - “Propensity to disease” of GO terms - Mutation within PFAM domains - Amino acid types - Hydrophobicity changes - Sequence conservation (SIFT) - Functional annotations (SW, FireDB) - Phosphorylation propensity - Specificity Determining Positions (SDPs) JoséMG Izarzugaza - txema@cbs.dtu.dk (Izarzugaza et al., BMC Genomics 2012)
Machine Learning: SVM Support Vector Machine (SVM) - Kernel (ϕ) Radial Basis Function - Optimized Parameters: Soft margin (C=8) Radius (ϒ=6x10-4) - 10 K-fold cross-validation Soft Margin (C) ϒ C JoséMG Izarzugaza - txema@cbs.dtu.dk Gaussian Parameter (γ) (Izarzugaza et al., BMC Genomics 2012)
KinMut: Kinase-Specific Prioritization of Muts. JoséMG Izarzugaza - txema@cbs.dtu.dk (Izarzugaza et al., BMC Genomics 2012)
Top most informative features • Disease propensity of GO terms (log odds ratio) • Specificity determining positions • Change in Hydrophobicity • Amino acid type • PFAM domains (Any, Pkinase_tyr, Fn…) • SIFT • Protein Kinase group (TK, CAMK, CK1, TKL…) • Uniprot annotations • Phosphorylation propensity Kinbase groups JoséMG Izarzugaza - txema@cbs.dtu.dk (bold: Kinase-specific features) Specificity determining positions (Izarzugaza et al., BMC Genomics 2012)
wKinMut: Assessing the Pathogenicity JoséMG Izarzugaza - txema@cbs.dtu.dk
wKinMut: Literature searches with SNP2L • SNP2L: • Automatic literature mining. • Kinase mutation mentions. • Mapping to protein sequences. • False positive filtering. JoséMG Izarzugaza - txema@cbs.dtu.dk (Krallinger et al., BMC Bioinformatics 2009)
SNP2L: A Text Mining Pipeline JoséMG Izarzugaza - txema@cbs.dtu.dk (Krallinger et al., BMC Bioinformatics 2009)
SNP2L: Simplified Pipeline Abstracts / Fulltext … V600E B-Raf, which confers increased kinase activity, … Mutation Detection … V600E B-Raf, which confers increased kinase activity, … Protein Detection Mutation Protein V600E B-Raf JoséMG Izarzugaza - txema@cbs.dtu.dk 583 GDFGLAT VKSRWSG 606 600 Filtering: Sequence Match Abstracts 643 different mutsin 128 kinases Full Texts 6970 different mutsin 325 kinases (Krallinger et al., BMC Bioinformatics 2009)
SNP2L: Manual Validation (100 abstracts) Ambiguous cases (2%) WrongMutation Detection (3%) Already in DBs (23%) Wrong Mutation to Protein Assignment (23%) Orthologues (8%) JoséMG Izarzugaza - txema@cbs.dtu.dk Correct, Manual Validation (41%) ~72% Mutation extractions proved correct… … including ~49% that were notalready in databases. (Krallinger et al., BMC Bioinformatics 2009)
SNP2L: Structural distribution of mentions JoséMG Izarzugaza - txema@cbs.dtu.dk - Literature mentions distributed all over the PK domain. - Higher frequency associated to functional hot-spots. (Krallinger et al., BMC Bioinformatics 2009)
SNP2L: From Mutations to Literature JoséMG Izarzugaza - txema@cbs.dtu.dk Kinase Mutations SNP2L Evidence Phenotype Biochemical mechanism Organism/Tissue Experimental conditions … (Krallinger et al., BMC Bioinformatics 2009)
wKinMut: Literature searches with SNP2L • SNP2L: • Automatic literature mining. • Kinase mutation mentions. • Mapping to protein sequences. • False positive filtering. JoséMG Izarzugaza - txema@cbs.dtu.dk (Krallinger et al., BMC Bioinformatics 2009)
wKinMut: Interactions from Literature (iHop) JoséMG Izarzugaza - txema@cbs.dtu.dk (Hoffman & Valencia, Nature Methods 2004)
wKinMut: Integration of mutation information Gene/Protein Features Residue Features SNP2L Literature iHop Interactions EGFR Gly 719 Ala: Lung cancer, somatic mutation Located in ATP binding pocket Confers resistance to gefitinib Domain Features Info in other Databases http://wkinmut.bioinfo.cnio.es
Personalized (Stratified) Medicine JoséMG Izarzugaza - txema@cbs.dtu.dk Sequencing and Genome Analysis “Here is my sequence”
Simplified Personalized Med. Pipeline Blood (normal tissue) Cancer Patient Mutations Cancer cells Personalized Medicine Information Integration wKinMut JoséMG Izarzugaza - txema@cbs.dtu.dk Clinician Treatment Educated Decision
Integrate Information, help the clinician decide JoséMG Izarzugaza - txema@cbs.dtu.dk (Courtesy of M.Vazquez)
wKinMut as part of our Pers. Med. Pipeline Endocrine Cancer SIFT Polyphen-2 MutationAssessor KinMut (kinases) Pancreas Cancer
Acknowledgements Pathogenicity Prediction & Personalized Medicine (Izarzugaza et al., BMC Genomics 2012) Alfonso Valencia Miguel Vazquez Angela del Pozo Victor de la Torre Text Mining (Izarzugaza et al., Frontiers 2012) (Krallinger et al., BMC Bioinformatics 2009) Martin Krallinger Mutation Mapping & Distribution Analysis (Izarzugaza et al., Proteins 2009) (Izarzugaza et al., BMC Bioinformatics2009) (Izarzugaza et al., BMC Bioinformatics2011) Gonzalo Lopez Antonio Rausell Christine Orengo’s Group (UCL) Ollie Redfern Corin Yeats BenôitDesailly Andrew Martin’s Group (UCL) AnjaBaresic Lisa Hopcroft JoséMG Izarzugaza - txema@cbs.dtu.dk