1 / 31

Associating Biomedical Terms: Case Study for Acetylation

Associating Biomedical Terms: Case Study for Acetylation. Aaron Buechlein Indiana University School of Informatics Advisor: Dr. Predrag Radivojac. Overview. Background Previous Work Methods Results. Central Dogma. Background Previous Work Methods Results.

anila
Download Presentation

Associating Biomedical Terms: Case Study for Acetylation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Associating Biomedical Terms:Case Study for Acetylation Aaron Buechlein Indiana University School of Informatics Advisor: Dr. Predrag Radivojac

  2. Overview • Background • Previous Work • Methods • Results

  3. Central Dogma Background Previous Work Methods Results http://www.accessexcellence.org/RC/VL/GG/images/central.gif

  4. Post-Translational Modifications (PTMs) Background Previous Work Methods Results

  5. Acetylation Background Previous Work Methods Results • Acetylation involves the substitution of an acetyl group (-COCH3) for hydrogen • Typically occurs on N-terminal tails and lysine residues (Lys or K)

  6. Previous Predictors Background Previous Work Methods Results • Several PTM predictors have been created prior to this work • There are also acetylation predictors prior • NetAcet is a predictor for only N-terminal sites • AutoMotif Server is a predictor for various PTMs and includes an acetylation portion • PAIL is a lysine acetylation predictor

  7. Methods Background Previous Work Methods Results • Create Dataset • Download articles relevant to acetylation and extract sites • Rank articles in order to elucidate sites quickly • SwissProt and Human Protein Reference Database (HPRD) • Create Predictors • Leave – one – protein – out validation • Matlab

  8. Article Retrieval Background Previous Work Methods Results • Searched individual journal sites for articles relevant to acetylation • Saved resultant html pages for each journal • These pages were then used as the input for a web crawler to download articles • Due to varying journal site construction each journal required a unique regular expression to extract links for articles

  9. Rank Articles Background Previous Work Methods Results • First locate occurrences of first phrase: “phrase 1” • A = {a1, a2, …, a|A |} • Next locate occurrences of second phrase: “phrase 2” • R = {r1, r2…, r|R|} • c and d are constants • x is the distance in characters between r and the nearest word a

  10. An example: acetylation Background Previous Work Methods Results 1. word “acetylat” A = {a1, a2, …, am} 2. regular expression (k  lys  lysine)(space)*(digit)+ R = {r1, r2, …, rn}

  11. An example: acetylation Background Previous Work Methods Results Score for article S: where and

  12. An example: acetylation Background Previous Work Methods Results Score for article S: where: and Papers with S > 100 are rich in sites; if S < 30 “twilight” zone

  13. Elucidate Sites Background Previous Work Methods Results • Sites were manually extracted from articles beginning with the highest rank • The original experimental paper for these sites was verified for traceable evidence • Sites were extracted from SwissProt • Sites were extracted from HPRD

  14. Predictors Background Previous Work Methods Results • Support Vector Machine • Artificial Neural Network • Decision Tree

  15. Predictor Input Background Previous Work Methods Results • Positives taken as all lysines found to be acetylated • Negatives taken as all lysines not found to be acetylated • Features created based on characteristics surrounding lysines • Amino acid content, hydrophobicity, charge, disorder, etc.

  16. Predictor Input Background Previous Work Methods Results

  17. Article and Ranking Results Background Previous Work Methods Results • 4888 articles from 10 sites were searched • Nature provided 2147 articles • Science Direct provided1519 articles • The highest ranking article was obtained from the Journal of Biological Chemistry • Score of 151.87 • Contained 10 acetylation sites • The highest ranking article was obtained from Nature when histones are excluded • Previously ranked at #5 • score of 116.36 • Contained 9unique acetylation sites

  18. Top 25 Background Previous Work Methods Results

  19. Ranking Results Background Previous Work Methods Results • Articles with scores greater than 30 had potential for providing at least one site • As scores approached 30, articles became less fruitful

  20. Dataset Results Background Previous Work Methods Results • Dataset included 1442 total sites and 1085 non-redundant sites • HPRD contributed 90 total sites • Swiss-Prot contributed 825 • Our Study contributed 527

  21. Dataset Results Background Previous Work Methods Results

  22. Sensitivity, Specificity, and Precision Background Previous Work Methods Results • Sensitivity(sn) - • Specificity(sp) - • Precision(pr) -

  23. Accuracy and AUC Background Previous Work Methods Results • Accuracy(acc) - • Area Under Curve(AUC) • Refers to the area under the Receiver Operating Curve (ROC) • ROC is the graphical plot of sensitivity vs. 1-specificity

  24. SVM Predictor Background Previous Work Methods Results

  25. Artificial Neural Network Background Previous Work Methods Results

  26. Decision Tree Background Previous Work Methods Results

  27. Algorithm Comparison Background Previous Work Methods Results

  28. I would like to acknowledge those who have helped me throughout the duration of this project, Dr. Predrag Radivojac, Dr. Haixu Tang, and Wyatt Clark

  29. I welcome your questions and/or comments

  30. An example: acetylation Background Previous Work Methods Results 1. word “acetylat” A = {a1, a2, …, am} 2. regular expression (k  lys  lysine)(space)*(digit)+ R = {r1, r2, …, rn}

  31. An example: acetylation Background Previous Work Methods Results Score for article S: where and

More Related