1 / 50

Informatics Journal Club and Research Talk Template

Informatics Journal Club and Research Talk Template. Research Paradigm. Driving Biomedical Problem. Informatics Methods (existing). New Methods. Apply to 2 nd problem area to see generality of new method .

rehan
Download Presentation

Informatics Journal Club and Research Talk Template

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Informatics Journal Club and Research Talk Template

  2. Research Paradigm Driving Biomedical Problem Informatics Methods (existing) New Methods Apply to 2nd problem area to see generality of new method Evaluate: 1) ability to solve Biomed problem & 2) Incremental improvement from new method

  3. Guide to Talks • Pick journal club paper or research topic that discusses new or improved informatics methods • Make the focus of your talk be a description of the methods. • Describe methods in the context of previous work and in light of evaluations of the methods applied to the biomedical problem

  4. Journal Club: {Title of Paper} by {Authors}{Bibliographic reference} {Your Name} {Date}

  5. BMI Journal Club:Finding function: evaluation methods for functional genomic dataMyers, Barrett, Hibbs, Huttenhower &TroyanaskayaBMC Genomics 2006 7:187 Russ B. Altman 10/4/06

  6. Why this paper? • {Brief bullet points about why this paper is a good BMI journal club paper, and why you selected it}

  7. Outline (part 1) • General description of medical/biological problem • Informatics issues that come up in solving those problems • Additional biological/informatics background • Aims of paper

  8. Outline (part 2) • Methods Employed • Results • Comparison/Evaluation of Methods • Authors Conclusions • Assessment of paper: informatics • Assessment of paper: biomedicine • Concerns • Summary/ Your conclusions

  9. Why this paper? • Needed a good methodological paper • Proliferation of work here and elsewhere on predicting gene function from high throughput genomics • This paper addresses an important problem in evaluation, and uses general informatics principles • Olga is a recent BMI graduate :)

  10. Potentially confounding biomedicine! ;) • {What is application area of biology or medicine in which this work is presented?} • {Discussion of the biological or medical problem that drove/required/suggested researchers to recognize potential for informatics innovation} • {What is the significance of this biomedical problem} • {REMEMBER TO SEPARATE THE INFORMATICS FROM THE BIOMEDICAL APPLICATION. THAT MAY LEAVE NOTHING…}

  11. (Potentially confounding) biomedical background… • With the human genome sequenced, we need to understand the interactions and functions of genes (for understanding, drug-design • High-throughput experimental data sets are used and integrated for this purpose: two-hybrid, mRNA expression, affinity precipitation • Diverse algorithms are also created for integrating these data: • Naïve Bayes (Troyanskaya & others) • Probabilistic Relational Models (Koller) • Comparative techniques (Segal & Stuart)

  12. More biology context… • It is critical to assemble networks of interacting and functionally related genes in order to generate hypotheses about cellular biology, identify drug targets, assess pathway engineering opportunities. • Yeast is the best-studied organism because of the wealth of data sets • Authors suspect that use of existing “silver standards” may skew conclusions about high vs. low information content methods/data sources. • Scientists are frustrated if many predictions are “high confidence” and then fail in the lab.

  13. Informatics Problem • {Describe what is the general biomedical informatics question/problem addressed in the paper} • {Brief review of what others have done to solve this problem, and how performance has been. THIS MAY REQUIRE READING OTHER PAPERS!} • {Why is there another paper on this topic?}

  14. Informatics Problem • Whenever a method is created that makes “predictions” or “diagnoses” it must be evaluated against a gold standard of truth. • When making multiple predictions, there can be biases in the gold standard based on its coverage of the predicted space • The resulting reports of performance can vary widely and unpredictably based on which parts of the gold standard are used. • This is a relatively new problem in the context of large scale predictive technologies

  15. Informatics Problem • What is the best way to evaluate a system making thousands or millions of predictions? • How can we “level the playing field” so that different methods and data sources can be assessed with respect to information content fairly?

  16. Biomedical Context (alternative slide location) • [You may want to address the informatics question first and then raise the medical/biological context, but it often flows better if you start with the biomedical context and use that to motivate the informatics question.]

  17. Background • {Review of informatics and biomedicine people need to know in order to understand the key contributions of the paper}

  18. Background • Gene Ontology • Taxonomy of gene function, 30K+ terms • Terms assigned to genes manually = genes related if they get the same term • KEGG • Database of biological pathways • Mostly metabolic, manually curated • Genes in same pathway = related • Each of these provides a biased coverage of gene function space!

  19. Uneven gold-standard

  20. Different conclusions from different silver standards

  21. Background • GO is organized from most general (top) to most specific (bottom) • For validation, people often choose a “level” of GO at which they define GO annotations to be “meaningful.” • E.g. All GO codes at level 5 or below = sufficiently precise predictions.

  22. Wide variability in GO depth annotation frequency

  23. Aims of Paper • {As in BMI 212, a listing of the specific aims of the paper. No more than 3 usually (often less).} • {NOTE: the paper should be presented initially in the most positive light, as the authors would have presented it. The time for critique is after the “author perspective” presentation.}

  24. Aims of Paper • Define the problem of biased gold standards in high-throughput evals. • Create a method for comparing prediction methods fairly • Build a manual gold standard and associated web tool • Allow evaluations to report not only overall performance, but area-specific performance.

  25. Methods Employed • {This is the key part of the presentation for BMI crowd. This should be a presentation of the methods described in the paper at sufficient technical level so people can discuss and evaluate it. Avoid detailed math/equations unless absolutely critical to the discussion.}

  26. Methods Employed • 6 post-doctoral biologists • Examine every GO code and vote on “informative” or “not informative” if applied to a gene • 3 “informative” votes = useful category • <1 “informative” and >1000 annotations = not useful category • “Not usefuls” are key denominator for computations of precision/specificity

  27. Results • {Recapitulate major results. Usually by presenting main figures from the paper.}

  28. Results of selecting GO codes manually for Gold Standard

  29. Methods • With “gold standard” GO codes that they trust, can now analyze methods/data sources and give specific performance report on different areas (of biology). • Can also systematically remove GO topics in order to see if there are dominant effects (e.g. remove ribosomes)

  30. Comparison of methods using new gold standard

  31. New method to compare/assess methods

  32. GRIFn website available (?)

  33. Authors Conclusions • {A presentation of how the authors summarize their results and significance. Usually not more than 3 major points. Often one.}

  34. Authors Conclusions • Curated GO codes now provide more trustworthy gold-standard • Allows tools to be built that give • Overall performance • Subarea-specific breakdown of performance • Direct comparison of different methods/data sources • Sets the bar on evaluation, and starts a discussion about community-wide standards.

  35. Assessment of Paper: Informatics • {What are the major methodological (engineering) innovations in the paper, in your opinion?} • {Are the methods presented soundly, completely, and evaluated appropriately?} • {How general are the methods presented for use in other areas either directly or with some effort by others?}

  36. Assessment of Paper: Informatics • Beautiful description and justification of the work. Clearly a general problem. • Well informed by research in the field, and evaluation of problems that arise in eval. • Solution applicable in many domains • Close (KEGG, NLP, others) • Farther (Any large volume prediction activity) • Some bias in expert-based gold standards • Very good availability of specific tool to allow use (cf. Maureen)

  37. Assessment of paper: Biomedicine • {Has the paper helped make a new contribution of biomedical knowledge?} • {What is the domain significance of this paper?} • {Was it published in the right journal to find the audience who should care about it the most?}

  38. Assessment of paper: Biomedicine • Should greatly reduce the noise in papers about high-throughput predictions • Should create a new bar for performance • Systems biology and interaction informatics workers need to pay attention. • Microarray information content may be lower than thought previously on average • Genomics audience is a good one, since they need to be aware of these relatively sophisticated informatics issues.

  39. Detailed Concerns • {Particularly if you don’t like the paper, what are your technical informatics concerns about the method, implementation or evaluation?}

  40. Detailed Concerns • A little confused about negative gold standard and how it is meant to be used. (Email in to Olga…) • There are still biases in the gold standard (e.g. GO) by omission that can’t be addressed without more work • What is #2 bad GO area after “ribosome”?… that example is used a lot in the paper.

  41. Summary Conclusions • {Do you accept all of the authors conclusions previously presented?} • {Modified conclusions that you would accept}

  42. Summary Conclusions • Very important paper for evaluation of these methods • Now mandatory for papers in future to address these issues. • Authors aims achieved • Showed the problem • General solution proposed • Specific solution built and disseminated

  43. References • {This paper, and other related papers that a BMI student studying for quals or otherwise interested could review.}

  44. References • Myers CL, Barrett DR, Hibbs MA, Huttenhower C, Troyanskaya OG. Finding function: evaluation methods for functional genomic data. BMC Genomics. 2006 Jul 25;7:187. PMID: 16869964 • Lin N, Wu B, Jansen R, Gerstein M, Zhao H. Information assessment on predicting protein-protein interactions. BMC Bioinformatics. 2004 Oct 18;5:154.PMID: 15491499 • Lee SG, Hur JU, Kim YS. A graph-theoretic modeling on GO space for biological interpretation of gene clusters. Bioinformatics. 2004 Feb 12;20(3):381-8. Epub 2004 Jan 22. PMID: 14960465 • Jansen R, Gerstein M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr Opin Microbiol. 2004 Oct;7(5):535-45. PMID: 15451510 • Ben-Hur A, Noble WS. Choosing negative examples for the prediction of protein-protein interactions.BMC Bioinformatics. 2006 Mar 20;7 Suppl 1:S2. PMID: 16723005

  45. Faculty of 1000 entry

  46. Acknowledgments • {Thanks to those who contributed to preparation of presentation.} • {Don’t hesitate to contact authors of paper for clarifications. They are usually flattered that you are looking at their paper.}

  47. Acknowledgments • Maureen Hillenmeyer first brought this paper to my attention. • Olga provided a few clarifications that I needed after reading the paper. • BMI-exec encouraged me to do this as an example for how we would like students to select and present BMI JC papers this year.

  48. Thanks. {insert your email address}

  49. Thanks. russ.altman@stanford.edu

More Related