60 likes | 68 Views
Biomed Summarization With Citation Sentences. Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen. Prabha.Yadav@ucdenver.edu. Task Definition. Given: A “reference” paper 10 “citing” papers that cite the reference paper Citations in the citing paper Return:
E N D
Biomed Summarization With Citation Sentences Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen Prabha.Yadav@ucdenver.edu
Task Definition • Given: • A “reference” paper • 10 “citing” papers that cite the reference paper • Citations in the citing paper • Return: • Task 1A: Substrings of the reference paper that are the source of specific citations in the citing papers • Task 1B: Identify the facet of the reference span • Task 2: Write a 250-word summary of the reference paper that takes into account the citations
Example of Citation Mapping Reference Paper Citing Paper Voorhoeve et al. (2006), A Genetic Screen Implicates miRNA-372 and miRNA-373 As Oncogenes in Testicular Germ Cell Tumors Osada and Takehashi (2007), MicroRNAs in biological processes and carcinogenesis We subsequently created a human miRNA expression library (miR-Lib) by cloning almost all annotated human miRNAs into our vector (Rfam release 6) (Figure S3). Additionally, we made a corresponding microarray (miR-Array) containing all miR-Lib inserts, which allow the detection of miRNA effects on proliferation. …suggesting that miR-21 overexpression may contribute to the malignant phenotype by suppressing critical apoptosis-related genes (115). Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array… Method Given a citance (or citing clause): Task 1: Find the most pertinent sentence(s) in the reference paper to the citation text Task 2: Identify the facet (from a given set of facets) of the reference span(s)
Application: Summarize the reference paper from ordered faceted citances Reference Paper Voorhoeve et al. (2006), A Genetic Screen…... In mammals, a near-perfect complementarity between miRNAs and protein coding genes almost never exists, making it difficult to directly pinpoint relevant downstream targets of a miRNA. Several algorithms were developed that predict miRNA targets, most notably TargetScanS, PicTar, and miRanda (John et al., 2004, Lewis et al., 2005 and Robins et al., 2005). These programs predict dozens to hundreds of target genes per miRNA, making it difficult to directly infer the cellular pathways affected by a given miRNA. Furthermore, the biological effect of the downregulation depends greatly on the cellular context, which exemplifies the need to deduce miRNA functions by in vivo genetic screens in well-defined model systems. The cancerous process can be modeled by in vitro neoplastic transformation assays in primary human cells (Hahn et al., 1999). Using this system, sets of genetic elements required for transformation were identified. For example, the joint expression of the telomerase reverse transcriptase subunit (hTERT), oncogenic H-RASV12, and SV40-small t antigen combined with the suppression of p53 and p16INK4A were sufficient to render primary human fibroblasts tumorigenic (Voorhoeve and Agami, 2003). Citing Papers to identify miRNAs that when overexpressed could substitute for p53 loss and allow continued proliferation in the context of Ras activation Goal Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array… Method miR-372 and miR-373 were consequently found to permit proliferation and tumorigenesis of these primary cells carrying both oncogenic RAS and wild-type p53, Result Con-clusion probably through direct inhibition of the expression of the tumor-suppressor LATS2 and subsequent neutralization of the p53 pathway.
Size of Corpus. Counts of annotations. No of reference papers
Conclusions • 15 teams participated in the shared task—significant impact for the first year • Many valuable lessons learnt about automation of quality control • Significant code based developed for public release • Data available free of charge: Thank You http://www.nist.gov/tac/2014/BiomedSumm/data.html