1 / 11

Na tional C entre for Te xt M ining

Na tional C entre for Te xt M ining. John Keane NaCTeM Co-director University of Manchester. Welcome To All. JISC, BBSRC, EPSRC National Agencies (British Libraries, HMCE, MoD) Regional Agencies Industry (pharmas etc, software related, etc) Academic community (Univs, DCC, CURL etc)

gretel
Download Presentation

Na tional C entre for Te xt M ining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. National Centre for Text Mining John Keane NaCTeM Co-director University of Manchester

  2. Welcome To All • JISC, BBSRC, EPSRC • National Agencies (British Libraries, HMCE, MoD) • Regional Agencies • Industry (pharmas etc, software related, etc) • Academic community (Univs, DCC, CURL etc) • Thanks to the host institutions • Thanks to: Anne Trefethen Ross King Leona Carpenter

  3. Funding Bodies, Community etc Thanks to the funding bodies (JISC (JCSR), BBSRC, EPSRC) and the UK and international Text Mining Community For recognition of potential impact and significance of Text Mining on the bio-sector and wider academic community, and for articulating need for a National Centre

  4. Invited Speakers/Panellists • Terri Attwood, University of Manchester • Clifford Lynch, Coalition for Digital Information • Rob Procter, National Centre for e-Social Science • Dietrich Rebholz-Schuhmann, European Bioinformatics Institute

  5. Self-funded Partners • University of California, Berkley Ray Larson • University of Geneva Margaret King • University of Tokyo Jun-ichi Tsujii • San Diego Supercomputer Centre Reagan Moore

  6. Involvement MANCHESTER • Bill Black; Informatics • Julia Chruszcz; MIMAS, Manchester Computing • Carole Goble; ESNW and Computer Science • John McCarthy; MIB and Faculty of Life Sciences • John McNaught; Informatics LIVERPOOL • Paul Watry; University Library and Dept of English SALFORD • Sophia Ananiadou; Computing, Science and Engineering Wendy Johnson, now MerseyBio

  7. Text Mining – definitionAuvril and Searsmith (Illinois) 2003 • Non trivial extraction of implicit, previously unknown, and potentially useful information from (large amount of) textual data • Exploration and analysis of textual (natural-language) data by automatic and semi automatic means to discover new knowledge and update existing knowledge • What is “previously unknown” information? • Strict: Information that not even the authors knew • Lenient: Rediscover the information that the author encoded in the text

  8. BIO-SCIENCE USERS E N G I N E E R I N G USERINTERFACE ONTOLOGIES MEDICINE TEXT MINING TERM & INFORMATION EXTRACTION DATA MINING INFORMATION RETRIEVAL SCIENCE DIGITAL LIBRARIES HUMANITIES

  9. Text Mining – vision • (Bio)DBs with accurate, valid, exhaustive, rapidly updated data • only 12% of TOXLINE users find what they want • significant error rate and gaps in manually curated data • Drug discovery costs slashed; animal experimentation reduced through early identification of unpromising paths • $800M over 12 years to develop a new drug -> reduce by 2 years • New insights gained through integration and exploitation of experimental results, (bio)DBs, and scientific knowledge • Product development archives and patents yield new directions for R&D Searching yields FACTS rather than documents

  10. Text Mining – realismComputerworld 2004 • Technical: Technology is becoming mature but issues of efficiency and scalability – need to integrate myriad set of tools • Person-intensive: Skill set required to understand domain (e.g. develop ontology) and interpret/analyse results

  11. NaCTeM so far … • £1M over 3 years (review after 2 years) – co-funding by institutions of ~£800K • 6 core staff – joined October’04-January’05 • Requirements gathering and technical development phases begun • UGeneva have received funding for part-time post on ‘evaluation’ • Planned move to Manchester Interdisciplinary Biocentre in summer 2005. Thanks to all involved, and the NaCTeM team, in particular Richard Barker for organising

More Related