1 / 23

Data Integration and Extraction over Molecular Biological Data

Data Integration and Extraction over Molecular Biological Data . Cui Tao. supported by NSF. Motivation. Online biological data: Highly diverse in granularity and variety Various formats Different terminologies, ID systems, units. How to Build a Gene Extraction Ontology?. Concepts

Download Presentation

Data Integration and Extraction over Molecular Biological Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Integration and Extraction over Molecular Biological Data Cui Tao supported by NSF

  2. Motivation • Online biological data: • Highly diverse in granularity and variety • Various formats • Different terminologies, ID systems, units

  3. How to Build a Gene Extraction Ontology? • Concepts • Relationship sets • Constraints • Data Frames

  4. (G*A*T*C*)* (G*A*U*C*)* How to Build a Gene Extraction Ontology?

  5. Knowledge Sources • Gene Ontology • Thousands of terms • All Species Toolkit • 1,231,935 species names • Protein Databases • Thousands of protein names (Molecular Function, Biological Process, Cellular Component)

  6. Extraction Rules • Statistical NLP • Machine learning • Naïve Bayes • Hidden Markov Models • Decision Trees

  7. Integration

  8. Integration • Information Hidden behind Links

  9. Query-based Extraction • Query the gene extraction ontology • Find applicable resources • Fill out forms • Extract information

  10. Gene Sequence Mutant Gene Name Gene Mutant Function Protein Function Query-based Extraction Example: “Find the alfR gene, its sequence, its protein's function, and any mutant that inhibits this gene.”

  11. Contribution • Provides a way to automatically integrate online biological data from different sources • Provides an approach that can find proper online resources, fill out online forms and extract data depending on user’s query

More Related