1 / 93

PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways

PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways. Z. Meral Özsoyoğlu Case Western Reserve University Cleveland, Ohio 44106. “ Digital” Biology. Biology and Life Sciences have become increasingly “data rich” over the past decade

mei
Download Presentation

PathCase A Web-Based Exploratory Querying and Visualization Tool for Biological Pathways

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PathCaseA Web-Based Exploratory Querying and Visualization Tool for Biological Pathways Z. Meral Özsoyoğlu Case Western Reserve University Cleveland, Ohio 44106

  2. “Digital” Biology • Biology and Life Sciences have become increasingly “data rich” over the past decade • Rapid growth of biological data, (distributed, heterogeneous) due to: • investments on public and private resources, • significant advances in data generation, storage, analysis, web-based availability, and sharing technologies, • emerging large-scale biological data gathering technologies

  3. More growth in amount and diversity • Huge investments in • developing large biological information resources, • assembling this information in public databases. • Many such resources , and tools are available, • NCBI’s Genbank, PubMed, Blast, MGI’s tools and databases, etc. • Continued explosive growth in the amount and diversity of biological and biochemical data is expected in the next century.

  4. medical informatics and physiological data • Also very large, diverse, non-standard, and distributed

  5. Biological Data Challenges • In addition to being large,diverse and distributed, three important characteristics: • Complexity, • Heterogeneity, and • Evolution of both data and the schema.

  6. Biological Data is Complex • Very rich in metadata, requires metadata management techniques. • Large, temporal, and historical, requires special knowledge warehouse design and management techniques. • It has inherently deeply-nested hierarchical structures (e.g., ontologies), or best modeled as graph structures at the conceptual level (e.g., metabolic pathways, or signaling pathways).

  7. Biological Data is Heterogeneous • in the sense that it involves a wide array of data types, including text, image, sequence data, as well as streaming data (e.g., medical sensors data), temporal data, and incomplete and missing data. • Also, heterogeneous sources and formats

  8. Biological Data is very Dynamic • Data management techniques effectively handle the dynamic data content. • But dynamic schema evolution poses challenges for data management • applications and the software tools are based on the schema and need to be updated and changed for the evolving schema accordingly.

  9. Research • Using off-the-shelf data management software tools will not be sufficient for the data management needs of “digital biology”. • Integration of the existing technologies for biological data, and development new data management techniques are needed. • NIH BISTI workshop on Digital Biology • http://www.bisti.nih.gov/2003meeting/

  10. PathCase: Case Pathways DataBase System • integrated software tool for • storing, • visualizing, • querying, • Analyzing, • biological pathways at different levels of genetic, molecular, biochemical and organismal detail. • http://nashua.case.edu/pathways

  11. Data Model • Graph structured database (hypergraph) • nodes: substrates and products • hyper edges: processes (reactions) • represented using a relational database • Querying and Visualization • based on the graph conceptual view.

  12. Other systems and resources • Reactome • Kegg • BioCyc & Pathway tools • Patika • Cytoscape • BioCarta • and others.

  13. Data Model • Pathway:interconnected arrangements of processes. • (representing functional role of genes in the genome) • Process: a reaction (or step) in a pathway involving one genetically unique gene product. • (substrates, products, co-factors, inhibitors, activators, of a reaction are all molecular entities in this perspective). • Molecular Entity : the general name given to any entity participating in a process, such as a basic molecule, protein, enzyme, gene, amino acid

  14. Browser view

  15. PathCase usage statistics: hits from 62 countries. User statistics

  16. Browse Pathways

  17. Metabolic Pathway groups

  18. Select for pathway details

  19. Select for interactive pathway graph

  20. Database content • Metabolic Pathways (39) • 37 from [Michal, G. Biochemical Pathways, John Wiley & Sons Inc., 1999] 2 (Folate and Homocystine) for human and mouse • by Joe Nadeou and Toshimori Kitami • 876 processes (for different organisms) • Organisms: • Human, mouse, animals, prokarya, plants & yeasts, unspecified

  21. S S S e e e r r r v v v e e e r r r C C C l l l i i i e e e n n n t t t D D D a a a t t t a a a b b b a a a s s s e e e R R R i i i c c c h h h C C C l l l i i i e e e n n n t t t W W W e e e b b b S S S e e e r r r v v v i i i c c c e e e W W W i i i n n n d d d o o o w w w s s s U U U s s s e e e r r r D D D a a a t t t a a a O O O b b b j j j e e e c c c t t t C C C l l l a a a s s s s s s e e e s s s I I I n n n t t t e e e r r r f f f a a a c c c e e e S S S O O O A A A P P P O O O b b b j j j e e e c c c t t t A A A c c c c c c e e e s s s s s s X X X M M M L L L I I I n n n t t t e e e r r r f f f a a a c c c e e e s s s f f f o o o r r r A A A c c c c c c e e e s s s s s s / / / E E E d d d i i i t t t G G G r r r a a a p p p h h h W W W i i i n n n d d d o o o w w w s s s B B B a a a s s s i i i c c c Q Q Q u u u e e e r r r i i i e e e s s s C C C o o o n n n t t t r r r o o o l l l S S Q Q L L Q Q u u e e r r i i e e s s A A A d d d v v v a a a n n n c c c e e e d d d Q Q Q u u u e e e r r r i i i e e e s s s X X X M M M L L L G G G r r r a a a p p p h h h A A A c c c c c c e e e s s s s s s G G G r r r a a a p p p h h h W W W e e e b b b B B B r r r o o o w w w s s s e e e r r r X X X M M M L L L G G r r a a p p h h i i n n g g / / L L a a y y o o u u t t W W W e e e b b b S S S i i i t t t e e e G G G r r r a a a p p p h h h A A A p p p p p p l l l e e e t t t G G r r a a p p h h G G e e n n e e r r a a t t i i o o n n H H H T T T M M M L L L U U U s s s e e e r r r I I I n n n t t t e e e r r r f f f a a a c c c e e e H H H T T T M M M L L L D D D i i i s s s p p p l l l a a a y y y H H H T T T M M M L L L G G r r a a p p h h C C a a c c h h i i n n g g D D D o o o c c c Web-based Pathways Query and Visualization sub-system Server Client Architecture

  22. Exploratory Querying and Visualization • Viewing whole network of pathways • Viewing in multiple levels of abstraction • Querying specific properties of any pathway component in any level of granularity • Path queries • Neighborhood queries • Different forms of queries & displaying outputs • - textual -- graphical queries • - built-in -- parametrized • - tabular – graphical query outputs • - advanced query interface

  23. Calls the query interface for finding the paths between two molecular entities

  24. Query interface for “Find paths between two molecular entities” query

More Related