240 likes | 322 Views
MSD Search and Visualization tools Jawahar Swaminathan. Issues. The raw database is large and complex: 27,190+ PDB entries 120+ tables in the warehouse, many very large Cross-referenced against UniProt, PubMed...
E N D
MSD Search and Visualization tools Jawahar Swaminathan
Issues • The raw database is large and complex: 27,190+ PDB entries 120+ tables in the warehouse, many very large Cross-referenced against UniProt, PubMed... • Need to expose as much of the data as possible, without making the interface too complex • We want to cater for three categories of user: "Novice" user Experienced user Expert user
biobar A toolbar search application for Mozilla/Netscape or firefox browsers
biobar • All major bioinformatics databases covered. • Search genomic, proteomic, structural, literature and functional databases. • Links to deposition and analysis tools for sequence and structural data.
MSDlite A simple form-based query system to search the MSD Databases
AstexViewer™@MSD-EBI • View structures as wireframe, backbone or ribbons • Built-in sequence viewer • Calculate and display surfaces • Various display options: • Ramachandran plots • Distance matrix • B-factors Based on the AstexViewer™ from Astex Technology Limited and modified under licence by the MSD group
Simple search interface • Strengths: • simple, easy to use form • allows multiple search fields to be combined • relatively fast, despite performing quite complex SQL queries • Weaknesses: • not exposing the power of a relational database • user can't specify the relationship between search fields: • "name" AND "title" AND "keyword" • "name" OR "title" OR "keyword" • ( "name" OR "title" ) AND NOT "keyword" • the search form is defined by the authors of the search system, not the author of a query
Describing complex searches • We want to allow the user to entirely control their query • Since HTML forms are inherently static, we'll use an applet to provide a dynamic "form" that will let the user: • choose the fields to be searched • specify the relationships between search fields • choose the result fields and how results are presented • perform "complex" sub-queries e.g. SSM, FASTA
A graphical database search system • MSDpro uses an applet for constructing queries and a server to execute them • Avoids the need for the user to understand a complex database schema or know SQL • The user describes their query entirely graphically, including logical operations such as AND, OR and NOT • Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically
MSDpro A flexible graphical search interface for advanced searching
Automatic SQL query generation • The query server is a Java servlet: • accepts a query description as XML • converts the user’s query description into a true SQL query, which is then submitted to the search database • Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM
Search system is generic • The search system is designed to be entirely database-independent • All information about the architecture of the search database is stored in XML dictionaries • Similarly, the search and result fields which the applet presents to the user are controlled by a dictionary • The entire system could move to a completely different database simply by modifying the dictionaries
Java server architecture Methods DB and external object ontology User interface DB Methods Interface Ontology
Web-services Some of the new services from MSD are designed as web-services: • web-services are network-based services with published method signatures • can be accessed via the SOAP protocol from any language with a SOAP library, via http • The same services used within MSDpro will be accessible to any SOAP client • The MSD query engine will also be available as a web-service, allowing users to submit queries programmatically