1 / 28

MSD Search tools

Introducing a cutting-edge search interface allowing users to dynamically control queries and visualize data for in-depth understanding. This tool simplifies complex database searches, especially for non-experts, across multiple scientific fields.

Download Presentation

MSD Search tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MSD Search tools

  2. MSDlite

  3. MSDlite

  4. The “Atlas” Pages

  5. The Atlas: Ligands

  6. The Atlas: Sequence

  7. Simple search interface • Strengths: • simple, easy to use form • allows multiple search fields to be combined • relatively fast, despite performing quite complex SQL queries • Weaknesses: • not exposing the power of a relational database • user can't specify the relationship between search fields: •  "name" AND "title" AND "keyword" •  "name" OR "title" OR "keyword" •  ( "name" OR "title" ) AND NOT "keyword" • the search form is defined by the authors of the search system, not the author of a query

  8. Describing complex searches • We want to allow the user to entirely control their query • Since HTML forms are inherently static, we'll use an applet to provide a dynamic "form" that will let the user: • choose the fields to be searched • specify the relationships between search fields • choose the result fields and how results are presented • perform "complex" sub-queries e.g. SSM, FASTA

  9. Graphical DB search system • MSDpro uses an applet for constructing queries and a server to execute them • Avoids the need for the user to understand a complex database schema or know SQL • The user describes their query entirely graphically, including logical operations such as AND, OR and NOT • Applet generates an XML description of the user’s query, which is sent to the MSD query server and converted to SQL automatically

  10. Automatic SQL generation • The query server is a Java servlet: • accepts a query description as XML • converts the user’s query description into a true SQL query, which is then submitted to the search database • Searches can include components that are executed outside of the database, e.g. sequence similarity, determined using FASTA or structural similarity, determined using SSM

  11. Visualisation The process of representing abstract data to aid in understanding the meaning of the data. Not to be confused with rendering data (drawing pictures) Typically though, we render data in such a way to visualize the information within that data.

  12. Introduction Biological data comes from & is of interest to: Chemists : reaction mechanism, drug design Biologists : sequence, expression, homology, function. Structure biologists : atomic structure, fold, classification, function. Medicine : clinical effect Education : Media : Presentation of diverse information to a diverse audience. Each has there own point of view (context). Expert = scientist working within their own field of expertise Non-expert = scientist using data/information outside their field Novice = Non-scientist

  13. Web pages These are notoriously badly designed often resulting in the information on that site being unusable. The front page should load quickly The main point should appear on the first full screen Clutter – not logically laid out Too busy – cannot find the salient point 8% men & 0.5% women are colour blind Bad text/fonts Too often it doesn’t work User will go somewhere else The latest wiz-bang stuff only works on the latest browsers Only works in one browser – they only tested on one. Does not conform to standard HTML Not just presentation of results Google is a good design

  14. Asking questions Asking questions • Biological data is very complex • Chemistry, Biology, Physics, Statistics, Medicine.. • Most users will be from a different field • Asking the right question is difficult. • The user cannot use the correct terminology • Too many things to query (2000 attributes in MSD) • SQL : not suitable for most users • Interface too complex • Too many check boxes, widgets etc • Trying to be too clever • The “Go” button is buried somewhere

  15. Result presentation Results Biological data is complex • Chemistry, physics, biology, statistics, medicine… Experts users want all the detail • Ie : want to use a specific method • They want all the details • The want (I hope) the statistical validity of the results The non-expert wants the best practice answer returned within their own context. • The want comparative analysis with other fields • The want to know the results are valid

  16. Query design Suitable for text queries Only one logic AND or OR Predefined Easy to use Limited scope 2000 attributes -> 2000 check-boxes ! The simple text box design is very common

  17. Query design Graphical interface Multiple logic AND/OR/NOT Under users control Slower Steep learning curve Some users just cannot get it Intuitive once mastered Pretty

  18. Query design Figurative 2D sketch for 3D query (Active sites) Informative – presents meaning for the question Slower Less error prone select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter in ('SER','HIS') and DISTANCE <= 2.0 and type_id = 1 and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 1 intersect select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS' and ( NEIGHBOUR_ATOM_NAME = 'NE2' and type_id = 1 and distance <= 2.0 or NEIGHBOUR_SYMBOL = 'N' and type_id = 1 and distance <= 2.0) and TYPE_ID != 0 group by entry_id, ligand_id having count(distinct neighbour_residue_id) >= 2 intersect select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS' and NEIGHBOUR_ATOM_NAME = 'NE2' and DISTANCE <= 2.0 and type_id = 1 and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 2 intersect select distinct entry_id, ligand_id from contact_search sel where neighbour_code_3_letter = 'HIS' and NEIGHBOUR_SYMBOL = 'N' and DISTANCE <= 2.0 and type_id = 1 and neighbour_substruct_code = 'side' and MACROMOL_SEC_STRUCT_TYPE = 3 intersect select distinct entry_id, ligand_id from residue_contact sel where neighbour_code_3_letter in ('HIS','SER','HIS') and BOND_STRENGTH != 10 group by entry_id, ligand_id having count(*) >= 3; HIS|SER:S/H>C2.0 HIS.ne2:S/S>C2.0 HIS.[n]:S/T>C2.0

  19. YAMGP (yet another molecular graphics program) Many different programs are available VMD AstexViewer@MSD-EBI LigPlot Quanta InsightII Bobscript WebMol Frodo iMol Chime Grasp Pymol POVRay Spock Rasmol Pymol Mage Raster3D Yasara Molscript Chimera O MolMol Whatif Frodo XtalView WebLab-viewer Swiss-PDBviewer

  20. Result visualisation Multiple types of biological data • Textual data • 3D structure • 2D chemical sketches • 1D sequence • Node linked • General/derived data • Web pages • Errors/Variance • Data provenance

  21. AstexViewer@MSD-EBI • Java 1.1 Applet • Should run under most browsers • Small footprint, high speed. • Structure • Line, stick, ball & stick, sphere, schematic, surface + texture map. • Written by Mike Hartshorn (Astex therapeutics Ltd). • Multiple structures supported

  22. AstexViewer@MSD-EBI Sequence • Multiple sequence alignment • Editing, • Annotation, colours… • Consensus alignment • Pick, Brushing & Magic lens

  23. Chemistry 2D flat representation Annotation, colours… Interaction types Placement fn(contact distance) Editable Pick, Brush and magic lens

  24. Graphs Graphs • 2D, 2D grid and ND • Linkage plots • Annotation, colours… • Ramachandran, etc… • Pick, Brush Magic Len

  25. AstexViewer@MSI-EBI Visualisation Lensing Linked views Brushing Picking Flying views Hyperbolic distortion Animation Solid rendering Depth cues Colour,lighting Highlighting Etc…

  26. Visualisation : comparative analysis Similarity/Difference Data superposition Attribute display Colour, size… Correlation Attribute mapping Sequence colour by structure alignment

More Related