200 likes | 364 Views
MSD-mine. MSD-mine overview. Web application for online data analysis and mining For the advanced MSDSD researcher Interactive ad-hoc queries Exploitation of integrated knowledge Analysis, charts and Data drill Combining of information with multiple joins
E N D
MSD-mine overview • Web application for online data analysis and mining • For the advanced MSDSD researcher • Interactive ad-hoc queries • Exploitation of integrated knowledge • Analysis, charts and Data drill • Combining of information with multiple joins • Generic but customised for the MSDSD
Characteristics • Not overview visualisation of hits from predefined queries • Online analysis of homogenised data • Arbitrary queries on • 100 entities (tables) in 9 sections (marts) • restrictions and results for 2000 attributes • combine entities based on 450 relations • Operability safeguards • Reject long queries and overload of results
Exploring MSDSD • Explores and explains MSDSD • With context sensitive help and descriptions • With links to MSDSD documentation • Helps to understand the structure of MSDSD • Helps learning query writing in SQL for advanced custom queries
Filter build page • Page areas • Entities (entities and relations) • Restrictions • Filter(entities joined) • Description (context sensitive)
MSDSD marts • MSDSD is organised in sections (marts) • A mart is a closely related set of tables Use in your query Click for documentation Click to expand & use
Define Restrictions • Select the attribute • Choose the operator • Type in the value or select one from a sample list value • Add the new restriction
Combine entities • Using one of its relations • Relations are organised per mart • Understand cardinality • Choose the the working node and follow its relations
MSDSD preferences • Constraint shortcuts • Important for correct analysis • All/Representative assembly • Asymmetric unit • All/Representative model • One chain per sequence • All entries • SCOP or DALI entries • Custom set of entries
Execute query • View-Navigate results • Load all records • Result based constraints • View details • Relation links • Export: Text-XML-script
Data analysis • Complete or Sample • Range or Value • Fully customisable • Context sensitive chart • Data drill operations
Analysis over a base attribute • Choose base attribute • Choose grouping operation for analysis attribute • Options and data-drill operations supported
Basic example • Find the entries with resolution < 1.2 • Select the “Structure” mart • Choose the Entry table • Set restriction on resolution • Browse the results
Filter Expressions • Entries with resolution<1.2 related to HEMOGLOBIN • Add restriction on resolution • “Or” sub-expression • Title contains the word “HEMO” or “HAEMO” or “GLOBIN”
Simple distribution chart • Find the distribution of assembly types • Use table “Assembly” • Execute the query • Analysis for the attribute “Assembly type”
Relations - external links • Entries related to “cell death” • follow their GO mappings • “Entries” where title contains the word “death” • GO mappings for an entry • Links to GO database
A more complex example • Linearity of helices that are part of beta-alpha-beta motifs and have active site contacts • Start with “Motif” table • Combine with “Helix” and “Residue Contacts” • Add a restriction • View results and statistics for the helix linearity • Focus (drill) on an area of interest
Saving results and exporting • Binding sites of “kinked” residues • Combining “Residue”, “Helix” and “Site” • Save the results on a local file • Export the results • in XML • Text • as a script
Preferences - representative sets • Find the distribution of number of crystals in experiments • Use the “XRay-data” table • View the distribution of number of crystals • For the whole PDB • For the DALI set • For a custom representative set
Custom filters and results • Percentage of residues that interact in helix interactions, of helices of similar size • “Helix interaction” table • Custom “normalised interaction factor” result item • Custom restriction “one helix is at most double in size than the other” • View the distribution of the “interaction factor”