290 likes | 302 Views
This presentation discusses the process of conducting a crystallographic experiment in an e-Science environment, from the initial conception to the final publication. It covers topics such as experiment properties prediction, data collection and analysis, automated structure solution, and data dissemination. The presentation also explores the challenges of separating raw data from interpretations and proposes an open archive solution for data presentation.
E N D
Simon J. Coles EPSRC National Crystallography Service School of Chemistry University of Southampton The 'end to end' crystallographic experiment in an e-Science environment: From conception to publication. All Hands Meeting 2005
Data – Information – Knowledge Cycle Experiment Properties Prediction Model All Hands Meeting 2005
Leveraging eScience Simulation Video Analysis StructuresDatabase Diffractometer Propertiese-Lab X-Raye-Lab Grid Middleware All Hands Meeting 2005
2002: ECSES Demonstrator All Hands Meeting 2005
Application for an allocation Secure access to NCS Grid resources Sample submission Monitoring sample status Data collection Raw data download Automated structure solution Data ‘Acquisition’ and ‘Workup’ All Hands Meeting 2005
Application * * * * * * * All Hands Meeting 2005
Security NCS CI KEYSTORE CLIENT Signed certificate imported into browser Applicant identity independently verified by NCS Panel award access to NCS NCS CI signs key pair NCS CI exports signed certificate Passcode & signed PFX CSR NCS CI public key All Hands Meeting 2005
Sample Submission All Hands Meeting 2005
Status Monitoring CLIENT NCS All Hands Meeting 2005
Data Collection Setup via GUI BruNo Unmount Sample Tray BruNo Mount PreScans Diffraction No Yes Unit Cell Success No Yes Strategy Data Collection Data Process System Y All Hands Meeting 2005
Data Collection • Metadata capture All Hands Meeting 2005
Data Collection All Hands Meeting 2005
Automatic Structure Solution • Background process designed to adopt the ‘Human Approach’, using refinement indicators and structural knowledge • Encorporates all ‘Q peaks’ above a cut-off as C atoms • Reject on basis of thermal parameters, adjust atom types accordingly & iterate • Hybridisation & hydrogens from connectivity & difference map peaks then fixed • Usual crystallographic validation performed, -introducing ‘chemical validation’ All Hands Meeting 2005
Data Overload & the Publication Problem 2,000,000 25,000,000 300,000 All Hands Meeting 2005
Data Dissemination Mandate All Hands Meeting 2005
Current Publishing Protocols • Aims, intellectual ideas, conclusions • Inferences, interpretation, derived results • Raw & underlying data All Hands Meeting 2005
Separating Data from Interpretations Underlying data Intellect & Interpretation All Hands Meeting 2005
The Open Archive Solution for Data Presentation services: subject, media-specific, data, commercial portals Searching , harvesting, embedding Resource discovery, linking, embedding Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Data analysis, transformation, mining, modelling Aggregator services: national, commercial Harvestingmetadata Research & e-Science workflows Repositories : institutional, e-prints, subject, data, learning objects Deposit / self-archiving Validation Validation Publication Linking Peer-reviewed publications: journals, conference proceedings Data curation: databases & databanks All Hands Meeting 2005
Workflow RAW DATA DERIVED DATA RESULTS DATA • Initialisation: mount new sample on diffractometer & set up data collection • Collection: collect data • Processing: process and correct images • Solution: solve structures • Refinement: refine structure • CIF: produce CIF (Crystallographic Information File) • Validation: chemical & crystallographic checks • Report: generate Crystal Structure Report All Hands Meeting 2005
Simple Deposition Data manipulation toolbox Associated Metadata All Hands Meeting 2005
An Archive Entry ecrystals.chem.soton.ac.uk All Hands Meeting 2005
Access to ALL underlying data All Hands Meeting 2005
Metadata Publication • Using simple Dublin Core • Crystal structure • Title (Systematic IUPAC Name) • Authors • Affiliation • Creation Date • Additional chemical information through Qualified Dublin Core • Empirical formula • International Chemical Identifier (InChI) • Compound Class • Keywords • Specifies which ‘datasets’ are present in an entry • DOI • Rights All Hands Meeting 2005
Harvesting & Aggregating: Google Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10),1832-1834. DOI:10.1039/b502828k All Hands Meeting 2005
OAI Harvesting & Aggregating OAIster: Generic All Hands Meeting 2005
OAI Harvesting & Aggregating eBank: Subject Specific All Hands Meeting 2005
OAI Harvesting & Aggregating PSIgate: Science Portal All Hands Meeting 2005
Thanks NCS: Mike Hursthouse, Mark Light, Peter Horton, Ann Bingham CombeChem: Jeremy Frey, Sam Peppe, Paul Walker IT Innovation: Mike Surridge, Ken Meacham, Steve Taylor, Darren Marvin ECS: Dave de Roure, Hugo Mills, Graham Smith, Les Carr, Chris Gutteridge eBank / UKOLN / PSIgate: Andrew Milsted, Liz Lyon, Rachel Heery, Monica Duke, Michael Day, Andy Powell, John Blundon-Ellis ££££($$$$)’s All Hands Meeting 2005
“The internet wasn't created for mockery! It was created so scientists from different universities could share datasets....” Take-Home Message Simpson, H.The Simpsons (2005), Eds. Groening, M., Brooks, J.L. & Simon, S., Series 16, Episode 8, Original air date (US) 06-Feb-2005. http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showid-146/epid-346864/ All Hands Meeting 2005