130 likes | 233 Views
X-ray Validation Package Present status Swanand Gore PDBe D&A meeting : 21-Oct-2010. VTF recommendations. Model-based indicators Covalent geometry (E&H) outliers Protein backbone ( Ramachandran ) and sidechains ( rotamericity , flips) outliers RNA backbone (atypical suites)
E N D
X-ray Validation PackagePresent statusSwanand GorePDBeD&A meeting : 21-Oct-2010
VTF recommendations • Model-based indicators • Covalent geometry (E&H) outliers • Protein backbone (Ramachandran) and sidechains (rotamericity, flips) outliers • RNA backbone (atypical suites) • Carbohydrates chirality and naming • Ligands • Features not observed in high-quality small-molecule xtal structures and other instances in PDB • Packing • Bad vdw clashes • Underpacking, voids • Unusual contacts • Unsatisfied hbond donors, acceptors
VTF recommendations • Data-based indicators • Wilson plot • Data anisotropy plot • Twinning (Padilla Yeates plot) • Mislabelling of amplitudes / intensities • Translational NCS • Missed symmetry • Data and model based indicators • R, Rfree • Reproducibility and difference • Real-space R • Per-residue measure of fit with 2FoFc map, normalized per residue type
VTF recommendations • Percentile scores • Per criterion, calculate the percentile rank against the whole set of X-ray entries and also against structures in its resolution bin • Update the percentiles periodically
VTF recommendations • Presentation of results for various consumers • Depositors (and annotators) • Reviewers • Concise PDF report highlighting any unusual features • End-users (experts and non-experts) • Web-based frontends with adjustable level of detail • Developers • Webservices and XML files
VTF recommendations • Validation package • Be open-source and freely distributable • wwPDB sites, labs, companies • Import/wrap existing 3rd party functionality • EDS (Uppsala), Molprobity, CCDC Mogul, WhatIf • Phenix, CCP4 • RosettaHoles, pdb-care, DACA, ProSA • Calculate recommended validation metrics and publish XML file per entry • Present XML contents in various kinds of reports
Prototypes – Validation Viewer Entry viewer Raw data and plots of phi-psi, omega, chi, B-factor, occupancy, RSR, RSCC Residue and maps viewer
New ligand-validation functionality • Mogul is a chemical mining engine developed by CCDC for small-molecule xtal structures in CSD • Splits query molecule into bond, angle, torsion and ring substructures • Finds comparable substructures from high-quality small-mol structures in CSD • Compares query substructures against CSD distributions • Bonds, angles: Z scores can be computed • Torsions: Z-score is undefined but gives an idea where a torsion lies w.r.t. distribution • Rings: computes query ring’s torsion RMSD against each comparable CSD ring, finds mean, stdev of tRMSDs to estimate a Z score for ring
Prototypes – Mogul webservice Distribution for the angle from Mogul Upload or select a ligand 2D & 3D views of ligand Bonds, angles, torsions, rings with comparable CSD fragments
D&A pipeline on all sites mmCIF under deposition Validation package (installed on each site) Validation XML file (Data, Percentiles) D&A Webservers D&A API D&A clients Released Validation XML file Public Access Distributions Webservice (if DB only at PDBe) Distributions Calculator (Runs yearly) Distributions Oracle Database (Time-stamped by year) wwPDB sites (PDBe - ?)
Validation XML • Contents • Administrative • Version of validation package and various 3rd party programs • Creation date • Distribution database version • Hierarchy of validation XML for data • Entry (id) • Model (id) • Chain (id) • Residue (seqnum, icode, resname, gri) • Atom (name, altcode, gai) • Annotations • Level (e.g. chain), identifier (chain_id), attributes • Supports modular development of validation package as annotations can be appended as and when new wrapper modules are ready
Example annotations • Atom-level • clashes • Residue-level • Average B factor, occupancy • Phi-psi, Rama outliers • Sidechain flips, rotamer outliers • RNA backbone and pucker values • Atom-group-level • Covalent bond-length and angles outliers • Chain-level • WhatIf Rama score, average RSR, NCS deviation • Entry-level • Rfree, Clash-score • twinning, tNCS, anisotropy, fit to ideal Wilson plot
Summary • VTF recommendations will be implemented in a validation package. • The package will consist of modules which import/wrap 3rd party functionality . • The package will be open-source and freely distributable. • A process for periodically updating distributions and validation XMLs will be implemented.