150 likes | 269 Views
Accessing the data: going beyond what the author wanted to tell you. Interactive Publications and the Record of Science ICSTI Winter Workshop Paris, Monday, February 8, 2010. Brian McMahon International Union of Crystallography 5 Abbey Square, Chester CH1 2HU, UK bm@iucr.org.
E N D
Accessing the data: going beyond what the author wanted to tell you Interactive Publications and the Record of Science ICSTI Winter Workshop Paris, Monday, February 8, 2010 Brian McMahon International Union of Crystallography 5 Abbey Square, Chester CH1 2HU, UK bm@iucr.org
PDFs and data impoverishment Henry Rzepa: Publishers are likely to love interactive PDF, since it is easy to archive. However ... such objects are data impoverished. Whereas with Jmol, one is obliged to provide semantically accurate data (e.g. CML or equivalent), the PDF object is simply a (pre)rendering of that data. Thus reconstituting a useful molecule from Jmol is trivial (and that reconstitution can then be used for many other purposes), reconstituting a molecule from a 3D PDF is likely to be non trivial, and will almost certainly suffer information loss compared to the original data. By all means, provide both, but I strongly urge that a 3D PDF should not be the only object provided. http://www.mail-archive.com/jmol-users@lists.sourceforge.net/msg13417.html 19 December 2009:
Jmol interactive visualizations • Not new • Biochem J. (2008). 412 399–413 • Bespoke design / • implementation • Expensive • Requires consultation • Supplementary • information
The right tool for the job • Jmol • Then (ca. 2004): • Protein structures (RasMol) • Small organic chemical molecules (Chime) • Now: • Crystal lattices (symmetry) • Inorganic materials (coordination polyhedra) • Displacement ellipsoids • Symmetry operations • Electron orbitals • Electron density maps
Making it easier to use • Editing toolkit • http://submission.iucr.org/jtkt • High-quality immediate visual feedback • Context-sensitive help • Manuals, examples, tutorials • Reference: McMahon, B. & Hanson, R.M. (2008). J. Appl. Cryst.41, 811-814. A toolkit for publishing enhanced figures
Interactive molecular visualizations enhance understanding Acta Cryst. (2008). F64, 156-162 • Rotate • Modify orientation • Alternative representations • Overlay representations • Interrogate
Infrastructure for publication workflow • Server/client architecture • Ability to create interactive figures before or during article submission/review • Opportunity for peer review/revision • Auto-generation of static equivalent • Easy generation/activation of multiple scripts to provide alternative views
Requirements for routine publication of enhanced figures • Platform independence • Web access for authors • Serving visualization application and data • Integration into submission/review procedures • Integration into journal production workflow • Automated generation of static copy (for failsafe/PDF edition/archiving) • Authoring tools
The authoring environment • The author uploads a data file (CIF) • The system provides different default styles according to the type of structure • The author edits and annotates the view • The author may supply additional scripts • The author saves the result as an enhanced figure + publication-quality static figure
Saving the enhanced figure • Interactive applet • Active scripts provided by the author • High-resolution static image • Option to view dynamic or static image online • Link to allow peer review
The toolkit editing interface • Essential tool for authors • Accommodates novice and advanced users • Tabbed interface allows authors to concentrate on scientific aspects of visualization • Presets tuned to journal style requirements • Live testing, preview and feedback mechanisms
Submission/review • Author may prepare enhanced figure ahead of publication • Simply enter URL of edit workspace when asked to ‘upload source files’ • Presented alongside other conventional figures • Available for peer review • Can be edited in response to referee comments
Interactive authorship: publBio http://publbio.iucr.org • Start with the data (PDB) • example 3jw1 • Add structured text • Online look-up: • authors • references • crystallization solution components • Validation • references • Visualisation (Jmol) • Update data file as submission vehicle
Uniform (compatible) markup systems • Crystallographic Information Framework (CIF) • Treat data/metadata, text/numerical data as peers • Domain-specific extensions (dictionaries = ontologies) • Image format • Some data fields may need to contain richer content • Text markup • Mathematical equations • Interactive figure scripts • Machine validation of dictionary attributes • Methods
Conclusions • The working scientist really wants to interact with the data • What interactive PDF offers is currently limited • Publishers should develop compatible architectures • Need domain-specific implementations (learned societies) • Investment in new applications; integration with workflow • Education for a new paradigm • Archiving • requires more standardisation • proper compound document model • concentrate on data (or semantic content), not the implementation • ‘record not what it looks like, but what you are looking at’ • Distributed content sources • data not necessarily integral part of document • retrieval of non-discrete data sets