120 likes | 202 Views
Interoperability of phylogenetic data. Weigang Qiu, Rutger Vos. Introduction. Improving interoperability at the data level. We consider two options: “ new standard ” approach: a new file format, endorsed by this meeting;
E N D
Interoperability of phylogenetic data Weigang Qiu, Rutger Vos
Introduction • Improving interoperability at the data level. • We consider two options: • “new standard” approach: a new file format, endorsed by this meeting; • “abstraction” approach: parsing and serialization layer, intermediate data model layer;
A new standard • What is to be done to achieve this? • Walk through of the steps involved; • Compare & contrast with abstraction approach;
Creating a new standard • Exhaustively define and publish structure and syntax; • Create unambiguous validation procedure; • Create extension protocol: • Governed extension adoption mechanism • Versioning
Commitment to maintenance • Standard governance means long term commitment of someone or some “body” (NESCENT? OBF? Us?), but: • So does abstraction approach, • …which doesn’t encourage standardization • …and trails rather than leads data trends • …neither of which provide impetus for maintenance
New standard, new features • When designing standard, add attractive new features from the start: • Substitution models; • More metadata for taxa, trees, nodes, matrices, sequences, sites; • More metadata for “project” (analysis metadata, logging)
Expanding abstraction architecture • Even without new features, facilitating union of existing features in abstraction approach implies complex ontology and metaformat: • Premature generalization • Analysis paralysis
Implement IO in common tools • Possible early adopters: • Services: CIPRES and TreeBASE • Analysis apps: Paup*, HyPhy, MEGA, MrBayes, Mesquite • Toolkits: Bio::*
Implementation of abstraction • Abstraction architecture needs to be developed separately • By whom? In what language? • More of a “complete rewrite” • Doesn’t tap into phylogenetics community expertise • Might be “lock in”
Adoption of new standard • Advocacy to increase community adoption: • Carrots: • Access to new features • Robust, can be validated objectively • Interoperability • Stick: • Submission requirement for services, databases, journals
Adoption of abstraction architecture • Adoption is hampered by catch-22: • Abstraction architecture only encourages contributions of mappings from 3rd party authors if that “adds value” to their application, • But main added value of abstraction architecture is the number of contributed mappings