200 likes | 220 Views
Explore the implementation of the Gold Standard Phenotype Library with GitHub/Shiny framework proposal for better user experience, automated processes, and detailed documentation. Learn about the actors involved and the step-by-step walkthrough of the library architecture.
E N D
OHDSI Gold Standard Phenotype Library Working Group Library Architecture and Implementation Aaron Potvien February 12, 2019
Gold Standard Phenotype Library Phenotypes Design Evaluation Rule-Based(Heuristic) Computable(Probabilistic) Chart Review(Annotations) Automated(Algorithmic) Library Architecture and Implementation
Who are the actors? End User Authors Librarians Validators
Actor Interactions The library implementation is largely framed by how these actors interact with each other: • The User needs the Librarians to maintain Gold Standard library entries. • The Librarians need Authors to develop Gold Standard phenotypes to populate the library. • The Authors need the Validators to test the performance of their phenotypes at different sites. • The Validators need the Librarians to identify candidates to validate.
A GitHub/Shiny Framework Proposal Back-end Development & Maintenance User Experience
Advantages Documentation “heavy lifting” is automated by GitHub: • Every change precisely tracked (authorship, ownership, dates of changes, rationale, etc.) • Pull requests encourage peer review GitHub and Shiny are already in use by many in the OHDSI community. Both are free and open source!
A Starting Point • Beginning with the user experience… • How would a user know which phenotypes there are to choose from? • We could create a Shiny application to help them navigate: • Identify which phenotypes are currently in the library. • Allow the user to search the library (with, say, autocompleting) and choosea phenotype. • On selecting a phenotype, display all of the characteristics of that phenotype and aggregate the validations that have been performed. • Allow for exporting the chosen phenotype (or go back to browsing others). • As things change, the tool would always represents the “cutting edge” of the library (e.g. new phenotype added, new validation was done).
Step-by-Step Walkthrough • Librarians house the list of phenotypes
Step-by-Step Walkthrough • Librarians house the list of phenotypes • Authors house the phenotypes and documentation • but librarians know the implementation at the time it was submitted and can detect any changes made to the cohort definition with hashes. • More on this point…
Hashes in a Nutshell • A hash function can take data as input and output a nearly-guaranteed unique string of a fixed length. • Therefore, putting the implementation instructions through a hash function allows it to be “frozen in time”. • Any change made to the implementation will result in a different hash. • Thus, if an author successfully submitted a phenotype to the library and subsequently changed it, we could act accordingly • Easy to implement in R!
To Hash or not to Hash? Don’t check any hash Authors could change the phenotype after having submitted it to the library without librarians (and, by extension, users) knowing! Check every hash Trivial changes (e.g. fixing of spelling errors, adding an author) would falsely cause the phenotype to appear substantively different. Happy Medium – Keep track of implementation hash only The notion of “implementation” is flexible. It could be a JSON file or an archive of multiple files.
Under Development • Lots of details remain to be filled in! • Who will the librarians be, and how will they operate coherently? • What would the interface of the Shiny App be like? • What are the exact elements required in the metadata file? • What criteria are required for admittance into the phenotype library? • What action should be taken when the hashes don’t match up? • What is the procedure for submitting a validation set to the librarians? • How would multiple validation sets be aggregated in a meaningful way within the Shiny app? • What’s the best way to incorporate tagging and versioning? • Would we need an extra program to assist authors in getting their phenotype ready for submitting their entries to the library and maintaining them (e.g. metadata generator)? • What exactly is the user provided with at the end when they choose a phenotype? • Thinking about this at a “high level” while allowing Gold Standard Design/Evaluation components to develop organically
Thank you! Questions or Comments?