1 / 20

CCEGA Informatics Working Group

CCEGA Informatics Working Group. Bradley Hemminger School of Information and Library Science. Supported in part by NIH Grant 5P20RR020751-02. Participants. Roger Akers, Shepp Center Peter DeSaix, Epidemiology Xiaojun Guan, RENCI Kevin Gamiel, RENCI

sarahlsmith
Download Presentation

CCEGA Informatics Working Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CCEGAInformatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR020751-02 CCEGA Informatics Hemminger

  2. Participants • Roger Akers, Shepp Center • Peter DeSaix, Epidemiology • Xiaojun Guan, RENCI • Kevin Gamiel, RENCI • Barrie Hayes, Health Sciences Library • Brad Hemminger (chair) School of Information & Library Science • Clark Jeffries, RENCI • Joel Kingsolver, Biology • Lavanya Ramakrishnan, RENCI • David Threadgill, Genetics • Kirk Wilhelmsen, Genetics • Dong Xiang, Lineberger Cancer Center CCEGA Informatics Hemminger

  3. Aims • Universal data model sharable by everyone. • Standardized, independent methods, so location can be anywhere. • Practical. Adoptable by many disparate groups for both new and legacy systems. • Utilize existing domain standards, controlled vocabularies and ontologies (e.g. GO, MIAME, caBIG, …) • Data repository should be safe and secure, with only controlled and accountable access by appropriate qualified entities. CCEGA Informatics Hemminger

  4. Areas of Focus • Development of common data model • Determine ways the common data model can be implemented as a common shared digital repository that allows for the ingest of digital content from many varied sources (both existing projects and new projects), and controlled access by appropriate people and automated agents. CCEGA Informatics Hemminger

  5. Areas of Focus cont’d • Address practical issues of how such a repository could be utilized by different groups with different needs in different contexts. Demonstrate advantages of how usage of the repository would be advantageous to groups, to help encourage them to utilize it. • Define security and privacy issues for the repository, and propose and implement methods to support this. • Preservation and curation. CCEGA Informatics Hemminger

  6. Overview • Status quo (difficulties summarized in Kirk’s talk). • Diagram and brief explanation of planned architecture. • How labs, clinics, and analysis would interact with repository. CCEGA Informatics Hemminger

  7. Issues: Lab and Clinic to Analysis ELSI • Independent data management • Data security • Version control • Redundancy • Controlled access Clinical Laboratory Analysis CCEGA Informatics Hemminger

  8. CCEGA Model ELSI We want the integration of the data operations across the labs, clinics, and analysis Analysis Integration & Informatics LAB Clinic CCEGA Informatics Hemminger

  9. Lab Lab mapping mapping Ingest Output Permissions Analysis Methods Data Store Repository Lab Lab Association Table CCEGA Informatics Hemminger

  10. Timeline • First intramural workshop (spring 2005) • Weekly meetings (beginning spring 2005) • Development of draft common model based on wealth of experience in local labs, and existing standards • Analysis of data requirements, and existing infrastructure at UNC. Internal interviews with labs • Second intramural workshop (summer/fall 2005) • Present draft common model for review and feedback by UNC community CCEGA Informatics Hemminger

  11. Timeline continued • Extramural workshop (winter 2005) • Bring community of experts to UNC for discussions. • Learn in more detail about related work outside of UNC • Present our draft model to get feedback and criticism. • Refine model • Implement and test model using data from the three main projects identified in this grant. • Think about and plan for how this model spreads. How to promote its use by groups with existing infrastructure as well as by new groups. CCEGA Informatics Hemminger

  12. Common Data Model • Survey schema/models in use by labs • Develop set of general requirements • Get ELSI and HIPAA requirements • Develop generalized model capable of meeting needs • Test model with data collection and analysis programs for alcoholism and addiction, breast cancer, and epidemiology studies that are part of the grant. CCEGA Informatics Hemminger

  13. Initial Examples • Epidemiology Specimen Collection and Tracking System (Roger) • Alcoholism and Addiction Study (Kirk) • Proteomics Core Facility General Model (Brad) CCEGA Informatics Hemminger

  14. CCEGA Informatics Hemminger

  15. CCEGA Informatics Hemminger

  16. CCEGA Informatics Hemminger

  17. Security • Security will be designed into the CCEGA model and to implemented in the repository to provide protection of information, while still allowing researchers timely access to data. • Data will be protected via trusted broker methodology. • Information is made anonymous by use of randomly chosen keys assigned by the trusted broker. The assignment is made at the clinic-database interface. • The coded key will be used to identify experimental data, while providing linkage to the source organism private information in a secure association table. CCEGA Informatics Hemminger

  18. Accountability • Access permissions will determine which entities are allowed access to which data. • All access to data is tracked via logs. • “Audit-readiness” will be maintained to respond quickly to an outside investigation and challenge with the goal of quick clearance. • Regular or random internal security audits will be included in a management strategy. Documents used in audits include 24/7 logs, flowcharts of procedures, training documents, incident reports, etc. CCEGA Informatics Hemminger

  19. Future (P50) Goals • Comprehensive survey and publication of different schemas, architectures, controlled vocabularies/ontologies used by different groups. Comparison of similarities and differences. • Digital content preservation planning. • Study of what factors determine how well such models are adopted in this environment. • Make publicly available the developed resources (data model, digital repository content, database structure/schema). CCEGA Informatics Hemminger

  20. End CCEGA Informatics Hemminger

More Related