410 likes | 684 Views
Development and Experience with Tissue Banking Tools to Support Cancer Research. Waqas Amin M.D , Anil V. Parwani M.D PhD and Michael J. Becich M.D, PhD1
E N D
Development and Experience with Tissue Banking Tools to Support Cancer Research Waqas Amin M.D, Anil V. Parwani M.D PhD and Michael J. Becich M.D, PhD1 Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA.USA 2Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA. USA
Introduction: • Over the last decade, the Department of Biomedical Informatics (DBMI) at the University of Pittsburgh has developed and deployed various tissue banking informatics tools to expedite translational medicine research. • Deals with management of clinicopathologic annotation, inventory management and distribution of biospecimens that are collected and stored for translational research use by the scientific community.
Tissue Banking Informatics: • Aggregation: Process to associate tissue samples with valuable data including demographic, epidemiology, pathology, progression, vital status, therapy and outcomes related data. • Standardization: Collected data must be uniform or shareable. This standardized approach to annotation is to ensure uniformity, consistency, and quality of collected data. This facilitates information sharing across multiple institutions. • Searchable: Development of an information model supported by standardized data collection approach allows annotated tissue samples to be matched with the research queries, thereby facilitating better understanding of the experimental design and result.
Data Requirement in Cancer Research: • High quality, accurate and comprehensive data is required to support genomic, proteomic, clinical and translation research. • Data must be acquired in accordance with legal and ethical subject polices. • Type of Data Collection: • Demographic data • Patient clinical data • Pathology block level data • Patient treatment data • Outcome and follow up data • Biochemical data • Genomic level data • Cell and tissue level data
Data Collection Standards: • Development of Common Data Element (CDE): • Standardized clinical annotations defined in detail utilizing metadata. Allows uniform, consistent shareable data collection across multiple institutes/systems. • Development of CDEs are supervised by multidisciplinary team and CDE subcommittee developed consensus CDE incorporating following standards applicable for a organ specific tissue. • ADASP (Association of Directors of Anatomic and Surgical Pathology (ADASP) Cancer Reporting Guidelines • American Joint Committee on Cancer (AJCC) Cancer Staging Manual • NAACCR (North American Association of Central Cancer Registry) Data Standards for Cancer Registries
Data Sources: Data import from automated electronic systems like AP-LIS, CP-LIS, Radiology and Registry information System (RIS). Patient questionnaire, patient health record and treatment charts, existing databases, consultation with referring physicians, archived data and pathology reports. De-Identification of PHI: The purpose is to ensure proper confidentiality and privacy of human subjects based upon Institutional Review Board approved protocols. De-identification of PHI is done by an Honest Broker according to Health Insurance Portability and Accountability Act (HIPAA). regulations by designating unique codes to patient data related identifiers.
Specimen collection and standardization • Biospecimens are collected according to pathology and tissue banking standardized protocol. Biospecimens are collected and stored for tissue banking project , includes: • Paraffin Blocks • Fresh Frozen Tissue • Blood Products includes: • Serum • Plasma • Buffy Coat • RBC • WBC
Tissue Banking Information Models and Architecture: • Two types of information models that have been utilized in the development of tissue bank. • Organ-specific databases (OSD) • Cooperative Prostate Cancer Tissue Resource (CPCTR) (www.cpctr.info) • Pennsylvania Cancer Alliance for Bioinformatics Consortium (PCABC) (www.pcabc.upmc.edu) • Early Detection Research Network (EDRN) Colorectal and Pancreatic Neoplasm database • SPORE Head and Neck Neoplasm Database • Model Driven Approach (Database) • National Mesothelioma Virtual Bank (NMVB) (www.mesotissue.org)
OSD (Organ Specific Database): • OSD is a three-tiered architecture, and implemented on an Oracle Application Server v10.1.2.3 running on a Windows 2003 and Oracle RDBMS v.10.2.0.2 running on an AIX 5L virtual host definition supported by IBM x3850 system hardware. • Dynamic web pages are generated using Oracle http server and mod_plsql extensions for the database users. • The data annotation engine is a flexible dynamic web-based tool, while the data query engine facilitates investigators to search de-identified information within the warehouse through a “point and click” interface.
Presentation Metadata Engine Physical Data Metadata Curation Common Data Elements (CDE) Definitions Application Data Layer Admin Security HELP Builder Business Rules Engine Mapping Engine Metadata Data Layer Manual Annotation Data Query Security Engine Security Data Layer Registration Authorization Data Import Export Authentication OSD Multi Tier Architecture:
OSD Feature List: • To address the needs of the heterogeneous users we identified numerous criteria for success. Some requirements and features are listed below: • Quick Statistics on overall data. • Multi-mode search: Multiplex search and Advance search. • Mechanism for keeping user’s orientated (e.g. help, persistence of last entered query text) • Results in tabular forms, sorting on each column including access to full case report. • Both Honest Broker and De-identified (researcher) access. • Controlled access to subjects for different studies
Feature List (Contd..) • Standard and customized query results of the data. • Individual research and consent based access to information. • Quick search using cases saved in “My Cases”. • Query Builder interface. • On Line Help Manual Builder. • This model can support multi institutional data enterprise model. • User Management Module helps create, revoke, control users access and activities within the database. • Business layer allows for creation of complex/logical data fields based on data interpretation by experts.
OSD model Based Head and Neck Neoplasm Virtual Biorepository: • It is Developing bioinformatics driven system to utilize multi model data sets from patient questionnaire, clinical, pathological, radiology and molecular systems • Results in one architecture supported by a set of CDEs to facilitate basic science, clinical as well translational research • Systems designed to facilitate semantic and syntactic interoperability in development of data elements (i.e., metadata or data descriptors using controlled vocabulary and ontology) • Provides data entry, data mining and analysis tools.
OSD Integration with other Data Sources: Genotype Lab data BIOS AP-LIS/ CP-LIS Patient Insurance information Bio-marker data SPORE H&N Neoplasm Database Human Papilloma Virus Questionnaire data RIS Radiology (PET/CT) data Epidemiology Project-1 questionnaire data
Data Collection & Annotation Tool User Authentication
Data Collection & Annotation Tool: User Management Module
Data Collection & Annotation Tool Administrator can create, edit, revoke control user’s & their access to different applications
Data Collection & Annotation Tool: Manual data collection module Case summary
Data Collection & Annotation Tool Can switch quickly between different available applications as per user access rights
Data Collection & Annotation Tool Quick over all review of Statistics on the collected database
Data Collection & Annotation Tool Data Query template
Data Collection & Annotation Tool: Standard view
Data Collection & Annotation Tool Descriptions of different views for reference
Data Collection & Annotation Tool Allows data export for Statistical analysis packages, such as SAS, etc.
Data Collection & Annotation Tool User can have multiple “My Case” lists for different studies Full Case Report View (Identified or De-identified as per access level
Data Collection & Annotation Tool User can also select any data field to create personalized views & save under ”My Views”
Data Collection & Annotation Tool Administrator can edit or create data views
OSD based Databases Accruals: Amin et al. Tissue banking informatics 2010)
Model Driven Database (MDD): • NMVB is developed using a model-driven approach (MDD). • Application components are generated from UML domain models. • Java based application designed using a Model-Driven Development framework.
MDD (contd.…) • Web Tier: Construct web pages upon metadata dictionary • Business Tier: Provides an object/relational mapping mechanism, a metadata interrogation mechanism, an application programming Interface and a set of shared services. • Data Tier: Consists of domain database that houses clinically annotated data, indexes to support the query mechanism and security data.
Virtual Component of NMVB: Statistical Data Query Interface Approved Investigator Query Interface Data Entry Interface
Conclusion: Informatics supported tissue banking initiatives act as a large source of annotated biospecimens and facilitates basic and clinical science research. Tissue banking infrastructure allows efficient governess, standardized capture of data and detailed standardized annotation at local institute and across multiple collaborating sites. Finally, tissue banking tools developed at DBMI (Department of biomedical informatics) provides an important knowledgebase for the development of integrated tissue banking efforts and benefit other tissue banking initiatives by providing consultation.