360 likes | 471 Views
FBIRN Federated Informatics Research Environment (FIRE). David B. Keator University of California, Irvine. Overview. Function BIRN introduction Federated landscape Image/File publication Clinical data collection Data import/export Derived data Federated QC and project management tracking.
E N D
FBIRN Federated Informatics Research Environment (FIRE) David B. Keator University of California, Irvine
Overview • Function BIRN introduction • Federated landscape • Image/File publication • Clinical data collection • Data import/export • Derived data • Federated QC and project management tracking
FBIRN Goals • Develop the capability to analyze, as a single data set, data acquired from multiple sites using tools developed from multiple sites • Develop multi-site functional neuroimaging tools. • Develop a federated data management system to support these multi-site imaging and genetics studies
Informatics Requirements • Federated, sites control own data collections (GridFTP) • Sites maintain their own database (HID Database) • Original and derived data provenance • Low cost, low maintenance overhead • Data access quality of service requirements • Access control • Site=RW, Project Group= R, User=Site • Replication • Performance for data transfers • faster than scopy & sftp • Human time • Improve efficiency over data warehousing
GridFTP/Database Server Federation = PostgreSQL HID database = Firewall = GridFTP server UMN BWH MGH UI UCSF VA UCLA Duke UCI VA UCSD MIND/ UNM
fMRI Scanner Local Data Publication Workflow • FMRI Images • Automated image upload to Data Grid/HID for sharing (DICOM -> NIfTI) Result Images and XML wrapper in Data Grid RLS BIRN-CC GridFTP (Local) Processing Pipeline (FIPS, SPM, FreeSurfer,etc) Analysis Results HID(s) (Local) Results with standard descriptions in HID (i.e. data provenance) • Clinical Data • Computer aided scale input via clinical data entry interface and remote tablet PC interface Multi-Site User Query
Image/File Upload Scripts • Uploads via GridFTP tools • Location independence via Replica Location Service (RLS) • Single sign-on to resources • Faster than scopy/sftp
fMRI Scanner Local Data Publication Workflow • FMRI Images • Automated image upload to Data Grid/HID for sharing (DICOM -> NIfTI) Result Images and XML wrapper in Data Grid RLS BIRN-CC GridFTP (Local) Processing Pipeline (FIPS, SPM, FreeSurfer,etc) Analysis Results HID(s) (Local) Results with standard descriptions in HID (i.e. data provenance) • Clinical Data • Computer aided scale input via clinical data entry interface and remote tablet PC interface Multi-Site User Query Keator, et. al. IEEE Trans InfTechnol Biomed. 2008 Mar;12(2):162-72.
HID: Human Imaging DatabaseOzyurt, et.al. Neuroinformatics 2010. UCSD: Lead Development UCI: Lead Development UCLA: PostgreSQL Specifics UNM: Clinical Measures UIowa: Performance Duke: XML
CALM/GAME Assessments Multi-Site Query XCEDE Services Subject Management Web Application Architecture Overview Study Protocols Study Data Clinical / Demographics Core Hierarchy Schema Oracle PostgreSQL File System Data Grid Database Data
fMRI Scanner Local Data Publication Workflow • FMRI Images • Automated image upload to Data Grid/HID for sharing (DICOM -> NIfTI) Result Images and XML wrapper in Data Grid RLS BIRN-CC GridFTP (Local) Processing Pipeline (FIPS, SPM, FreeSurfer,etc) Analysis Results HID(s) (Local) Results with standard descriptions in HID (i.e. data provenance) • Clinical Data • Computer aided scale input via clinical data entry interface and remote tablet PC interface Multi-Site User Query
Data Entry: TabletPC Remote • Remote TabletPC based assessment entry
Data Entry: Excel Spreadsheet • Excel based data input • Excel based online GUI form creation • Excel spreadsheet demo
fMRI Scanner Local Data Publication Workflow • FMRI Images • Automated image upload to Data Grid/HID for sharing (DICOM -> NIfTI) Result Images and XML wrapper in Data Grid RLS BIRN-CC GridFTP (Local) Processing Pipeline (FIPS, SPM, FreeSurfer,etc) Analysis Results HID(s) (Local) Results with standard descriptions in HID (i.e. data provenance) • Clinical Data • Computer aided scale input via clinical data entry interface and remote tablet PC interface Multi-Site User Query
Data Export • Shopping cart model via HID GUI • Add scans and assessments from multiple sites for download (via job scheduler) • CSV values file for assessment data • Excel spreadsheet
fMRI Scanner Local Data Publication Workflow • FMRI Images • Automated image upload to Data Grid/HID for sharing (DICOM -> NIfTI) Result Images and XML wrapper in Data Grid RLS BIRN-CC GridFTP (Local) Processing Pipeline (FIPS, SPM, FreeSurfer,etc) Analysis Results HID(s) (Local) Results with standard descriptions in HID (i.e. data provenance) • Clinical Data • Computer aided scale input via clinical data entry interface and remote tablet PC interface Multi-Site User Query
FIPS – FBIRN Imaging Processing Scripts • FSL package for the comprehensive management of large-scale multi-site fMRI projects, including data storage, retrieval, calibration, analysis, multi-modal integration, and quality control.
XML Based Meta-Data Format • The XML-Based Clinical and Experimental Data Exchange (XCEDE;www.xcede.org) XML schema provides an extensive metadata hierarchy for describing and documenting research and clinical studies. Keator, et. al. Neuroinformatics 2006; 4(2):199-212.
XCEDE Analysis Document Data provenance Pipeline component provenance Analytic results
Derived Data – HID • Many components arranged in specific orders. • Results from steps depends on command line parameters. • RAW data as inputs at various stages in the workflow combined with derived data from previous steps. • Individual tools from different developers. • Many tools log command line parameters from execution but provide little additional provenance. Keator, et. al. Front. Neuroinformatics. 2009;3:30.
HID Analysis Components • Generic bag of processes representation • Encourages reuse of processes in subsequent pipelines • Analysis flow and component tables document how processes are assembled into pipelines • Tables for storing instantiation of pipeline across sites • Supports distributed data and analyses • Generic tables for analytic results used by semi-automated query page builder
Automatic Federated Study Management Tracking HID Database GridFTP Servers
Imaging QC Tool by Subject Wiki QC Tracking Table Cardio and Respiratory Tracking Image QC Tracking
Federated Study Management DTI Portal
FBIRN Open-Source Software • Data Management: • XCEDE XML schema (www.xcede.org) • XML schema for describing/documenting research and clinical studies • Human Imaging Database (HID; www.nitrc.org/projects/hid) • Query interface, workflow pipeline documentation, image download • Clinical Assessment Layout Manager (CALM; www.nitrc.org/projects/hid) • Graphical web enabled form builder for data entry • Data Analysis • FBIRN Image Processing Scripts (FIPS; www.nitrc.org/projects/fips) • Comprehensive management of large-scale multi-site FMRI projects