MAHI Research Database

MAHI Research Database Project Directions June 7, 2001

Overview of Client Requirements • Quaterlyreports • (STS, ACC) • Data to/from external • organizations • General patient data • Simple queries • Fast & reliable queries • Add new data points • Efficient data storage • Internal security • Extensibility • Maintenance

Concept Diagram of MAHI Architecture(designed by: Kelly Kerns) Management Database (contains metadata) M Generates user interface using DHTML, ASP Patient Primary customer: Physicians Processing, normalization using ATL, COM, C++ MAHI DB Interface (HTML?) SAS Ad-Hoc Query HL7 SPSS Q R A ODBC ODBC T ASCII ASCII HL7 Change Request /Record Editor ACC L Legacy database (no quarantine/data cleansing) Cardiff (optical recognition software) DTS STS Currently manually implemented using SQL scripts Digital form scanner/transmitter (model HP 9100C) Digital Sender DB Explorer user interface Interim Solution

Status: Interim solution in place Angioplasty and Cardiac Cath data available Simple queries available via DB Explorer Limitations: Missing Surgical data (legacy database) Unable to perform quarterly reports No single repository to store common data No simple analytic tools built-in Current Implementation

Why are we using a database? • To expose legacy data • To record/store/maintain/access patient and treatment information • To allow extraction and investigative analysis of above information • To meet clinical organization IS standards • To collect and store data from external entities

Storage and Processing • MAHI’s storage and processing needs are data-centric (as opposed to document-centric), related to dynamic procedures, and patient-oriented.

Storage and Retrieval technologies • Database (relational, OO, hierarchical) and middleware (built-in or 3rd party) • Data warehousing • XML server - Native XML database to manage large numbers of XML documents • XML-enabled web server - Web server that can build XML documents from data in a database

Database features of XML • Storage (XML documents) • Schemas (DTDs, XML Schema) • Query language (XQuery) • Programming interface (SAX, DOM)

Why XML as storage format? (contd.) • System needs to store data with repeating fields • Example: • Caregiver has several patients (some repeat patients) • Location can be used for multiple encounters • Patient may be administered multiple materials/medication (some prescriptions are repeated) • XML provides mechanism to quickly parse and extract data with repeating structure, or to generate new documents. • System needs to store data with arbitrary length • Example: • Text of medical notes • XML can store nested elements of arbitrary depth and length.

Why XML as storage format? (contd.) • System needs to exchange information with other organizations • Example: • External Stroke database, external Women’s Health Clinic database, external HL7 compliant database; currently this formatting has to be done manually because data formats are different. • XML can be used to make formatting easier (similar data items will have similar markup tags; more human readable), and possibly automate some of the work by using DTDs/Schemas as reference.

Why XML as storage format? (contd.) • System needs to store standard queries/analytical results for display on multiple end-user client apps • Example: • Currently, Kelly and biostatisticians have to make specialized query every time it is requested by a user. • Standard queries can be stored as SQL procedures, and query results can be returned in a single XML format document (with its associated DTD) to be transformed into views for multiple end-users (e.g. via XML-enabled web server).

Why XML as storage format? (contd.) • System needs to store metadata • Example: • Medical groups access to information, originating source of data, etc. • Metadata can be easily incorporated into XML documents as attributes in the element tags. • Example: <Caregiver id=“1234” medgroup=“MHI”> <name> … </name> <ssn> … </ssn> <type> … </type> </Caregiver>

Limitations of XML as storage format • Efficient storage? • Security? • Transactions and data integrity? • Multi-user access? • Recovery and fault-tolerance? • Queries across multiple documents? • Others?

An XML extension to existing architecture Input data Q R Current architecture I Adaptable front-end apps for Query/Report Adapter Extender Neutral storage facility Q/R XQ XR XML H Report Harvester (e.g. ACC, STS) Converter Legacy storage Legacy (e.g. Surgical database)

Key tasks (initial thoughts) • Develop storage facility for XML documents • Design schema/DTD for data to be converted and stored • Develop retrieval (via query) facility for searching, indexing, extraction, and analysis of documents • Perform data conversion of legacy database • Integrate into existing architecture

Modified Project Tasks • Re-visit the tables in the Repository data store • Re-engineer • Validate structure of Repository data model • Develop a X-Quarantine using XML (from Legacy & New Data) • Develop a process from X-Quarantine to Repository • Develop a process from Repository to X-Repository • Query tool (for Bio-Statistician) from X-Repository

Other Issues • Ensure X-Repository information is ‘secure’ (i.e. bio-statisticians can ‘trust’.)

MAHI Research Database