280 likes | 467 Views
Information Lineage. Building Blocks Of The Complete Information Production Process (Please use presentation mode and animations) . Sami Laine, Project Manager, QUALIDAT-project Department of Computer Science and Engineering sami.k.laine@aalto.fi.
E N D
Information Lineage Building Blocks Of The Complete Information Production Process (Please use presentation mode and animations) Sami Laine, Project Manager, QUALIDAT-project Department of Computer Science and Engineering sami.k.laine@aalto.fi
Personal background combines technical, social and healthcare perspectives
Whichone is better? Where do these numbers come from? What do these figures actually include? What do they really mean in practice? What is missing from them? Areyou sure?
DATA GENERATION DATA MANIPULATION DATA UTILIZATION Analyses and reports data Builds data sets for secondary use Enters data for primary purpose Interprets data and makes decisions for secondary purposes Electronic Patient Record Monthly Service Reports Algorithms produce analyses for internal use Scripts collect manually entered data Analytical Datawarehouse Laboratory System National Hospital Benchmarking Scripts construct data sets for external use Application inspects machine generated event data Medical Imaging System
Data flows are currently traced by managerial models or automatic software features • IP Map • Manual Managerial system model • Data Lineage • Automatic Technical Software feature
Where does the data come from? Data Lineage Data Source(e.g. database, file) Information Product(e.g. report, graph) Impact Analysis
IP Map developed by MIT IQ Program Information Production Process
DATA GENERATION DATA MANIPULATION DATA UTILIZATION Analyses and reports data Builds data sets for secondary use IP Map - Managerial system model Enters data for primary purpose Interprets data and makes decisions for secondary purposes Electronic Patient Record Monthly Service Reports Algorithms produce analyses for internal use Scripts collect manually entered data Analytical Datawarehouse • Data Lineage • Technical software feature Laboratory System National Hospital Benchmarking Scripts construct data sets for external use Application inspects machine generated event data Medical Imaging System
But what about the reality behind data flows and resulting figures!
Whichone is better? 11:00:00-9:30:00 (time-time) 10:00:00-9:30:00 (time-time) Easy!
Length-of-Stay is a difference between patient arriving and leaving. Easy?!?!
AUTOMATIC TIMESTAMPS Lenth-of-Stay =>“length of documentation process” • Timestamp 1 • Timestamp 2 • Distractions leading to too early timestamps • Staff enters default values to rush forward in electronic patient record system • Distractions leading to too late timestamps • Staff skips documentation and fills forms later Too early IMPOSSIBLE Too late
MANUALLY ENTERED TIME VALUES Length-of-Stay =>“Length of Something” is ambiguous and subjective • Timestamp 1 • Timestamp 2 • Distractions leading to too early timestamps • Staff estimates or remembers timing too positively • Staff manipulates positively their clinics performance indicators • Distractions leading to too late timestamps • Staff estimates or remembers timings too negatively • Staff manipulates negatively their clinics performance indicators Too early IMPOSSIBLE Too late
TRIGGERS FROM RFID-TAGS ATTACHED TO PATIENTS Length-of-Stay => Describe presence in locations and not advancing process phases! • Timestamp 1 • Timestamp 2 • Distractions leading to too early timestamps • Cigarette breaks • Visiting parking lot • Distractions leading to too late timestamps • Patient stays for lunch or dinner • Patient visits a friend in Too early IMPOSSIBLE Too late
One has to know EXACTLY where timestamp data VALUES come from! Patient Leaves at 11:00 ≠ 11:00 ≠ 11:00
One has to know EXACTLY what kind of COMPONENT produces timestamp data! Patient Leaves at 11:00 ≠ 11:00 ≠ 11:00
One has to know what happens in LOCAL HUMAN context! Patient Leaves at 11:00 ≠ 11:00 ≠ 11:00
One must know where data comes from and which factors affect it Current management and development practices • Data Modeling is not enough Data models are under the hood – nobody enters data to a model! • Business Process Modeling is not enough Processes are worked around or have variations! Work is situated activity affected by contextual factors! • User interface is the interface that people use to advance further along the process and enter data to a hidden data model To understand actual meaning of the data instance, one should needs to understand • Triangular combination of data models, user interfaces and work flows. • And also understand how social, environmental and technical context affects these.
Information Production Process Requires Multi-disciplinary Understanding of Tiny Critical Details! DATA UTILIZATION DATA GENERATION DATA MANIPULATION TDQM – ACTOR - Human Sub-Roles of Data Supplier WORK PRACTICES DATA FLOW MEASUREMENT USER INTERFACE BUSINESS FUNCTION DATA MODEL APPLICATION LOGIC ANALYTICS BUSINESS CASE
Data flows are currently traced by managerial models or automatic software • IP Map • Managerial system model • Data Lineage • Technical feature
Unfortunately, current traceability methods track only a small part of information production process! DATA UTILIZATION DATA GENERATION DATA MANIPULATION ? ? WORK PRACTICES DATA FLOW MEASUREMENT USER INTERFACE BUSINESS FUNCTION DATA MODEL APPLICATION LOGIC ANALYTICS BUSINESS CASE
Data flow-tracing methods should be integrated, extended by their scope and embedded with semantic details DATA UTILIZATION DATA GENERATION DATA MANIPULATION System models=> IP Maps Extend the length of “traceability” Extend the length of “traceability” Integrate methods Software support => Data lineage WORK PRACTICES DATA FLOW MEASUREMENT USER INTERFACE BUSINESS FUNCTION DATA MODEL APPLICATION LOGIC ANALYTICS BUSINESS CASE
Questions? Project Manager, Sami Laine Department of Computer Science and Engineering sami.k.laine@aalto.fi, http://qualidat.aalto.fi/ Questions?