1 / 17

Introduction Data structures Retrieval process Experiments Future work

Flexible and efficient retrieval of haemodialysis time series S. Montani, G. Leonardi, A. Bottrighi, L. Portinale, P. Terenziani DISIT, Sezione di Informatica, Universita del Piemonte Orientale, Alessandria, Italy. Introduction Data structures Retrieval process Experiments Future work.

bernie
Download Presentation

Introduction Data structures Retrieval process Experiments Future work

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Flexible and efficient retrieval of haemodialysis time seriesS. Montani, G. Leonardi, A. Bottrighi, L. Portinale, P. TerenzianiDISIT, Sezione di Informatica, Universita del Piemonte Orientale, Alessandria, Italy • Introduction • Data structures • Retrieval process • Experiments • Future work

  2. Introduction: Time Series - evolution of a phenomenon over time, to understand its behavior for future problem solving  TIME SERIES • Medical domain: continuous monitoring, control instruments (e.g., ICU, hemodialysis) • State variables (e.g, distolic pressure value) vstrend variables (e.g., increasing, decreasing) • PROBLEM: difficult interpretation and retrieval – e.g. - find similar cases, - find “abstract” cases, - understand results to interactively refine\relax search  need for automatic support for these tasks

  3. Introduction: Time Series Retrieval (literature) • DIMENSIONALITY REDUCTION • mathematical transforms able to preserve the distance between two time series (or to underestimate it). E.g. Discrete Fourier Transform (DFT) • Complexity (preprocessing, post processing) • INPUT: a specific time serie (case) • Black box behavior  difficult interpretation, no flexibility, no interacttivity Symbolic approaches to dimensionality reduction (e.g., [Xia, 96], survey [Daw et al., 2001])

  4. Our approach: Time Series Retrieval + Temporal Abstraction (TA) • Original contribution: TA used for dimensionality reduction and flexible retrieval • TA: deriving high level concepts from time stamped data (from a point-based to an interval-based representation) • In our proposal: two-level TA: SYMBOL (e.g., increase vslow_increase) TIME GRANULARITY (e.g., 1h vs 20min) • DOMAIN-INDEPENDENT methodology: • General DATA STRUCTURES • CONSTRAINTS on the data structures

  5. DATA STRUCTURES: SYMBOL TAXONOMY Example!! • SYMBOL ORDERING naturally emerges from the domain dependent interpretation (e.g., Ds may abstract slopes from −90 to −45 degrees, thus preceding Dw(slopes from −44 to −10 degrees) - Domain-independent general constraint: symbol taxonomy must respect the ordering ∀x, y, x′, y′ ∈ isa(x, x′) ∧ isa(y, y′) ∧ x′  y′ ∧ x < y → x′ < y′

  6. DATA STRUCTURES: SYMBOL DISTANCE • ANY DISTANCE function is admitted (domain independent) - However, the DISTANCE function must be CONSISTENT with the SYMBOL ORDERING (if any) ∀x, y, z  x < y < z → distance(x, y) < distance(x, z)

  7. DATA STRUCTURES: TIME GRANULARITY TAXONOMY • ANY taxonomy of time granularities (to describe the episodes at increasingly more abstract levels of temporal aggregation) e.g. 10 min  30 min  1 h  2 h  4 h • HOMOGENEITY: aggregation must be “homogeneous” at every given level, in the sense that each granule at a given level must be an aggregation of exactly the same number of consecutive granules at the lower level  IMPLICIT information about DURATION of (sub)episodes • “up” function, to aggregate from each level to the upper one e.g. up(<I,I,S>, 10 min, 30 min)  <I, 30 min>

  8. DATA STRUCTURES: TIME GRANULARITY TAXONOMY: UP FUNCTION • ANY “up” function (domain-dependent), BUT • CONSTRAINT about PERSISTENCE ∀x ∈ up(x, x) = x • CONSTRAINTS about ORDERING PRESERVATION ∀x, y  x < y → x ≤ up(x, y) ≤ y ∀x, y, z  x < y < z → up(x, y) ≤ up(x, z) ∀x, y, z x <y <z → up(x, z) ≤ up(y, z)

  9. DATA STRUCTURES: INDEX OF Time Series (cases) • FOREST of TREEs • First, the TIME GRANULARITY dimension is (partially) expanded • Then, the SYMBOL dimension is (partially) expanded • Each node in the tree addresses all the time series (cases) that are abstracted (“up” function + ISA symbol taxonomy) by the pattern of the node

  10. BUILDING\MAINTAINING THE INDEX

  11. DATA RETRIEVAL • Exploits Temporal Abstraction (“up” function on temporal granularity and ISA on symbol taxonomy) and the INDEX Supports both • “basic” queries (retrieve time series similar to a given one) • “abstract” queries (retrieve time series similar to (<S,Iw,Iw,Iw>, 1h) • QUERY PROCESSING HIGHLIGHTS - Abstract on the symbol taxonomy (ISA) - Abstract on the time granularity taxonomy (“up”) - Find the proper (root of the) index tree in the forest - Descend the index tree backward to the lowest possible node - Return the time series (cases) addressed by such a node

  12. DATA RETRIEVAL: an example “Abstract” query: S IwIwIw(1h time granularity level) Abstraction, symbol taxonomy: S IwIwIw S I II Abstraction, time granularity (“up” function): S I II  II  I

  13. DATA RETRIEVAL: an example Descend the index from the root “I “ to search for “S IwIwIw” ALL the corresponding time series are returned

  14. DATA RETRIEVAL: advantages FLEXIBLE and UNDERSTANDABLE - “Abstract” query: S IwIwIw (1h time granularity level) • Understandable query, process and output all time series that can be abstracted as dictated by the “abstract” query are returned • Support for INTERCACTIVITY E.g., depending on the output of the query, the user may • Relax the query, eg., by asking “S I II” • Refine the query, e.g., by asking “S SIwIwIwIwIw S “

  15. DATA RETRIEVAL: Experimental Results • Dataset of 10388hemodialysis sessions (i.e. cases), collected at the Vigevano hospital, Italy. • Comparisons with RHENE, an approach was based on DFT for dimensionality reduction, and on spatial indexing (through TV-trees) for further improving retrieval performances ADVANTAGES • Efficiency • Flexibility (“abstract” queries vs. specific time series) • Interactivity * Trends vs, state abstractions

  16. DATA RETRIEVAL: Experimental Results

  17. FUTURE WORK • Queries about SUBpattrerns • Higher level queries (e.g., regular expressions)

More Related