170 likes | 308 Views
Dynamic query tools for time series data sets: Timebox widgets for interactive exploration Harry Hochheiser Ben Shneiderman. Presented by Justin Domke. Motivation. Data that changes over time is common. Algorithmic and statistical methods are good at answering questions.
E N D
Dynamic query tools for time series data sets:Timebox widgets for interactive explorationHarry HochheiserBen Shneiderman Presented by Justin Domke
Motivation • Data that changes over time is common. • Algorithmic and statistical methods are good at answering questions. • How to choose the questions themselves?
Standard time plots are very compelling, but can only display a limited amount of data
Idea: Query the data!
Notation niis an item in a time series data set ni(t) is the value of ni at time t
Three Widgets: (1) Timebox A timebox is a 4-tuple b = (tmin, tmax, vmin, vmax) nisatisfiesb if for all t, tmin ≤ t ≤ tmax, vmin≤ ni(t) ≤ vmax
Three Widgets: (2) Variable Time Timebox A variable time timebox is a 5-tuple b = (tmin, tmax, vmin, vmax,R) nisatisfiesb if: there exists t0, tmin ≤ t0 ≤ tmax- R, such that for all t, t0 ≤ t ≤ t0+R, vmin≤ ni(t) ≤ vmax vmax vmin tmin tmin R
Three Widgets: (3) Angular Query Widget An angular query widget is a 4-tuple b = (tmin, tmax, θmin, θmax) nisatisfiesb if for all t, tmin ≤ t ≤ tmax, θmin≤ φ(ni(t), ni(t)) ≤ θmax Where φ is the angle formed on the graph. max min
Demonstration • Standard Timeboxes • Drag From Display Window • Manpulate multiple boxes • Coupling of windows • Variable Time Timeboxes • Angular Queries • Query Inversion • Query Multiple Variables • Leaders and Laggards
Performance • Over 75% of time is spent on query evaluation. • Naïve approach: • For each item in the set, examine every point in each timebox. • Easy improvement: • Throw an item out if it fails any query.
Performance (2) – Alternatives • Suppose data has n time series, each with m time points. • Think of this as mn points in 2-d space. • Use geometric methods to find the points in each given range. • Increment a value for each point in a series. If the sum is right, the series satisfies the query. • Use orthogonal range tree or grid approach with buckets
Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Performance – 3 Average query completion time vs. number of items for random data. (100 time points)
Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Performance – 4 Average query completion time vs. number of time points for random data. (100 items)
Design Studies • 24 Computer Science students completed various tasks using different but semantically equivalent input mechanisms: • Timebox queries • Fill-in • Range sliders
Design Study 1 • Fully specified tasks. (“During days 22-23, are there more stocks between 69-119, 59-109, or 49-99”) • Form fill in fastest • Range sliders second. • Timeboxes last.
Design Study 2 • More open-ended tasks. • Comare: • Timeboxes with graphical output • Forms with graphical output • Forms with tabular output • No statistically significant difference. (Were the users already familiar with timeboxes?)
Comments • Problems with user interface? • Why “timesearcher”, instead of “parallelcoordinatesearcher”? • In the performance experiment, what did the data look like? • In the design study, were the users already familiar with Timesearcher?