80 likes | 233 Views
Data Mining and Structure Retrieval. Presented by Abdullah Mueen. The Keogh Lab. Overview of our work. Our Goal: Extract information from raw, noisy, massive, unstructured data. We develop algorithms for Classification Clustering Rule finding Motif discovery Discord discovery
E N D
Data Mining and Structure Retrieval Presented by Abdullah Mueen The Keogh Lab
Overview of our work • Our Goal: Extract information from raw, noisy, massive, unstructured data. • We develop algorithms for • Classification • Clustering • Rule finding • Motif discovery • Discord discovery • Shapelet discovery • Linkage discovery • We work closely with the domain experts. • For collecting new data. • To verify our results.
Case 1: Motif Discovery 20 Beet Leafhopper (Circulifertenellus) 10 input resistor 0 conductive glue 0 50 100 150 200 to insect to soil near plant V voltage source voltage reading plant membrane Stylet MK motif discovery Exact Discovery of Time Series Motifs. Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, Brandon Westover. SDM 2009.
Case 2: Shapelet Discovery stinging nettles false nettles Shapelet Time Series Shapelets: A New Primitive for Data Mining. Lexiang Ye and Eamonn Keogh. SIGKDD 2009 stinging nettles false nettles
Case 3: Linkage Discovery 0.9 0.8 0.7 0.6 CK-1 CK-1 CK-1 Distance Measure Print House 1 Print House 2 0.9033 0.6291 Single Linkage Dendrogram character matrix CK-1 Distance A Compression Based Distance Measure for Texture. BilsonCampana and Eamonn Keogh . SDM 2010 ornaments text text text a hand-press book
Lab Members Dr. Eamonn Keogh Dr. Gustavo Batista Abdullah Mueen Qiang Zhu BilsonCampana Thanawin Art R. Bing Hu Yuan Hao JesinZakaria
Motif in Online Data • Maintain motif in streaming data without introducing latency.
Motion Motif • Find repeated motion in motion capture data which is a 32 dimensional time series.