1 / 19

ShiftTree : an Interpretable Model-Based Approach for Time Series Classification

ShiftTree : an Interpretable Model-Based Approach for Time Series Classification. Balázs Hidasi - hidasi @ tmit.bme.hu Csaba Gáspár-Papanek - gaspar @ tmit.bme.hu Budapest University of Technology and Economics , Hungary. ECML/PKDD, 6 th September 2011, Athens. Outline.

nhi
Download Presentation

ShiftTree : an Interpretable Model-Based Approach for Time Series Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ShiftTree: an InterpretableModel-BasedApproachfor Time Series Classification Balázs Hidasi - hidasi@tmit.bme.hu Csaba Gáspár-Papanek - gaspar@tmit.bme.hu Budapest University of Technology and Economics, Hungary ECML/PKDD, 6th September 2011, Athens

  2. Outline • General concept • Task & adaptation • Labeling & learning • Evaluation • Forest methods • Conclusion

  3. General Concept • Not limited totime series • Time series • Semi-structureddata • Graphs • Twoquestions: • „Wheretolookat?” • „Whattoobserve?” • Operator families • EyeShifter Operator(s) (ESO) • ConditionBuilder Operators (CBO) • Dynamicattributes • Model: sequence of simpleoperatos / rules F(x)

  4. Time Series Classification Model • Time series • Withoneobservedvariable • Evenlysampled • The algorithm has no suchlimitations • Standard classificationtask • Focusingonaccuracy • Goals • Satisfyingaccuracy • Model-basedinstead of memory-based • Interpretablemodels Learning Model Labeling

  5. Explainingthegoals • Whymodel-based? • Labeling is muchfaster • Canlearn more generalproperties of theclasses • Needs more trainingsamples • What is interpretabilitygoodfor? • Gettingusertrust • Helpinginunderstandingthedata

  6. ConceptAppliedto Time Series • Dynamicattributes of time series • EyeShifter Operator (ESO): „Wheretolookat?” • Moves a cursoralongthetimeaxis • To a specifiedpoint of the series • E.g.: „Tothenext local minimum”, „100 stepsahead”, etc. • Note: result of operator sequencedepensontheorder • ConditionBuilder Operator (CBO): „Whattoobserve?” • Computesthedynamicattribute • E.g.: „Valueatthegiventime”, „Length of thejump”, etc. • Decisiontreeasmodel • Works wellwiththedynamicattributeconcept • Operator sequencefromroottoleaf F(x) Weightedaverage

  7. Labeling Dynamicattributevalue:0,767139 Dynamicattributevalue:0,836848 Level 0 ESO: Toglobalmax CBO: Valueattime Threshold: 1,20775 Level 1 Level 1 ESO: Stepforward 100 ESO: Tonext local max CBO: Valueattime CBO: Valueattime Threshold : 0,390093 Threshold : -0,700739 Level 2 Level 2 Level 2 Level 2 Leaf Leaf Leaf Leaf

  8. Learning (1/3) • Operator pooldefinition: ESOs, CBOs • Rootnode • Cursorstartsatthebeginning • May be overridden

  9. F(x) Learning (2/3) • Availabledynamicattributes (CBO pool) • At/aroundreachablepositions • Selecting a… • Attribute • Thresholdvalue Minimizeweightedchildnodeentropy

  10. Learning (3/3) • Best splitselected • Cursor is movedby ESO • Reachablepositionswillchange • Trainsetsplit • Childnodescontinuethelearningprocess • Stop: homogenousnodes

  11. Interpretability • Dependsonthe operators • Helpsundepstandingthedata • E.g.: CBF dataset • Z-normalized, 3 class: Cylinder, Bell, Funnel • Distinguishcylinderfrombell and funnelbyglobal maxima • The data is z-normalized (standardized) • Distinguishbellfromfunnelbystepping back 25 steps + noisefilteringthroughweightedaverage • Onwhichside is thepeak

  12. Performance of thealgorithm • Databases • UCR: mostlysmallerdatasets (20) • Ford: largerdatasets (2) • No optimizationsintheexperiments • Performance • Betteronlargerdatasets (model-based)

  13. Advantages and drawbacks • Advantages • Advantages of being modelbased (fastlabeling, etc) • Interpretable (depenson operators) • Can be usedindependently of applicationdomain • Expert’sknowledgecan be builtinthrough operators • Disadvantages • Optimization of the operators is nottrivial • Learningmaytake a longtimewhentoomany operators used • Disadvantages of themodelbasedmethods (largertrainingsetrequired, etc)

  14. ShiftForest: combiningmultiplemodels • Combiningmodelsenhancesaccuracy • Boosting • Someissuesonsmallerdatasets • The „XV” method • Splittingtrainingset (random splits) • Building modelonthefirst part, evaluatingonthesecond • Modelweight: themeasuredaccuracy

  15. ShiftForestresults • Accuracyincreases • Interpretability is oftenlost • May theintersection of thetreescan be interpreted

  16. Conclusion • Novelmodel-basedalgorithm • Usescursors and operators • Cursormovement • Attributecomputation • Sequence of simple operators asthemodel • Decisiontree • Base of a wholealgorithmfamily

  17. Thankyouforyourattention! Balázs Hidasi (hidasi@tmit.bme.hu) Csaba Gáspár-Papanek (gaspar@tmit.bme.hu) For more ShiftTree related research visit my side:http://www.hidasi.eu

  18. Time of model building, scalability • O(N*log(N)) per node(ordering) • Time of model building • Dependsontreestructure • Acceptable • Scaleslinearly • Experiments

  19. Learningthemodel EyeShifter – Cursorposition ESONextMax ESONext100 ESOMax CBOSimple ShiftTree ConditionBuilder - Ordering F(x) Best splitinnode (so far) ESO: ESONext100 - ESOMax - ESONextMax - - ESONextMax ESONextMax Dynamicattributeselection - CBOSimple - - CBOSimple - CBOSimple CBO: M Threshold: - - 1,02775 -0,332858 -0,739969 - -0,700739 - 0,390093 Best thresholdbycurrentdynamicattribute: - 1,02775 2,48622 -1,16155 -0,332858 -0,700739 - -0,739969 0,77183 0,390093 - 0,369887 Score of thebestsplitbycurrentdyn. attribute : 0,805777 0 0 Inf Inf 0,985078 1,32624 0,600131 Inf 0,983634 Best scoreinthenode (so far): Inf 0 0,805777 0 0,983634 0,985078 Inf Inf Best ordering:

More Related