1 / 46

A conceptual Data Model for Trajectory Data Mining

Introducing a conceptual framework supporting data mining in trajectory database modeling, addressing challenges in preprocessing and transforming data for trajectory analysis.

jamanda
Download Presentation

A conceptual Data Model for Trajectory Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Universidade Federal de Santa Catarina, Florianopolis, Brazil Informatics and Statistics Department A conceptual Data Model for Trajectory Data Mining * Prof. Vania Bogorny (INE/UFSC - Brazil) vania@inf.ufsc.br Prof. Carlos Alberto Heuser (II/UFRGS - Brazil) Prof. Luis Otavio Alvares (II/UFRGS-Brazil) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  2. Outline • Motivation • Objective • Basic concepts • Proposed Model • Evaluation • Conclusion GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  3. Introduction and Motivation On the one side (database technology.......) Since its origin, database design has the purpose of modeling data for operational purposes only Database designers don't think about data mining during the conceptual database design GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 3

  4. Introduction and Motivation On the other side (artificial intelligence.......) Data mining (DM) or knowledge discovery (KDD) from databases has become very popular in the last years in many fields and several application domains Dozens of new data mining algorithms have been proposed in the last decade, but very little has been done for the automatic data preprocessing, which is the most time consuming step GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 4

  5. Introduction and Motivation DATABASE Modelling (Normalization) DATA MINING (Disnormalization) One single file GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  6. Introduction and Motivation • Another problem for data mining: • data have to be preprocessed and transformed into different granularities • Examples: • Louvre Museum  Museum  TuristicPlace type Instance + type GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  7. Introduction and Motivation • These problems increase when dealing with trajectories of moving objects, which is the focus of this paper GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  8. Objective We propose a conceptual framework for trajectory database modeling that supports data mining GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  9. Basic Concepts GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 9

  10. Trajectory Data • Trajectories are new kind of spatio-temporal data • Trajectories have attracted intensive research in both databases and data mining communities GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  11. Trajectory Raw Data • Trajectory Data are: • Spatio-temporal data • Represented by a set of points located in space and time • Form: (tid, x,y,t), where tid is the trajectory identifier, (x,y) represent the spatial location at time t • Tid position (x,y) time (t) • 1 48.890018 2.246100 08:25 • 1 48.890018 2.246100 08:26 • ... ... ... • 1 48.890020 2.246102 08:40 • 1 48.888880 2.248208 08:41 • 1 48.885732 2.255031 08:42 • ... ... ... • 1 48.858434 2.336105 09:04 • 1 48.853611 2.349190 09:05 • ... ... ... • 2 ... ... GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  12. The Model of Stops and Moves (Spaccapietra 2008) STOPS • Important parts of trajectories • Where the moving object has stayed for a minimal amount of time • Stops are application dependent • Tourism application • Hotels, touristic places, airport, … • Traffic Management Application • Traffic lights, roundabouts, big events… MOVES • Are the parts that are not stops GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  13. Semantic Trajectories • A semantic trajectoryis a set of stops and moves • Stops have by a place, a start time and an end time • Moves are characterized by two consecutive stops GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  14. STOPS at Multiple-Granularities Stop at Ibis Hotel from 6:04PM to 7:42PM, september 16, 2010 time space IbisHotel or Hotel or Accommodation Afternoon or Thursday or 6:00PM – 8:00PM or RUSH-HOUR GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  15. ITEMS - the building blocks for semantic pattern discovery • An item is generated either from a stop or a move • An item is a set of complex information (space + time), that can be defined in many formats/types and at different granularities GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  16. Building an ITEM for Data Mining • Formats/types for an item: • NameOnly: is the name of the stop/move • STOPS: name of the spatial feature instance • IbisHotel • MOVES: name of the two stops which define the move • ZurichAirport – IbisHotel • NameStart: is the name of the stop/move + start time • IbisHotel [morning] --stop • LouvreMuseum [weekend] --stop • IbisHotel-ZurichAirport [10:00AM-11:00AM] --move GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  17. Building an ITEM for Data Mining • NameEnd: name of a stop/move + end time • IbisHotel[morning] stop • IbisHotel-ZurichAirport[10:00AM-11:00AM]  move • NameStartEnd: name of a stop/move + start time + end time • IbisHotel[08:00AM-11:00AM][1:00pm-6:00pm]  stop • LouvreMuseum[morning][afternoon]  stop • ZurichAirport– IbisHotel [10:00AM-11:00PM] [10:00AM-6:00PM] GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  18. Semantic Trajectory Patterns Frequent Patterns Sequential Patterns and Association Rules GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 18

  19. Trajectory Frequent Pattern • Is a set of items that occur a minimal number of times (support s) • Examples: {LouvreMuseum [08:00-10:00]} (s=0.1) {Airport [morning], hotel [morning]} (s=0.2) {Airport-Hotel, Hotel-Museum} (s=0.15) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  20. Trajectory Sequential Pattern • Is an ordered list of items that occur a minimal number of times (support s) • Examples: <Airport[morning], Hotel[morning], Museum[afternoon] > (s=0.15) <Airport-Hotel, Hotel-Museum> (s= 0.1) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  21. Trajectory Association Rule • Is a rule where the items occur a minimal number of times (support s) and with a minimal confidence (c) • Example • Airport[morning], Hotel[morning]  Museum[afternoon] (s=0.1) (c=0.5) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  22. The Proposed Model GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 22

  23. The Proposed Model • We extend the model of stops and Moves proposed by Spaccapietra with new attributes and methods • Add new classes and relationships, with attributes and methods to automatic data preprocessing and multiple-level mining GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  24. The Conceptual Data Model of Stops and Moves (Spaccapietra 2008) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  25. Proposed OO Model Compute and Store the patterns Data Pre-processing Spaccapietra´s Model GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 25

  26. Proposed OO Model Concept Hierarchy for the spatial feature type (e.g.: AccomodationPlace  Hotel) Stops and Moves are extended with new attributes (specific time, e.g. 07:10 – 08:05 ) and methods to instatiate stops and moves GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 26

  27. Proposed ModelOO Model Generic class to represent the 3 kinds of patterns Attributes: support, listOfItems Methods: countSupport(), sequentialPattern() Attributes: support, confidence, antecedent (set of items) and consequent (set of items) Methods: countSupport(), associatePattern(), and computeConfidence() Attributes : startT, endT (generic time, e.g. Morning) Methods: getGenericSpatialFeature() – retrieves the hierarchy level timeG() – generalizes time spaceG() – generalizes space based on the hierarchy buildItem() – creates generalized ITEM Frequent Patterns: Attributes: support, setOfItems Methods: countSupport(), frequentPattern() GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 27

  28. Example of an Instantiated Model GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 28

  29. Schema of Stops and Moves • STOP (Tid integer, Sid integer, SFTname string, SFid integer, startT timestamp, endT timestamp) Ex.: stop (1,1,Hotel, 3, 10AM, 11AM) • MOVE (Tid integer, Mid integer, SFT1name string, SF1id integer, SFT2name string, SF2id integer, startT timestamp, endT timestamp, the_move geometry) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  30. Schema of the Patterns FrequentPattern/ SequentialPattern (Pid integer, patternitemSetType, support real) itemSetType (SFT1name string, SF1id integer, SFT2name string, SF2id integer, startT string, endT string) AssociatePattern (Pid integer, antecedentitemSetType, consequentitemSetType, support real, confidence real) Nested relation GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  31. Instantiating and Querying Patterns To instantiate the patterns we can use the ST-DMQL proposed in (Bogorny 2009) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 31

  32. IB-SMOT CB-SMOT DB-SMOT ...... Instantiating Stops and Moves SELECT generateS (method, candidateStops, buffer) FROM trajectory SELECT generateM (method, candidateStops, buffer) FROM trajectory GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 32

  33. Method in the ST-DMQL Instatiating Sequential Patterns Q1 (tourism application): Which are the sequences of moves that occur most frequently in the morning and in the evening? SELECT sequentialPattern (itemType = NameEnd, timeG = [8:00-12:00 AS morning, 18:00-23:00 AS evening], spaceG = instance, minsup=0.03) FROM move Ans: {IbisHotel - NotreDame[morning], EiffelTower – IbisHotel [evening]} (s=0.04) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  34. Example of Pattern Queries Q: How manymoves of sequential patternscrossPont Neuf bridge? SELECT count(m.*) FROM sequentialPattern s, bridge b, move m WHERE s.pattern.SFT1name=m.SFT1name AND s.pattern.SF1id=m.SF1id AND s.pattern.SFT2name=m.SFT2name AND s.pattern.SF2id=m.SF2id AND b.name='Pont Neuf' AND intersects (m.the_geom,b.the_geom) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 34

  35. Conclusions • Data pre-processing is the most time consuming step for DM and KDD • To think about data mining during the conceptual design of a database can significantly reduce these steps GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 35

  36. Conclusions • The proposed model: • Reduces the pre-processing tasks • Supports mining at multiple granularity levels • Automatically prepares the data for data mining * Stores the patterns for futures queries Multiple-granularities data patterns Queries GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 36

  37. TP R TP R H H H R TP Hotel Restaurant Touristic Place • Semantic trajectory Pattern • Hotel to Restaurant, passing by CC • (b) go to Cinema, passing by CC Geometric Patterns X Semantic Patterns (Bogorny 2009) Geometric Pattern CC CC T3 T3 T2 T2 T1 T1 T4 T4 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 37

  38. Thank You! GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 38

  39. More examples for generating stops SELECT generateS (CB-SMOT, [Hotel,60,TouristicPlace,15,ShoppingCenter,30], 5) FROM trajectory t, district d WHERE d.name='Bela Vista' and intersects (t.movingpoint.geometry, d.geometry) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania 39

  40. Querying Rules Suppose that the user is interested in associationpatterns which have weekend as the time dimension in the antecedent of the rule SELECT * FROM associatePattern WHERE antecedent.startT='weekend' or antecedent.endT='weekend' GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  41. Basic Concepts: Support GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  42. Basic Concepts: Semantic Trajectory Patterns Example Work [morning], ShoppingCenter [afternoon], Gym [afternoon] (s=0.08%) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  43. Basic Concepts: Semantic Trajectory Patterns Example Home [night], Work [afternoon] Gym [afternoon] (s=0.10%) (c=0.50) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  44. Basic Concepts: Semantic Trajectory Patterns Example ReligiousPlace [weekend], Restaurant [weekend] (s=0. 07) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  45. Related Works GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

  46. Method in the ST-DMQL Example of Frequent Pattern Instantiation Q2: Which are the types of places most frequently visited by tourists on weekdays and weekends? SELECT frequentPattern (itemType =NameStart, timeG = WEEKEND-WEEKDAY, spaceG = [type, GenericHotel = 1], minsup = 0.15) FROM stop Ans: {4StarsHotel[weekend], Museum[weekend], Restaurant[weekend] } (s=0.16) GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, www.inf.ufsc.br/~vania

More Related