150 likes | 354 Views
Maintaining Knowledge-Bases of Navigational Patterns from Streams of Navigational Sequences. Ajumobi Udechukwu, Ken Barker, Reda Alhajj Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA’05)
E N D
Maintaining Knowledge-Bases of Navigational Patterns from Streams ofNavigational Sequences Ajumobi Udechukwu, Ken Barker, Reda Alhajj Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA’05) Advisor:Jia-Ling Koh Speaker:Chun-Wei Hsieh
Introduction • Navigational patterns: traversal patterns • Two broad techniques for mining navigational patterns • 1. level-wise, apriori-based techniques • 2. tree-based techniques
Methodology • Sliding window • Batch-update strategy • Batch: the web log in the base time unit Example
Adapted GST • Adapted generalized suffix tree • Appending a stop symbol to all strings • Mining without thresholds
Adapted GST LQR LQ • LQR
The Challenge of Adapted GST • ”LQ” occurs in B1 with support count of 4 and “L” occurs independently in B2 with support count of 2 • Total count of “L” should be 4 + 2
AC-NAP tree 2 • Output all node labels and counts to a database
Maintaining patterns within a window • Count total support • Remove out_of_date patterns
Experiments • OS: Microsoft Windows XP professional edition • CPU: 2GHz Intel Pentium 4 • RAM: 512MB • Program language: Java • DBMS: MySQL • Data: real-world web logs of ”msnbc.com”