80 likes | 181 Views
Automatic Classification of Bookmarked Web Pages. Individual APT Presentation January 2007. Introduction. There are no pre-requisites May be useful for students who intend to follow CSA3200 (Adaptive Hypertext Systems) in 4th year. Aims. Helping to keep bookmark files organised
E N D
Automatic Classification of Bookmarked Web Pages Individual APT Presentation January 2007 1
Introduction • There are no pre-requisites • May be useful for students who intend to follow CSA3200 (Adaptive Hypertext Systems) in 4th year 2
Aims • Helping to keep bookmark files organised • When a user chooses to bookmark a web page, system recommends one of the user’s existing categories (instead of just last location saved to, or bookmark root) 3
How? • 2 algorithms to perform bookmark classification • One builds a representative document of each category (will be provided) • Second approach is up to you • An additional utility may be proposed to improve results • E.g., synonym recognition 4
Why? • Having organised bookmark files will enable us to do… • Automatic query generation from bookmark files • Web page recommendation based on other people’s bookmark files • … 5
How? • Start with Open Source framework provided by Ian Bugeja in his HyperBK project • Build algorithms • Build evaluation platform for your system • I will provide 8 bookmark files for you to use • You can remove some URLs at random to see if your algorithms classify them correctly • You will also attempt to reconstruct each bookmark file from scratch! 6
Evaluation • I will provide another 20 bookmark files (with some URLs randomly removed) for you to use to evaluate your algorithms • Students who have the best performing algorithms and best reports will have opportunity to continue working on system for FYP and to submit co-authored paper to leading IR/Adaptive systems conference 7
Tools • I recommend… • Mozilla Firefox • Xul (XML User Interface Language) and JavaScript • A tutorial on Xul will be provided • Google API • You’ll be able to use Ian Bugeja’s framework and your plug-in will be portable! • But you’re free to use any other browser, platform, language 8