260 likes | 330 Views
Learning Patterns on the World Wide Web. Andrew Hogue Advisor: David Karger October 17, 2003. Agenda. What is a pattern? How do we make one? How do we use it? Why do you want one? Demo. What is a pattern?. Objects in the world have certain semantic properties
E N D
Learning Patterns on the World Wide Web Andrew Hogue Advisor: David Karger October 17, 2003
Agenda • What is a pattern? • How do we make one? • How do we use it? • Why do you want one? • Demo
What is a pattern? • Objects in the world have certain semantic properties • A pattern is a way of recognizing the semantic properties of an object we’ve seen before • A pattern is a structure with semantic slots to be filled in
Example – Books • Define an object’s semantics (ontology): Class: Book Property: Author Property: Title Property: Price Property: Publisher Property: ISBN . . .
Example - Books Class: Book Property: Author Property: Title Property: Price Property: Publisher Property: ISBN . . . ? ?
Example - Books Class: Book Property: Author Property: Title Property: Price Property: Publisher Property: ISBN . . .
Example - Books Class: Book Property: Author Property: Title Property: Price Property: Publisher Property: ISBN . . .
Creating a Pattern • Choose positive examples
Creating a Pattern • Choose positive examples • Find best mapping between examples
Creating a Pattern • Choose positive examples • Find best mapping between examples • Merge mapped elements and assign semantic labels
Creating a Pattern • Choose positive examples • Find best mapping between examples • Merge mapped elements and assign semantic labels • Eliminate unmapped elements
Matching Patterns • Given a pattern with slots and a page to search • Look for items on page with same structure • Map pattern slots to page text
Applications • Extract search engine results • Extract and email news headlines • Watch sites for updates • Reformat sites for easier reading • Monitor bank account balances
More Information http://haystack.lcs.mit.edu ahogue@mit.edu