80 likes | 90 Views
Extract structured information from automobile advertisements on Craigslist using a rudimentary keyword search approach. Overcome the difficulty of finding ads with specific attributes such as make, model, price, year, mileage, transmission, posted by, location, and contact information.
E N D
Information Extraction From Automobile Advertisements Nipun Bhatia Rakshit Kumar Shashank Senapaty
Problem Definition • Craigslist - Rudimentary keyword search. • Not a natural way to search for cars. • Difficult to efficiently find ads with particular attributes. • Want structured search over attributes. • Attributes : Make, Model, Price, Year, Mileage, Transmission, PostedBy, Location, Contact
Dataset & Issues • 350 postings from the cars & trucks section in Craigslist. • Manually annotated with the attributes.
Feature Selection • Features: • Title : isPresentLexicon, hasDollar_hasDigit, hasParanthesis, hasDigit, hasApostrophe_hasDigit, PrevLabel, Word • Body :isPresentTrLexicon, isPresentOwLexicon, hasDigit_ hasDash, hasDigit_hasDot, hasDigit_ hasParanthesis, Word_Representation, Neighbor
Results Body Classifier Title Classifier