140 likes | 220 Views
Project Version 2.0. May 6, 2009 Seungmi Lee Mary Burns Ching-Tien Wu (Claire). Business Case/ Objectives.
E N D
Project Version 2.0 May 6, 2009 Seungmi Lee Mary Burns Ching-Tien Wu (Claire)
Business Case/Objectives • Movie fansdo travel to the movie locations of their favorite movie (example = Field of Dreams). However, it can be very difficult to find out about the locations. Therefore, there is a need for a website to: 1. customize to a specific market niche for which targeted advertising/product tie-ins can be designed 2. develop a strong brand for people who want to explore and/or travel to movie locations 3. take advantage of external APIs to decrease development time, cost, and maintenance 4. encourage customers to purchase from our revenue partners, such as Amazon.com
Our Functionality • Our one-stop-shopping website functionality includes: 1. providing movie/travel information 2. recommending travel locations based on user preferences 3. displaying locations on maps (via GoogleMaps) 4. integrating pictures of movie locations from Flickr 5. providing weather information from Weather.com 6. enabling customers to buy products from Amazon.com 7. supporting flight, hotel, and car rental reservations via Kayak.com
Novelty • 4,762 Movie Titles & 2,733 Movie Locations • Combines movie and vacation preferences in making recommendations to users • Stores user preferences from use to use • Provides one-stop-shopping for users who not only want recommendations for travel butwho also want to book reservations or purchase related materials • Steers away from NY City and LA which dominate competitors’ sites
Architecture Open APIs MovieLocationQuest Client Tomcat Server Google Maps Flickr Recommendation component MySQL (Database) Kayak.com Weather.com IMDb.com Movie database from Web sources Amazon.com Associates
Database/Data Mining Extract, Transform, and Load to our 14 tables: • In IMDb, > 750,000 Movie/TV Titles & > 300,000 Movie Locations! • To limit our prototype and to introduce “novelty”, we limited our data set to 69 U.S. cities with the largest number of movie locations, excluding NY and LA • Version 2.0 contains: 4,762 Movie Titles & 2,733 Movie Locations • Includes airport code and zip code tables with latitude/longitude Data Mining: • Mined IMDb.com and Yahoo! Travel for each genre/city location binary vector space of attributes: - 21 genre attributes, such as FilmNoir, Adventure, or Western - 26 vacation style attributes, such as RomanceDest, OutdoorActivities, or Shopping • Using Jaccard similarity function against weighted user preferences
Data Mining/Vector Space • Mined IMDb.com and various web sites about the cities to create a binary vector space combining 2 types of attributes: 21 movie genres and 26 vacation styles • VECTORS(LocationID, Action, AdventureMovie, Biography, Comedy, Crime, Documentary, Drama, Family, Fantasy, FilmNoir, History, Horror, Music, Musical, Mystery, RomanceMovie, SciFi, Sport, Thriller, War, Western, RomanticDest, FamilyDest, AdventureDest, RoadTrip, Singles, Budget, Luxury, Historic, Art, Architecture, Sightseeing, Museums, Beach, OutdoorActivities, Spa, SkiSnow, Dining, Golf, Shopping, Nightlife, Music, Themepark, Summer, Winter, Weekend, Honeymoon) • Examples: • Beverly Hills (0,0,1,1,1,0,0,0,1,1,0,1,0,0,0,1,1,0,1,0,1,0,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,1,1,1,1,1,0, 1,0,0) • Boston (0,0,0,0,1,1,1,0,1,1,0,1,0,0,1,0,0,1,0,0,0,1,1,1,1,1,0,0,1,1,1,1,1,1,1,0,1,1,0,1,1,1,0,1,0,1,1)
Recommender System In the recommendation section, we created a survey to get weighted user preferences for both sets of attributes. The user preferences can be saved. We used the Jaccard similarity function to find the cities closest to a user’s preferences
Scalability • Vectors for new cities: • Genre can be automatically added • Vacation style requires manual intervention • Data for all cities/locations exists on IMDb.com; downside is amount of data pre-processing of IMDb data, especially for locations • All APIs are scalable
Contributions • Seungmi Lee • Project Manager • Web Service Integration / Mash-up • Java Server Pages/User Interface • Data Loading • Branding • Mary Burns • Database Design and Implementation • Recommender System • Data Pre-processing & Loading • Ching-Tien Wu (Claire) • Business Concept Refinement • Data Mining • Data Pre-processing & Loading
Future Directions • Add more cities/locations/vectors to expand internationally • Refine vectors to location-level vs. city-level • Include more user preference choices, such as preferred travel time period or preferred cities • Include links to additional movie data • Add links to city data (example = Wikipedia) • Add user forums • Add another revenue resource similar to Google adsense