Topics in AI: Applied Natural Language Processing

Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games: Gameplay Krishna Achuthan, Stephanie Hasz, Carl Staab November 23, 2009

Initial Tasks • Research prior work • Video game review analysis • Other product review analysis • Recommender methods • Create a lexicon of domain-specific terms for named entity recognition • Crawling sites, existing lexicons

Previous Research • Jose Zagal's paper • Reviews include different commentary types • Found that game review NLP is a virgin topic • One paper finding polarity of adjectives using review scores • A couple papers using presence of feature nouns in user reviews for search

NER & Recommender Research • Reviewed allgame, GameFly, GameSpot, GameSpy, GiantBomb, IGN, IMDB, MobyGames • GiantBomb: API for retrieving metadata • IGN: lexicon of video game terminology • Most sites had no “similar games” feature • Those that did used page views, genre, or user-submitted data

Giantbomb Extraction • Crawled GiantBomb game database and extracted entity names and types for each game • Necessary for efficient tagging • Established a fixed dataset to avoid unexpected errors from editing on live database • Games, franchises and their games, platforms, companies, genres, characters, locations, concepts

Named Entity Tagging • Used GiantBomb data to identify named entities in review text and their types • Tagger underwent several iterations • Result is flexible in terms of specifying capitalization or level of abbreviation for different starting strings, types of NEs • Most effective strategy: prioritize-but-overwrite-shorter

Named Entity Tagging • Example: occurrence of “Super Mario World” in review text for “Mario Galaxy” • Super <Mario CHARACTER> World • <Super Mario FRANCHISE> World • <Mario TITLE_PART> tag rejected - not longer than <Super Mario FRANCHISE> • <Super Mario FRANCHISE> <World LOCATION> • <Super Mario World OTHER_GAME>

Defining Gameplay • Read reviews, looking for sentences describing gameplay • Age of Empire III, Legend of Zelda: Twilight Princess, Animal Crossing, Gauntlet: Dark Legacy, Tony Hawk’s Pro Skater 3, Mario & Luigi: Partners in Time • Lack of emotional content in user reviews • Flaws described in more detail than strengths • Reviews focus on plot description • Categories emerged • Purchasing advice, story/structure, staying power/replay value, non-emotional and emotional gameplay experience, external factors

Gameplay Adjectives • Google bigram dataset gave us 531 adjectives describing gameplay • Separated review files into sentences, extracted sentences containing Google adjectives • Also extracted adjectives from GameSpot reviews • Needed domain-specific data • Adjectives might show that users are describing things we haven't considered • Later used for noun extraction

Review Adjectives • Using Stanford POS tagger, extracted adjectives from a subset of 3,074 reviews • Review subset taken from all genres with > 200 games • 60,000+ “adjectives” • Manually analyzed the list for gameplay words • Eliminated: • < 20 occurrences • Generic qualitative adjectives • Personality descriptors • Kept: action and experience words

Resultant Adjective List • 1,141 adjectives from 20 to 16,094 occurrences • Words describing: • Size: massive/tiny • Pace: quick/slow • Ease: easy/impossible • Uniqueness: innovative/uninspired • Experience: addictive/tedious • Aesthetics: gorgeous/ugly

Towards Using Adjectives • Extracted sentences with potentially interesting adjectives from a sample of reviews and parsed with the Minipar parser • Will allow us to further refine our lists of adjectives and especially nouns of interest • Eventually, will also use the MK-means clustering algorithm implemented this quarter to determine which adjectives are most useful

Interface • Backend-functionality for basic interface coded by Krishna • Utilizes a different database, but ASP code might be portable • Database contains all GiantBomb data vs. the GameSpot subset with review data

Next Steps • Cluster gameplay adjectives using Mkmeans • Description vs. experience? • Derive categories of gameplay • Assign games to gameplay categories • Extract sentences with both a gameplay adjective and noun • Assign games to their adjectives' categories • Incorporate gameplay features into database • Back-end coding of website

Topics in AI: Applied Natural Language Processing

Topics in AI: Applied Natural Language Processing

Presentation Transcript

SIMS 290-2: Applied Natural Language Processing

Natural Language in AI

I256: Applied Natural Language Processing

M.Tech Seminar topics in Natural Language Processing ( 2014)

I256: Applied Natural Language Processing

I256: Applied Natural Language Processing

CSC 594 Topics in AI – Applied Natural Language Processing

Research Topics Natural Language Processing Image Processing

CSC 594 Topics in AI – Applied Natural Language Processing

I256: Applied Natural Language Processing

I256: Applied Natural Language Processing

CSC 594 Topics in AI – Applied Natural Language Processing

I256: Applied Natural Language Processing

CSC 594 Topics in AI – Applied Natural Language Processing

I256: Applied Natural Language Processing

CSC 594 Topics in AI – Applied Natural Language Processing

CSC 594 Topics in AI – Natural Language Processing

CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach

I256: Applied Natural Language Processing

CSC 594 Topics in AI – Natural Language Processing

CSC 594 Topics in AI – Applied Natural Language Processing

I256: Applied Natural Language Processing