120 likes | 242 Views
Concept Description Vectors and the 20 Question s Game. Włodzisław Duch Tomasz Sarnatowicz Julian Szymański. Semantic Memory. P ermanent container for general knowledge. Hierarchical Model Collins & Quillian, 1969. Semantic network Collins & Loftus, 1975. Semantic Memory.
E N D
Concept Description Vectors and the 20 Questions Game Włodzisław Duch Tomasz Sarnatowicz Julian Szymański
Semantic Memory Permanent container for general knowledge
Semantic Space All the concepts and keywords create a semantic matrix
Concept Description Vectors • CDV – a vector of properties describing a single concept • Most of elements are 0’s – sparse vector
Data Sources I • Machine readable dictionaries and ontologies: • Wordnet • ConceptNet • Sumo/Milo ontology
Data Sources II • Dictionaries data retrieval • On-line sources • Merriam Webster • Wordnet (gloss) • MSN Encarta • Approach • Word morphing • Phrases extraction (with POS tagger) • Statistical analysis
Data access • Binary dictionary search 220 = 1048576 • Binary search – not acceptable in complex semantical applications • Narrowing concept space by subsequent queries
20 Questions Game Algorithm p(keyword=vi) is fraction of concepts for which the keyword has value vi Candidate concepts O(A) are selected according to: O(A) = {i; |CDVi-A| is minimal} where CDVi is a vector for concept i and A is a partial vector of retrieved answers
Word puzzles • 20Q game reversed • Concept – known • Keywords – the ones that would lead to the concept