180 likes | 303 Views
Whales & Cat Fur: Using A Semantic Net To Improve Precision & Recall. Semantic Technologies 2008. Information Tasks Today. Query Well Formed. Discovery. Analysis. Sources Known. Sources Not Known. Exploration. Search. Query Not Well Formed. Information Tasks Today. Query Well Formed.
E N D
Whales & Cat Fur:Using A Semantic Net To Improve Precision & Recall Semantic Technologies 2008
Information Tasks Today Query Well Formed Discovery Analysis Sources Known Sources Not Known Exploration Search Query Not Well Formed
Information Tasks Today Query Well Formed Discovery Business Analysis Sources Known Sources Not Known Exploration Consumer Search Query Not Well Formed
Current Problems • Information Overload • Too many answers • Inability to classify or arrange effectively • Lack of time to read all documents • Information Underload • Few or no answers • No classification or poor selection power of system
Current Technology 1. Same Word Different Meanings Jaguar (animal) Jaguar (car) 2. Different Words Same Meaning Disability Legislation Equal Opportunity Law 3. Different Words Related Meaning Organization Company Organization Charity Organization Trade Union • 3 Problems with Search Technology;
Current Problems Semantic Web Web 3.0 Semantic Search Social Web Web 2.0 World Wide Web Web 1.0 Natural Language Search Productivity of Search Tagging Keyword Search (Google) Desktop PC Era Directories Files & Folders Databases Amount of Information
Current Problems “I’ve been flying my bird of prey” The bird will attack • Lack of meaning based processing • Ignore stop words (e.g. articles & prepositions) “I’ve been flying my bird as prey” The bird will be attacked
and this text: The Dow gained 46.58, or 0.42 percent, to 11,002.14. The Standard & Poor's 500 index fell 1.44, or 0.11 percent, to 1,263.85, and the Nasdaq composite fell 6.84, or 0.32 percent, to 2,162.78. Current Problems • Traditional systems have a superficial processing levelno understanding of the relation between elements and of the meaning, for example, in this text: The Dow fell 46.58, or 0.42 percent, to 11,002.14. The Standard & Poor's 500 index fell 1.44, or 0.11 percent, to 1,263.85, and the Nasdaq composite gained 6.84, or 0.32 percent, to 2,162.78. • are the same!
The heart of semantic technology; Quality of results derived from the complexity and richness of the network. Includes all definitions of all words. Include relationships among all words. What is a Semantic Network?
Semantic Networks Terms Abbrev. Concepts Connections Phrases Meanings Domains • Traditional technologies can only “guess” the meaning using; • keywords, shallow linguistics, & statistics • Semantic Networks instead indentify; “San Jose is a geographic part of California” “San Jose is an American city”
The Solution is Semantics Semantic Network Linguistic Query Engine Steps to establish meaning 2 3 1 Parse Order & Priority Eliminate Ambiguity • Machine understanding of text needs: • A semantic network • A parser to trace each text back to its basic elements • A linguistic engine to query the semantic network • A system to eliminateambiguity
1. Parse Parser assigns exact logical and grammatical value by querying the semantic network with 3,500 rules. “The salesclerk says that he can’t accept a credit card ”
2. Order & Priority Priority is set to order single words or groups of two or more words, when these work as lexical and grammatical units. Here credit card is an expression.
3. Eliminate Ambiguity • Each meaning obtains a weight according to; • frequency of use, • domains, • attributes of adjectives/nouns/verbs • semantic congruence, • contextual information • Meaning with the highest weight is assigned. “Like a human reads & reasons”
Next Generation Technology Entity extraction Natural lang. I/F Disambiguation Sentiment extraction Categorization Searching Discovery • Semantic Intelligence • Linguistic rules • Sentence analysis • Shallow text analytics • Statistics • Heuristic rules • Morphological recognition Keyword-based technologies
Superior Performance Semantic text analysis processing speed (one CPU) Scalability in number of CPUs Software memory footprint (semantic net and engine) 50 MB Typical time of access to a concept in the semantic net • 60KB / sec Number of concepts in English semantic net Virtually unlimited Hyponyms and hypernyms <10-6 sec Hypernyms and troponyms 320,000 Average # of attributes for each concept 400,000+ Number of relations in semantic net (English) 55,000 20 2,800,000
Applications & Benefits • Sentiment Monitor • Question – Answer System (AskWiki) • Intelligence System
Thank you Brooke Aker CEO of Expert System US +1 860-614-2411 baker@expertsystem.net www.expertsystem.net