Text Analytics Evaluation A Case Study: Amdocs

Text Analytics EvaluationA Case Study: Amdocs Tom ReamyChief Knowledge Architect KAPS Group http://www.kapsgroup.com

Text Analytics Evaluation Case Study • Agenda • Introduction – Text Analytics Basics • Evaluation Process & Methodology Two Stages – Initial Filters & POC • Initial Evaluation Results • Proof of Concept • Methodology • Results • Final Recommendation • Sentiment Analysis and Beyond • Conclusions

KAPS Group: General • Knowledge Architecture Professional Services • Virtual Company: Network of consultants – 8-10 • Partners – SAS – 2 Whitepapers (Semantic infrastructure) • GAO, FDA, Amdocs – Sales & Development Projects • Other Partners: Smart Logic, FAST, Concept Searching, etc. • Consulting, Strategy, Knowledge architecture audit • Services: • Text Analytics evaluation, development, consulting, customization • Knowledge Representation – taxonomy, ontology, Prototype • Knowledge Management: Collaboration, Expertise, e-learning • Applied Theory – Faceted taxonomies, complexity theory, natural categories

Text Analytics Evaluation Case Study Text Analytics Features • Noun Phrase Extraction • Catalogs with variants, rule based dynamic • Multiple types, custom classes – entities, concepts, events • Feeds facets • Summarization • Customizable rules, map to different content • Fact Extraction • Relationships of entities – people-organizations-activities • Ontologies – triples, RDF, etc. • Sentiment Analysis • Rules – Objects and phrases

Introduction to Text AnalyticsText Analytics Features • Auto-categorization • Training sets – Bayesian, Vector space • Terms – literal strings, stemming, dictionary of related terms • Rules – simple – position in text (Title, body, url) • Semantic Network – Predefined relationships, sets of rules • Boolean– Full search syntax – AND, OR, NOT • Advanced – DIST(#), PARAGRAPH, SENTENCE • This is the most difficult to develop • Build on a Taxonomy • Combine with Extraction • If any of list of entities and other words

Evaluating Text Analytics Software Start with Self Knowledge • Strategic and Business Context • Strategic Questions – why, what value from the taxonomy/text analytics, how are you going to use it • Info Problems – what, how severe • Formal Process - KA audit – content, users, technology, business and information behaviors, applications - Or informal for smaller organization, application specific initiatives • Text Analytics Strategy/Model – forms, technology, people • Existing taxonomic resources, software • Need this foundation to evaluate and to develop

Evaluating Text Analytics Software Start with Self Knowledge • Do you need it – and what blend if so? • Taxonomy Management Full Functionality • Multiple taxonomies, languages, authors-editors • Technology Environment – Text Mining, ECM, Enterprise Search • Where is it embedded, integration issues • Publishing Process – where and how is metadata being added – now and projected future • Can it utilize auto-categorization, entity extraction, summarization • Applications – text mining, BI, CI, Social Media, Mobile?

Evaluation Process & MethodologyTeam - Interdisciplinary • IT – Large software purchase, needs assessment • Text Analytics is different – semantics • Construction company designing your house • Business – Understand the business needs • Don’t understand information • Restaurant owner doing the cooking • Library - know information, search • Don’t understand the business, non-information experts • Accountant doing financial strategy • Team – 3 KAPS - Information • 5-8 Amdocs – SME - business, Technical.

Evaluation Process & MethodologyAmdocs Requirements / Initial Filters • Platform – range of capabilities • Categorization, Sentiment analysis, etc. • Technical • API’s, Java based, Linux run time • Scalability – millions of documents a day • Import-Export – XML, RDF • Total Cost of Ownership • Vendor Relationship - OEM • Usability, Multiple Language Support

Evaluation Process & MethodologyTwo Phases • Phase I – Traditional Software Evaluation • Filter One- Ask Experts - reputation, research – Gartner, etc. • Market strength of vendor, platforms, etc. • Filter Two - Feature scorecard – minimum, must have, filter to top 3 • Filter Three – Technology Filter – match to your overall scope and capabilities – Filter not a focus • Filter Four– In-Depth Demo – 3-6 vendors • Phase II - Deep POC (2) – advanced, integration, semantics

Phase I – Case Study • Attensity • SAP – Inxight • Clarabridge • ClearForest • Concept Searching • Data Harmony / Access Innovations • Expert Systems • GATE (Open Source) • IBM • Lexalytics • Multi-Tes • Nstein • SAS • SchemaLogic • Smart Logic • Content Management • Enterprise Search • Sentiment Analysis Specialty • Ontology Platforms

Phase I - 4 Demos • SmartLogic • Taxonomy Management, good interface • 20 types of entities, API’s, XML-Http • Full Platform – no Sentiment Analysis • Expert Systems • Different Approach – Semantic Network – 400,000 words / 3,500 rules, 65 types of relationships • Strong out of the box – 80%, no training sets • Language concerns – no Spanish, high cost to develop new ones • Customization – add terms and relationships, develop rules – uncertain how much effort, use their professional linguists

Phase I - 4 Demos • SAS- Content Categorization & Sentiment • Full Platform – categorization, entity, sentiment – integrated • API’s, XML, Java – ease of integration • Strong history of company, range of experience • IBM – Classification, Concept Analytics – Two products • Classification Module – statistical emphasis • Once trained, it could “learn” new words • Rapid development / depends on training sets • Content Analytics, Languageware Workbench • Full Platform

Phase I – Findings • SAS & IBM – Full Platform, OEM Experience, multilingual • Proven ability to scale, customizable components, mature tool sets • SAS was the strongest offering • Capabilities, experience, integrated tool sets • IBM good second choice • Capabilities, experience - multiple products – strength and weakness • Single Vendor POC - Demonstrate it can be done • Ability to dive more deeply into capabilities, issues • Stronger foundation for future development, Learn the software better • Danger of missing better choice • Two Vendor POC • Balance of depth and full testing

Phase II - Proof Of Concept - POC • 4-6 weeks POC – bake off / or short pilot • Measurable Quality of results is the essential factor • Real life scenarios, categorization with your content • 2-3 rounds of development, test, refine / Not OOB • Need SME’s as test evaluators – also to do an initial categorization of content • Majority of time is on auto-categorization • Need to balance uniformity of results with vendor unique capabilities – have to determine at POC time • Taxonomy Developers – expert consultants plus internal taxonomists

Phase II – POC: Range of Evaluations • Basic Question – Can this stuff work at all? • Auto-categorization to existing taxonomy – variety of content • Essential Issue is complexity of language • Clustering – automatic node generation • Summarization • Entity extraction – build a number of catalogs – design which ones based on projected needs – example privacy info (SS#, phone, etc.) • Entity example –people, organization, methods, etc. • Essential issue is scale and disambiguation • Evaluate usability in action by taxonomists

Phase II – POC: Evaluation Criteria & Issues • Basic Test Design – categorize test set • Score – by file name, human testers • Categorization & Sentiment – Accuracy 80-90% • Effort Level per accuracy level • Quantify development time – main elements • Comparison of two vendors – how score? • Combination of scores and report • Quality of content & initial human categorization • Normalize among different test evaluators • Quality of taxonomists – experience with text analytics software and/or experience with content and information needs and behaviors • Quality of taxonomy – structure, overlapping categories

Phase II – POC: Risks • CIO/CTO Problem –This is not a regular software process • Language is messy not just complex • 30% accuracy isn’t 30% done – could be 90% • Variability of human categorization / expression • Even professional writers – journalists examples • Categorization is iterative, not “the program works” • Need realistic budget and flexible project plan • Anyone can do categorization • Librarians often overdo, SME’s often get lost (keywords) • Meta-language issues – understanding the results • Need to educate IT and business in their language

Text Analytics POC OutcomesCategorization of CSR Notes • Content –2,000 CSR notes categorized by humans • Variation among human categorization • Recall (finding all the correct documents) • Precision (not categorizing documents from other categories) • Precision is harder than recall • Two scores – raw and corrected – only raw for IBM precision • First score was very low, with an extra round got it up • Uncategorized documents – 50,000 – look at top 10 in each category

Text Analytics POC OutcomesCategorization Results

Text Analytics POC OutcomesVendor Comparisons • SAS has a much more complete set of operators – NOT, DIST, ORDDIST, START • IBM team was able to develop work arounds for some – more development effort • Operators impact most other features – Sentiment analysis, Entity and Fact Extraction, Summarization, etc. • SAS has relevancy – can be used for precision, applications • Sentiment Analysis – SAS has workbench, IBM would require more development • SAS also has statistical modeling capabilities • Development Environment & Methodology • IBM as toolkit provides more flexibility but it also increases development effort, enforces good method

Text Analytics POC OutcomesVendor Comparisons - Conclusions • Both can do the job • Product vs. Tool Kit (SAS has toolkit capabilities also) • IBM will require more development effort • Boolean Operators – NOT, DIST, ORDDIST, START, etc. • In rules, entity and fact extraction • Sentiment Analysis – rules, statistical • Summarization • Rule building more programming than taxonomy • IBM harder to learn – POC had 2X effort for IBM • Conclusion: Buy SAS ECC and Sentiment Workbench

Sentiment AnalysisDevelopment Process • Combination of Statistical and categorization rules • Start with Training sets – examples of positive, negative, neutral documents • Develop a Statistical Model • Generate domain positive and negative words and phrases • Develop a taxonomy of Products & Features • Develop rules for positive and negative statements • Test and Refine • Test and Refine again

Beyond Sentiment: Behavior PredictionCase Study – Telecom Customer Service • Problem – distinguish customers likely to cancel from mere threats • Analyze customer support notes • General issues – creative spelling, second hand reports • Develop categorization rules • First – distinguish cancellation calls – not simple • Second - distinguish cancel what – one line or all • Third – distinguish real threats

Beyond SentimentBehavior Prediction – Case Study • Basic Rule • (START_20, (AND, • (DIST_7,"[cancel]", "[cancel-what-cust]"), • (NOT,(DIST_10, "[cancel]", (OR, "[one-line]", "[restore]", “[if]”))))) • Examples: • customer called to say he will cancell his account if the does not stop receiving a call from the ad agency. • cci and is upset that he has the asl charge and wants it offor her is going to cancel his act • ask about the contract expiration date as she wanted to cxltehacct Combine sophisticated rules with sentiment statistical training and Predictive Analytics

Beyond Sentiment - Wisdom of CrowdsCrowd Sourcing Technical Support • Example – Android User Forum • Develop a taxonomy of products, features, problem areas • Develop Categorization Rules: • “I use the SDK method and it isn't to bad a all. I'll get some pics up later, I am still trying to get the time to update from fresh 1.0 to 1.1.” • Find product & feature – forum structure • Find problem areas in response, nearby text for solution • Automatic – simply expose lists of “solutions” • Search Based application • Human mediated – experts scan and clean up solutions

Beyond Sentiment: Expertise Analysis • Apply Sentiment Analysis techniques to Expertise • Expertise Characterization for individuals, communities, documents, and sets of documents • Experts prefer lower, subordinate levels • Novice prefer higher, superordinate levels • General Populace prefers basic level • Experts language structure is different • Focus on procedures over content • Develop expertise rules – sentiment and categorization

Expertise Analysis Expertise – application areas • Taxonomy / Ontology development /design – audience focus • Card sorting – non-experts use superficial similarities • Business & Customer intelligence – add expertise to sentiment • Deeper research into communities, customers • Text Mining - Expertise characterization of writer, corpus • eCommerce – Organization/Presentation of information – expert, novice • Expertise location- Generate automatic expertise characterization based on documents • Experiments - Pronoun Analysis – personality types • Essay Evaluation Software - Apply to expertise characterization • Model levels of chunking, procedure words over content

Text Analytics EvaluationConclusions • Start with Self Knowledge – text analytics not an end in itself • Initial Evaluation – filters, not scorecards • Weights change output – need self knowledge for good weights • Proof of Concept – essential • OOB doesn’t tell you how it will work in real world • Content and Scenarios is your real world • Good idea even if you know SAS is the answer • Importance of operators, relevance for a platform • Sentiment needs full platform capabilities • Everyone has room for improvement

Text Analytics Future Directions • Start with the 80% of significant content that is not data • Enterprise search, content management, Search based applications • Text Analytics and Text Mining • Text Analytics turns text into data – Build better TM Apps • Better extraction and add Subject / Concepts • Sentiment and Beyond – Behavior, Expertise • Text Mining and Text Analytics • TM enriching TA • Taxonomy development • New Content Structures, ensemble models • Text Analytics and Predictive Analytics • More content, New content – social, interactive – CSR • New sources of content/data = new & better apps • Add Learning & Cognitive Science and the future is ?

Questions? Tom Reamytomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

Text Analytics Evaluation A Case Study: Amdocs

Text Analytics Evaluation A Case Study: Amdocs

Presentation Transcript

“This is a Test. This is Only a Test!”

Software Testing

3D Test Issues

Test and Test Equipment December 2012 Hsin -Chu , Taiwan

Who wants to be a Millionaire?

Test Preparation, Test Taking Strategies, and Test Anxiety

Test Automation Tools: QF-Test and Selenium

System Test Specification

TDC ( Test Description Code)

Engine Condition Diagnosis

Chi-square test or c 2 test

200

Test del Software, con elementi di Verifica e Validazione, Qualità del Prodotto Software

Test of Significance

System Test Tools

Lesson 7