300 likes | 403 Views
Soft Computing Research & Applications at BT Exact. Marcus Thint Computational Intelligence Group Intelligent Systems Lab. About BT Exact. Part of British Telecom (BT Group) BT’s research, technology, and IT operations business
E N D
Soft Computing Research & Applications at BT Exact Marcus Thint Computational Intelligence Group Intelligent Systems Lab
About BT Exact • Part of British Telecom (BT Group) • BT’s research, technology, and IT operations business • Research group has established history (formerly BT Laboratories) • Main campus in Adastral Park, near Ipswich, UK • More info at btexact.com
Technology Core Related Unrelated Internal development Internal development Joint venture Core Internal development Corporate venture Corporate venture Related Market Joint venture Corporate venture Acquisition Unrelated Tidd et. al . ,2000 pp275 Research & Venturing • Approaches to innovation
Technology perspective: Focused on technologies aimed at exploiting tolerance for imprecision to provide transparent, adaptive solutions quickly, including reasoning under uncertainty. Fuzzy Logic (approximate reasoning) Neural Networks (learning and adaptation) Probabilistic Reasoning (Bayesian learning) combinations Business perspective: Applied research & development for corporate and venture clients Two major initiatives: Intelligent Data Analysis (IDA) Intelligent Information Management (IIM) Computational Intelligence Group
Multimodal Interface Sensor Fusion Profile Profile Profile iContact Contact Finder iDiary Flexible Scheduling Adaptive Profile iRemind Info Reminder iPhone Acquaintance Model W W W Interest Model myPaper Personalised Newspaper iMail Priority Model Database IDEA Natural Communication IIM driver:Intelligent Personal Assistant Framework ...
User profile • Enables personalized services… • Consists of topics of interests • described by ‘positive’ and ‘negative’ phrases • attached attributes: • Privacy (private, restricted, public) • Expertise (curious, fair, competent, expert) • Duration factor • User has full access and control • Manual or semi-automatic maintenance • Agents suggest updates to profile
Document classification • iMail prioritizes email for personal productivity • Goal: minimize interruptions (frequency & timing) • Automatically monitor and sort incoming mail into folders named: • Read now • Read today • Read this week • Read this month • Read never (junk) • Notify when batches become large
A B iMail (cont’d) • Prioritization model based on combination of • Importance (content, sender) • Urgency (timing, how soon action is required) • Representation of modules in priority model is based on Bayes nets • Adaptation: conditional probabilities are updated with new observations • Content itself can be examined based on local knowledge (iMail agent) or community knowledge (e.g. info from iDiary agent).
iMail (cont’d) • Initially learns user preferences by watching how s/he handles similar incoming messages (user need not specify explicit rules) • During supervised phase, iMail informs user where an email is being filed • Once confident, user can delegate the sorting and eliminate interruption • Priorities are computed on sender, subject keywords (similarity to interest profile). Content keywords can also be added. • Integrated with Microsoft Outlook
Semantic similarity computation(via Word Replaceability Matrix) • Word-to-word and document-to-document similarity assessment • Common approach: vector space model (of term frequencies) • Can only compare like terms (e.g. taxi & taxi) ∑ (x1 ◦ x2) • WRM handles equivalent terms (e.g. taxi & cab) ∑ ( W ◦ x1 ◦ y2 ) • Fuzzy WRM • Incremental update of WRM • Automatic (using Wordnet) and semi-automatic (user feedback) verification & refinement processes
Fuzzy WRM • Based on n-grams from a corpus of plain text or XML documents • Each sequence of M words has a probability distribution of n-grams associated with it • For each (stemmed) word, a fuzzy set of n-grams is formed • The semantic similarity of word1 and word2 is given by the semantic unification of word1 given word2 • During the creation of the WRM, a threshold value is used to adjust the ‘quality’; i.e. to produce a matrix of very highly probable semantic unified words
Fuzzy WRM (cont’d) • Antonyms found • face back • small big • break make • major minor • lost found • leading following • smallmouth big • hit miss • brother sister • unmarried married • minor major • lose win • unfaithful faithful • present past • sound silent • south north • foe friend • man woman • collect spread • boy girl • miss hit • night day Sample output from 12MB text corpus (movies) Synonyms found • reality world • directions way • gamble risk • solid strong • father mother • top clear • video TV • planning plan • shooting shot • pop popular • enormously hugely • doubt question • combat fighting • sum amount • effective good • cook make • named called
Fuzzy WRM (cont’d) Anomalies also found • temporarily attacks • trashy fantastic • thrillers christian • tchaikovsky essential • flat etude • meeting lee • cbs old? • flaws eclipsed • bravado endearing • martin magner • grainy nevadas • minions goats • saves whos • leap british • lean meets • sweat swing • descriptions any • bizet composers • swear akira • cbc childrens • Most of the derived ‘equivalent’ terms are thus in the “good” category, depending on the context…. • Much room for improvement • Use top N terms to augment other processing
Fuzzy Query & Search • BT-BISC collaboration • Applications to restaurants, movies, hotel databases • Fuzzy membership of restaurants/movies/hotels into different categories • Asian - science fiction - luxury • Mexican - action - economical • Spicy - drama - resort • … - … - … • Early version requires manual (human) assignment of membership
Fuzzy Query & Search (cont’d) Searching in personal news archive:
Fuzzy Query & Search (cont’d) Searching in personal news archive:
Intelligent Data Analysis • Corporations store large amounts of data on product features & use, competitors, customers • IDA required to maximize & leverage this latent knowledge & asset • Need for IDA tools that can be used by non-experts • “open” models that are easily understandable • permit inclusion of prior knowledge to analysis • move from technology centered to user-centered tools
iKnowDis Soft computing-based platform for intelligent data analysis Adaptive user preference modelling Data pre-processing Method selection (fuzzy, neural, neuro-fuzzy, decision trees, +) Model creation Model evaluation Model deployment ‘Wizard’ asks high level questions about objectives and required analysis results ‘Expert mode’ enables custom configuration and analysis via a visual programming approach Plug-in API enables user extension of analysis methods Intelligent Data Analysis (cont’d)
Problems in Predicting Travel Time Intelligent Data Analysis (cont’d) Different traffic conditions Find a place to park,get access, ... Travel time is inter-job time
ITEMS: intelligent travel time estimation and management system Typical logs consist of point-to-point, total job time Mine historical travel times, & estimate inter-job times (distinct local patterns do emerge) Used by BT Wholesale mobile workforce Improved scheduling and 10% reduction in travel times Provides explanation facility based on decision trees and neuro-fuzzy systems that display rule-based info about individual journeys Learning component that constantly builds new models for travel time prediction Intelligent Data Analysis (cont’d)
Intelligent Data Analysis (cont’d) • Managerial tool: • Identify where travel is slow • Assess performance of technicians • Derived rules explain why a certain journey may have been late
Intelligent Data Analysis (cont’d) • DecTOP (decision table optimization tool) • Call Center operators typically use decision models in terms of prescribed interviews • Based on customer responses, operator navigates through the decision model to reach an assessment • DecTOP monitors the performance of the model and the individual user • Identify decisions of poor performance, and optimize them manually or automatically (based on accuracy or cost) • Permits study of model variants and to perform what-if analysis (by changing individual decisions) • Used by BT Retail to improve fault allocation process
Intelligent Data Analysis (cont’d) DecTOP UI screens:
Interest accessed (add/delete/modify) Interest/expertise sought or contacted Expertise level News article accessed (read/searched/ keywords updated) Fuzzy Inference Engine Reference links (followed/searched) Tasks accessed (add/delete) Other docs accessed (read/searched) Ratings from articles read (excellent…poor) Mail accessed (read/replied/deleted) Importance level Duration level Keyword weights New interest creation Positive KWs of existing categories Current KW vector similarity detector Adaptive profiling • Fuzzy rule-based approach: IF read_article_about_interest IS frequent THENimportance IS increased_slightly IF access_to_interest IS negligible THENduration IS decreased …
Adaptive profiling (cont’d) • Inductive logic programming approach • User models used to: • predict additional tasks to be carried out • assist in scheduling of multiple tasks by suggesting most likely task order
I want a meeting with Jeff at the end of next week. Coordination Agent Coordination Agent Coordination Agent Coordination Agent User Agent User Agent User Agent iDiary Flexible Scheduling iMS toolkit Scheduling • iDiary: • Tasks and meetings can be given fuzzy times and durations (e.g. “in the morning”, “early next week”) • automatically resolves conflicts among participants and finds common time slot • changes can be rescheduled under the fuzzy constraints • synchronizes with Microsoft Outlook
Scheduling (cont’d) • iMeeting on mPAF:
Summary • CIG is involved in research and applications of soft computing, agent, and data analysis technologies • Personalized information management requires handling of uncertainty in user models and information content • To date, we have been active in: • Document classification • Similarity computations • Fuzzy query and search • IDA tools & decision optimization • Adaptive systems (user profiling) • Flexible scheduling
Outlook • Information management (true) knowledge management • Automated reasoning on natural language text/voice input is a key target • Incorporating context, domain knowledge, semantic nets… • But approaching the limits of ‘word manipulations without understanding’