300 likes | 397 Views
FIRST European research for web information extraction and analysis for supporting financial decision making. EFMA Customer Week - April 2013 Tomás Pariente Lobo – Atos Spain. Motivation. Vision. Innovation. Tools. Why FIRST? - Motivations. The most reliable data sources today….
E N D
FIRSTEuropean research for web information extraction and analysis for supporting financial decision making EFMA Customer Week - April 2013 Tomás Pariente Lobo – Atos Spain
Motivation Vision Innovation Tools
Why FIRST? - Motivations The most reliable data sources today… …also have their weakness!They do not consider unstructured data, rumors, market sentiments, etc. 3
Why FIRST? - Motivations Example: Apple iPhone 1 Announcement on 2007-01-09 Stock prices were skyrocketing after the announcement. However, the announcement could be sensed before… 4
Why FIRST? - Motivations Example: Market surveillance via FIRST (the Google news case) • September 2008: Google news announced “United Airlines bankruptcy”. • Within 12 minutes stock price decreased 75% wiped out US $ 1bn. • The “news” was actually 6 years old… Plausibility checking will help in identifying hoaxes: consistence with regulatory news and other sources. 5
Why FIRST? – MotivationsA growing universe of unstructured data … how to separate the wheat from the chaff ? 6
Motivation Vision Innovation Tools
FIRST Project European-funded research project Project facts Running from October 2010 until September 2013 9 partners More than 30 people Preliminary results available More to come... Stay tuned (http://project-first.eu/) 8
Who is behind FIRST? Industrial partners Academic/Research SMEs
FIRST Vision Vision is to make available the relevant information of the entire financial information space (including unreliable, unstructured, sentiment sources) to the decision maker in near-real time in an automated way 10
FIRST Vision Financial Resources Structured AUTOMATION Acquisition Processing Analysis Decision support Unstructured Blog, analysis, bulletin boards… Unreliable, poor quality, noisy… 11
Automated data processing • Overall Goal: Mixing structured information and unstructured web data in specific decision making processes • Four steps in the macro-process of converting data into information are tackled, in our solutions tailored to the financial services market: • Data stream acquisition • Real-time processing • Sentiment building • Decision Support System
Sentiment from web data streams • Sentiment is extracted from data streams and correlated with events
Sentiment in Financial Services? Sentiment cross-over happens before price plunge Sentiment cross-over In Sept 2011, the sentiment turns from a long time of positive values to negative. A big plunge in the price happens shortly after, accompanied by a series of negative events (lost deals etc)
Motivation Vision Innovation Tools
Mining the Web for financial texts Data Acquisition pipeline: Web mining Natural Language preprocessing and entity extraction Streaming Content HTML Annotated Content Cleaning Financial terms, Companies, Intruments …
Data acquisition after one year • Some numbers • 176 Web sites • 2,671 RSS sources • ~40,000 documents per day • >10.000.000 documents by end of 2012 • And growing Essential for future evaluation and analysis 17
Analysing sentiments in Web texts The Analytical Pipeline: Identify, extract, classify, aggregate Document with basic annotations SENTIMENT CLASSIFICATION per object and feature Document with sentiment sentences SENTIMENT AGGREGATION per object and feature Aggregated sentiments Annotated Content Indicators Object Positive sentiment Sentiment Sentences 18
Supporting the decision making process The Decision Support techniques: Analysis and visualization Machine Learning Techniques FIRST Acquisition & Analytical Pipelines Outputs: Forecasts of volatility or returns, Alert on pump and dump, Reputation change of a counterpart Signals, Charts, Topic Spaces, Topic Trends, Reports … Qualitative Modeling Forecasting Models Knowledge Base Visualization Techniques 19
Glassbox model Sentiment Drill down Document sentences X Objects Features 20
Sentiment analysis & decision making • The integrated model of FIRST and its innovations • In the following slides we will rapidly check results from incorporating sentiments in retail brokerage, investment management and reputational risk scenarios • Main areas of research • Sentiment analysis • DSS models • Stream visualization • Scaling strategy • Early adopters • Slovenian presidential elections • GAMA Perception Analytics
Motivation Vision Innovation Tools
The three FIRST use cases & their relevance for the industry Market Surveillance • Capital markets compliance can be automated today using structured data, but the automation does not take unstructured data into account • FIRST will • make use of large volumes of unstructured data into financial compliance; • develop automated techniques to better detect market abuse/insider trading.. Reputational Risk Management • No off-the-shelf solutions or methodologies for reputational risk management. • FIRST will • provide a sustainable tool for reputational risk monitoring; • contribute to break new ground in this field of dramatically high impact in FSI. Retail Brokerage • Today, mainly based on quantitative analysis and key figures. • FIRST will • use unstructured data to leverage both information for private investors and sophisticated tools for professional users. 23
The three usecases in the words of the FIRST UC-Owners UC#1 – Market Surveillance • “The development of surveillance scenarios based on unstructured information will allow the compliance offices to better investigate on unusual and suspicious trading activities and to better understand trends and patterns” – Stefan Queck, Business Dev. Manager at NEXT. • “Especially in time of financial crises, new regulatory requirements and reputation loss risks, the financial industry is interested in new methods and approaches to detect abuse trading behaviour”– Wolfgang Fabisch, CEO at NEXT. UC#2 – Reputational Risk Management • “From the early prototype release, we are looking forward to utilising in a real-life environment the FIRST solution” – Maria Costante, Responsible for reputational risk modelling and Pillar 3 at Gruppo Montepaschi. • “We already discussed the tool we are setting up in European contexts, and we are looking forward to presenting the first results, already in 1H/2012” – Giorgio Aprile, Head of Reputational and Operational Risks at Gruppo Montepaschi. UC#3 – Retail Brokerage • “When presenting the usecase to potential customers, they showed interest in this kind of data and the resulting tools” – Michael Diefenthäler, Director of Product Mgmt at IDMS. • “We are looking forward to present the FIRST results to a variety of customers.” – Peter Heister, Head of Sales EMEA at IDMS.
Reputational risk ….. Need for integrating online unstructured data analysis with the current analysis on financial structured data • Query • on-demand • routinary Sentimentanalysis Reputation cockpit Reputational Risk Index (RI) Model Applicationscenarios Ontology IE Unstructuredsources • What-ifscenarios: • events • probability-weightedriskscenarios • … • Risk reporting: • reputationaltrends for eachcounterpart • events/topic • data sourcesdrill-downs • … Customeran d product data (internalsources) Performance Mismatching Volumes Nr. Customers Structuredsources Goal: to measure and to report, in quasi-real time, on reputational risk, using internal as well external data sources, to be integrated into a single reputation engine and application scenario
Retail brokerage @ work • Sentiments: Leverage of the investment process by assessment of unstructured information • Unburden the actor of reviewing various sources repeatedly by automation of this task • Provide different levels of sentiments, e.g. for single instruments and sectors • Support individual decision making by incorporating sentiments
Market surveillance @ work Typically thinly-traded stocks 1) Buying Blog A Blog B 2) Disseminating inaccurate or misleading information Twitter Blog C 3) Waiting for changes of market price p 4) Selling On artificial price level t • Identification and classification of unstructured information • Quiteunderstandablegeneration of alerts • Functionalitites to handle alerts • Comparison of market, institute specific and unstructured information Decision support in evaluationofsuspiciousconstellations
Market surveillance @ work Structured Information • Broadend approach of detecting suspicious trading behaviour • Early recognition of trends and patterns • Decision support in investigation and escalation Sentiment Analysis Scenario Analysis Analytic Models Visualisation Benefits for the market Market data Instrument Reference data Ad-hoc news Transaction data Employee data Order data Unstructured Information Blogs Discussion Forums „News“ Social Networks Real-lifeimplementation @ B-NEXT, Germany. Contact: Markus.Reinhardt@b-next.com
29 Stay tuned (http://project-first.eu/)
AcknowledgementThe research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n°257928. THANKS