280 likes | 424 Views
Real-Time News Analytics With Semantic Big Data Technologies. Dr. Volker Stümpflen and Michael Schramm Clueda AG 1 .4.2014. Clueda. Founded 2012 Spin -Off Institute for Bioinformatics a nd Systemsbiology of the Helmholtz Zentrum München
E N D
Real-Time News AnalyticsWithSemantic Big Data Technologies Dr. Volker Stümpflen and Michael Schramm CluedaAG 1.4.2014
Clueda • Founded 2012 • Spin-Off Institute forBioinformaticsandSystemsbiologyofthe Helmholtz Zentrum München • Real-time softwaresolutionsforsemanticandassociativeknowledgeprocessingandanalysis • >40 man years R&D • 30 employees • Partner: Baader Bank AG • Winner Best in Big Data Award 2013
Why Big Data • Storage ischeap • Data isgloballyaccesible
Newsflood Millionsoffinancialinstruments X tradersandanalysts 500.000 newsp.d. ~4 bnsentences p.a. Fromstocksto derivatives Increasing Decreasing time forincreasinginformation Isconstantandsmall Fromnewsagencies tosocialmedia channels (Blogs, Tweets) Stronglyincreasing
News Moves Markets Cluedaanalysisready traderisbuying Price News reading Automated analysis Commercial advantage Time News published News readingfinished traderisbuying
Big Data Problem: Big Data – Big Noise • Junk-In -> Junk-Out
User-Centric Decision Making See Concepts, relations and events as they happen in multiple information sources Data Understand Trends, mood and relationships using semantics and systems biology approaches Information Real-timeengine Answer Questions that only specialists could answer before Knowledge
Market Moving Influences Events Market Moving Mood Insider Knowl. Sentiment Information
Elementary Processing Steps RecognizingConcepts (Companies, Persons, ...) Generating Knowledge Networks Recognizing Relations and Events AdvancedAnalytics (e.g. Sentiment)
Simple DetectionAndUtilizingOfConcepts • Applicationsand Problems Source : Preis, T., Moat, H. S. & Stanley, H. E. Quantifying Trading Behavior in Financial Markets Using Google Trends. Sci. Rep. 3, 1684 (2013).
Concept Detection • Recognizing the meaning of unknown words • Self-learning capabilities based on machine learning approaches • After initial training knowledge base ist extended automatically
Real-Time Event Detection and Processing … biglaunchcelebrations at hardwarestoreswithGalaxy Tab III werecanceled. Apple sues Samsung in Australia.Followingearlier legal disputes … Nokia Sony Motorola legal action Samsung NEGATIVE RELATION ACTING COMPANY RECEIVING COMPANY • Understands textual information and relations • Generates a semantic knowledge network • Identifies market moving news in real-time Apple Apple sues Samsung Sharp in Australia Microsoft Foxconn China LOCATION OF RELATION Rare Earths
Event Determination With Big Data Analytics Price close = high move caused by news open measurement error market move low Time t0 t1 News Release
Analysis Of News From One Year Event Type 2 Number of news Clustering Event Type 1 Optimal threshold Meaningful news events Threshold market move
Statement-Centric Information CompressionandDetection • Approximately 30-40% of all newscontain redundant information • Onlyone out of 500 newsismarketmoving
BehaviouralFinance “We find an accuracy of 87.6% in predicting the daily up and down changes in the closing values of the DJIA”
Sentiment Detection • Simple approach: Counting positive and negative words • Problems
Systemic Interrelations / Systemic Mood Nokia Nokia Sony Motorola Sony Motorola Samsung Samsung legal action Apple Sharp Sharp Microsoft Foxconn Foxconn China • Sentiment influences with systems biological methods • Mood propagation in networks • Identification of indirect mood drivers Rare Earths
Understanding ComplexSituations • Extractionfromnetworkswithmillionsofnodesandbillionsofedges
Semantic Big Data News Analytics • Big Data is a reality • Big Data pitfalls • Junk in – Junk out • Correlation vs. Causation • Combinationwith intelligent methodsismandatory • Semanticanalysis • Network analysis • Itworks “Wir sparen mit der Software jeden Tag Tausende von Euros” UtoBaader - Baader Bank
ThankYou! Volker Stümpflen Michael Schramm