1 / 6

HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems

HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems. Risk Solutions . INTRODUCTION. LexisNexis Risk Solutions More than 15 years of Big Data experience Provides information solutions to enterprise customers Generates about $1.4 billion in revenue

guido
Download Presentation

HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems Risk Solutions

  2. INTRODUCTION LexisNexis Risk Solutions • More than 15 years of Big Data experience • Provides information solutions to enterprise customers • Generates about $1.4 billion in revenue • Has been using the HPCC Systems platform for over 10 years • HPCC Systems • Launched in June 2011 • Open source, and enterprise-proven distributed Big Data analytics platform • To help enterprises manage Big Data at every step in the Complete Big Data Value Chain Strata 2012 Keynote 2

  3. THE COMPLETE BIG DATA VALUE CHAIN Collection – Structured, unstructured and semi-structured data from multiple sources Ingestion – loading vast amounts of data onto a single data store Discovery & Cleansing – understanding format and content; clean upand formatting Integration – linking, entity extraction, entity resolution, indexing and data fusion Analysis – Intelligence, statistics, predictive and text analytics, machine learning Delivery – querying, visualization, real time delivery on enterprise-class availability Collection Ingestion Discovery & Cleansing Integration Analysis Delivery Strata 2012 Keynote 3

  4. MACHINE LEARNING IN BIG DATA • How do you extract value from big data? • You surely can’t glance over every record; • And it may not even have records… • What if you wanted to learn from it? • Understand trends • Classify into categories • Detect similarities • Predict the future based on the past… (No, not like Nostradamus!) • Machine learning is quickly establishing as an emerging discipline. • But there are challenges with ML in big data: • Thousands of features • Billions of records • The largest machine that you can get, may not be large enough… • Get the picture? Strata 2012 Keynote

  5. ECL-ML: HPCC SYSTEMS MACHINE LEARNING • A fully distributed and extensible set of Machine Learning techniques for Big Data • State of the art algorithms in each of the Machine Learning domains, including supervised and unsupervised learning: • Correlation • Classifiers • Clustering • Statistics • Document manipulation • N-gram extraction • Histogram computation • Natural Language Processing • Distributed and parallel underlying linear algebra library Strata 2012 Keynote

  6. TAKE AWAYS • A fully parallel set of Machine Learning algorithms on Big Data gives you full insight • Outliers matter, especially when those outliers are the exact reason for the discovery effort (for example, in anomaly detection) • Dimensionality reduction can conduce to information loss: why risk losing valuable information when you can have it all? • Leveraging a fully parallel machine learning solution on Big Data will help you identify fraud, bring products to market faster, and become more competitive • Organizations that don’t leverage the big data that they have, risk losing ground to their competitors • Get on it, now! Strata 2012 Keynote

More Related