1 / 28

A presentation by W H Inmon

Learn about the evolution of Big Data technology, the types of data it encompasses, and the crucial importance of extracting business value through textual disambiguation. Explore the challenges, strategies, and potential of leveraging Big Data for significant business gains.

ronquillo
Download Presentation

A presentation by W H Inmon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ACHIEVING BUSINESS VALUE WITH BIG DATA A presentation by W H Inmon

  2. Big Data – a brief history (using a military analogy) In military strategy, you want to take the high ground In technology, having the dbms technology that manages the largest amount of data IS the high ground 1960 – IBM with IMS 1970 – IBM with IMS /DC – transaction processing 1980 – Teradata with MPP technology 2010 – IBM + Hadoop Each successive iteration of technology took new high ground for the vendor

  3. Hadoop Google Yahoo Amazon.com Then IBM, Cloudera, Hortonworks, Teradata et al came along and discovered Hadoop

  4. Big Data can be divided into two major types of data Repetitive data metering data click stream data call level detail log tapes analog Non repetitive email call center corporate contracts healthcare warranty claims insurance claims

  5. Repetitive data Same record size Same structure Often times same data Very regular Non repetitive data Record size is different Similar structures are accidental Almost never the same data Very irregular

  6. Repetitive data Useful data Only a small percentage of data is useful Non repetitive data Useful data The vast majority of data is useful

  7. Big Data can be divided into two major types of data Repetitive data metering data click stream data call level detail log tapes analog Limited business value Massive business value Non repetitive email call center corporate contracts healthcare warranty claims insurance claims

  8. 90% of fishermen 10% of the fish Repetitive data 10% of the fishermen 90% of the fish Non repetitive data

  9. There is trouble in paradise - Wall Street Journal, winter 2013 – the return on investment for every dollar spent on Big Data is $.55 Large consulting firm – in 18 months we have done 150 proof of concepts for Big Data – 5 have been successful Large New York bank – for three years we have been trying to make Big Data work. We have done everything our vendor has told us to do. We just are not getting any business value from Big Data

  10. So what happens when you go to a presentation on Big Data?

  11. budget sources of data analytical processing compatibility Cirro Mongo pig Hive Map Reduce what the analyst sees today

  12. Business value No one talks about business value

  13. Non repetitive business relevant unstructured data Big Data Business Value Here is what lies ahead in addressing the topic of achieving business value out of Big Data

  14. Non repetitive business relevant unstructured data Big Data Business Value In order to get to business value you MUST solve the issues of unstructured data

  15. The vendor’s notion of a solution Map Reduce Data scientist Big Data Business Value

  16. Non repetitive business relevant unstructured data Here is the even bigger hurdle that no one is talking about Here’s what everyone is talking about Big Data

  17. text text text text text text text text text text text text text text text text text text text text text text We need lots of things, but most of all we need CONTEXT

  18. and what’s so challenging about raw text? it is dangerous and potentially very misleading to try to use raw text as a basis for decisions…. 7? Consider the following confusion….. the answer is seven… seven what? seven days? seven dollars? seven wonders of the world? seven seas? seven dwarfs?

  19. She’s hot…. Or consider this confusion….. what is being said here? she is attractive and I want to date her… it is Houston Texas and it is 98 degrees. She is sweating… I just took her temperature and it was 104 degrees….. looking at the words “She’s hot” tells you nothing in order to make sense of the text you MUST supply context and that is true for ALL text

  20. Textual disambiguation In order to achieve Business Value, the raw non repetitive business relevant text found in Big Data must pass through a process known as textual disambiguation

  21. so how do you do textual disambiguation? The first step is to “contextualize” the raw data qualified vocabularies document metadata homographic resolution taxonomies ontologies document sensitive inference textual proximity documents acronym resolutions

  22. The problem with “contextualization” is that there are many ways to “contextualize” the text, all depending on the text There is no one single algorithm……

  23. Repetitive data Context is easy to find Non repetitive data Context is there but it is difficult to find

  24. Contract type Date Contract party Term

  25. doctor gender/race cancer type location description

  26. raw Non repetitive Standard dbms disambiguated Analytical processing Textual disambiguation Analytical processing

  27. Limited business value So where are organizations being pushed? Business value

  28. Thank goodness someone understands Big Data! For more information see our white papers and articles at – www.forestrimtech.com Everything on the site is FREE!!!

More Related