1 / 23

J. Christopher Westland, University of Illinois - Chicago

Big Data’s Challenge to Business Analytics Cloud Computing, Unstructured Data and Intangible Assets. J. Christopher Westland, University of Illinois - Chicago. The Original ‘Big Data’. Digital recording and calculation started long ago… … started with the Lebombo Bone ( 35,000 BCE )

ratana
Download Presentation

J. Christopher Westland, University of Illinois - Chicago

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data’s Challenge to Business Analytics Cloud Computing, Unstructured Data and Intangible Assets J. Christopher Westland, University of Illinois - Chicago

  2. The Original ‘Big Data’ • Digital recording and calculation started long ago… • … started with the Lebombo Bone (35,000 BCE) • … the oldest known computer • A tally stick with counting notches carved into a baboon’s fibula Westland 2014

  3. Sumerian Sexagesimal Abacus (3000 BCE) Spread to Egypt and Persia, whose dominance of central Asia spread it to all corners of the world, along with coinage Adopted under the Qin Dynasty in China with invention of money Westland 2014

  4. Office AutomationWith spreadsheet data stores (c. 1870) Westland 2014

  5. Early Data storage and Processing Punched cards for the Jacquard Loom Westland 2014

  6. Accounting defined the growth of Modern Computingthe IBM 360 Westland 2014

  7. Google Redefines Computing around Big Data(c. 2013 43 PB @ $75 million/yr energy bill) Westland 2014

  8. Expanded Data Output: Robotics Westland 2013

  9. Expanded Data Output:Additive Manufacturing Westland 2013

  10. Expanded Data Input:Drone Surveillance Westland 2014

  11. Expanded Data Input:Reading Minds Westland 2014

  12. Information is now the major source of firm value Westland 2014

  13. Information Production is the Generator of National Wealth Westland 2014

  14. Big Data • The amount of data in the world has been growing exponentially for decades. • Global data was 2.8 zettabytes (ZB) in 2012 • or 2.8 trillion GB generated annually • Only around 0.5% of this is used for analysis, and most of this data and analysis is in the financial sector, of the sort that might conceivably be 'Big Data' analyticsed. • Volumes of data are projected to reach 40ZB by 2020, or 5,247 GB per person Westland 2014

  15. Varieties of Information • Financial data is declining • Unstructured and transient data is growing • Only a fraction of available data has been explored for analytic value. • ‘Protected’ data is growing faster than the total amount of data itself, due to regulation and additional legal, lending and trading requirements in the financial community. Westland 2014

  16. Why we have a problemData growth vastly outstrips our capability to process it Westland 2014

  17. Two Facets of Analytics' ‘Big Data’ Challenge • Organization and Access to Data in non-Matrix Modalities • Text, Image, Sensory (smell, sound, taste), etc. • Public cloud, Internet and NoSQL databases • Exponential growth of datasets size but linear growth of budgets • Data Reduction & Information Distillation • Probability Distributions, Data Reduction and Pattern Matching algorithms • Compound distributions (with multiple modes) for accounting transactions Westland 2014

  18. The Rise of Non-Matrix Datafrom accounting classifications to continuous time-series Westland CNAIS 2013

  19. Time • Incorporating ‘Time’ into scientific models is particularly troublesome • Three approaches: • Time Domain • Auto-correlation and cross-correlation analysis for serial dependence • ARMA, ARIMA, ARFIMA, ARCH, GARCH, etc. • Frequency Domain • Spectral analysis and wavelets (cyclic behavior) • Minkowski time (spatial conversion) Westland CNAIS 2013

  20. Big Data & Evidence Capture • We have to capture data wherever it may live • If it comes to us, it may be fraudulent • This requires sophisticated tools for access • To Clouds, Server farms, NoSQL and document collections and sensor data Westland 2013

  21. Financial & Economic Data Distributions are Very Non-Standard • They have probability spikes at zero • They have multiple-modes • This can be explained by modeling as compound distributions with two parts: • The occurrence rate of a type of transaction • E.g., a discrete Negative Binomial distribution • The economic value of a single transaction of a specific type • E.g., a continuous Pareto or Gamma distribution Westland 2014

  22. ‘Big Data’ Analytics presents Us with Classic “Wicked Problems” • There is no definitive formulation of 'Big Data' analytics • 'Big Data' analytics have no stopping rule • 'Big Data' analytics products are not true-or-false, rather are probability distributions • There is no immediate and no ultimate test of an 'Big Data' analytics conclusion • Each 'Big Data' analytics is a "one-shot operation and is essentially unique • Stakeholders have radically different world views and different frames for understanding the problem. • 'Big Data' analytics solutions depend on how the problem is framed, • And vice-versa – the problem definition depends on the solution • The constraints that the problem is subject to and the resources needed to solve it change over time. • Every 'Big Data' analytics procedure can be considered to be a response to some other finding Westland 2014

  23. The Future of 'Big Data’ Analytics • 'Big Data' analytics will increasingly require new tools: • Powerful computing platforms • Intimate knowledge of statistical theory • 'Big Data' analytics in today's complex industrial environment require: • New data streams • Science (analytics) • Art (experience and intuition) and • ‘Access to unstructured, large-volume data on new product and service offerings • NoSQL data structures • Open-source – ‘big data’ technologies • New reporting: forecasts, intangible accounting, social accounting • Demands for an accounting science Westland 2014

More Related