1 / 25

Modernizing Business with BIG DATA

Modernizing Business with BIG DATA. Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale. Big Data fueling Enterprise Agility. Harvard Business Review refers Sears Holdings Hadoop use case - Big Data's Management Revolution!.

robynh
Download Presentation

Modernizing Business with BIG DATA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modernizing Business with BIG DATA Aashish Chandra Divisional VP, Sears Holdings Global Head, Legacy Modernization, MetaScale

  2. Big Data fueling Enterprise Agility Harvard Business Review refers Sears Holdings Hadoop use case - Big Data's Management Revolution! Sears eschews IBM/Oracle for open source and self build Sears’ Big Data Swap Lesson: Functionality over price? How banks can benefit from real-time Big Data analytics?

  3. Legacy Rides The Elephant Hadoop has changed the enterprise big data game. Are you languishing in the past or adopting outdated trends?

  4. Journey to the world with NO Mainframes.. Cost Savings Open Source Platform Simpler & Easier Code Business Agility Business & IT Transformation Modernized Systems IT Efficiencies High TCO Optimize • I. Mainframe Optimization • 5% ~ 10% MIPS Reduction • Quick Wins with Low hanging fruits Mainframe Migration Inert Business Practices Convert • II. Mainframe ONLINE • Tool based Conversion • Convert COBOL & JCL to Java Resource Crunch PiG / Hadoop Rewrites • III. Mainframe BATCH • ETL Modernization • Move Batch Processing to Hadoop

  5. Why Hadoop and Why Now? THE ADVANTAGES: Cost reduction Alleviate performance bottlenecks  ETL too expensive and complex Mainframe and Data Warehouse processing  Hadoop THE CHALLENGE: Traditional enterprises lack of awareness THE SOLUTION: Leverage the growing support system for Hadoop Make Hadoop the data hub in the Enterprise Use Hadoop for processing batch and analytic jobs

  6. The Classic Enterprise Challenge

  7. The Sears Holdings Approach • Key to our Approach: • allowing users to continue to use familiar consumption interfaces • providing inherent HA • enabling businesses to unlock previously unusable data 1 2 3 4 5 6

  8. The Architecture • Enterprise solutions using Hadoop must be an eco-system • Large companies have a complex environment: • Transactional system • Services • EDW and Data marts • Reporting tools and needs • We needed to build an entire solution

  9. The Sears Holdings Architecture

  10. PiG/Hadoop Ecosystem MetaScale

  11. The Learning • We can dramatically reduce batch processing times for mainframe and EDW • We can retain and analyze data at a much more granular level, with longer history • Hadoop must be part of an overall solution and eco-system • We developed tools and skills – The learning curve is not to be underestimated • We developed experience in moving workload from expensive, proprietary mainframe and EDW platforms to Hadoop with spectacular results Over two years of Hadoop experience using Hadoop for Enterprise legacy workload. HADOOP • We can reliably meet our production deliverable time-windows by using Hadoop • We can largely eliminate the use of traditional ETL tools • New Tools allow improved user experience on very large data sets IMPLEMENTATION UNIQUE VALUE

  12. Some Examples Use-Cases at Sears Holdings

  13. The Challenge – Use-Case #1 Sales: 8.9B Line Items Price Sync: Daily Elasticity: 12.6B Parameters Offers: 1.4B SKUs Items: 11.3M SKUs Stores: 3200 Sites Timing: Weekly Inventory: 1.8B rows • Intensive computational and large storage requirements • Needed to calculate item price elasticity based on 8 billion rows of sales data • Could only be run quarterly and on subset of data – Needed more often • Business need - React to market conditions and new product launches

  14. The Result – Use-Case #1 Sales: 8.9B Line Items Price Sync: Daily Business Problem: Elasticity: 12.6B Parameters Offers: 1.4B SKUs • Intensive computational and large storage requirements • Needed to calculate store-item price elasticity based on 8 billion rows of sales data • Could only be run quarterly and on subset of data • Business missing the opportunity to react to changing market conditions and new product launches Items: 11.3M SKUs Stores: 3200 Sites Timing: Weekly Inventory: 1.8B rows Hadoop Price elasticity calculated weekly 100% of data set and granularity Meets all SLAs New business capability enabled

  15. The Challenge – Use-Case #2 Mainframe Scalability: Unable to Scale 100 fold Data Sources: 30+ Mainframe: 100 MIPS on 1% of data Input Records: Billions Hadoop • Mainframe batch business process would not scale • Needed to process 100 times more detail to handle business critical functionality • Business need required processing billions of records from 30 input data sources • Complex business logic and financial calculations • SLA for this cyclic process was 2 hours per run

  16. The Result – Use-Case #2 Mainframe Scalability: Unable to Scale 100 fold Data Sources: 30+ Business Problem: Mainframe: 100 MIPS on 1% of data Input Records: Billions • Mainframe batch business process would not scale • Needed to process 100 times more detail to handle rollout of high value business critical functionality • Time sensitive business need required processing billions of records from 30 input data sources • Complex business logic and financial calculations • SLA for this cyclic process was 2 hours per run Hadoop Teradata & Mainframe Data on Hadoop JAVA UDFs for financial calculations Scalable Solution in 8 Weeks Implemented PIG for Processing Processing Met Tighter SLA $600K Annual Savings 6000 Lines Reduced to 400 Lines of PIG

  17. The Challenge – Use-Case #3 Data Storage: Mainframe DB2 Tables Price Data: 500M Records Processing Window: 3.5 Hours Mainframe Jobs: 64 Hadoop Mainframe unable to meet SLAs on growing data volume

  18. The Result – Use-Case #3 Business Problem: Data Storage: Mainframe DB2 Tables Mainframe unable to meet SLAs on growing data volume Price Data: 500M Records Processing Window: 3.5 Hours Mainframe Jobs: 64 Hadoop Source Data in Hadoop $100K in Annual Savings Maintenance Improvement – <50 Lines PIG code Job Runs Over 100% faster – Now in 1.5 hours

  19. The Challenge – Use-Case #4 Teradata via Business Objects Transformation: On Teradata User Experience: Unacceptable Batch Processing Output: .CSV Files History Retained: No New Report Development: Slow Hadoop • Needed to enhance user experience and ability to perform analytics at granular data • Restricted availability of data due to space constraint • Needed to retain granular data • Needed Excel format interaction on data sources of 100 millions of records with agility

  20. The Result – Use-Case #4 Teradata via Business Objects Transformation: On Teradata Business Problem: User Experience: Unacceptable • Needed to enhance user experience and ability to perform analytics at granular data • Restricted availability of data due to space constraint • Needed to retain granular data • Needed Excel format interaction on data sources of 100 millions of records with agility Batch Processing Output: .CSV Files History Retained: No New Report Development: Slow Hadoop Sourcing Data Directly to Hadoop Transformation Moved to Hadoop User Experience Expectations Met Redundant Storage Eliminated Business’s Single Source of Truth Datameer for Additional Analytics Over 50 Data Sources Retained in Hadoop PIG Scripts to Ease Code Maintenance Granular History Retained

  21. Summary of Benefits

  22. Summary • Hadoop can revolutionize Enterprise workload and make business agile • Can reduce strain on legacy platforms • Can reduce cost • Can bring new business opportunities • Must be an eco-system • Must be part of an data overall strategy • Not to be underestimated

  23. The Horizon – What do we need next? • Automation tools and techniques that ease the Enterprise integration of Hadoop • Educate traditional Enterprise IT organizations about the possibilities and reasons to deploy Hadoop • Continue development of a reusable framework for legacy workload migration

  24. Legacy Modernization Made Easy! www.metascale.com Follow us on Twitter @LegacyModernizationMadeEasy Join us on LinkedIn: www.linkedin.com/company/metascale-llc For more information, visit: Contact: Kate Kostan National Solutions Kate.Kostan@MetaScale.com

More Related