1 / 29

Got Big Data?

Got Big Data?. Presentation Topics. IRI, The CoSort Company Fast Extract (FACT) for Oracle CoSort – Big Data Manipulation RowGen - Safe Test Data Conclusion. The CoSort Company. Innovative Routines International (IRI), Inc. 29 years of self-funded, sustained growth

psyche
Download Presentation

Got Big Data?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Got Big Data?

  2. Presentation Topics • IRI, The CoSort Company • Fast Extract (FACT) for Oracle • CoSort – Big Data Manipulation • RowGen - Safe Test Data • Conclusion

  3. The CoSort Company Innovative Routines International (IRI), Inc. • 29 years of self-funded, sustained growth • Headquarters in Melbourne, Florida • 30 international offices • Core products: Fast Extract for Oracle, reorgs, and ETL Big data transformations, conversion, reporting, protection, DB load and ETL tool acceleration, legacy sort migrations Safe, realistic test data

  4. Some Users

  5. Fast Extract for Oracle • √ Unloads large tables in parallel to flat files • √ Leverages SQL SELECT syntax • √ Writes CoSort metadata for transforms, reports, etc. • √ Creates SQL*Loader metadata

  6. FACT vs. Oracle Unload 7X FASTER HPUX B.11.11, Oracle 9.2, 50byte VARCHAR HP9000 L200044, 4 PA 8500 CPUs @ 440MHz, 8GB RAM

  7. FACT Business Benefits √ Faster unloads increase data availability for business intelligence √ Speeds data migrations for faster CRM, ERP, and SCM roll-outs √ Delays hardware and software upgrade expenditures √ Helps businesses meet SLAs and other commitments √ Faster reorgs and ETL frees people and computing resources for higher-value operations, improving enterprise agility

  8. Operational Overview

  9. Detailed Schematic • Parallel file manipulation engine for simultaneous, high-volume data: • Transformation • Conversion • Protection • Reporting • Applications: • DW data integration and staging • DB and ETL tool acceleration • Batch and delta reporting • Data privacy and governance • File compares and mapping • Legacy sort and data migrations

  10. Data Transformations • Select/Filter • Sort/Merge • Match/Join • Aggregate • Cross-Calculate • Re-Map/Reformat • Scrub/Cleanse • Substrings • Table Lookup • Type-Convert • PCR Expressions • User Functions CoSort v9 for Linux on Dell PowerEdge 2950,  2 CPUs1000 -line query vs. 46, ~15-line SortCL scripts. Transform CoSort v9 can run all these functions in the same job script and I/O pass

  11. Sorting Speeds Oracle Loads • Problem: • Unsorted inserts into indexes: • Require more internal work (less efficient block splits) • Require more temporary space • Run at half the sorted sustained rate • Solution: • 1. Unload tables to flat files via FACT • 2. Sort File on longest index field via CoSort • 3. Load with SQL*Loader where DIRECT=TRUE • 4. Create indexes during load with SORTED INDEXES • After loading, use CREATE INDEX with NOSORT Do this all in one pass! Details in FACT/Oracle ETL Whitepaper

  12. Conversion • File Formats: • ACUCOBOL Vision • CLF/ELF Web Logs • CSV • LDIF • L/R/V Sequential • MF I-SAM • MF Variables • Text • Variable Block • XML Convert SunFire 4800, 8 x 1200MHz CPUs, 16 GB RAM, 64-bit Solaris 9 CoSort also transforms and converts > 100 Data Types

  13. Field Protection • Field-Level Functions: • AES-256/Decrypt • Filter/Redact • Anonymize/Mask • De/Re-Identify • Pseudonymize Protect • XML Audit (Compliance) Logs • Safe Test Data:

  14. Field Protection Unique Benefits • Only CoSort delivers all these security advantages together: • √ Choice of protection methods, libraries or keys • √ Precise RBACs • √ Source and platform portability • √ Integration with Data Transformation & Reporting • √ Speed • √ XML Audit Logs

  15. Batch Reporting • Custom Layouts: • Condition Logic • Embedded HTML • Field Padding • Field Remapping • Variables • User Exits • Clickstream Analytics • CDI/Segmentation • Change Data Capture • Hand-offs to BI Tools • iDashboard Option Report

  16. Transform, Convert, Protect, Report - Together Client SS# Symbol Symbol Shares LastTrade Shares*LT Ln. CEE 12.55 1 CEO 16.44 2 Moses Dinan HKT4rcaFaJrFWuvjHepZtw== CEQ CEQ 2000 27.47 54940.00 3 Jonathan Lawhon 2zQtfMY2KoyLyFnXuKZeSw== CEW CEW 825 47.25 38981.25 4 Cathrine McDougal cZxTLE3gGW0V98pgYvTJ7Q== CEX CEX 9000 2.61 23490.00 5 Eugenio Killen VOmswdpk3OuJ08eTxaC1jQ== CFU 855 6 Estelle Culbert v6EvcxAdThRyminIj0VLDg== CIH CIH 1500 52.81 79215.00 7 CMG 4.84 8 Valentine Ormond a+WOaP8znyuC3mkgw9Q9RA== COV 3250 9 CQJ 50.86 10 CSU 42.40 11 Penny Worthley 9aATT49TjxlLP7P8ncCZXg== ECG 5000 12 Isaiah Nordin lRQ92+/HuEHXraIABcso1A== FMU 1000 13 Rosalee Torre Nh14RLmiVG2Sfa6k1JM6qA== HFU 950 14 IJZ 25.05 15 IJ] 24.71 16 Rey Gaffney CFEhSs5L6cv1IYz3L9416g== KEG 400 17 Virgil Kerner T5BtEtioca/UJmvp4aUlgg== KMI 2100 18 Tonya Dove a9nvad/P0DnQACLsFlWAvQ== MEO 50 19 Adrianna Brand 8rN6FT/s0ijmWldemST9mw== OGN 1000 20 Clarissa Dicus yLo9RIHDT3Wg2w2x/4XfLw== RVY 3333 21 Chuck Britton gn+nfQHsR1m2Y73PvkVPhA== VQW VQW 8500 24.05 204425.00 22 Martin Baynes 7S92fb+kyrMJeYgRtquCeA== UFU 9000 23 Lakesha Croy Lna+zcnwXRTyHmbXX4EaXw== UQH 3500 24 Kenton Medlin 52ouvtttaeDKV1fg5RPr0A== UYQ 90 25 U\C 25.00 26 Bobbie Watson 2Ng7KIGL1Nm69gzeSr8uww== WHO 950 27 WQW 103.0 28 Suzanna Koster 33GbsTFaldxviCEtcTli9g== WUI 775 29 Gretchen Delima RCHBP7u0yHsNEatXUtky+Q== YQK 4300 30 Petra Kivi u3iFMokehLXjFPgWe75YnQ== ZOW 25000 31 --------------- $401,051.25 Pseudonym Encrypted De-Identified Calc Sort Re-Map Sequence Customer Data Join NYSE Data Aggregate

  17. Delta Reporting ChangeData Capture • Problems: • Non-scalability for large transaction volumes • Single-source acquisition and refresh • Complex 3GL and SQL code • Slow roll-ups and cube builds • Solutions: • 1. Capture database and legacy file changes off-line • 2. Compare with CoSort’s parallel sort/join engines • Identify inserts, deletes, and updates simultaneously • Roll up, transform, segment, and report together • Populate multiple cubes and protect sensitive data • Enable fast “what if” testing

  18. Dashboard Reporting iDashboards Option • Translate raw data into visual BI • Unify and direct departments around business goals • Integrate CoSort and other outputs with DB sources • Import/export Excel data and preferences Applications • Balanced Scorecard • Supply Chain Management • Process and Quality Control • Sales & Marketing Intelligence • Facility Performance • Project Management • Market Research and Analysis • Enterprise Resource Planning • Financial Intelligence • Executive Reporting • IT Systems Monitoring • SLA Monitoring Applications

  19. Other Applications Legacy Sort Migrations • ACUCOBOL–GT • Clerity/UniKix • Micro Focus COBOL • MVS JCL • SAG Natural • SAS PROC • SS Unix • Unix /bin/sort • VAX VMS • VSE JCL ~4GB, 4-key sort of variable-length records Hardware: IBM p5 570 with 2 CPUs running 64-bit AIX 5 .3 MF COBOL 4 (Workbench) sorted the input file in ~50 minutes CoSort v9 sorted the same file in 3 minutes, 12 seconds

  20. Other Applications ETL Tool Accelerations • Exclusive sort plug ‘n’ plays for PowerCenter • and DataStage • External CoSort transform and load jobs can • use .xml and .dsx file layouts, thanks • to MIMB from • Direct calls from ETI, Kalido, OWB, SAS, • TeraStream, and other DW applications

  21. FACT-CoSort vs. In-Database Transform 7X FASTER HP-UX B.11.23, Oracle 9i ia64hp server rx5670 4x1ghz Itanium2 CPUs, 32GB RAM

  22. CoSort Business Benefits • √ Parallel transforms increase data availability for business intelligence • √ Task consolidation reduces runtimes and software expenses, and delays hardware upgrades • √ Built-in field security reduces risk of litigation, fines, brand damage • √ Intuitive syntax reduces development time and training costs • √ Seamless accelerations optimize packaged application ROI • √ Flexible, perpetual-use licensing models

  23. Need Safe Test Data? RowGen: • √ Creates multiple targets for loads, development, and outsourcing • √Reads your data models and preserves referential integrity • √Supports virtually all data types, files sizes, value ranges, and conditions • √Leverages CoSort selection, transformation and pre-load sorting • √Includes CoSort formatting functions for custom file/report outputs • √Re-uses the metadata in your applications

  24. Your Data Models, Your Metadata • Build Test DBs for • Oracle • Microsoft SQL Server • DB2 • Sybase • Teradata • Packaged Apps

  25. RowGen Business Benefits • √ Cuts development and testing time/costs • √ Realistic volume and range testing = better quality control • √ Higher quality products increase customer satisfaction/loyalty and decrease support costs • √Complies with data privacy policies and eliminates production data risk

  26. Conclusion • IRI, The CoSort Company • Fast Extract (FACT) for Oracle • CoSort – Big Data Manipulation • RowGen - Safe Test Data

More Related