510 likes | 737 Views
SAS on Oracle for Big Data and Cloud S ervices: Insights into a Strong Partnership (CON8653) . Paul K ent, VP Big D ata, SAS Randy W ilcox, DBA Team Manager, SAS Solutions onDemand Hermann Baer, Director Product Management, Oracle. AGENDA. Introduction
E N D
SAS on Oracle for Big Data and Cloud Services: Insights into a Strong Partnership (CON8653) Paul Kent, VP Big Data, SAS Randy Wilcox, DBA Team Manager, SAS Solutions onDemand Hermann Baer, Director Product Management, Oracle
AGENDA Introduction SAS Visual Analytics, SAS High Performance Analytics • on Oracle Engineered Systems Oracle & SAS Collaboration • Setting the Stage for Big Data • Oracle Database 12c SAS Solutions onDemand – SAS Cloud Services
Reflection on a stronger partnership than ever • SAS High-Performance Analytics and SAS Visual Analytics on Oracle Engineered Systems • Extensive engineering collaboration • Sizing, configuration guidance and best practices for deployment • Support for POVs • A strong technology and business alliance to develop solutions and products brings tremendous value and confidence
OUR PERSPECTIVE Big Data is RELATIVE not ABSOLUTE When volume, velocity and variety of data exceeds an organization’s storage or compute capacity for accurate and timely decision-making BIG DATA The process surrounding the development, interpretation, and useful application of statistics to solve a problem. Analytics applied to data provides the 4th V = Value Three types: Descriptive, Predictive, Prescriptive ANALYTICS The combination of using ANALYTICS on BIG DATA AND/OR the capability to run advanced or complex analytics on any size data. BIG ANALYTICS
SAS® HIGH-PERFORMANCE ANALYTICS KEY COMPONENTS
Analytical Workload HOW analytical leaders Architect to exploit data Analytics Platform Analytical Services Analytical Models and Rules Repository Fast insights - IN-MEMORY Analytical Data Warehouse Analytical Visualization Discovery ANALYTICS Inc. Enterprise Miner Event Data Store EDW Raw Data Pool(HDFS / NoSQL) Event Data Store (RDBMS) Tx Data Sources Event Management Platform • Event Stream • Processing • R/T Decision Services Operational Execution Event Streams
SAS Business Analytics Framework Analytic Data Warehouse / Marts SAS Analyst’sDesktops SAS Compute Server SAS Metadata Server Relational Data Store Web Application Server SAS Web Clients Metadata Tier Server Tier Data Tier Client Tier Web Tier
SAS Business Analytics Framework Analytic Data Warehouse / Marts SAS Analyst’sDesktops SAS Compute Server SAS Metadata Server Relational Data Store Web Application Server SAS Web Clients Metadata Tier Server Tier Data Tier Client Tier Web Tier
Analytical Workload Rapid time to value in standard deployment Analytical Services (on Oracle Exalogic / Big Data / OVCA ) Analytical Models and Rules Repository SAS Grid-in-a-Box SAS Visual Analytics SAS High Performance Analytics SAS ANALYTICS Inc. Enterprise Miner SAS Visual Statistics (HDFS / NoSQL) Exadata Big Data Appliance SAS Analytics Accelerator SAS Big Data Connector Database & options Oracle Event Processing SAS Event Stream processing SAS Enterprise Decision Management Oracle Business Rules SAS Business Rules manager Real-Time Decisions Oracle Policy Automation
SAS Business Analytics Framework SAS Compute Server Analytic Data Warehouse / Marts SAS Analyst’sDesktops SAS Metadata Server Relational Data Store Web Application Server SAS Web Clients Infiniband Metadata Tier Data Tier Client Tier Server Tier Web Tier
SAS Business Analytics Framework Your Cloud SAS Compute Server Analytic Data Warehouse / Marts SAS Analyst’sDesktops SAS Metadata Server Relational Data Store Web Application Server SAS Web Clients Infiniband Metadata Tier Data Tier Client Tier Server Tier Web Tier
HOW DOES IT WORK Exalogic/BDA/OVCA (compute) with exadata (storage) Exadata Exalogic / BDA / OVCA • Client
HIGH-PERFORMANCE ANALYTICS Using Different Data and Computing Appliances with Asymmetric HPA Computing Appliance (Exalogic/BDA/OVCA) SAS Server General Captains TKGrid TK TK libnamea oracle server=“dataAppliance”; prochpcorr data=a.flights; performance mode=asym host=“computingAppliance”; run; TK Data Appliance (Exadata) Access Engine Workers Controller
HIGH-PERFORMANCE ANALYTICS Using Different Data and Computing Appliances with Asymmetric HPA Computing Appliance (Exalogic/BDA/OVCA) SAS Server General Captains TKGrid TK TK libnamea oracle server=“dataAppliance”; prochpcorr data=a.flights; performance mode=asym host=“computingAppliance”; run; TK Data Appliance (Exadata) Access Engine Workers Controller
SAS High-Performance Analytics Performance SAS EP Parallel Data Feeders Table 1: Summation of 5/20/100/200 columns; Baseline: DOP=1 (no parallelism) 120M rows, 400 columns, reg_simtbl_400
SAS High-Performance Analytics Performance SAS EP Parallel Data Feeders Table 2: Scan times for 2 tables (200 columns, 400 columns, 120M rows); Baseline: SAS/ACCESS vs. HPA EP feeder
SAS High Performance Analytics, SAS Visual Analytics on Oracle Engineered Systems Big Data Appliance (BDA) SAS High-Performance Analytics Server Root Node HadoopNamenode bda101 SAS Analyst’sDesktops bda102 SAS Visual Analytics Server Tier SAS Visual Analytics Middle Tier HadoopDatanode HadoopDatanode bda103-bda118 SAS LASR In-Memory Analytics Server SAS Web Clients
SAS and Oracle Working together to create customer value Analysis Platform Analytics 3.0 Lifecycle Management
SASandOracle Better Together
SAS® Exadata Value PropositionRandy Wilcox, DBA Team Manager, SAS Solutions onDemand
SAS Solutions OnDemand Overview SAS Solutions OnDemand – Started in 2000, 450 global staff members Advanced Analytics Lab (AAL) – Created in 2007 Over 1 PB of data under management Multiple ASP lines of business, representing over 400 customer sites (5 - 30,000 users per solution) in more than 70 countries • Retail, financial services, health care, pharmaceutical, government, entertainment analytics • Marketing and fraud analytic solutions Experience supporting customers with unique situations • Regulatory constraints - AML, FDA, HIPAA, Safe Harbor, SOC 2 / SOC 3 • Working with multiple parties Best Practices • Innovative techniques • Documented processes and procedures
SAS Solutions OnDemand Advanced Analytics Lab Formed by CEO Jim Goodnight in 2007 Premier analytic services group Mission: • Develop Innovative analytical processes and techniques, using SAS software, to solve our customers' high end business problems. • Support sales and consulting in generating revenue by helping close analytically challenging engagements • Produce analytical work products for repeatable processes 98% AAL members with graduate degrees in analytic fields (34% Ph.D.'s) 20 approved and 10 pending patents Learn with the expertsto the degree desired
Staffing To Support Any Customer Need Analyst Application Developer Business Analyst Compliance Specialist Data Architect / Data Modeler Data Custodian Data Integration Consultant Database Administrator Information Technology System Administrator Instructional Designer • Retail Operational Manager • SAS Administrator • Service Desk Consultant • Solution Architect • System Administrator • Technical Account Manager • Technical Architect • Technical Communicator • Technical Lead • Trainer • Web Developer • Load Tester • Operations / Maintenance Engineer • Performance Analyst • Program Manager • Project Manager • Quality Assurance Analyst • Quality Specialist • Release Manager • Repository Administrator • Retail Duty Manager
SAS Solutions OnDEmand Exadata at SAS Solutions on Demand Multitenant Agility Business Benefits SAS Solutions OnDemand utilizes key features of Exadata: Multitenant, Agility, and Performance to consolidate, speed time to deployment and drive down cost while realizing performance improvements • Deployed quarter racks in multiple data centers • Utilized ZFS Storage Appliance for a backup solution
SAS Solutions OnDEmand Challenges • Key Problems with Legacy environment: • Low CPU utilization – typical usage <20% • Complex server farm • Under-utilized licenses • High energy cost with legacy servers • Systemic inefficiencies • Requires support and coordination from multiple internal organizations and vendors
SAS Solutions OnDemand Exadata Key Requirements • Multitenant: • Consolidation of database instances to Exadata • Utilize multiple hosted Exadata racks • Instance caging • Maintain separation of data across customers • Agility: • Decrease deployment time • Remove dependencies on other departments • Business Continuity: • High availability SLA’s >99% • Superior backup, restore, and recovery • Oracle DB License Consolidation: • Consolidate under utilized licenses • Lower yearly license spend • Performance Improvement: • Not an initial key requirement but have recognized significant performance improvements 31
SAS Solutions OnDemand Multitenant Benefits BEFORE Current • BENEFITS: • High availability • Cloud control/OEM 12c • Lowered cost of license per CPU for database • Exadata could handle the spike and meet SLA • Optional compress data using HCC to lower costs and no impact on performance • Backup / recovery configured once • Less data center storage space used • Lower energy consumption to host • Total cost of ownership significantly lowered Many Disparate Customer Systems Consolidated on Exadata Data Guard Data Guard Exadata X2-2 DB Consolidation • Test and QA • Production PROBLEMS: Typical usage <20% Costly Inefficient • Disaster Protection
SAS Solutions On Demand Agility Benefit IT Team Single DBA team HW OS provisioning Set and Deploy all FS for each DB Manage Netbackup infra & all DB backups Network/Firewall/VLAN configuration Network Team Storage Team Backup Team
SAS Solutions On Demand Agility Benefit EnhancedBusiness Performance: • Key Recognized Benefits: • Onboarding a new database went from days to hours • OEM12c Cloud Control to manage the entire stack • The DBA team size is able to complete the entire process • Storage, network, hardware and OS setup steps eliminated • Dependency on corporate backup/recovery services was reduced to DR only with the usage of ZFS • TCO decreased for hosting services Service Levels: Improved and consistent delivery to the business Innovation: Superior capabilities to drive high value business results Time to Value: Reduced time to stand-up and deliver database services
Customer Example One: Anti-Money Laundering CURRENT ENVIRONMENT 1- ¼ RAC Exadata X2-2 Each ¼ RAC has: 2 db nodes / 12 cores per node, 192GB RAM per node. Customer has 2 cores from each node = 4 cores 3 Storage Cells: Raw Capacity: 21.6TB (HP) 108TB (HC) Customer uses 500 GB Strategic use of partitioning and hybrid columnar compression. Data extract selections are made faster by use of the Exadata storage indexes. CURRENT BEFORE Customer has 8 core dedicated standalone Customer uses 1730 GB Customers 1,2….X Customer “x” Anti Money Laundering / Fraud • Up to 45x performance increase with Exadata storage indexes • Significant reduction in storage by utilizing Hybrid Columnar Compression on aging partitions
Customer Example Two: Marketing Automation CURRENT BEFORE CURRENT ENVIRONMENT Partial - ¼ Exadata X2-2 Each ¼ has: 2 db nodes / 12 cores per node, 192GB RAM per node. Customer uses 2 cores on each node for total of 4 cores 3 Storage Cells: Raw Capacity: 21.6TB (HP) 108TB (HC) Customer uses 700GB Used partitioning to run long ETL and analytic jobs in the background prior to daily promotion to production. Customer 2 x 6 cores of Linux Customer uses 2850 GB Customers 1,2….X Customer “x” SAS Marketing Automation • Instant ETL updates with ZERO downtime by utilizing partitioning for background processing and the exchange partition function for promotion. • Saved much space by eliminating indexes that are no longer required due to Exadata’s superior processing power.
Customer Example Three: Fraud Detection CURRENT ENVIRONMENT 1- ¼ RAC Exadata X3-2 Each ¼ RAC has: 2 db nodes / 12 cores per node, 256 GB RAM per node. Customer has 6 cores from each node = 12 cores 3 Storage Cells: Raw Capacity: 21.6TB (HP) 108TB (HC) Customer uses 600 GB per DB. CURRENT BEFORE Customer has 8 nodes of a commercial Postgres based cluster. Each node as 2x6 cores and 96 GB of RAM. Customer uses 1800 GB per database, 2 databases in place at production level per data center Customers 1,2….X Customer “x” Anti Money Laundering / Fraud • Daily ETL runs < 10 hours vs. > 20 hours • Interface in use by 33,000 users now returns all queries in less than 30 seconds vs. many selections timing out at 3 minutes.
SAS Solutions OnDemand Performance Improvement Benefits Increased performance by removing indexes and letting the Exadata Storage Engine do its work. Side benefit is more space for additional customers and databases leading to an increased ROI. Implemented an Information Lifecycle Management Policy to partition data where possible and to compress data utilizing Hybrid Columnar Compression based on usage and historic attributes. Implemented Transparent Database Encryption as a standard for all customers. • Very few other database vendors could compete against this option. • Little performance impact as the data was encrypted in the DB Nodes BUT decrypted by hardware at the storage nodes. Utilized Instance Caging, Database Resource Management and IO Resource Management to guarantee a level of performance to all customer.
The Business Case for Exadata Delivering IT & Business Benefits at a Lower Cost of Ownership • Business benefits result from Multitenancy, Agility and improved IT performance: • Superior services and processing • Superior business intelligence Business Benefits • Increased Revenue • Retention • Growth • Cost Management • Direct Costs • Expenses • Asset Management • Workforce Productivity ITValue-Add Value of Quantified Benefits • SLAs • Performance • Speed • Frequency • Granularity • Time-to-Market IT CostSavings • Consolidation of: • Storage • Servers • Data Center • Labor
Oracle Exadata Benefits for SAS End Users –
SAS Solutions OnDemand Technologies Used Centralized management of all Oracle databases with Oracle Enterprise Manager 12c. Utilized Oracle Advanced Security Option (ASO) for Transparent Database Encryption with unique wallets/keys for each database. Also utilized the ASO for SSL encryption of all client connections. Utilized the Scan Listener to hand off to dedicated local listeners on their own port for each database. The compute tier for the solution had access to our Exadata DMZ only over the Scan Listener port and the dedicated local listener port. Backups are to ZFS and utilize mainly RMAN backup sets and opportunistic data pump exports. Database Partitioning and Hybrid Columnar Compression is used in our data lifecycle. management strategy, we are still testing offloading image copies to ZFS. Utilized Oracle Database Appliance as a Tier 2 database offering.
SAS Solutions OnDemand Lessons Learned If you do not have RAC and GRID experience, then sign up for training as soon as you place your order. Utilize Oracle’s onboarding services for Exadata if you are a first time buyer. Make sure you understand the performance implications between High Performance Disks and High Capacity Disks in regards to your intended usage. Investigate how data is being placed onto the disk, the default ASM templates do not explicitly place any file types to the HOTarea of the disk. If you are already a premium support customer, look into the platinum support offerings available for Oracle Engineered Systems.
SAS Solutions OnDemand Future Direction • Evaluating Oracle Database 12c Multitenant • Reduced TCO through the management of many databases as one • Lower resource utilization • Lower administration costs • Rapid cloning for development and debugging • Tiered DBaaS offering • Define Container Databases with different degrees of availability – Single Instance, RAC, disaster recovery with Data Guard • Move customer’s pluggable database between tiers with ease • Improved Information Lifecycle Management (ILM) with the use of Automatic Data Optimization CUST 7 CUST 1 CUST 6 CUST 3 CUST 2 CUST 5 CUST 4
SAS Solutions OnDemand Questions?
SAS Solutions OnDemand Contact Learn more about our services: http://www.sas.com/solutions/ondemand/index.html Email: Randy.Wilcox@sas.com Blog: http://randywilcoxdba.wordpress.com/
SAS on Oracle for Big Data and Cloud Services: Insights into a Strong Partnership (CON8653) Paul Kent, VP Big Data, SAS Randy Wilcox, DBA Team Manager, SAS Solutions onDemand Hermann Baer, Director Product Management, Oracle
Distributed SMP (SAS 9.4) SPARC M5-32, Solaris 11.1 Single domain test – 48 cores, 2TB RAM SMP – In-Memory Analytic Server (LASR) Lift 100GB table from Exadata to LASR -> “hp” PROCS running in multi-threaded fashion SAS High-Performance Analytics - Choice • Exalogic, BDA, OVCA • Oracle Linux Infiniband
SAS Marketing Automation - Oracle SuperCluster Optimized Test environment at Oracle Solution Center • Oracle and SAS Institute jointly tested SAS Marketing Automation with the Oracle SPARC SuperCluster • Each of the SPARC T4-4 compute nodes were partitioned into two domains, one running Oracle Solaris 10 for SAS Marketing Automation, and the other running Oracle Solaris 11 and Oracle Database 11g • Oracle Exa Storage Cells accelerated the Database performance • Infiniband network maximized I/O throughput between nodes OPN Partner and Oracle Internal and Confidential
SAS Marketing Automation on Oracle SuperCluster - Comparison Results OPN Partner and Oracle Internal and Confidential
Field Collateral • Empowering SAS Grid Computing and SAS Marketing Automation on Oracle SuperCluster(Presentation) • Improving SAS Customer Intelligence Solution Performance with Oracle SuperCluster(Paper) OPN Partner and Oracle Internal and Confidential