420 likes | 560 Views
Learn successful migration processes, tools, and challenges in migrating on-premises enterprise data warehouses to Azure SQL Data Warehouse. Understand modern data warehousing techniques and migration types with real customer case studies.
E N D
Migration experience from an on-premises enterprise data warehouse to Azure SQL Data Warehouse Kavitha Jonnakuti Principal Program Manager Shamik Ghosh Principal Program Manager BRK3327
Agenda • Successful customer migrations to Azure SQL Data Warehouse • Why customers migrate and from where? • Process • Tools • Partners • Top challenges, red-herring and mitigations
Modern data warehousing The modern data warehouse extends the scope of the data warehouse to serve big data that’s prepared with techniques beyond relational ETL Load and ingest Process Serve Transfer and store Process and clean Serve and analyze
Modern data warehousing – Data Ingestion Data loading r Ingest storage Load flat files into data lake on a schedule Logs, files, and media (unstructured) Azure Storage/Data Lake Store Business/custom apps (structured) Azure Data Factory
Modern data warehousing – Data Ingestion Data loading r Ingest storage Load flat files into data lake on a schedule Logs, files, and media (unstructured) Azure Storage/Data Lake Store Orchestration Business/custom apps (structured) Extract and transform relational data Azure Data Factory Azure Data Factory
Modern data warehousing – Data cleansing and prep Data loading r Ingest storage Data processing Read data from files using DBFS Load flat files into data lake on a schedule Logs, files, and media (unstructured) Azure Databricks Azure Storage/Data Lake Store Orchestration Clean and join with stored data Business/custom apps (structured) Extract and transform relational data Azure Data Factory Azure Data Factory
Modern data warehousing – Analysis and Post processing Data loading r Ingest storage Data processing Serving storage Applications Load processed data into tables optimized for analytics Read data from files using DBFS Load flat files into data lake on a schedule Logs, files, and media (unstructured) Azure Databricks Azure Storage/Data Lake Store Azure SQL DW Orchestration Clean and join with stored data Dashboards Business/custom apps (structured) Load to SQL DW Extract and transform relational data Azure Data Factory Azure Data Factory
Anheuser Busch build global analytics platform on Azure Global Analytics Platform Before One EDW for all functional data Challenge Siloed data across organization with finance, marketing, sales and support teams building separate data platforms and divergent analytics solutions across multiple platforms Impact One platform for all data Advanced analytics helps optimize onboard water storage, saving $200k per ship annually Power BI Visual insights Azure SQL Data Warehouse Multiple data platforms Insights from aggregated data
Reckitt Benckiser empowering 40,000 employees to work smarter using Big Data Azure SQL Data Warehouse + Power BI Before Challenge Global consumer goods company struggled with poor performance on their existing business intelligence solution Impact Easily accessible data insights with Azure SQL Data Warehouse and supporting Azure services Increased cost savings on licensing and maintenance which paid for the new platform in the first year Hybrid business intelligence solution with Microsoft Power BI and Azure Front-end data consumption Insights on desktop & mobile ? Power BI Azure Analysis Services & SSRS Cache Back-end data storage Azure SQL Data Warehouse Disconnect between data storage and consumption Integrated front-end and back-end data
Top data warehouse migration motivations • Cost savings/avoidance • End-of-life hardware • Application modernization • Application growth, solution expansion • Platform consolidation
Migration types: Full • Full migration of entire DW solution • Original DW decommissioned after new DW goes live • Involves • Schema and data • Applications; usually minimal if using common BI tools • Infrastructure and security • ETL conversion • People
Migration types: Offload • Migrate some workloads off primary data warehouse (DW) • Long term retirement roadmap for original DW; typically years • Involves • Schema and data • Applications; usually minimal to no change depending on approach • Infrastructure and security • Gradual ETL migration; start with data only needed on offload DW • People
Migration lifecycle SQL Server Migration Assistant (SSMA) Process applicable to POC, pilot and broad production rollout
Migration steps Assess Prepare Migrate Optimize • Validate schema, data and application • Schema & data type optimization • Infrastructure and security • Administration tools, scripts, operations and knowledge • Success or go/no-go criteria • Post migration tasks • Documentation • Security signoff • Migration execution • Post migration fixes/re-configurations • Data synchronization/catchup (for pilots) • Go/no-go decision (for pilots) • Functional validation • Performance baseline • Performance tuning (query, schema, data, storage) • Post migration optimizations • HA/DR • Security • Verify feasibility (data volume, architecture, workload pattern, latency) • Identify and understand required changes to DW, ETL, application, infrastructure and security • Understand how to leverage new features • Get and validate security requirements/restrictions
SSMA – Schema Migration Engine Customized Conversions Export as SSDT project for refactoring
Demo Oracle to Azure SQL Data Warehouse migration Shamik Gosh
Rich partner network Trusted ecosystem to accelerate time to value SYSTEM INTEGRATION DATA INTEGRATION & MIGRATION BI & ANALYTICS
Azure SQL Data Warehouse - StrengthsMigrations Best practices
Enterprise Grade Cloud Data Warehouse – Strengths On-demand Secure Scalable 4,000+ core compute scale Infinite data storage Compute scale independent of storage Adaptive caching Instant data movement 128 active concurrent queries, 1024 connections per cluster Automated provisioning in seconds Re-size compute in minutes Create multiple instances Pause and Resume warehouses Consume compute and storage independently Virtual Networks Firewalls Azure Active Directory, MFA Transparent Data Encryption Object level security Azure Key Vault Auditing and Threat Detection Vulnerability Assessment Compliance
Enterprise Grade Cloud Data Warehouse - Strengths Backup / Restore Provisioning Administration Tuning Automatic columnar index optimization Auto-create statistics Automatic Intelligent Insights One-click scale-up during peak demand Maintenance Windows Multi-level cost-based query optimization Graphical execution plans Resource utilization/contention Data distribution/skew Automated backup snapshots every 8 hours Geo-redundant backups every 24 hours User-defined restore points Provision and load through integrated portal Portal and programmatic provisioning Provision in 33 global regions
Explorations and ADS (App Design Spec) – Considerations • Performance – Query & Loading • Architecture – traditional EDW, cloud DW/Lambda • Workloads – interactive, direct query, standard reports, data scientists • Scale • DWU is a performance measure (CPU, memory, throughput) • Measure baseline and scale • Security • Data privacy & security • Clients/workloads • ISV vs in-house development
Explorations and ADS – Red-herrings • Concurrency • Over indexing on concurrent query execution • Focus on query performance and scale, including scale out • Feature compatibility • Goal should not be product functional parity – focus on business requirements • Azure Platform goodness
Best Practices – POC/pilot/production • Data loading performance • Load data via Polybase, SSIS, ADF (~6TB /hour) • Start with medium or large RC (or static RC equivalents) for loading and work down • Optimize distribution for joins – even data distribution must consider query pattern • Query performance • Statistics are critical; extra care with very large tables • Start with SmallRC (or low static Resource Class) for queries and work up
Best Practices – POC/pilot/production • Scale • Start with DW 1500c, move up/down • Identify bottleneck: IO, CPU, data movement • Account for cache warm up patterns in Gen2 • Clients/workloads • Connectivity • Efficiency; bad queries don’t automagically get better • Count (sheer number of workloads, ETL packages, code, etc…) • Pay careful attention to ETL • Security • Row level security • VNET
Session resources • Migration Guide Public Preview: datamigration.microsoft.com • Azure Database Migration Service (Limited Preview) • Preview signup: aka.ms/migrating • Feedback alias: dmsfeedback@microsoft.com • Channel 9 Video: Oracle migrations; Azure SQL Database migrations • SQL Server Migration Assistant • Blog:Released: SQL Server Migration Assistant (SSMA) v7.4 • Feedback alias: ssmahelp@microsoft.com • Data Experimentation Assistant: Download • Find a partner: http://migration/Pages/SearchPartners.aspx • Azure SQL Data Warehouse expert help: asksqldw@service.microsoft.com • Data Migration Team: DMPMs@microsoft.com • Data Migration Team Blog: blogs.msdn.microsoft.com/datamigration
SQL Data Warehouse Ignite sessions BRK2204 - Azure Data Factory - Enabling modern data integration in the cloud BRK3186 - Turbocharged analytics with SQL Data Warehouse Gen2 BRK3149 - In the security trenches of Azure SQL Database and Azure SQL Data Warehouse BRK3326 - Azure Data Lake Storage Gen 2: Enhancing big data analytics on Azure BRK3327 - Migration experience from an on-premises enterprise data warehouse to Azure SQL Data Warehouse BRK2408 - Thousands of Azure data warehousing success stories BRK2303 - Rubikloud's journey to build the modern data warehouse with Azure SQL Data Warehouse and Spark THR2156 - Azure SQL Data Warehouse: Tuning your DW for optimal performance THR2181 - Azure SQL Data Warehouse tips and tricks
Please evaluate this sessionYour feedback is important to us! Please evaluate this session through MyEvaluations on the mobile appor website. Download the app:https://aka.ms/ignite.mobileApp Go to the website: https://myignite.techcommunity.microsoft.com/evaluations