1 / 25

What is the Azure SQL Datawarehouse ?

Learn about Azure SQL Data Warehouse, a massively parallel processing distributed database system that combines SQL Server with Azure cloud scale-out capabilities. Discover how it decouples storage from compute, enables predictable and scalable performance, and complies with various security requirements.

kristib
Download Presentation

What is the Azure SQL Datawarehouse ?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Vitor Fava What is the Azure SQL Datawarehouse?

  2. Whatis a DataWarehouse? Whatisthe SQL DataWarehouse? SQL DataWarehouse Architecture Managing a SQL DataWarehouse Agenda

  3. MVP Data Platform SQL Server Database Consultant at Pythian fava@pythian.com http://vfava.wordpress.com http://www.youtube.com/vitortff https://groups.google.com/group/sqlmaniacs Vitor Fava

  4. Whatis a DataWarehouse?

  5. Whatisthe SQL DataWarehouse?

  6. Combines the SQL Server relational database with Azure cloud scale-out capabilities; Decouples storage from compute; Enables increasing, decreasing, pausing, or resuming compute; Integrates across the Azure platform; Utilizes SQL Server Transact-SQL (T-SQL) and tools; Complies with various legal and business security requirements such as SOC and ISO; Whatisthe SQL DataWarehouse?

  7. Predictable and scalable performance With Data Warehouse Units

  8. SQL Data Warehouse is a massively parallel processing (MPP) distributed database system; Behind the scenes, SQL Data Warehouse spreads your data across many shared-nothing storage and processing units; The data is stored in a Premium locally redundant storage layer on top of which dynamically linked Compute nodes execute queries; SQL Data Warehouse takes a "divide and conquer" approach to running loads and complex queries; Requests are received by a Control node, optimized for distribution, and then passed to Compute nodes to do their work in parallel; Massively parallel processing architecture (MPP)

  9. Grow or shrink storage size independent of compute; Grow or shrink compute power without moving data; Pause compute capacity while leaving data intact, only paying for storage; Resume compute capacity during operational hours; Massively parallel processing architecture (MPP)

  10. Massively parallel processing architecture (MPP)

  11. A distribution is the basic unit of storage and processing for parallel queries that run on distributed data; When SQL Data Warehouse runs a query, the work is divided into 60 smaller queries that run in parallel; A data warehouse with maximum compute resources has one distribution per Compute node; A data warehouse with minimum compute resources has all the distributions on one compute node. Distributions

  12. Hash-distributed tables; Round-robin distributed tables; Replicated Tables; Distributions

  13. Each row belongs to one distribution; A deterministic hash algorithm assigns each row to one distribution; The number of table rows per distribution varies as shown by the different sizes of tables; Hash-distributed tables

  14. Hash-distributed tables

  15. A round-robin distributed table distributes data evenly across the table but without any further optimization; A distribution is first chosen at random and then buffers of rows are assigned to distributions sequentially; It is quick to load data into a round-robin table, but query performance can often be better with hash distributed tables; Round-Robin distribution tables

  16. Round-Robin distribution tables

  17. A replicated table provides the fastest query performance for small tables; A table that is replicated caches a full copy of the table on each compute node; Replicated tables are best utilized with small tables; Replicated tables

  18. Replicated tables

  19. Uses T-SQL syntax for many operations; It also supports a broad set of traditional SQL constructs, such as stored procedures, user-defined functions, table partitioning, indexes, and collations; Contains various newer SQL Server features, including clustered columnstore indexes, PolyBase integration, and data auditing (complete with threat detection); Certain T-SQL language elements that are less common for data warehousing workloads, or are newer to SQL Server, may not be currently available; Built on SQL Server

  20. Tools toimport data

  21. Azure Portal; SQL Server Data Tools Powershell; SQLCMD; Management Tools

  22. It works with the MPP architecture to guarantee the speed in order fulfillment; You can scale storage and processing power independently; We can pause the service when it is not needed and lower the costs; Support for T-SQL, which facilitates the use of the service; Conclusion

  23. Just like Jimi Hendrix …  We love to get feedback Please complete the session feedback forms

  24. SQLBits - It's all about the community... Please visit Community Corner, we are trying this year to get more people to learn about the SQL Community, equally if you would be happy to visit the community corner we’d really appreciate it.

  25. Vitor Fava fava@pythian.com Obrigado / ThankYou !!!!!

More Related