Happy World Azure Synapse Day!
12/11 - Azure Synapse Day!

Happy World Azure Synapse Day!

What is Azure Synapse Analytics?

In summary, a single web-based SaaS (Software as a Service) for combining traditional data warehousing with big data analytics! Everything is contained in one place where you can extract, transform, prepare, manage and provide data for BI and machine learning purposes at a significant scale!

What are the components of Azure Synapse Analytics?

No alt text provided for this image

Azure Synapse Studio:

A single console allows you to orchestrate and manage all aspects of Azure Synapse Analytics. Using this tool allows you to ingest data using pipelines, transform data using mapping data flows, analyse data using SQL or Spark jobs or just visualise data through an integrated Power BI service.

Orchestration:

Azure Synapse comes with a fully integrated data orchestration service that is almost identical to Azure Data Factory. If you’ve come across Linked Services, Datasets, Integration Pipelines and Mapping Data Flows this will be familiar to you too! It’s basically an ETL service that allows for serverless data integration and data transformation at scale all within one environment.

No alt text provided for this image

Compute:

?At its core, Azure Synapse Analytics allows for massively parallel processing (MPP) using Pools.

?On the SQL Side, there are SQL Pools. The “SQL Pool” allows you to execute intensive SQL queries but massively in parallel. The control node of the MPP engine can distribute a query across a variable number of specified compute nodes which can then send this query to up to 60 distributions before collating the results back to you at speed.

?These SQL Pools can either be serverless on-demand or provisioned resources. The on-demand query service can be used for data exploration or ad-hoc analysis of your data. The pricing model for on-demand is based on the queries executed and their relative demand instead of the number of DWUs allocated to the instance. For more demanding workloads you can provide the amount of computing you require using data warehouse units (or DWUs) which are charged per unit per hour rather than per query.

?And what about Spark Pools? These are very similar to provisioned SQL Pools where you specify the amount of computing you require and when. Spark pools are used when you want to run some Python, Scala, or even some C# scripts in a notebook for some intensive machine learning analysis!

No alt text provided for this image

And Finally…Power BI Integration:

Near full Power BI Integration built into Azure Synapse Studio. This allows you to create new Power BI interactive reports and datasets directly from inside Azure Synapse Studio.

要查看或添加评论,请登录

Purple Frog Systems Ltd的更多文章