Azure Synapse Analytics - First Impression - Part 1

Azure Synapse Analytics - First Impression - Part 1

As everyone knows Azure Synapse Analytics is in public preview and has made everyone excited about it since Microsoft Ignite 2019.

Before we understand Synapse Analytics, let's first understand why Synapse Analytics? Let's understand some of the issues/confusion before Synapse Analytics.

  1. Azure came up with a Massive Parallel Processing (MPP) storage service SQL Data Warehouse (similar to AWS RedShift and Google BigQuery), the first issue is the name of the product itself, thanks to Microsoft for changing it. People will frequently get confused about building a Data Warehouse on Azure (using SQL Server in a traditional Star/Snowflake Schema) or a product/MPP service that can handle data to the Petabyte scale.
  2. The second issue was missing the studio experience on the cloud where users can develop, query, and manage all relevant resources/services in one place. For example, building an ETL/ELT pipeline which could be a mix of ADF/Notebook for Spark (HDInsight)/Integration runtimes/Datastores (MPP/SQL database)/Query capability (dedicated/serverless)/Reports/Machine Learning. All the services existed but not under one roof, Synapse Analytics Studio brings them together with easy use and visualize. I think Microsoft should be given credit for this effort.
No alt text provided for this image

3. Missing Azure SQL Database serverless (SQL On-Demand), similar to AWS Athena. The service was available in November 2019 but now serverless and managed services are part of Synapse Studio.

What is still confusing?

  1. Some people thing Synapse Analytics is SQL Data Warehouse as Microsoft points out in its documentation mentioning Synapse Analytics (Previously SQL Data Warehouse?). Synapse Analytics is much more than a Data Warehouse and it's still confusing for many users, Nice article by James Serra to clear out the confusion.
  2. Only one icon is available for Synapse Analytics, there should be many more icons for each service available for example in AWS one icon is for Redshift but another one for Athena, similarly separate icons should exist for SQL Pool and SQL on-Demand.
  3. Integration of Delta/Spark tables with SQL on-Demand. The SQL on-Demand view displays spark tables (Hive) but queries fail to execute for these tables.

What will I cover in the next article?

Experience using Spark Notebooks.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了