登录查看更多内容

Unbiased view of bringing Synapse Analytics and Azure Databricks together

Elizabeth Antoine

Regional Analytics Leader @ Microsoft | Board Director @ Avivo: Live Life | Executive MBA

发布日期: 2021年10月15日

Disclaimer. This article represents personal experience and understanding of the authors. This article doesn’t represent official position of Microsoft.

There are many challenges that prevent organizations from realizing their advanced analytics mission:

There are so many advanced analytics solutions and offerings out there, many of which are difficult to understand and implement.?
Siloed data across teams and departments inhibit the development of unified data pipelines.?
Scaling challenges and performance constraints often represent a cost and implementation barrier for advanced analytics teams.?

As Azure Synapse brings the worlds of data warehousing, big data, and data integration into a single unified analytics platform,?there is continued investment?in improving performance for Apache Spark workloads?in?Azure Synapse.?

Spark in Azure Synapse Analytics is the OSS Apache Spark distribution with additional Microsoft proprietary optimizations. It also is deeply integrated in Azure Synapse and benefits from a unified security, networking, monitoring, CI/CD, management experience and meet strict JEDI compliance requirements. ?

Azure Databricks provides the premium Spark experience targeting data engineering, data science, and data analysis on Azure and contains unique Databricks IP that is not available in OSS Apache Spark distribution. Capabilities unique to Azure Databricks include a Databricks-optimized high-performance Spark engine, managed Delta Lake, and with ML Flow an enterprise data science workspace with collaborative notebooks. ?

We have made our first attempt?to create a decision tree that?gives an?unbiased view of bringing?Synapse and Azure Databricks together. You can access this interactive Decision Tree by following this link: Azure Synapse And Azure Databricks

Below are some of the things we have taken into consideration while creating this decision tree

Differences & Preferences are originated in the technology itself – they were built / meant for different things?
Synapse meant to solve “stitching together” problem but its core is Data?Warehouse?nevertheless?
Databricks is built for massive processing on read (applying transformations while reading)?

领英推荐

Understanding Databricks

CoffeeBeans 1 个月前

Harnessing the Power of Azure Databricks and Microsoft…

Sanjay Kumar MBA,MS,PhD 7 个月前

Azure Databricks Vs Snowflake: A Comparison Guide You…

Kanerika Inc 2 个月前

Write path:??

Synapse – ELT
Databricks – large ETL volumes?

Read path:??

Synapse – lots of “smallish” queries with substantial amount of joins
Databricks – large volumes?of data?with few joins & processing on read?

Feature by feature?comparison?doesn’t make a lot of sense generally, but:?

Time travel is only available in Databricks?
Databricks?provides more sophisticated security model on Spark than Synapse?
Native?Column-Level Security,?Row-level Security & Dynamic Data Masking?(without building views & with full integration with AAD) is only available in Synapse Dedicated?SQL Pool?
Synapse provides some sort of DR on top of Storage (DDL / definition-wise).?
Note: Synapse Storage is same price as storage accounts?

You can access this interactive Decision Tree by following this link: Azure Synapse And Azure Databricks and provide your feedback / submit questions in a public GitHub Repository. Thank you and have a very pleasant day!

@Elizabeth and @Andrei

?Simplicity is an ultimate sophistication. -- Leonardo Da Vinci

Pradeep Dadlani

Data Strategy and Architecture | Data Platforms | Data Management

2 年

Interesting article this one Elizabeth Antoine. I have always read/seen the two tech stacks along the lines as described. Good to see someone who is closely involved present an unbiased view.

1 次回应

Elizabeth Antoine

Regional Analytics Leader @ Microsoft | Board Director @ Avivo: Live Life | Executive MBA

3 年

We (Andrei Zaichikov,?Elizabeth Antoine,?Eleni Santorinaiou) are often asked a question - how up-to-date is?https://albero.cloud/? And we saw the same question in the comments. We are happy to announce that our post-Ignite review is finished. We have added all updates from Ignite and improved loads of things for you. 21 issues are closed, major updates now include rework of our Synapse & Databricks Decision Tree, added Auto-Scaling capabilities directly to Main DT, rework of Modern Data Analytics DT and more. In addition, we have added a list of all major public datasets available in Azure (look into Useful Materials section)

3 次回应

Andrei Zaichikov

Director, Enterprise Technology Strategy, EMEA at Pure Storage

3 年

Ramki, Rodney As for the cost – this is indeed a thing we are considering but as Elizabeth mentioned it is extremely complex product-wise. And, it has one more dimension which is people. Low qualification can lead to excessive consumption which doesn’t have anything to do with the functional fit of the service. On contrary, if service fits the purpose, it will be used more effectively and optimally. Unfortunately, quite often fake “simplicity” of the picture hides actual complexity and future troubles. This is what we would like to avoid by providing more robust and comparable criteria. Thank you once again for your feedback.

1 次回应

查看更多评论

要查看或添加评论，请登录

Elizabeth Antoine的更多文章

Building Interactive Enterprise Grade Applications with Open AI and Microsoft Azure

2023年5月17日

Building Interactive Enterprise Grade Applications with Open AI and Microsoft Azure

Disclaimer. This is an opinion of the authors, and it does not necessarily reflect the recommendations or point of view…

5 条评论
Unbiased view of bringing Synapse Analytics and Azure Databricks together

2023年4月21日

Unbiased view of bringing Synapse Analytics and Azure Databricks together

About a year ago, we created this article to provide an unbiased view on when and how to use Azure Synapse and Azure…

6 条评论

Unbiased view of bringing Synapse Analytics and Azure Databricks together

Elizabeth Antoine

Regional Analytics Leader @ Microsoft | Board Director @ Avivo: Live Life | Executive MBA

领英推荐

Elizabeth Antoine的更多文章

社区洞察

其他会员也浏览了

Azure synapse

The 5 Modern Data Platforms: Is There Room for a 6th?

Modern Analytical Databases: How to Power Your Big Data Insights

Seamless Integration: Databricks' Approach to Reading and Writing in Azure Data Lake Gen 2

Understanding Batch and Real-Time Processing in DataBricks

Understanding Batch and Real-Time Processing in DataBricks

Simplifying Analytics with Azure Databricks' Open Lakehouse Architecture

Mapping Microsoft's Data Analytics Landscape – Comparing Databricks, Synapse and Fabric

Exploring Azure Synapse Analytics: Dedicated Pools vs. Serverless Pools

领英推荐

Elizabeth Antoine的更多文章

Building Interactive Enterprise Grade Applications with Open AI and Microsoft Azure