Unbiased view of bringing Synapse Analytics and Azure Databricks together
Elizabeth Antoine
Regional Analytics Leader @ Microsoft | Board Director @ Avivo: Live Life | Executive MBA
Disclaimer. This article represents personal experience and understanding of the authors. This article doesn’t represent official position of Microsoft.
There are many challenges that prevent organizations from realizing their advanced analytics mission:
As Azure Synapse brings the worlds of data warehousing, big data, and data integration into a single unified analytics platform,?there is continued investment?in improving performance for Apache Spark workloads?in?Azure Synapse.?
Spark in Azure Synapse Analytics is the OSS Apache Spark distribution with additional Microsoft proprietary optimizations. It also is deeply integrated in Azure Synapse and benefits from a unified security, networking, monitoring, CI/CD, management experience and meet strict JEDI compliance requirements. ?
Azure Databricks provides the premium Spark experience targeting data engineering, data science, and data analysis on Azure and contains unique Databricks IP that is not available in OSS Apache Spark distribution. Capabilities unique to Azure Databricks include a Databricks-optimized high-performance Spark engine, managed Delta Lake, and with ML Flow an enterprise data science workspace with collaborative notebooks. ?
We have made our first attempt?to create a decision tree that?gives an?unbiased view of bringing?Synapse and Azure Databricks together. You can access this interactive Decision Tree by following this link: Azure Synapse And Azure Databricks
Below are some of the things we have taken into consideration while creating this decision tree
领英推荐
Write path:??
Read path:??
Feature by feature?comparison?doesn’t make a lot of sense generally, but:?
You can access this interactive Decision Tree by following this link: Azure Synapse And Azure Databricks and provide your feedback / submit questions in a public GitHub Repository. Thank you and have a very pleasant day!
@Elizabeth and @Andrei
?Simplicity is an ultimate sophistication. -- Leonardo Da Vinci
Data Strategy and Architecture | Data Platforms | Data Management
2 年Interesting article this one Elizabeth Antoine. I have always read/seen the two tech stacks along the lines as described. Good to see someone who is closely involved present an unbiased view.
Regional Analytics Leader @ Microsoft | Board Director @ Avivo: Live Life | Executive MBA
3 年We (Andrei Zaichikov,?Elizabeth Antoine,?Eleni Santorinaiou) are often asked a question - how up-to-date is?https://albero.cloud/? And we saw the same question in the comments. We are happy to announce that our post-Ignite review is finished. We have added all updates from Ignite and improved loads of things for you. 21 issues are closed, major updates now include rework of our Synapse & Databricks Decision Tree, added Auto-Scaling capabilities directly to Main DT, rework of Modern Data Analytics DT and more. In addition, we have added a list of all major public datasets available in Azure (look into Useful Materials section)
Director, Enterprise Technology Strategy, EMEA at Pure Storage
3 年Ramki, Rodney As for the cost – this is indeed a thing we are considering but as Elizabeth mentioned it is extremely complex product-wise. And, it has one more dimension which is people. Low qualification can lead to excessive consumption which doesn’t have anything to do with the functional fit of the service. On contrary, if service fits the purpose, it will be used more effectively and optimally. Unfortunately, quite often fake “simplicity” of the picture hides actual complexity and future troubles. This is what we would like to avoid by providing more robust and comparable criteria. Thank you once again for your feedback.