March 8, 2023, SAP announced its next-generation data toolbox: SAP Datasphere.
Michael Johnson
Director of SAP Solutions at American Digital Corporation |||| SAP SME & Advisor
So what is SAP Datasphere?
SAP Datasphere is a unified experience for data integration, data cataloging, semantic modeling, data warehousing, data federation, and data virtualization data professionals can now efficiently distribute mission-critical business data – with business context and logic preserved- across their organization's data landscape.?
So what REALLY is SAP Datasphere?
Well, given that this announcement happened about an hour ago, details are still hazy, and we've barely had time to digest the information. But here's what we know so far.
SAP Datasphere is part of SAP Business Technology Platform (BTP) built on SAP HANA Cloud.?It's a comprehensive group of data services within BTP that enable a business data fabric architecture.??
In plain terms, it's the evolution of SAP Data Warehouse Cloud.?It will include new features, including a global data catalog, integrations and partnerships with Databricks, Confluent, DataRobot, and Collebra, and an advanced analytic model.?In addition, it will allow the ability to access data from different sources to create a cohesive data experience.?It was said that as of March 8, 2023, all existing SAP Data Warehouse Cloud customers would undergo the automatic update to become SAP Datasphere.?In the future, SAP DWC will be known as Datasphere.?
So why does rebranding matter?
It's not just a rebranding; it's also an expansion of the toolbox and capabilities.?For one, this will be an open-source system to allow other tools, partners, and users to connect to and build.??
There are also four crucial Business Data Fabric cornerstones
SAP Datasphere can pull various data from data lakes, lake houses, data stores, etc., into a unified data access point in real time, with the proper data governance in a self-service model.?SAP Datasphere follows a federated first approach – meaning you leave data where it resides, build your models, and later decide whether you want to replicate full tables and create view persistency to cater to source system workloads, data egress, and performance.??
This sounds great.?What's the catch?
As mentioned, this announcement is extremely new, and details are thin, but here's some points that might be of interest.
Performance - How do we ensure Datasphere queries don't impact source system performance??SAP Datasphere follows a federated first approach, meaning leave the data where it resides, build your data models, and later decide whether to replicate and or to create view persistency (cashes) to cater to source system workload and performance.?
What about my existing investments in hyperscalers or other data lakes that are elsewhere alongside my SAP environment??SAP Datasphere can directly connect to hyperscaler data lakes like AWS S3, ADLS, and GCS in conjunction with Data Flow.?When overlayed by a corresponding SQL layer, SAP also allows to directly federate on data residing in those data lakes.?So this means SAP Datasphere can be understood as an optional overlay in data lakes as well as SAP and non-SAP systems, thus allowing access to data wherever it resides.
SAP Datasphere can access data anywhere, which is cool,?but what if our data is in various on-prem systems??Can you move on-prem data to another on-prem target via replication without the roundtrip to the cloud??SAP Datasphere focuses on replicating data into the cloud and distributing it further.?As of today, SAP can only do this with the trip to the cloud, potentially posing a consideration point for egress.?That might be a further consideration point for pulling data from on-prem to on-prem directly.?But we've been told SAP is planning for hybrid scenarios where workloads would be executed on-prem while orchestrated in the cloud to avoid that roundtrip in the future.??
Speaking of on-prem, there is a downside to using this tool with on-premise systems.?For on-prem S/4HANA and ECC systems, SAP will use SAP Landscape Transformation (DMIS portion), and CDS views as a way to integrate data but only in near real-time, not real-time.?These will be for initial and delta replication.?For other non-SAP on-prem solutions, SAP will enable seamless integration without requiring an on-prem agent.?So it seems there's a bit of a disadvantage with SAP Data
What are some things SAP Datasphere is not?
Like most SAP product announcements, time will tell what truly it's power, capabilities, functionality, and adoption will be. This newborn is still too young to fully understand it's place in the market. But I will be watching with great interest, and I hope you will be too!
VP of Marketing | MBA in Digital Marketing
2 年Thank you Michael Johnson for sharing this news and providing an in depth explanation of this solution.