Is Salesforce’s “Zero Copy” Data Cloud a Composable CDP?
Salesforce is everywhere touting its “Zero Copy architecture” in their Data Cloud CDP. In fact, Marc Benioff himself said the following during Salesforce's earnings call last week:?
“We need to be able to, through our Zero Copy, automatically integrate into our Data Cloud
Sounds good, right? But if you take a second to think, this sentence itself is an oxymoron. If you don’t copy the data over, as "Zero Copy" implies, how can that data “integrate into” your data cloud? Makes no sense.
The other clue is that despite "Zero Copy", Salesforce never calls its CDP "Composable"… interesting!
So here’s the answer that makes all of the above make sense. Salesforce’s “Zero Copy” does not mean that data is not copied. To the contrary, in digging into the actual architecture, it turns out that Salesforce’s “Zero Copy” actually means that data is copied at least once, and likely several times, from their customer's Data Warehouse to Salesforce’s Data Cloud.
“Zero Copy” does NOT mean no copies, it means Many Copies!?
However, Salesforce’s Data Cloud does hold promise… just not in the way Benioff promotes it.
So let's dive into what Zero Copy really is, and what Salesforce Data Cloud should be. ??
Quick Background on Composable CDPs
If you know about Composable CDPs you are welcome to move to the next section. But if not, please read on because "Zero Copy" is essentially Salesforce's (non) answer to Composability.
By now, you probably heard that Gartner’s first-ever Magic Quadrant for Customer Data Platforms was announced last week. This Magic Quadrant was noteworthy for validating the Composable CDP model as the future of the CDP category. If you haven’t yet read the Magic Quadrant you can get a free copy here.
Here’s a summary of what a Composable CDP is:
Why is Composability taking the CDP market by storm? As IT and Data Teams centralize all customer data around the cloud data warehouse (CDW), it becomes critical to avoid having multiple copies of data both across the customer’s own infrastructure and even more across 3rd party vendors like Salesforce. This is because data copies create security risks
ActionIQ is proud to be the only Composable CDP in the Magic Quadrant. In addition, we are the only vendor to offer a Hybrid capability, where a combination of Composable and Bundled architecture can co-exist with an ActionIQ CDP giving enterprises a future-proof solution that works wherever they may be in their data warehouse maturity journey.
The Salesforce (Non) Answer to Composability: “Zero” Copy
Back to our friends at Salesforce. As mentioned above, “Zero Copy”, as Salesforce is using the term, is not what it sounds like.Instead, it is just a faster way to copy data.?
So what’s actually happening here? What Salesforce is calling “Zero Copy” is a technology that has been around forever but with a different name. It used to be called Copy-on-Write and there’s even a Wikipedia article about it. From that article (truncated for clarity):
“Copy-on-write (COW) is a technique used in computer programming to efficiently implement a "duplicate" or "copy" operation“
Essentially Salesforce’s “Zero Copy,” a.k.a. Copy-on-Write, means the following: if you are trying to copy data from one database (Snowflake, specifically) to Salesforce’s internal Snowflake instance, “Zero Copy” creates a “pointer” from the destination to the source. But here’s the catch: as the destination (Salesforce) starts to integrate that data, it anyway copies it over. So it’s faster to copy (initially) but the copying process takes longer to complete. But at the end of the day, it’s still a copy!
This is a nice-to-have feature, but architecturally it changes nothing since data is still copied and transformed.?
And by the way, it only works if your CDW is Snowflake since “Zero Copy” is Snowflake’s technology.?
So Salesforce’s “Zero” Copy is still a full data copy, meaning:
领英推荐
So while Composability promises a "Never Copy" architecture, Salesforce's Zero Copy means Many Copies: once into Salesforce's Snowflake instance (landing zone), then copied & transformed into the Salesforce data lake, and then many times over to the different Salesforce Apps.
But the story doesn’t end there. Once Salesforce copies all your enterprise data into their data cloud, there’s nowhere to go but Salesforce itself. Salesforce internally has a Snowflake instance to receive the Zero Copies but then has another, additional data lake build on Iceberg. And then every other Salesforce product has their own storage and backend. It’s a true multi-copy environment!
So let’s say you accept the fact that Salesforce will copy everything over. Then what? Well, once your data gets copied over to the Salesforce Data Cloud it has nowhere to go but get further copied or activated via exclusively Salesforce applications (something that Gartner called out in the CDP MQ). This means perpetual lock-in to legacy Salesforce apps.?
No wonder Salesforce is giving the Data Cloud away for free, left and right: Data Cloud is using your own Enterprise data to create the ultimate lock-in play for Salesforce.
The bottom line? Salesforce Data Cloud is neither Zero Copy nor a CDP. And to this date, I still have yet to meet a successful production customer of Salesforce’s “Data Cloud”. It’s a great vision, but like many Salesforce promises, it’s still very much a vision. And unfortunately, this vision is using an architecture that’s already legacy (copy-on-write) and leads to a place where their customers will have no choice outside the Salesforce suite of products.
Salesforce’s Data Cloud Could Be Useful… Just Not How Salesforce Wants It
Now, don’t get me wrong—many, if not most, of our Enterprise clients use Salesforce products. Some of them are great. And, if Salesforce surprises us by executing well on a home-build product, Salesforce Data Cloud could actually be useful if you are already a big multi-cloud Saelsforce customer—but not in the way Salesforce advertises it.
The biggest opportunity for Salesforce Data Cloud is to become a data integration layer
If you are not a Salesforce customer you may be surprised to hear that so many silos exist within Salesforce's "Cloud". The reason is that most of Salesforce’s products are the result of acquisitions; and even the ones developed in-house were historically independent from each other, using different databases and data models/schemas.
If Salesforce promises now to integrate all that data in one place, great! That’s not only helpful, but overdue. However, this is not an Enterprise CDP; rather, it’s a data source—the Salesforce customer data source! And this data belongs to the Enterprise’s Cloud Data Warehouse, where they can be integrated with other enterprise customer data and not the other way around.?
But instead, Benioff wants to hold the Salesforce-sourced customer data hostage and demand the rest of the Enterprise customer data to be given to him so that he can lock it up within the Salesforce cloud. And that’s just wrong—architecturally and commercially wrong, not to mention the lack of honesty in this approach.
What about “Warehouse Sync” or Warehouse “Connectors”?
Salesforce is using Zero Copy to get away with lacking a Composable architecture. But Bundled CDP vendors, also unable to evolve their own architecture, have come up with other creative terminology to avoid solving the problem for their customers.
A popular one is using the term “Warehouse Sync” which, like “Zero Copy,” has nothing to do with Composability or not copying data. Warehouse Sync is just a trivial way to extract data from the data warehouse and copy it over to the Bundled CDP. Think about the old way you used to “sync” your iPod to iTunes on your computer. Nothing more or less than that.
Warehouse Sync still requires the CDP using it to:
I still remember the days when we called Warehouse Sync, simply, a warehouse “integration”. Even if that integration is bi-drectional, it changes nothing because the CDP still needs to maintain a full data copy of customer data to do its job.
Every CDP in the Magic Quadrant except for ActionIQ, when they talk about Warehouse Data/Connector/Sync or even Composability, they mean a simple, old school integration that copies data from the DW to the CDP. And vendors will go to great lengths to hide this simple fact, so in your evaluations you need to make sure to ask very direct and pointed questions to understand if, at the end of the day, data is being copied/moved or not.
Short of a fully Composable architecture, you will find that most fancy new terms like "Data Warehouse Sync" just represent a good ol’ Data Warehouse Integration that just copies data around.
ActionIQ is the only Composable CDP in the Gartner Magic Quadrant
We understand how big of a problem data copies can be. From security risks, to higher costs, to governance nightmares, our customers tell us about the challenges regularly. It’s why we invested so much energy into solving it. We’re proud that ActionIQ is the only CDP in Gartner’s Magic Quadrant that gives the customer a true choice of using their CDW for storage and processing. If you want to keep all your customer data in your CDW, we just become a UI & orchestration/activation layer on top of that, copying (truly, this time!) zero data. And this works for real, today, in some of our Enterprise customers—Atlassian, Albertsons’, Brightspeed, Doordash to name just a few.
And for those enterprises that aren’t ready with all of their data neatly loaded into a cloud data warehouse? ActionIQ offers the only Hybrid Composable CDP which is ideal for most Enterprises: You use your CDW as the main storage and processing layer for the Customer Data that’s there; and use? ActionIQ’s built in storage for the data that’s not. The marketing and CX end-users won’t notice the difference. All they’ll see is access to more data. As more of your customer data makes its way into the CDW, you can seamlessly have ActionIQ migrate into leveraging the CDW more and more with no disruption to the business users, eventually becoming 100% Composable.
If you want to learn more, you should really talk to our team. As you can tell, we’re pretty passionate about this stuff. And for serious Enterprise buyers we readily offer demos and free PoCs to put our product where our mouth is!
Databricks Engineer | Marketing & Personalization | Partner & Consultant
4 个月Thanks for a great article. I am missing a few details on ActionIQ and how you actually succeed with “Zero Copy”. Personally, I would have liked a few more details on how you make this work with the mix of real-time and batch data. Can you solve use-cases on web where the page load needs to be as fast as possible and as a minimum below 100ms? Would have loved your perspective on some of your competitors like Hightouch? Just a few thoughts…
Data Engg & MLOps for Risk models @ BMO | Scaling Data teams | ETL, Data Products, Data Platforms, MLOps, GenAI, Project management
10 个月I am starting to learn about CDP. Why should companies buy a CDP as a standalone product? I am assuming we have enterprise data lakehouses and data engineering team to support.
MBA | Senior Architect,Commercial Execution & Strategy (MarTech, Omnichannel Service & Web Technologies)
10 个月Agree with most of your pointers. Having 1st hand exposure to Data Cloud implementation now, my view is that the concept of DC is fabulous but the product and the engineering of the offerings has miles to go before it is fully baked...Next 8-9 months will be critical for DC, with few of the large enterprises have proposed go-lives scheduled and the proof will be in the pudding...
SVP, Engineering @ AI Squared
1 年Tasso Argyros - Great article. When we enable data activation to third-party tools, it essentially involves replicating data into those business applications. Given this, it seems that a completely "zero copy" Customer Data Platform (CDP) architecture might not be feasible. Is this a fair understanding?
Vice President - IT Risk Officer Electronic and Algorithmic Trading
1 年Good points