Is Salesforce’s “Zero Copy” Data Cloud a Composable CDP?

Is Salesforce’s “Zero Copy” Data Cloud a Composable CDP?

Salesforce is everywhere touting its “Zero Copy architecture” in their Data Cloud CDP. In fact, Marc Benioff himself said the following during Salesforce's earnings call last week:?

“We need to be able to, through our Zero Copy, automatically integrate into our Data Cloud, all of those systems and then seamlessly provide that data back into these amazing tools.”

Sounds good, right? But if you take a second to think, this sentence itself is an oxymoron. If you don’t copy the data over, as "Zero Copy" implies, how can that data “integrate into” your data cloud? Makes no sense.

The other clue is that despite "Zero Copy", Salesforce never calls its CDP "Composable"… interesting!

So here’s the answer that makes all of the above make sense. Salesforce’s “Zero Copy” does not mean that data is not copied. To the contrary, in digging into the actual architecture, it turns out that Salesforce’s “Zero Copy” actually means that data is copied at least once, and likely several times, from their customer's Data Warehouse to Salesforce’s Data Cloud.

“Zero Copy” does NOT mean no copies, it means Many Copies!?

However, Salesforce’s Data Cloud does hold promise… just not in the way Benioff promotes it.

So let's dive into what Zero Copy really is, and what Salesforce Data Cloud should be. ??

Quick Background on Composable CDPs

If you know about Composable CDPs you are welcome to move to the next section. But if not, please read on because "Zero Copy" is essentially Salesforce's (non) answer to Composability.

By now, you probably heard that Gartner’s first-ever Magic Quadrant for Customer Data Platforms was announced last week. This Magic Quadrant was noteworthy for validating the Composable CDP model as the future of the CDP category. If you haven’t yet read the Magic Quadrant you can get a free copy here.

Here’s a summary of what a Composable CDP is:

  • A Composable CDP needs to provide all the core functionality of a traditional, Non-Composable (“Bundled”) CDP, i.e. audiencing, journeys, orchestration, etc through a no-code interface. Marketers shouldn’t have to know SQL to build an audience or customer experience.?
  • A Composable CDP should leverage any data warehouse (Databricks, Snowflake, Teradata, Redshift, etc) for all storage and processing; i.e. the storage and processing component of the CDP can be decoupled from the UI and the rest of the CDP functionality. This is important to prevent data copies (for real, not just using funny words), reduce costs, and support the spirit of data centralization that the cloud data warehouse represents.?
  • Finally, secondary CDP functions like Identity Resolution or Customer Data Infrastructure (CDI) can also be decoupled from the CDP if the customer desires. For example, AWS Entity Resolution allows Identity Resolution to take place in the data warehouse versus in the CDP. Brands should be able to choose what works best for them and their constantly maturing data strategy.

Why is Composability taking the CDP market by storm? As IT and Data Teams centralize all customer data around the cloud data warehouse (CDW), it becomes critical to avoid having multiple copies of data both across the customer’s own infrastructure and even more across 3rd party vendors like Salesforce. This is because data copies create security risks and also make data governance complicated. Experience teaches us that keeping data models and data quality consistent across multiple copies is nearly impossible and creates huge business issues downstream.

ActionIQ is proud to be the only Composable CDP in the Magic Quadrant. In addition, we are the only vendor to offer a Hybrid capability, where a combination of Composable and Bundled architecture can co-exist with an ActionIQ CDP giving enterprises a future-proof solution that works wherever they may be in their data warehouse maturity journey.

The Salesforce (Non) Answer to Composability: “Zero” Copy

Back to our friends at Salesforce. As mentioned above, “Zero Copy”, as Salesforce is using the term, is not what it sounds like.Instead, it is just a faster way to copy data.?

So what’s actually happening here? What Salesforce is calling “Zero Copy” is a technology that has been around forever but with a different name. It used to be called Copy-on-Write and there’s even a Wikipedia article about it. From that article (truncated for clarity):

“Copy-on-write (COW) is a technique used in computer programming to efficiently implement a "duplicate" or "copy" operation“

Essentially Salesforce’s “Zero Copy,” a.k.a. Copy-on-Write, means the following: if you are trying to copy data from one database (Snowflake, specifically) to Salesforce’s internal Snowflake instance, “Zero Copy” creates a “pointer” from the destination to the source. But here’s the catch: as the destination (Salesforce) starts to integrate that data, it anyway copies it over. So it’s faster to copy (initially) but the copying process takes longer to complete. But at the end of the day, it’s still a copy!

This is a nice-to-have feature, but architecturally it changes nothing since data is still copied and transformed.?

And by the way, it only works if your CDW is Snowflake since “Zero Copy” is Snowflake’s technology.?

So Salesforce’s “Zero” Copy is still a full data copy, meaning:

  • Salesforce Data Cloud needs a full copy of your Data Warehouse data
  • Salesforce Data Cloud needs to ETL (transform) that data into its own data model, to integrate it with other Salesforce data;
  • Salesforce Apps that are not yet ported to Data Cloud (like Marketing Cloud, Email Studio,, etc) need to again copy the data from Data Cloud into their own application framework.

So while Composability promises a "Never Copy" architecture, Salesforce's Zero Copy means Many Copies: once into Salesforce's Snowflake instance (landing zone), then copied & transformed into the Salesforce data lake, and then many times over to the different Salesforce Apps.

But the story doesn’t end there. Once Salesforce copies all your enterprise data into their data cloud, there’s nowhere to go but Salesforce itself. Salesforce internally has a Snowflake instance to receive the Zero Copies but then has another, additional data lake build on Iceberg. And then every other Salesforce product has their own storage and backend. It’s a true multi-copy environment!

So let’s say you accept the fact that Salesforce will copy everything over. Then what? Well, once your data gets copied over to the Salesforce Data Cloud it has nowhere to go but get further copied or activated via exclusively Salesforce applications (something that Gartner called out in the CDP MQ). This means perpetual lock-in to legacy Salesforce apps.?

No wonder Salesforce is giving the Data Cloud away for free, left and right: Data Cloud is using your own Enterprise data to create the ultimate lock-in play for Salesforce.

The bottom line? Salesforce Data Cloud is neither Zero Copy nor a CDP. And to this date, I still have yet to meet a successful production customer of Salesforce’s “Data Cloud”. It’s a great vision, but like many Salesforce promises, it’s still very much a vision. And unfortunately, this vision is using an architecture that’s already legacy (copy-on-write) and leads to a place where their customers will have no choice outside the Salesforce suite of products.

Salesforce’s Data Cloud Could Be Useful… Just Not How Salesforce Wants It

Now, don’t get me wrong—many, if not most, of our Enterprise clients use Salesforce products. Some of them are great. And, if Salesforce surprises us by executing well on a home-build product, Salesforce Data Cloud could actually be useful if you are already a big multi-cloud Saelsforce customer—but not in the way Salesforce advertises it.

The biggest opportunity for Salesforce Data Cloud is to become a data integration layer for all the Salesforce-sourced customer data (not the broader Enterprise customer data) that live on the dozens of data silos that exist within the different Salesforce clouds.?

If you are not a Salesforce customer you may be surprised to hear that so many silos exist within Salesforce's "Cloud". The reason is that most of Salesforce’s products are the result of acquisitions; and even the ones developed in-house were historically independent from each other, using different databases and data models/schemas.

If Salesforce promises now to integrate all that data in one place, great! That’s not only helpful, but overdue. However, this is not an Enterprise CDP; rather, it’s a data source—the Salesforce customer data source! And this data belongs to the Enterprise’s Cloud Data Warehouse, where they can be integrated with other enterprise customer data and not the other way around.?

But instead, Benioff wants to hold the Salesforce-sourced customer data hostage and demand the rest of the Enterprise customer data to be given to him so that he can lock it up within the Salesforce cloud. And that’s just wrong—architecturally and commercially wrong, not to mention the lack of honesty in this approach.

What about “Warehouse Sync” or Warehouse “Connectors”?

Salesforce is using Zero Copy to get away with lacking a Composable architecture. But Bundled CDP vendors, also unable to evolve their own architecture, have come up with other creative terminology to avoid solving the problem for their customers.

A popular one is using the term “Warehouse Sync” which, like “Zero Copy,” has nothing to do with Composability or not copying data. Warehouse Sync is just a trivial way to extract data from the data warehouse and copy it over to the Bundled CDP. Think about the old way you used to “sync” your iPod to iTunes on your computer. Nothing more or less than that.

Warehouse Sync still requires the CDP using it to:

  • Copy all the data into the CDP, thus creating security and governance headaches
  • Provide an additional storage and processing system – separate from the centralized Cloud Data Warehouse (Snowflake/Databricks/etc). Usually this layer is a lot less scalable and open than the CDW itself

I still remember the days when we called Warehouse Sync, simply, a warehouse “integration”. Even if that integration is bi-drectional, it changes nothing because the CDP still needs to maintain a full data copy of customer data to do its job.

Every CDP in the Magic Quadrant except for ActionIQ, when they talk about Warehouse Data/Connector/Sync or even Composability, they mean a simple, old school integration that copies data from the DW to the CDP. And vendors will go to great lengths to hide this simple fact, so in your evaluations you need to make sure to ask very direct and pointed questions to understand if, at the end of the day, data is being copied/moved or not.

Short of a fully Composable architecture, you will find that most fancy new terms like "Data Warehouse Sync" just represent a good ol’ Data Warehouse Integration that just copies data around.

ActionIQ is the only Composable CDP in the Gartner Magic Quadrant

We understand how big of a problem data copies can be. From security risks, to higher costs, to governance nightmares, our customers tell us about the challenges regularly. It’s why we invested so much energy into solving it. We’re proud that ActionIQ is the only CDP in Gartner’s Magic Quadrant that gives the customer a true choice of using their CDW for storage and processing. If you want to keep all your customer data in your CDW, we just become a UI & orchestration/activation layer on top of that, copying (truly, this time!) zero data. And this works for real, today, in some of our Enterprise customers—Atlassian, Albertsons’, Brightspeed, Doordash to name just a few.

And for those enterprises that aren’t ready with all of their data neatly loaded into a cloud data warehouse? ActionIQ offers the only Hybrid Composable CDP which is ideal for most Enterprises: You use your CDW as the main storage and processing layer for the Customer Data that’s there; and use? ActionIQ’s built in storage for the data that’s not. The marketing and CX end-users won’t notice the difference. All they’ll see is access to more data. As more of your customer data makes its way into the CDW, you can seamlessly have ActionIQ migrate into leveraging the CDW more and more with no disruption to the business users, eventually becoming 100% Composable.

If you want to learn more, you should really talk to our team. As you can tell, we’re pretty passionate about this stuff. And for serious Enterprise buyers we readily offer demos and free PoCs to put our product where our mouth is!

Christian Gert Hansen

Databricks Engineer | Marketing & Personalization | Partner & Consultant

4 个月

Thanks for a great article. I am missing a few details on ActionIQ and how you actually succeed with “Zero Copy”. Personally, I would have liked a few more details on how you make this work with the mix of real-time and batch data. Can you solve use-cases on web where the page load needs to be as fast as possible and as a minimum below 100ms? Would have loved your perspective on some of your competitors like Hightouch? Just a few thoughts…

回复
Abul F.

Data Engg & MLOps for Risk models @ BMO | Scaling Data teams | ETL, Data Products, Data Platforms, MLOps, GenAI, Project management

10 个月

I am starting to learn about CDP. Why should companies buy a CDP as a standalone product? I am assuming we have enterprise data lakehouses and data engineering team to support.

Abhirup Bose

MBA | Senior Architect,Commercial Execution & Strategy (MarTech, Omnichannel Service & Web Technologies)

10 个月

Agree with most of your pointers. Having 1st hand exposure to Data Cloud implementation now, my view is that the concept of DC is fabulous but the product and the engineering of the offerings has miles to go before it is fully baked...Next 8-9 months will be critical for DC, with few of the large enterprises have proposed go-lives scheduled and the proof will be in the pudding...

Subin T P

SVP, Engineering @ AI Squared

1 年

Tasso Argyros - Great article. When we enable data activation to third-party tools, it essentially involves replicating data into those business applications. Given this, it seems that a completely "zero copy" Customer Data Platform (CDP) architecture might not be feasible. Is this a fair understanding?

回复
Alex M.

Vice President - IT Risk Officer Electronic and Algorithmic Trading

1 年

Good points

要查看或添加评论,请登录

Tasso Argyros的更多文章

社区洞察

其他会员也浏览了