登录查看更多内容

Why Databases Won't Charge for Storage in the Future

Tomasz Tunguz

发布日期: 2024年3月13日

The database is being unbundled. Historically, a database like Snowflake sold both data storage & a query engine (& the computing power to execute the query). That’s step 1 above.

But, customers are pushing for a deeper separation of compute & storage. The recent Snowflake earnings call highlighted the trend. Larger customers prefer open formats for interoperability (step 2 & 3).

A lot of big customers want to have open file formats to give them the options…So data interoperability is very much a thing and our AI products can generally act on data that is sitting in cloud storage as well.

We do expect a number of our large customers are going to adopt Iceberg formats and move their data out of Snowflake where we lose that storage revenue and also the compute revenue associated with moving that data into Snowflake.

Instead of locking the data in one database, customers prefer to have it in open formats like Apache Arrow, Apache Parquet, Apache Iceberg.

As data use inside of an enterprise has expanded, so has the diversity of demands on that data.

领英推荐

Balancing Decoupling: Finding the Right Boundaries for…

Dr. RVS Praveen Ph.D 1 年前

Microsoft Fabric--Connecting the Dots

Jo Peterson 6 个月前

The 5 Modern Data Platforms:…

Dr. Rabi Prasad Padhy 6 个月前

Rather than copying it each time for a different purpose whether it’s exploratory analytics, business intelligence, or AI workloads, why not centralize the data and then have many different systems access it?

This saves money : Storage is about $280m-300m overall for Snowflake.

As a reminder, about 10% to 11% of our overall revenue is associated with storage.

But it also simplifies architectures.

It also ushers in an epoch where the query engines will compete for different workloads with price & performance. Snowflake may be better for large-scale BI ; Databricks’ Spark for AI data pipelines ; MotherDuck for interactive analytics.

Data warehouse vendors have marketed the separation of storage & computein the past. But, that message was about scaling the system to handle bigger data within their own product.

Customers demand a deeper separation - a world in which databases don’t charge for storage.

带有此图标的链接由领英创建，不带此图标的链接由作者添加。

Tomasz Tunguz

115,377 位关注者

Marco Ullasci

Data Solutions Architect, Singapore PEP

5 个月

The market is right even when it's (environmentally) wrong: no one can win against it. in the process R.i.P. computational efficiency sacrificed on the altar of open formats. Additional caching mechanisms will come to the rescue once the tradeoffs of the opens formats become unacceptable. Just like it happened with many data federation solutions in the past.

Matthew Birdsall

Experienced Data Leader | Builder | Father

11 个月

This has been one of the intentions of the big data revolution—why are we saying that we’re inching towards such an era in Cloud Data Warehousing? Presto, Trino… even Redshift to some extent—have had independently scalable compute and external storage layers. I guess I’m just confused? Maybe this is specific to Snowflakes strategy and not their technology. I worked on an implementation on snowflake in 2019-2020 that I didn’t design. The design called for Snowflake manage storage but I used extensively external schemas pointed at S3 and saved storage costs. Using Athena right now is extremely cost effective, and its compute is completely decoupled from its storage layer.

Andreas Borg

11 个月

Tomasz Tunguz a question. is the main reason customers are calling for the decoupling "cost of storage" , "interoperability" for different usecases (as in 3), or is it to make the current combination of storage compute more efficient? (as in 2). The reason for the question is that databricks and snowflake has indeed made a lot of changes and have been switching their architecture towards these open formats. As you mention, this is to handle more data more efficiently in their own platform. but....this might be its own reward for customers since more efficiency brings down cost in the current operations (where storage is not the main cost) so can you tell us a bit more about the motivations you see?

Andreas Borg

11 个月

This assumes that all data must be moved to "storage" first. What about keeping data in the source system and query this when needed. Right now we are already doing one copy of all data we think we need. Storage his cheaper now and fast enough for most uses. Streaming data might be the exception! But there is still duplication and it can create inconsistent data (between storage and source) An interesting future would be when data (mostly) remains in source systems. The data that has most speed requirements will be moved to storage. The rest remains. Everything gets a metadata tag. Queries are done on metadata and retrieved from either What would be needed for this? Faster transfer? Other type of indexing and tagging of source system data? Better search? Different source systems? Snowflakes invention was to hash everything in a different way and that meant they could distribute data differently while keeping qry and retrieval time as before (or faster). Could something similar be made with a hybrid storage approach?

Vijay Ganesan

1 年

Great article Tomasz Tunguz! This is the biggest trend in the data space right now and is going to have a huge impact. This solves one of the biggest challenges that enterprises face - data lock-in. Freeing your core data from the clutches of vendors enables better value generation from it - from a variety of tools that can process that single copy of data. As an analytics vendor, open table formats allow us to employ specialized compute engines tailored for specialized workloads such as event data, time series, graph etc. instead of being forced to work with a lowest common denominator SQL engine; and we can do that without having to make copies of the data into proprietary stores. Further, it lets us monetize better by being able to own and charge for compute too. Game changing!

4 次回应

查看更多评论

要查看或添加评论，请登录

Tomasz Tunguz的更多文章

The Third UI : The Rise & Fall of the Keyboard

2025年3月21日

The Third UI : The Rise & Fall of the Keyboard

I remember the day I received it : my first Blackberry. A few weeks later I lost it in the back of a taxi cab in Paris.

18 条评论
The Implications of the Wiz/Google Deal

2025年3月18日

The Implications of the Wiz/Google Deal

Is tech M&A back? Google announced its intention to buy Wiz for $32b today. If approved by regulators, it would be the…

6 条评论
Halving R&D with AI & the Impact to Valuation

2025年3月17日

Halving R&D with AI & the Impact to Valuation

Engineering teams within AI application startups are much smaller than a classic software company - maybe half the size…

8 条评论
The Mirage in the Software Clouds

2025年3月14日

The Mirage in the Software Clouds

Public SaaS companies’ growth rates have halved since 2023, as David Spitz pointed, from 36% to 17%. Why? There are…

12 条评论
This Analysis Cost 27 Cents

2025年3月12日

This Analysis Cost 27 Cents

Monday’s analysis cost about 27 cents to produce. This little screenshot is of Claude Code, the product I use now to…

9 条评论
Positioning Startups in the Age of AI

2025年3月11日

Positioning Startups in the Age of AI

How do you position and scale an AI company in a rapidly evolving market? Join us for an in-person Office Hours session…

6 条评论
How Much Is A Venture Firm Worth?

2025年3月10日

How Much Is A Venture Firm Worth?

A small spin-out from a publicly traded behemoth launched with the ambitious vision of transforming their entire…

5 条评论
Why War & Peace Is Killing Your Data Budget

2025年3月6日

Why War & Peace Is Killing Your Data Budget

Imagine if every time you edited a document, the word processor forced you to retype everything that had been written…

3 条评论
A Founder's Guide: Essential Management Advice for Startups

2025年3月5日

A Founder's Guide: Essential Management Advice for Startups

As startups scale, effective management becomes the difference between chaotic growth and sustainable success. After…

10 条评论
Lopsided AI Revenues

2025年3月3日

Lopsided AI Revenues

Which is the best business in AI at the moment? I analyzed Q4 revenue data from publicly traded companies across…

8 条评论

See all articles

Why Databases Won't Charge for Storage in the Future

Tomasz Tunguz

领英推荐

Tomasz Tunguz

115,377 位关注者

Tomasz Tunguz的更多文章

社区洞察

其他会员也浏览了

Seamless Integration: Databricks' Approach to Reading and Writing in Azure Data Lake Gen 2

Understanding Batch and Real-Time Processing in DataBricks

Mapping Microsoft's Data Analytics Landscape – Comparing Databricks, Synapse and Fabric

The War of Database Stack Titans: Oracle Snowflake databricks key differences in ROI TCO

Migrating from Traditional Databases to Databricks: A Strategic Path to Data Modernization

The 3 Most Important Concepts To Design A Scalable DynamoDB Table

Unlocking Synergy: Connecting Databricks Notebooks with Microsoft Fabric OneLake

Snowflake VS Azure Synapse | 7 reasons why you should choose Snowflake OR Synapse on Azure

Can Snowflake forge a path to $50bn?

领英推荐

Tomasz Tunguz

115,377 位关注者

Tomasz Tunguz的更多文章

The Third UI : The Rise & Fall of the Keyboard

The Implications of the Wiz/Google Deal

Halving R&D with AI & the Impact to Valuation

The Mirage in the Software Clouds

This Analysis Cost 27 Cents

Positioning Startups in the Age of AI

How Much Is A Venture Firm Worth?

Why War & Peace Is Killing Your Data Budget

A Founder's Guide: Essential Management Advice for Startups

Lopsided AI Revenues

社区洞察

其他会员也浏览了

Seamless Integration: Databricks' Approach to Reading and Writing in Azure Data Lake Gen 2

Understanding Batch and Real-Time Processing in DataBricks

Mapping Microsoft's Data Analytics Landscape – Comparing Databricks, Synapse and Fabric

The War of Database Stack Titans: Oracle Snowflake databricks key differences in ROI TCO

Migrating from Traditional Databases to Databricks: A Strategic Path to Data Modernization

The 3 Most Important Concepts To Design A Scalable DynamoDB Table

Unlocking Synergy: Connecting Databricks Notebooks with Microsoft Fabric OneLake

Snowflake VS Azure Synapse | 7 reasons why you should choose Snowflake OR Synapse on Azure

Can Snowflake forge a path to $50bn?