登录查看更多内容

Shape Data BEFORE It Gets Expensive—Here’s How

Julian Giuca

CEO of Datable, The Streaming Pipeline for Security Teams

发布日期: 2025年2月28日

+ 关注

There used to be two.

Now there are three places you can shape your data.

We know the source, where the data is generated. It’s stateless and cheap.

And we know that data at rest is expensive to query. And iceberg slow.

But you don’t just have to clean your data at rest.

There’s a new, third way—a way to clean, transform, and route data before it becomes a problem.

In the pipeline.

Where Data is Generated: Fast, Cheap, but Isolated

Telemetry data begins at the source. Tools like FluentBit, Vector, and OTel Collector capture logs, metrics, and traces at their point of origin.

The advantage?

Data is processed early and efficiently at a low cost.
Cleanup is cheap—trimming logs, normalizing formats, and pre-filtering before they travel further.

The challenge?

Statelessness—each node only knows about itself.
Complexity and risk—changing configurations is high-risk and requires deep expertise.
Dependency on SRE teams—adjustments aren’t self-service, slowing iteration.
Lack of correlation—events across distributed systems remain fragmented.

Where Data is Stored: Expensive and Slow

Once telemetry data reaches a data lake or observability platform like Snowflake, Splunk, or Datadog, it gains full system-wide context.

The advantage?

Full correlation—data from different sources can be analyzed together.
Historical visibility—teams can look back across months or years of telemetry data.

The challenge?

High cost—storage and query expenses grow exponentially.
Slow analysis—retrieving insights can take minutes, hours, or even days.
Complex transformations—modifying data requires specialized teams, and requests often take weeks.

The Missing Middle: Stateful Data Pipelines

Between fast but limited sources and slow but powerful storage, there is a critical missing layer—one that allows teams to shape, filter, and route data before it becomes expensive.

This is where Datable fits in.

Take a look at this graph below i put together in 5 mins.

A telemetry data pipeline outlining the third, new way to transform data with Datable.

There's a new, third way to clean up your data—in the pipeline, before it reaches costly storage and analytics platforms.

A stateful data pipeline provides:

Smart filtering and enrichment—so only valuable data moves forward.
Low complexity, low risk changes—any team can safely adjust data pipelines without waiting on Ops.
Multi-destination routing—send critical data to expensive analytics tools and low-priority data to cost-effective storage.
Faster correlation and decision-making—teams don’t have to choose between fragmented raw data and high-cost insights.

Shaping Data in the Pipeline

Most organizations are stuck between two extremes. They either:

Handle everything at the source—and struggle with complexity and a lack of big-picture insights.
Push everything to data lakes—and face skyrocketing costs, delays, and operational bottlenecks.

A stateful processing layer balances both.

It allows security teams, SREs, BI analysts, and engineers to work without stepping on each other’s data—or toes.

It’s time to rethink where and how you shape your data. This three-layer model isn’t just a theory: it’s the Moneyball approach to doing observability.

The math maths, and the problems improve.

Start shaping your data in the pipeline with Datable.

DM me and I'll set you up myself.

要查看或添加评论，请登录

Julian Giuca的更多文章

No one cares about observability costs. (SecOps Must Read!)

2025年3月21日

No one cares about observability costs. (SecOps Must Read!)

I worked on logs for 12 years at New Relic. I loved it.

8 条评论
Grafana vs. Datadog: The Reddit Debate

2025年2月14日

Grafana vs. Datadog: The Reddit Debate

It started on Reddit. I waded into a thread comparing Grafana and Datadog—specifically around dashboarding.

6 条评论
All Observability Measurements Are Events

2025年2月10日

All Observability Measurements Are Events

Logs, traces, and metrics all represent something happening at a specific time. Technically, they're all events, and at…

2 条评论
The OTel Problem No One's Discussing

2025年1月31日

The OTel Problem No One's Discussing

This notion of a collector has existed in the logging space for decades now. Logstash, Filebeat—these aren't…

6 条评论

Where Data is Generated: Fast, Cheap, but Isolated

Where Data is Stored: Expensive and Slow

The Missing Middle: Stateful Data Pipelines

Shaping Data in the Pipeline

Julian Giuca的更多文章

No one cares about observability costs. (SecOps Must Read!)

Grafana vs. Datadog: The Reddit Debate

All Observability Measurements Are Events

The OTel Problem No One's Discussing