The shortest month of the year is now over which means that spring is starting and the first rays of sun are popping through the clouds. Between Mardi Gras and crêpes, the data world is also heating up with significant movements that could reshape enterprise AI infrastructure.
As always, here's what you can expect:
- One data tool?that we believe is worth digging into as a data person
- A selection of the?key articles?we read this month along with quick teasers
- Show and Tell:?some exciting updates from the CastorDoc team
- A?data meme?to brighten your day
IBM just acquired DataStax in a strategic move that shows IBM's ambition to strengthen its position in the operational database layer that powers AI applications.
DataStax, known for its NoSQL database technology, occupies a specific place in the data stack: it's not a data warehouse (like Snowflake) or a transformation tool (like Coalesce), but rather an operational database designed for high-throughput, low-latency workloads involving unstructured data. Their technology excels at handling the massive volumes of distributed data that power modern applications - which explains why companies like FedEx, Capital One, and Verizon rely on their solutions.
IBM is addressing one of the fundamental challenges in implementing generative AI at scale: organizing and utilizing the vast amounts of unstructured information that companies already possess.
- The New Age of Invention:?David Jayatillake explores how AI-oriented tooling has lowered the barrier to software creation. For him, tools like Cursor and Windsurf have made it possible for people with basic coding knowledge to build sophisticated applications in hours rather than months. He believes we're entering an era where custom software development becomes accessible to many more people.
- A Guide to Choosing the Right Data Transformation Tool:?Coalesce’s guide outlines key considerations when choosing data transformation technologies, emphasizing that the decision goes beyond technical features to include team dynamics, scalability requirements, and business objectives. Choosing between open source and commercial offerings involves evaluating not just upfront costs but total ownership costs, management overhead, and the trade-offs between control and convenience.
- The Limits of Data:?C. Thi Nguyen provides a critique of our data-centric approach to decision-making. He argues that our institutional reliance on metrics systematically filters out context-sensitive information that can't be easily quantified. His key insight is that data collection methodologies are designed to be portable at scale, but this comes at the cost of eliminating nuance that might be essential for good decisions.
- Our CEO’s Newsletter of the month: Where Does Data End? In his latest post, Tristan tackles the impossible challenge of "governing all company data" by questioning what counts as company data when it exists everywhere from warehouses to emails to SaaS tools. He argues that complete data visibility isn't just difficult - it's fundamentally impossible, with every organization having "known unknowns" that can't be tracked. His solution? Focus pragmatically on the data warehouse first, acknowledge coverage limitations, and recognize when poor coverage signals an infrastructure problem rather than a governance issue.
- Article of the Month:??Why Most Data Catalogs Fail. What separates successful data catalogs from expensive failures? After helping 150+ companies, CastorDoc's Customer Success Director reveals the surprising answer: it's not about technology but business transformation. The key insight? Start with high-impact assets rather than cataloging everything, connect data to business domains, secure executive sponsorship, and create shared responsibility between technical and business teams.