The GREAT data debates: warehouse or lake, dashboards: alive or dead & more
Prukalpa ?
Co-Founder at Atlan –?Home for Data Teams | Forbes30 & Fortune40 lists | TED Speaker
Welcome to this week's edition of the ??Metadata Weekly ??newsletter.
Every week I bring you my recommended reads and share my (meta?) thoughts on everything metadata! In case you prefer email over social, the newsletter is on Substack. You can subscribe?here .
What a week it was! ??
The highlight of my week was jamming with Jen Stirrup , founder of Data Relish, Jordan Volz , Director of Field Engineering at Continual, and 60+ data leaders about the biggest data trends in 2022! We got a pulse on how these data leaders feel about the GREATEST data debates from Twitter, and I’m dedicating today’s newsletter to some of the highlights from our discussion. ??
???Data Warehouse or Data Lakes?
We had a split house on the Data Warehouse vs Data Lakes debate from the audience, but almost everyone agreed on the great convergence: Data Lakes and Warehouses are converging (a trend I’d pointed out in 2021 ), and the Data Lakehouse is emerging as a new architecture paradigm.
Will warehouses and lakes converge fully, or will there always be a need for both architecture paradigms? I don’t know all the answers, but highly recommend this thought-provoking a16z podcast episode “The Great Data Debate” where Bob Muglia, Tristan Handy, George Fraser, Martin Casado, and Michelle Ufford debate the question.
???Dashboards are Dead or Long Live Dashboards?
Last year, Data Twitter was buzzing with articles announcing dashboards are dead . But our audience (85%) overwhelmingly voted for “Long live dashboards”, saying that dashboards are here to stay. I fully agree. Dashboards have an important place in the data stack and aren’t going anywhere.
Honestly, two of the reasons dashboards get a lot of flak is because:
Jen pointed this out as the biggest reason for poor dashboard adoption, and I tend to agree. We fall into a “dazzling dashboard” trap and focus on making beautiful dashboards, rather than useful dashboards that are mapped to “decision flows”!
Different people make decisions differently. If you want to build dashboards to help them make better decisions, you first have to understand how and why they make those decisions. You could sit on a treasure trove of data and make the fanciest graphs, but if you’re not filtering out the data from the noise and presenting the right data in the right way, it is moot.
At SocialCops (a data for good company I had started), we actively tried avoiding this “dazzling dashboard” trap. One of our core values is “Problem first, solution second.” One way this translates into reality is that we focus 80% of our time on figuring out a problem, then 20% designing the solution (often, a dashboard). The more time we spend scoping out a problem statement, the less time we end up spending on implementation and the more effective the implementation becomes.
This conversation reminded me of a blog we’d written at SocialCops outlining our 8 step methodology for building dashboards that people will actually use.
You’ll see that we don’t even get to the data until Step 6! I highly recommend this read to anyone who’s working on designing dashboards.
2. Dashboards aren’t a silver bullet.
Dashboards aren’t the best medium for EVERYTHING data consumption. Dashboards are a great presentation medium but they aren’t the best medium for data exploration. Joseph Moon’s tweet thread on data interfaces comparing them to slides, docs, and presentations is great. In his words
领英推荐
???Data Mesh: Revolutionary or Ridiculous?
This was the funniest poll, with half our audience voting revolutionary and half voting ridiculous — a fair representation of the very PASSIONATE views the data mesh evokes in data people.
Here’s my take: The more I study the data mesh, the more I’m beginning to realize that a lot of the original principles proposed by Zhamak about the Data Mesh are actually great.
The challenge is that a lot of the original material around the data mesh is filled with jargon, making it hard for most people to comprehend. To add to this, the vendor hype is killing me, with everyone and their mom rebranding to talk about the data mesh, adding to market confusion. So here’s my humble appeal to data leaders: Try and focus on the core principles of the data mesh, and apply what makes sense to you and your organization!
Check out the recording from the event here . Watch for some behind-the-scenes fun, some ???takes on the greatest data debates from 2021, and tell me which side are you on!
?? Fave?Links from Last Week
Same Slack, Different Day by David Jayatillake
“We can’t just expect to spin our chair around and tap someone on the shoulder to ask for this metadata from a colleague; it was never a scalable solution, to begin with. These questions and answers are too valuable to be asked over and over again without automatically becoming documentation for the next person to access. It has often been the case that you would get two slightly different answers from two experienced colleagues - if it’s codified then the differences can be addressed with consistency or divergence understood.... The end goal should be that, with stewardship of this metadata and strong DataOps processes, it should be comparably easy to onboard a data professional as any other and at least as easily as with a software engineer.”
My take: David nailed it with this post! In my opinion, more data leaders should take the building a documentation-first culture, discoverability, and knowledge into their culture in their early days.
Personally, my biggest scars as a leader come from a quarter when we doubled our team size and thought it would solve all our problems. Our older analysts on the team were neck-deep in work and didn’t have time to onboard newer analysts. Newer analysts didn’t understand our data, or how to work on our projects and so they couldn’t contribute. When they did try to contribute, we needed to redo all their work because they used the wrong data or made the wrong assumptions. Our older analysts got frustrated because all the new team members were creating more work for them, rather than relieving them of work. Our newer analysts were demotivated because they weren’t getting any wins. This led to a group of people in the office with significant amounts of negative energy, which even brought down the morale of some of our highest performers. Ultimately, our oldest analyst ended up quitting. We ended up letting go of our newer analysts and had to rebuild our team from scratch. I personally went through periods of serious self-doubt, and it took me years to come out of that phase.
I know this is a very personal story — but I’m sharing it here with the hope that it can help other data leaders. Please don’t take this problem lightly. It might seem like something as simple as “onboarding”... but it can have serious ramifications on your team and culture.
???More from My Reading List
????Upcoming Events
???Datanova 2022: The Data Mesh Summit by Starburst | 9th-10th February
This week, Starburst is hosting Datanova 2022: The Data Mesh Summit . Super excited to tune in for talks from Steve Wozniak, Zhamak Dehghani, Adrian Estala, Barr Moses, and more!
P.S. On day 2 of the event, I’ll be speaking about the key components of a metadata platform, how it can help meet enterprise governance standards, and how it powers the Data Mesh architecture. You can learn more and sign up here .
I'll see you next week with more interesting stuff around the modern data stack. Meanwhile, you can subscribe to the newsletter on Substack and connect with me on LinkedIn here .
VP Client Insights Analytics (Digital Data and Marketing) at Bank Of America, Data Driven Strategist, Innovation Advisory Council. Member at Vation Ventures. Opinions/Comments/Views stated in LinkedIn are solely mine.
2 年Very valid points on data mesh and why metadata first mindset is very critical for data product/projects.