Start as early as possible...the data product discovery & design
Image generated using MidJourney by a human

Start as early as possible...the data product discovery & design

"The best time to plant a tree is twenty years ago. The second best time is now."

I am a big believer that in the world of data, we are not starting earlier enough in the process of data product discovery, linked to business outcomes, use cases and understanding user personas. This is either not taking place at all and the entire focus on the latest and greatest technology or done by chance and not as a continuous habit of data product teams.?

Truth be told, I was not doing that as a practice either till 2019 and learnt a lot since 2021. There are good product management & discovery practices that can be applied to data or adapted for data, analytics & AI as well, here is one good example.

Having said that, for an enterprise, focusing on internal use cases, for internal employees at least, there could be multiple teams working in the data space. It can range from highly centralised to decentralised teams or a hybrid model (my favourite). Even with a hybrid approach, the freedom and empowerment comes hand in hand with oversight & visibility (may be a good form of data governance).?

In order to support this freedom... I think we have been missing a discovery, design, collaboration tool, which sits very close to the data catalogue and data modelling space but currently, I don't see such functionality in a data catalogue and data modelling software. However, I have recently explored a few tools in this space and colour me optimistic & hopeful (hence this blog). I am curious to know what others are thinking about this topic? To give a comparison, in the absence of such a tool, this work is either not happening or happening in a combination of Excel files, Google Sheets, Air Tables, Miro, Murals of this world and then there is a “coordinator” who is chasing all the teams to aggregate such information.?

Curious to know your thoughts...


Disclaimers:?

These are my personal observations and do not represent my current or previous employers. Any resemblance is purely coincidental.


i would like to hear other people's views on writing mega scripts in Python that perform extraction, transformation and loading. Enhancing these packages and debugging for data issues is more expensive then writing a new one. Compound that with serverless components further increases the complexity. Is that typical for data pipelines or people using other tools in the cloud?

回复

the data space is back 20 years with current tooling and skills availability. The tooling that all 3 clouds offer is not up to the mark compared with some we had access to in data engineering and embedded reporting and data mining. A lot of visual tools that simplified training people, replacing resources and skills as well as overall coding and debugging. Things like error handling and data lineage were built-in to data pipelines back then. So overall, I think the data space has moved backwards and nobody teaches or learns data modeling anyway anymore. Most of the expert abstract modelers who could navigate between geo-spatial, hierarchical, network or relational schemes are retired pretty much or too high up in the food chain to not be hands-on. So overall, I see gaps all around in the data space that we need to quickly fill with either better training or preferably better tools

Joe Reis

Data Engineer and Architect | Best selling author and course creator | Recovering Data Scientist ? | Global Keynote Speaker | Professor | Podcaster & Writer | Advisor & Investor

1 年

This is such a great thread right now. Thanks Omar!

要查看或添加评论,请登录

Omar Khawaja的更多文章

  • Tools don't matter or do they?

    Tools don't matter or do they?

    Disclaimers: This is a short break-out activity idea on what we may hear every day at work. Please read it with a light…

    13 条评论
  • Data Products: Let's talk about people

    Data Products: Let's talk about people

    "I can do things you cannot, you can do things I cannot; together we can do great things." - Mother Teresa In the…

    10 条评论
  • Data Products - the hard parts!

    Data Products - the hard parts!

    Data Products are all the rage these days in the data community. For me personally this was a new concept back in 2020…

    14 条评论
  • Reflections from Gartner Data & Analytics Summit May 13-15, 2024

    Reflections from Gartner Data & Analytics Summit May 13-15, 2024

    Sharing my reflections, insights and a few takeaways from the recent Gartner Data & Analytics Summit that took place in…

    21 条评论
  • Finding the right Balance

    Finding the right Balance

    Socio-Technical Balance in Data Mesh Implementations Disclaimers: These observations and suggestions are personal…

    10 条评论
  • Bringing Data Mesh to life with Team Topologies

    Bringing Data Mesh to life with Team Topologies

    Disclaimers: These observations and suggestions are personal observations and do not represent my current or previous…

    16 条评论
  • How did we create this pandora's box?

    How did we create this pandora's box?

    Most companies have defined their vision, and mission and have established an operating model for their business to…

    6 条评论
  • Book of Business #bob

    Book of Business #bob

    the business literacy book I would like to read (& co-author?) Disclaimers: These observations and suggestions are my…

    10 条评论
  • The data product experience - the missing step…

    The data product experience - the missing step…

    Disclaimers: This blog is part 2 of my earlier blog on my thoughts on a few “data management” capabilities of the…

    21 条评论
  • The natural order of “data management” things….

    The natural order of “data management” things….

    Disclaimers: This blog is meant to open a dialog on a few “data management” capabilities of the modern data & analytics…

    27 条评论

社区洞察

其他会员也浏览了