Small (data) is beautiful

Companies are investing in big data projects with the expectation that aggregating data into one big central data repository will automatically allow them to find solutions to all of their business problems. They merge exiting easy-to-use data sets, maybe add new ones, and develop an entire taxonomy with limited access rights around this central data repository.

And now what? What do you do with this data? Will your data scientists dissect the data, but won’t know how relevant it is since they are disconnected from business operations? Or will your business analysts be trained on how to access the data first, but then will only scratch the surface since they may have confirmation bias focusing on data that they know and agree with and ignoring everything else?

Maybe it would be better to start small. Before embracing big investments in big data, it would be worthwhile to find out if your organization can handle small data. Do you have the analytical talent to examine and interrogate a small data set with let’s say half a million records, which can be even managed in a spreadsheet? Can your analysts find relationships across different dimensions and across time that may impact your company’s revenues and profits? Are they able to distinguish correlations from causations? Can they recognize patterns in the data? Are they unbiased enough to identify the unknown unknowns and create hypotheses for further testing? Are they skilled to apply basic statistics and advanced regression, simulation, clustering, and other techniques to test these hypotheses? And are they willing to reiterate all the steps if the analysis and modeling results don’t confirm their initial assumptions?

If you don’t have the talent to do all this, maybe big data is not for you … at least not until your analytics teams can fully understand (small) data and its implications on business results. Otherwise your big data project may be like teenage sex as written by The Register: “everyone's talking about it, only a few know how to do it, they all think everyone else is at it and so pretend they are too”.

要查看或添加评论,请登录

Jack Lampka ??的更多文章

  • Analytical AI & Generative AI: Why, What, How

    Analytical AI & Generative AI: Why, What, How

    The hype about Generative AI with ChatGPT & Co. and now the publicity around AI agents lead to confusion what AI really…

    5 条评论
  • How to market AI products to internal customers?

    How to market AI products to internal customers?

    Theoretically that should be easy ..

    5 条评论
  • 11 building blocks for a successful data strategy

    11 building blocks for a successful data strategy

    Data strategy has become one of the many buzzwords used and misused in the data & analytics space. To some it stands…

    2 条评论
  • It takes a data village

    It takes a data village

    How do you become a successful data-driven company? If you believe the buzzwords floating around, you need to hire data…

    6 条评论
  • Data Science in Pharma RELOADED

    Data Science in Pharma RELOADED

    It has been almost 3 years in pharma for me, after 20 years in tech, and I think I know the answer now: no burning…

    10 条评论
  • Data Science in Pharma

    Data Science in Pharma

    After spending now a year and a half in the pharma industry, I still wonder why pharma is having such a hard time with…

  • Future of Patient Data

    Future of Patient Data

    The world’s healthcare systems are seeing significant changes. A more information-rich, digital approach to healthcare…

  • The Healthcare Trifecta

    The Healthcare Trifecta

    As somebody new to the healthcare sector, I was confused when people were talking about customers. It seemed like…

  • A data scientist, a data analyst, and a business intelligence expert walk into a bar …

    A data scientist, a data analyst, and a business intelligence expert walk into a bar …

    What starts as a joke for some, means nothing for most people. Those who are deeply entrenched in the analytics field…

  • Data visualization demystified

    Data visualization demystified

    Data visualization is nothing more and nothing less than representing data graphically for the human brain to process…

社区洞察

其他会员也浏览了