Rethinking Synthetic Data: A Distraction or a Stepping Stone?
Ray Poynter
At the intersection of work, fun & discovery (all views are my personal views unless indicated otherwise).
Synthetic Data has been a buzzword in the market research world for a couple of years, promising a revolution in how we approach data collection and analysis. However, I’d like to suggest that its current trajectory is more of a distraction than the game-changer we’ve been waiting for.
Augmented Synthetic Data: Modest Gains, Manageable Risks Augmented Synthetic Data, where a small portion of the data is created to complement ‘real’ data, is starting to find its way from the lab to real-world applications. In my opinion it is a slight improvement over traditional methods like weighting data or working with small samples. I suggest it offers something like a 5–10% improvement in business as usual for quantitative studies. While this is useful, it’s not a groundbreaking shift.
One problem for Augmented Synthetic Data at the moment is the lack of clear guidelines, but once these are in place, I think acceptance will likely grow. However, I foresee that the impact of augmented Synthetic Data will be incremental rather than transformational.
Pure Synthetic Data: Still in the Lab On the other hand, pure Synthetic Data, where most or all of the data for a study is entirely artificial, is still very much a lab experiment. I don’t see this changing significantly in 2025. While it’s a fascinating area of development, its practical application for quantitative studies remains limited for now.
A Distraction from Bigger Trends? I think that the Synthetic Data conversation may be diverting attention from a larger, more transformative trend, the rise of research approaches that are designed for AI, rather than using AI for research approaches designed for humans.
Think of Synthetic Data as the "faster horses" approach to insights. In the past, we designed questionnaires, collected responses, analysed the data, and interpreted the findings. Today, AI is helping us optimise each stage, making them faster, cheaper, and often better.
But is this optimisation enough? Or are we missing the key opportunity to reimagine the entire process?
领英推荐
Rethinking Insights with AI Consider Kantar’s Link AI as an example. Instead of gathering participant feedback for TV ads, clients can now upload their creative, wait 15 minutes, and receive metrics equivalent to traditional studies, with no participants required. This isn’t Synthetic Data; it completely bypasses the need for participant-generated data.
Now, zoom out even further. AI can already create TV ad scripts, produce visuals, and generate actors' voices and soundtracks from a creative brief. Imagine an AI system where the client provides a brief, the AI generates multiple ad concepts, tests them (using tools like Link AI), fine-tunes the creative based on the diagnostics, and repeats the process until it produces a winning ad. No interviews, no questionnaires, no cross-tabs or dashboards.
Let’s take this one step further. What if the client could skip even these intermediate steps? Instead of focusing on ads or metrics, they simply describe their business objectives to the AI. The system would then design and execute a campaign, delivering results directly aligned with those objectives. No Synthetic Data. No questionnaires. Just solutions?
Clearly there will be a role in the future for new data, and there will be, for example, TV campaigns that can be created and tested by AI. But, I think the bulk of the future insights industry will not include participants (live or synthetic) and I do not see a role for tables, questionnaires etc in the majority of future projects. I also think that the projects that do require participants, questionnaires, and analysis will be specialist (expensive).
Rethink, Don’t Refine Synthetic Data has its place, but it’s not the revolution some expected. As well as working on incremental gains, we should embrace AI’s potential to fundamentally transform the market research and insights landscape. The future isn’t about making horses faster; and it is not about building the car, it is addressing how to get people to the places they want to be.
What are your thoughts?
Curious what you think, personally, Ray Poynter. It’s interesting that you’re asking for old school perspectives which given the topic seems strangely antiquated. Wonder whether you’re simultaneously running some kind of parallel synthetic study
Great Researcher, Insighter and Provider of Solutions at Winton Research and Insights Pty Ltd
2 个月??
Great Researcher, Insighter and Provider of Solutions at Winton Research and Insights Pty Ltd
2 个月I like very much your suggestion that we stop refining and start rethinking synthetic data. In fact, so much change is happening in research and insighting generally that I am going to start applying it methodically and cautiously to every project and process, especially where "old" thinking tends to be holding us back or delaying the decision point. Indeed, I'm inclined also to follow Lewis Carroll (Alive in Wonderland) when he has Alice saying "sometimes I've believed as many as six impossible things before breakfast." as a starting point in the exploration process. I thought embracing AI would beimposibke for an old Bugga like me, but I believed I could do it anyway, and I have, so as with envisioning the car rather than faster horses, I will believe in innovating ?? and creating a data-free research future.
At the MRS Awards, at a table full of clients, the big issue for 2025 among all of them was concern about the data quality. As such I think “Trust” is the big issue for research right now, and not sure synthetic data or augmented data has a future, unless we can seriously answer the trust issue for panel, for synthetic for augmented data based consultancy
Provider of ID-verified US consumer survey sample | Data quality advocate | Market research revolutionary | marketresearchsucks.com
2 个月The Link AI example makes a lot of sense, but when it comes to quant survey concept tests, who is going to have enough data to make a model of fast food product preferences, for example? I feel like when people talk about the promise of completely synthetic responses, they're flirting with artificial general intelligence, i.e., "you can ask our synthetic panel anything!" AGI is much more complicated, but that's how the marketing feels right now and I can't say I like it.