When Humans + AI Are More Than the Sum of Their Parts

When Humans + AI Are More Than the Sum of Their Parts

If we’ve chatted in the last few months, you probably already know that I have strong feelings about the future of AI as part of a human work system. Core to my belief is that the best way forward for most businesses is to leverage the unique capabilities of each actor in the system to generate collective intelligence. Or, to put it another way, let the humans do their human thing and let the machines do their machine thing; putting the two of those together is going to get you further than trying to replace one with the other.

?So when Nature magazine published a paper last week titled “When combinations of humans and AI are useful: A systematic review and meta-analysis”, I almost sprained my finger clicking the download button so fast. The authors have created a thoughtful and comprehensive overview of existing literature on where human-AI systems are an improvement on either working alone, and where they’re not. I’ll let them speak for themselves as to the specifics and limitations of the study – you can read the whole paper here – and focus here on what I think is most useful (and interesting) for businesses.

First: their key finding is that the human-AI system (generally) seems to be more successful when the overall task is coded as more “creative”. Definitions are important here: in the dataset the authors were looking at, “creative” tasks are those whose output is generally an “open response” – so text, annotations, decisions etc. that are not pre-selected. The other task type – “decide” – offers a selection of possible answers for the participant in the system to choose from. So, the general direction of the findings is that (on average), human-AI systems perform better together when the human is given the latitude to contextualize and shape the final task output. This makes sense; AI systems are GREAT at finding patterns over historical data but currently – and, IMO, for the foreseeable future – not so great at grokking the nuanced context in which they’re being utilized.

A second finding that caught my eye is the intimation that the human-AI system performed better (again, on average!) when the overall task was split into subtasks. I say “intimation” because the number of data points for this type of result was very low, so it’s hard to say with confidence that it is a pattern that will hold up in larger populations. More later on why I’m willing to give it credence even with such statistical ambivalence …

Thirdly, the authors specifically limited their dataset to papers that included a human and/or AI baseline for each of the tasks. When looking at the tasks where AI performed the best at baseline, they found that a human using AI increased their performance, but still (again, on average) didn’t overtake the AI alone. However, when a human performed the best at baseline, adding an AI system to the task increased the overall output beyond the human baseline. I’m still digging into the implications of this - I want to better understand the potential differences between the types of task – but it is interesting that, overall, humans do better with AI assistance than without it.

Now here are the caveats: by the very nature of scientific research, all of these tasks are, in some way, definable. And success was measured based on historical performance (even with predictor tasks). Which makes sense – it’s hard to measure accuracy or errors if there is not an answer key. It’s also in line with an earlier analysis I did of another seminal paper - ?“Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality” - ?wherein researchers identified the existence of a border area between tasks that AI helped and tasks where AI (in their case, an LLM) did not help. In that case, linguistic analysis clearly delineated that more successful interactions were those that were directive – with clear description of the expected/desired output, step by step instructions, etc. Meanwhile, less successful tasks were those where the prompt language was more open-ended, less descriptive, even more polite.

Combining the two analyses, these papers seem to support my core hypothesis that the AI systems we’re building are very well suited for what I call “simple” work – not that it’s necessarily easy, more that it is well-defined work with generally known steps to accomplish and an outcome that is clearly linked to the steps taken.

This is important when we think about the advent of agential AI, which can be defined as small bits of coded functionality that can operate autonomously. Going back to the faint signal about the efficacy of interactions that include subtasks, the evidence seems to clearly point to the value of focusing on building agential AI systems for tasks that with clearly defined outcomes and definable steps.

This kinda sounds obvious when we say it out loud … but it’s important to keep in mind as businesses race to find the right use cases for AI systems. It’s very easy to get caught up in imagining all of the amazing things AI will one day be able to help us with. But many organizations – especially those in the middle market – cannot afford to invest in pipe dreams. Instead, they need to focus on what is achievable now. Using this yardstick can help direct those limited resources in the appropriate directions:

  • If the answer is yes to the questions “Do we have a clearly defined process and outcome for this work?” and “Is it a task that is primarily a decision amongst a limited set of possibilities?”, then an agential AI solution will likely provide benefit (although adding AI to the human work system seems likely to help, too!).
  • If the answer is yes to the question “Is success in this task highly dependent on its context?” – then the best way forward is likely to focus first on enhancing the human worker in the system with the additional information that an AI can provide, but focus the AI system on subtasks within the overall task that are better known and repeated.

?Here’s one example: A business analyst’s work is likely to benefit from different AI systems at different points in the process. When gathering requirements, the AI can be used to help identify patterns that the BA would then fold into the final requirement list. When ensuring that the backlog of requirements is in agreement with the work picked up and worked on by the development team, an agential AI system can be used to help track the work and (potentially) predict completion dates for the whole or a subset of the requirements.

It also bears noting that many of the papers included in the dataset were researching the impact of AI systems within medical tasks (e.g. diagnosing). It can be assumed that the AI systems in use were therefore more specialized in their training and development, rather than an off-the-shelf generalized AI solution. This has important implications for how organizations approach integrating the existing AI tools with their own unique data set and architecture.

One final trend I noticed, although it wasn’t explicitly discussed by the authors, is that over time, the level of human-AI synergy seems to be steadily increasing. Whether that’s because the technology is getting better or we as humans are getting better at using it (or both) isn’t knowable at this point, but it’s an encouraging sign for the future of human-AI systems that are designed to enhance the capabilities of all actors in the system, human and machine alike.

Paco Nathan

Evil Mad Scientist

4 个月

Excellent, especially the perspectives about how the level of human-AI synergy seems to be steadily increasing. One might argue that the dynamics of human-machine synergy were central to early conceptual work in AI: circa Macy conference, and later well represented in Vinge's "Singularity" paper. However, there have been repeated short-term deviations from this narrative -- temporary blips of widespread belief in _automation_ which rarely obtain -- if that fits?

Helpful insights. Thanks for posting!

回复
Maggie Smith

AI Strategy & Service Design / Digital Workplace Experience Innovation

4 个月

Your perspective and insight never fail to inspire, sending my mind racing with 1001 new questions and ideas! Incredibly curious about the differences between the types of tasks and the implications for the workplace as we begin to really dig into what ‘work’ is and (hopefully) look at it in entirely new ways (collaborating within thought processes ??). Thank you for sharing!

Chris McClean

Digital Ethicist | Responsible AI Lead | Risk Specialist | PhD Candidate | former Industry Analyst and Research Director | Corporate Citizenship Champion | Avanade

4 个月

Great perspective, C. Merrell Stone. I think you're answering exactly the right kinds of questions about AI in the workplace.

回复

要查看或添加评论,请登录

C. Merrell Stone的更多文章

  • Oh, the Humane-ity!

    Oh, the Humane-ity!

    Early products are often glitchy, overly complex and only loved by a few. But they usually solve at least one problem.

    8 条评论
  • The Race for the AI Assistant

    The Race for the AI Assistant

    First, I need to set the stage: I research emerging technologies for a living - these are some thoughts I had today…

    2 条评论
  • Emerging technologies: businesses accelerate unproven technology

    Emerging technologies: businesses accelerate unproven technology

    Originally posted on Avanade Insights We’re excited by the possibilities and potential of emerging technologies (which…

  • A fully blended world is almost in reach with XR

    A fully blended world is almost in reach with XR

    This blog post was originally published on Avanade Insights. We’re spending about a quarter of our waking hours on our…

社区洞察