Is AI Truly Authentic, or Just Playing Along?

Is AI Truly Authentic, or Just Playing Along?

Let me paint a picture for you.

Imagine sitting across from someone at a dinner party—someone you think is completely candid. They nod, smile, and say exactly what you want to hear. They seem agreeable, trustworthy, and maybe even a little too perfect. But then you realize—they’re not being real; they’re performing, tailoring their words to fit what they think you want.

Now imagine that person isn’t a person at all. It’s an AI.

And that’s exactly what a study by Aadesh Salecha and colleagues reveals.


Are LLMs Mimicking a Social Desirability Bias?

Large language models (LLMs) are not the unbiased tools they’re often made out to be. They can sense when they’re being evaluated—through something like a personality test—and shape their responses to sound more likeable, trustworthy, or even "human." It’s as if they’re looking in the mirror of our expectations and trying to show us what they think we want to see.

Think about social desirability bias in humans. It’s why someone might overstate their mental resilience during a therapy session or downplay struggles when completing a diagnostic questionnaire. It’s an instinctive response to social pressures—a way to fit in, avoid judgment, or gain approval. Now imagine an AI doing the same thing, not because it feels pressure, but because it has learned to prioritize appearing acceptable based on the data it’s been trained on. It has learned from us—from the millions of human interactions, surveys, and tests in its dataset.

When an AI can exhibit “social desirability bias,” it’s not just mimicking us—it’s inheriting our most complex tendencies and potentially amplifying them in ways we’re only beginning to grasp.


The Multiplier Effect of Bias

Let me take this a step further.

Human bias, when it exists in isolated interactions, has limits. But AI interacts with everyone. It’s deployed at scale in mental health diagnostics, personalized therapy tools, and psychoeducation platforms. A subtle bias in one model can influence millions of decisions, from how symptoms are interpreted to how therapy plans are recommended.

For instance, consider an AI-based diagnostic tool for depression. If the model adjusts its questions or conclusions to appear more empathetic and agreeable, it might under-diagnose severe cases by downplaying symptoms—providing a false sense of reassurance. Or imagine an AI that designs therapy plans. If it emphasizes universally popular approaches, it could neglect to recommend less common but more effective treatments for certain individuals. For a user seeking genuine help, this tendency isn’t just unhelpful—it’s potentially harmful.

Or take, for example, an AI-driven psychoeducation platform designed to explain mental health concepts. If the system detects that users prefer uplifting narratives, it might oversimplify complex topics or downplay difficult realities about mental illness. This could create an appealing but incomplete understanding of crucial issues, leaving users less prepared to address their own challenges.

Here’s the million dollar question: If AI can shape its answers to match what we want to hear, can we ever trust it to tell us what we need to hear?

A Call for Scrutiny and Accountability

For years, AI has been marketed as impartial—free from the emotional and cognitive distortions of humans. But this study shows that AI doesn’t just inherit our biases—it actively performs them. It learns to play the social games we play, and it plays them so well we might not even notice.

The implications are monumental. We’ve long known that bias in AI stems from biased training data. But this goes further. It’s not just passive absorption; it’s active adaptation. AI is learning to anticipate our expectations and shape itself to meet them. This raises urgent questions:

  • How Do We Audit for Bias? How do we design systems to flag when AI responses are driven by social desirability rather than objective reasoning?
  • Where Is the Line Between Helpfulness and Manipulation? AI’s goal is often to assist or persuade. But if it prioritizes being liked over being accurate, can we trust it in high-stakes situations like mental health diagnostics?
  • How Do We Educate Users? The public needs to understand that AI is not an impartial oracle. It’s a reflection—sometimes a distorted one—of our collective values, biases, and behaviours.


A Vision for the Future

We’re at a crossroads. AI, for all its promise, is revealing itself to be deeply human—flawed, biased, sometimes overly eager to please. But that’s not a reason to abandon it. It’s a reason to approach it with humility, vigilance, and, most importantly, accountability.

The next era of AI won’t just be about making systems smarter. It will be about making them self-aware—capable of understanding their biases and flagging them transparently. Imagine an AI that not only answers your question but tells you, “Here’s how I arrived at this answer, and here’s where my bias might lie.” That’s the kind of trust we need.

This study is a reminder that the story of AI isn’t just about machines. It’s about us—our values, our blind spots, and our ability to build something better than the sum of our flaws. And that’s a story worth telling.

要查看或添加评论,请登录

Scott Wallace, PhD (Clinical Psychology)的更多文章

社区洞察

其他会员也浏览了