Fear & Loathing in Voice Authentication - the 2nd Key Idea on Deepfakes

Fear & Loathing in Voice Authentication - the 2nd Key Idea on Deepfakes

For this week's entry, I have invited Tim Savage as guest author to shed some light on how, if at all, fraudsters use deepfakes today.

Tim's extensive experience and knowledge throughout his decades-long fight against fraudsters have allowed him to compile important insights, which he shares here for our collective goal to feel safe and keep our financial assets secure.

How are we vulnerable today when it comes to our identities, our life savings? Where does voice authentication fit into all of this?

Tim will also be active in the comments section, so please ask your questions there! -HT

Key Idea 2 – Fraudsters have started using audio deepfakes, but most fraud is still done the old-fashioned way

By Tim Savage

In last week’s article (Key Idea 1), we briefly touched upon how deepfakes already appeared to be used to nefarious ends, but spent most of that article evaluating if we humans (our ears, specifically) had any chance of telling a deepfake voice from a real one (we can’t).

For this article we’ll dive a bit deeper on how fraudsters, specifically, are using deepfake voices (or custom Text-to-Speech) for financial crimes.

Before we get started, perhaps a small PSA: while we will spend time looking at extreme cases where crimes took place, and real people were harmed, we have to caution at becoming too worried about deepfakes.

As the headline implies, and as this article hopes to highlight, for the most part you will remain safe. What we’re hoping to do with this article (and this series) is to inform and raise awareness around deepfakes and the tools/technologies that can help defend us against them, rather than to build hype or instill fear among our readers. Not every odd interaction you will have in your daily life means a crime is being committed/attempted, and not every crime has a deepfake behind it.


AI-generated image using GPT

With the advent of easily-accessible, high quality deepfake audio creation tools, how are individuals and organizations (banks) impacted in terms of their exposure to fraud? Fraudster groups are very adept at two things: (1) exploiting vulnerabilities (even the good nature) of their targets, and (2) running a lean and highly profitable business (these people are pros!).

So how does the underworld capitalize on the shiny new toy that’s been handed to them, that is deepfake audio?

Scamming (Grand-)Parents

Imagine one quiet evening at home, as you’re cozying up to go to bed, when suddenly you get a call from your child (or grandchild), clearly panicked and in a state of fear, asking you to bail them out of jail, or to pay a ransom because someone has kidnapped them.

As last week’s blog post illustrated, the human ear is easily fooled by synthetic voice software. In a moment of high-stress and an emotionally charged situation triggered by the sound of a loved one’s panicked voice, victims can be easily fooled. It just takes one moment of trust, from a place of love and trying to do the right thing, before being deceived into losing a large sum of money.

The person calling you is not your child.

Instead, it’s a fraudster who has obtained your child’s voice, successfully synthesized it and is playing it back to you asking for money to be urgently transferred to their account. A repugnant, but highly lucrative scam. The extent of these scams has become so prevalent that 3 letter government agencies (including the FTC) are recommending setting up 2 factor authentication with your loved ones (in Europe and Canada too!).

Social Engineering of Employees

To the professional fraudster, fooling a (grand-)parent is easy, their heartstrings and loving nature ripe for exploitation, but how about fooling a bank employee at their place of work, while on duty? What emotional space can the professional fraudster exploit in that scenario?

Deepfakes have been used to mimic the voice of business executives to convince their employees to transfer millions of dollars (also this article)?fraudulently. Under the pretense of an urgent business deal needing to be executed or by changing critical financial details, fraudsters can successfully dupe employees via internal phone calls or team collaboration software.

Gaining Access to Privileged/Protected Accounts

In the first two scenarios we explored how the professional fraudster leverages human vulnerability, in particular their victim’s emotions as they hear the voice of a child, or their boss’s voice.

But what happens when the professional fraudster engages with banks and other organizations that’ve added layers of more sophisticated authentication processes, creating a safety net of sorts for their very human employees as they interact with customers?

To skip right to the end: as of the writing of this article, third party fraud attacks against accounts protected by voice authentication are virtually unheard of. To understand why, we must remember that fraud is like water, and follows the path of least resistance.

Going back to the beginning, every successful fraud attack (e.g. towards a bank) contains two key ingredients: (1) an identity claim, and (2) overcoming an authentication challenge. Once a fraudster has identified their victim, they need to pass between 1 and 3 types of challenges: knowledge, possession and/or inherence.

Answers to knowledge-based challenges, such as social security numbers, date of birth and mother’s maiden name can be purchased in bulk (recall various headlines on data breaches over the years) for a nominal price.

Possession-based challenges add another step for the professional fraudster to overcome, but the professional fraudster has several tactics they commonly employ today, whether it’s infecting a victim’s device with malware, performing a SIM swap or compromising an email account.

Inherence is the last frontier, where in the context of audio deepfakes, the fraudster would need to: obtain the victim’s voice, effectively synthesize it, and then orchestrate an attack against either an IVR or a contact center agent in an attempt to bypass a voice authentication system.

Compromising an account protected by voice authentication requires a critical ingredient: the victim’s voice. With the exception of content creators and public figures, the overwhelming majority of customers do not have high quality samples of their voices on the internet or on the dark web.

So how could our professional fraudster obtain a usable sample of your voice? The most probable method would be by a traditional phishing call. Be aware that speaking to an unknown caller about a benign subject is exposing yourself to additional risk. Thankfully, decades of telemarketer calls from unfamiliar caller IDs have collectively conditioned many of us to be skeptical of unknown callers, but this warning still bears stating for awareness.

Other modalities besides voice can be used for inherence, but they too could be susceptible to various forms of fakery (a discussion that is beyond the scope of this post). Other modalities, however, tend to require the user to possess specialized hardware, which itself can introduce risks and dependencies, thus limiting their practicality.

Regardless, the complexities introduced in attempting to circumvent inherence-type challenges make them the most difficult (or time-consuming) for our professional fraudster to attempt. Possession is the next most-difficult type, with knowledge being the easiest to circumvent. Our ranking might look something like the below (though we invite further investigation and discussion on how to evaluate and judge all the different methods).

Which is the most secure form of authentication? Our proposed ranking places inherence at the top

In the end, the professional fraudster is running a tight, profitable operation: they don’t have time to fiddle with software and audio clips when other tactics are more effective. As we will see next, in this scenario the professional chooses the way of water.

Since we’ve now declared that inherence is the strongest type of challenge when it comes to securely authenticating an individual, it’s our winning choice, right? Actually, the correct answer is to use all of them together.

More is more when it comes to authentication, the more difficult it is for a fraudster to compromise an account, the less likely that they will invest their efforts attacking it

Returning to our professional fraudster, who is likely to have what they need to bypass knowledge and possession challenges, how do they fare against voice authentication today?

The fraudster’s MO is simple: avoid accounts protected by voice biometrics and instead, attack accounts that aren’t. If the account is protected, it is not worth their time or effort required to obtain the victims voice, synthesize it and execute a deepfake voice attack. Much quicker, and easier, to move on to the next victim on the list who isn’t protected by voice authentication. The slower gazelles satisfy the appetite of the lions just as well as the fast ones.

This notion cannot be more clearly illustrated than in the next graphic, where fraudster attack patterns are compared with and without the presence of voice authentication.

In an A-B test where half the population of an oganization are protected by voice authentication and the other half by using traditional authentication methods, it's clear which side the fraudsters favor as victims

In an organization where 50% of customer accounts are protected by voice authentication, that segment only sees 3% of fraud attacks. The remaining 97% of the time, fraudsters simply move on to victims with less protection.

Voice authentication is fraudster repellent

Gazelles, take note: to reduce your fraud risk you need to run faster (use the stronger authentication means available).

Organizations, take note: If you want to attract more gazelles to graze on your grasslands you need to ensure they will have the means to run faster than the lions (require your customers to use strong authentication methods).

Tying these last points to our earlier review of how fraudsters are or aren't using audio deepfakes, we can also posit that -

In the hands of fraudsters, deepfakes are mainly used to target humans, not systems

Now begs the question: What do customers and organizations’ decisionmakers do with this info?

For customers it is easy: use the strongest authentication standard available to you. If your bank does not offer robust controls and a three-layered approach, choose one that does.

For organizations: Implement and incentivize stronger authentication methods. Menu costs are not as trivial for organizations as they are for customers. The price of moving away from business as usual can be expensive: Buying and implementing new technologies, changing operating processes and disrupting customer experiences can be a long and arduous road.

We are not there yet, but the lions are getting faster and more capable with the advances of deepfake voice technology, and the rising threat this represents. This places more fraud risk on individuals and their accounts. It also creates a fertile breeding ground for more lions to appear. We as consumers need to protect ourselves and choose organizations that prioritize strong authentication. Nobody wants to be an account holder at, nor work for the Bank of Yesterday.

tldr; fraudsters would love it if we all stopped using voice authentication. In one real-world scenario, 97% of call center fraud was committed against individuals with no voice authentication protection

I want to extend a special thanks to my colleague Tim Savage, fraud fighter extraordinaire. While Tim continues to contribute to this article series (oh yes, there's more coming), he stepped up to take the writing reins on this week's piece. Hope you all enjoy it as much as I have!

要查看或添加评论,请登录

Haydar Talib的更多文章

社区洞察

其他会员也浏览了