AI and ML to help Alzheimer patients: combining GFP-GANs, time travel rephotography, deepfakes and speech synthesis.
If you have a loved one suffering from Alzheimer's, you know that old photos are part of a 'memory box' you assemble to help them on this arduous journey. Putting one together for my mother before she became fully aphasic (unable to speak) took me an incredibly long time, but to this day she still goes back to 'her box' to look at images of better times.
In this article, I explore how we can restore, animate and create deepfake avatars complete with synthesized voices out of damaged photographs, today. This can transform how we think of this 'memory box' used to support people suffering from dementia.
AI is changing everything, including opening new avenues to dementia treatment and care.
Step 1 - Restoring the image.
"They shall not grow old" did not use mind-bending AI technology to achieve its goal of bringing WW1 to life. Yet, it unflinchingly conveyed the horrors of WW1 to millennials used to 4k HDR, an audience that does not relate emotionally with grainy, black and white footage.
Now that GANs are a dime a dozen on Github , better than Hollywood results are available to computer enthusiasts and will undoubtedly make their way into commercial software. Expect a tidal wave of "AI-enhanced" photo restoration suites producing better outcomes than highly paid experts.
Cutting through the jargon: A GAN is a fancy acronym that means creating a?machine learning (ML) model in which two neural networks compete with each other to become more accurate in their predictions. In other words, make machines fight machines to create... better machines! :)
The results speak for themselves:
Step 2 - Bring the image back to today's timeline.
Once the photo is restored, it still holds that 'old-fashioned' feel of yesteryear. Furthermore, it's probably low resolution and suffering from artifacts brought on by the restoration process or digital compression. Wrinkles, dated background and outdated technologies in the frame such as rotary phones all 'keep' the image firmly stuck in the past.
Thankfully, this is where AI can help once again - through a process called Time-Travel Rephotography. Yes, you read that right - and as I take you on various journeys through Artificial Intelligence and the wonders of Deep Learning, you better get used to this kind of terminology. We may not have flying cars - but we do live in the future - it's just not evenly distributed, as Gibson famously said.
This Two Minute Paper video demonstrates that sub-surface scattering can be added back to a photo through simulation for this beautiful, 'modern look'. A quick background change later, and Abraham Lincoln is holding a tablet to oversee the Civil War while Thomas Edison is pictured driving a Tesla (bad joke, I know).
Did you know? Subsurface scattering is simply simulating how light penetrates a translucent surface, such as human skin instance.
We now have grandma or great-grandpa looking all 2022's like. But let's avoid giving them face tattoos though, please :)
Step 3 - Bringing still pictures all to life.
Now that we have a perfect image of our loved ones, it's time to animate them. A few short years ago a deepfake based on a single image would have been unthinkable. But again, progress in disentangling appearance and motion has made it possible to animate a synthesia.io style deepfake avatar out of a single photograph.
And this is not limited to Hollywood and their large-budget computer farms either - have a play with Wombo or Reface on your mobile phone to see what's possible today with a device that fits in your hand. Results may vary of course, but still - this is the kind of stuff that not so long ago would have seen committed if you claimed it would be possible this century.
领英推荐
Step 4 - But what about speech?
This is where things are going to go from impressive to borderline creepy, so please fasten your seatbelts.
First, if we are given a sample of the person's voice, we have the option to simply use an off-the-shelf tool such as Respeecher to recreate the person's voice. My grandmother used to be an opera singer, something my mother was so proud of. With today's technology and the help of the French National Archives lending me a copy of one of her performances, even I can impersonate grandma and provide my mother with a new performance indistinguishable from the little that's left of her memories.
But it doesn't stop here - singing takes time and I might need to generate a lot of content. I might stumble upon words, leading to timely retakes. This is where reality surpasses (science) fiction: it's now entirely possible to synthesise an artist's performance, complete with all the subtleties of diction, or in the case of rappers 'flow' and 'style'.
Don't believe me? Check out this performance by Eminem (warning: extremely strong language!) that Eminem never gave.
This was created by 30/40 Hertz, an AI enthusiast who is bringing artists back from the dead or creating 'impossible' music collaborations. As much as this might be unbelievable, the audio in this video is entirely computer-generated. This was not recorded by Eminem, but by an AI.
There are evidently incredibly complex moral and ethical consequences to this technology, as studios, known to vulture their way into releasing endless 'posthumous' albums, will simply be able to purchase the rights to an artist 'image and performance style' and sadly produce Tupac albums for all eternity.
Conclusion - putting it all together.
You've now learned that as of 2022, it's now perfectly possible for someone with basic AI programming skills to gaffer tape together various tools downloaded from Github and generate an animated , talking version of long-lost family members to stimulate the minds of those that need it the most.
Give a few years and all this will be automated on AI-video editing production suites that will slowly replace traditional NLEs (Non-linear editors, think Premiere or FinalCut).
The real work is now in changing the mentalities towards dementia, and getting care homes to implement new technologies rather than shy away from them, perhaps paradoxically using machines to provide a more 'humane' approach to mental health.
#techforgood #ai #gan #machinelearning #ml
About the Author
I'm Stephan Tual, my passion is in communicating the impossible while building lasting communities. I was the architect behind the marketing and partner strategy for Ethereum .
I'm currently helping companies like yours navigate complex and sometimes overwhelming next-generation technologies around the buzzword minefield: AI, ML, Deep Learning, Blockchain, etc.
I regularly speak at both small and large conferences or community events.
Website: https://stephantual.com.
(? _ ?)
2 年Don't be shy, and let me know how to improve. Trust me I've been on the receiving end of the most vicious criticism you can ever think of. Some constructive advice will not hurt me and I look forward to it in fact! I want to put together the most interesting and accessible compendium on AI and ML, tracking these technologies as they progressively make their way into our daily lives. Thank you so much for your continued support. ??