In Tallinn, work is underway on a solution to overcome video call fatigue around the world!
Patrick Marmaroli
Senior Audio Engineer at Microsoft | Tech Entrepreneur & Signal Processing Specialist | Founder at Vocametrix & Hidacs
When people talk about video calls or meetings, they are - as the name suggests - thinking primarily about the video image, but perhaps more importantly about the sound. By now, everyone probably knows first-hand that a meeting with poor sound, where one cannot hear the other or the other cannot hear the third, is a real headache. This is one of the reasons why we have a phrase like 'video call fatigue'. When you are talking and half the words don't come out, or something crashes halfway through, it's exhausting!
300 million users, 300 million different conditions
That's why at Microsoft we have people like Patrick Marmaroli, whose job in Microsoft's Tallinn office is to make video calls sound as good as possible. "The sound quality of a call can be affected by a number of factors, including the acoustic conditions of the workplace (background noise and echo), the quality of the equipment (microphones, speakers, headphones), the distance from the equipment and even the speed of the internet," explains the senior audio engineer. With more than 300 million people using Teams every month, making calls from their homes, offices, cafes, trains, streets, forests and hundreds of other locations around the world, Microsoft's audio engineers have to consider every possible scenario.
Many of these scenarios are answered by AI. AI models can already effectively remove echoes and suppress background noise in real time. AI can distinguish between noise and speech. "If you eat something that crackles during a conversation, the other party can't hear it," Marmaroli gives a simple example.
In 2022, Microsoft introduced DeepVQE, an AI-powered digital signal processing for audio integrated into Teams, and Voice Clarity, a system for processing audio from Windows microphones, to ensure optimal speech quality even when the user is on the move or away from their laptop. Previously only available for Surface devices, Voice Clarity will become a built-in audio processing feature in Windows 11 this year, bringing it to a wider user base.
Microsoft will only allow certified devices to process calls
At the same time, headphone and speaker manufacturers are working to ensure the best possible sound quality. Some manufacturers are even integrating in-chip signal processing algorithms tailored to the specifications of their products. "In situations where better sound quality is provided by the device itself, Teams can offload the signal processing to the device's own algorithms, which reduces the load on the computer processor and provides fine-tuned enhancements for the user," explains Marmaroli.
领英推荐
However, there are two sides to every coin - if the sound quality is still poor for whatever reason, this can have a negative impact not only on the user, but also on Microsoft's reputation. For this reason, Microsoft only allows certified devices that meet the Microsoft Teams Audio Test Specifications to manage their own signal processing.
This certification programme will also be facilitated from the former Skype building on Academy Road in Tallinn. The Media Benchmarking team, led by audio engineer Ergo Esken, will carry out sound tests, set up sound setups and measurement methods to ensure compliance with telecommunications standards, based on years of experience dating back to the Skype days.
"Our audit lab tests are designed to establish standardised procedures, ensuring consistency across all testing facilities and, as closely as possible, reflecting real-world operating conditions," explains Marmaroli. According to him, the main objective is to assess the device's ability to effectively record and reproduce voice, reduce noise, eliminate echo, manage double-talk and respond to acoustic changes in both reverberant and ideal acoustic conditions. Put simply, the system has to stand up to any situation.
This thorough certification process in Tallinn helps to create global high audio quality standards for video calls and online meetings, so that communication is clear and effective across different platforms and devices.
--
Translated from Estonian, source: Tallinnas k?ib t?? lahenduse kallal, mis aitaks üle maailma videok?nev?simusest jagu saada! - DigiPRO (geenius.ee)
Speech deep learning engineer at Cisco
6 个月Toujours au top ?? !!!
Portfolio Chief Financial Officer
7 个月Superb article. Really informative and helpful. Merci.
constructeur habillage horloger chez Van Cleef & Arpels (Richemont group)
7 个月il est fort notre Patrick, bravo mec !
Risk Manager @ Microsoft
7 个月Wow! Welcome Patrick!