What's new with audio technologies? - Part II: microphones and audio processing

What's new with audio technologies? - Part II: microphones and audio processing

In a former edition of this newsletter, I focused on speaker technologies. In this edition, I cover microphones and major sound processing technologies.

How humans perceive sound: 101

At the most basic level, the human ear converts sound waves into electrical signals that your brain processes. Sound waves are characterized by their amplitude, frequency and direction. Your eardrum (or "tympanic membrane") vibrates, the vibrations are conducted through bones and a fluid hearing organ converts them and passes the information to a nerve connected to your brain. The human ear can detect sounds coming from different directions and your brain can focus the sound (this is called the "cocktail party effect"). As you age, your ability to hear high-frequency sounds (high pitch) diminishes.

Microphones also convert sound waves into electrical signals. The sound is then processed using processors and audio algorithms. Let's take a look at these two technologies.

Microphone technologies

There are half a dozen technologies for microphones, all?based on converting the pressure pattern of sound waves into electricity.

The 4 mostly deployed technologies are:

  • Dynamic microphone: this is the most common form of microphone where a diaphragm is designed to displace a coil of wire within the magnetic field of a magnet to generate electricity. It is sort of the opposite of how an electric motor works (we all learned the bicycle dynamo?effect at school when a coil turning around magnets would light up your bike's light)
  • Condenser Microphones: a diaphragm (front plate) is put very close to a backplate and when the distance between the two changes, it changes the capacitance which in, turn, alters electrical characteristics. "Capacitance" measures how much electric charge can be stored given a voltage applied between the front and?backplate so this type of microphone needs to be continuously powered for the measurement of sound pressure to work. They are mostly found in recording studios?and are considered high-end solutions.
  • Piezoelectric microphones - also called "contact microphones"?use certain materials like quartz or ceramics whose intrinsic properties are to generate electricity when subjected to mechanical stress or vibration. What's interesting about those is that they can be used as speakers as well since applying an electrical signal to these materials makes them oscillate. They are mostly used when they can be in contact with a resounding surface (for example, when attached to a musical instrument or for ultrasound medical imaging applications)
  • MEMs microphones are a newer generation of thin microphones that can be easily "mounted" on standard electronic boards. They are widely used in cell phones and tablets given their miniature size and use capacitance measurement similar to condenser microphones but require much less voltage and are hence better suited for battery-operated applications. The micro-fabrication techniques used in MEMs microphones make them particularly cost, power and size-efficient.

Audio processing: what's new

Once a microphone outputs an electric signal, it is typically amplified and then processed by a computer or specialized processor (like a DSP for specialized Digital Signal Processor).

Typical algorithms include:

  • Noise suppression: the definition of noise can be broad (for example, in automotive applications, you have wind noise, road noise (noise that your tyres make on the road), internal fan noise, etc...) but generally speaking the idea is to isolate an undesirable audio wave that's mixed with another one and remove or attenuate it. Noise-suppressing earbuds or headsets have integrated microphone(s) capturing the audio environment and before using speakers to play sound, they can detect a constant recurring audio pattern (like a jet engine noise) from the rest of the environment and reshape an output audio signal
  • Acoustic echo cancellation: we all have been in situations where you could hear an echo of your voice during the early days of internet calling. This is called acoustic echo. Without the use of headsets, what's happening in a 2-way calling scenario is that a mic picks up your voice, then your voice is transmitted and played back to the far-end recipient's speaker who is also using a mic to send back your voice with a super annoying delay. This doesn't happen with a headset because the sender's voice goes directly into the ear of a recipient and cannot be picked up by the headset mic. Well, guess what, it is possible to use algorithms to suppress the echo caused by this feedback loop even if you aren't using headsets. Most videocalling applications use sophisticated algorithms to achieve that goal.
  • Beamforming: the use of multiple microphones isn't rare in several applications. Microphones expose different capabilities given the directionality of sounds so if you want to record stereo sounds or 360-degree sounds you need more than one mic. In enterprise-grade internet calling solutions, microphone arrays are also often employed as ways to select the microphone that best captures the sound of one person in a large room (and also identify the localization of who's talking). Often paired with face-tracking technologies nowadays, beam-forming algorithms aim at combining or segmenting sounds coming from more than one mic in real-time
  • Waveform reshaping: audio waves can also be processed in a way to create certain desired effects between the record and playback phases. For example, certain frequencies can be amplified or shifted to create musical effects, reacher sounds or re-distribute part of original audio contents over multiple speakers. DJ and audio mixing tables are prime examples but hearing aids can also be tuned to compensate for certain human ear deficiencies (like "shift down" high-frequencies waves to lower frequencies for elders)

I would argue that one of the most interesting audio technologies that has still not fully taken off (apart from the computer gaming industry) is the concept of audio object encoding which was very much discussed in ~2000 during the genesis of the MPEG-4 audio/video encoding standard.

The idea is to pair the recording of sound with geographical information about the localisation of the sound sources and/or rules as to how the sound should be reshaped in case of different objects' placement in a scene (like the speed and direction of a moving sound source). The scene may be composed of a listener moving around or surrounding audio sources also moving.

Virtual games do this frequently now (for example, when wearing a VR headset and turning your head, the sound is reshaped in real-time to balance the volume between your 2 ears to emphasize the effect directionality of an audio environment and give a "special" feeling to the experience).

However, such algorithms are much less common in real-life applications, for example, imaging a car cockpit experience that would amplify alert signals on your car's numerous speakers depending on the directionality of a safety hazard or a home audio theatre system that would track the position of a listener using camera, wireless signals or other means to deliver an optimal audio rending experience in your living room.

Conclusion

Acoustic technologies have evolved quietly over the years with the advent of mature MEMs fabrication technologies and processor and algorithmic technologies. Nowadays, it isn't rare to find multiple microphones, speakers and a combination of sophisticated algorithms like in the automotive sector, videoconferencing or computer gaming/VR applications. Given all these technologies, what's funny is that some audiophiles still prefer old-fashioned vinyls over mp3 audio files for example - vinyl growing steadily at a 3% annual rate in sales. The moral of the story: don't quickly bury old technologies!


#audio #speaker #VR #MPEG-4 #MEMs #microphone #display #activesafety #audioprocessing


I hope you enjoyed this article. As always, feel free to contact me @ [email protected] if you have comments or questions about this article (I am open to providing consulting services). More at www.lohier.com and also my book. Here to subscribe to this newsletter and quickly access former editions.

Timothy Coats

Thinker, Problem Solver, Solutions Developer, Father & Husband, Student of History

1 年

Frantz, that's a great summary! As someone who spent their formative professional years in active noise control, this brought back many memories...and a few nightmares from my past. I am excited to see where technology can take us in recreating the immersive effects in audio as we venture into AR/VR.

Cyril Laury

Group Electronics Technology Director at Forvia

1 年

Nice overview again Frantz ! I would just add that depending on the intended applicatio, there is different performance parameters that comes into consideration: - Microphone directivity: in a car, where the microphone is facing the user, unidirectional microphones are preferred to omnidirectional, the latters could be used in headphones for noise cancellation or voice recording - Microphone dynamic: it's often difficult to find (at least for a reasonable price) a microphone that does provide a good sensitivity AND a strong dynamic, able to withstand powerful sounds without clipping. - Integration: also often overlooked, a proper microphone integration is key to ensure a strong performance, especially in cars where the surrounding parts of a microphone could drive a lot of noise to the microphone if not well damped. All of this explains that even if MEMS microphones offer a lot of interesting performances (size, sensitivity, power consumption) they're not ubiquitous and we still see other technologies remaining.

要查看或添加评论,请登录

Frantz Lohier的更多文章

社区洞察

其他会员也浏览了