Talking Audio Forensics - the Theory and the Key Terms Digital Forensics Audio Specialists Use
Dr Todd Hutchison
?CEO ?Digital Forensic Investigator (Multimedia) ?Adjunct Associate Professor in Business and Law (ECU) ?International Bestselling Author ?Project, Risk and Contract Management, and Behavioural Specialist
One of the difficulties for lawyers and clients working with digital forensic experts is the terminology and technical speak that can cause challenges in even reading court prepared investigation reports.
This article aims at providing a basic understanding of the fundamental physics of sound and how it is measured, as well as provides the key terminology digital forensic practitioners may use in discussions the modification of audio-based evidence.
Sound Waves
Sound travels in energy waves that displaces air particles that causes an elastic force within and between the particles propagating through the air hitting objects that causes the molecules within them to vibrate at specific frequencies, including the human eardrum (known as the tympanic membrane). The eardrum passes these vibrations though the middle ear bones or ossicles to the inner ear.
Air is not the only medium sound can pass through, as other mediums, such as water and other solids can, however these can attenuate or distort the sound. Interestingly, in space there are large empty areas that have no molecules to vibrate, so there is no sound in space as we know it on planet earth.
Microphones
A microphone is an acoustic-to-electric transducer that converts energy from a sound wave to an electronic equivalent. When a microphone senses atmospheric pressure caused by the energy waves through its vibrating internal diaphragm membrane, using various methods to cause electrical pulses that enter an analogue-to-digital converter to generate a stream of digital numbers. There are a lot of different microphone types that are better for voice or particular instruments or environments that will not be covered here, including the key capacitor (electrostatic), dynamic (electromagnetic) and piezoelectric (mechanical) model types that vary in their method of conversion.
Once the conversion of the analogue-to-digital process occurs, the stream of digital numbers comparatively represent the pressure differences of the waves received and therefore can later mimic the same sound in a similar reverse method to drive a loudspeaker that will then be picked up by the human ear as a recorded sound.
Microphones also vary by their capture direction of sensitivity, known as the ‘polar pattern’ that describes a three-dimensional orientation in space relative to the sound source. They are normally categorised as either an omnidirectional microphone that attempts to capture sound waves from any direction or the unidirectional microphone that has a polar pattern concentration toward a specific direction that can be seen in the diagram below [i] from the cardioid microphone that captures in a heart-like shape to the other patterns.
Soundwave Structure
As the sound moves away from the source, the pressure, velocity and displacement of the medium varies over time. The distance a single wave cycle travels is known as a ‘wavelength’ and it is measured by the number of cycles every second called ‘hertz (Hz). It also has a measurable height that is referred to as an ‘’amplitude’, being the strength of the displacement of the medium parties, which is experienced as its loudness.
The following diagram illustrates a sine wave, showing a wavelength and the peak amplitude point.
Whilst the number of wavelengths per second is measured in hertz (symbol being Hz), the sound pressure is measured by ‘pascal’ (symbol being Pa) and the sound pressure level (SPL) is represented in a logarithmic measure of the effective sound pressure of a sound relative to a reference value that is termed ‘decibels’ (symbol being dB).
The threshold of the lowest audible sound for average human hearing is between 1 - 2 kHz that is close to 0 dB, which has become used as the reference level point for measuring sound levels. This represents the equivalent sound pressure of 20 μPa (equivalent to 1 μPa in water). Therefore, the 0 VU level point (on the Voltage Unit meter) is often referred to as 0 dB, allowing signal levels to be measured relative to this reference level.
Human Hearing
The human ear most commonly can hear between the 20 Hz and 20 kHz range (lab tests have shown ranges between 12 Hz and 28 kHz)[ii], and typically as a person ages the higher frequencies are gradually lost. Its design enables an extremely versatile and amazing hearing device, with in-built mechanisms that reduces its own sensitivity as the sound level rises and handles an enormous range of sound power levels. From the logarithmic chart of hearing ranges in animals as shown below [iii] helps to show how the animal kingdom each have limits:
Those sounds heard by humans are represented by sound pressure levels, such as the typical different real-life scenarios values in the diagram below [iv]:
As the frequency response of human hearing changes with the amplitude, weighting curves have been established for measuring sound pressure. This is required as the human ear doesn’t have a consistent gain factor at different frequencies, making a sound pressure of 1 kHz sound louder than the same pressure level at 100 Hz. As a consequence, the audio weighting curves aim to compensate for this error.
There are three key aspects to understand sound:
1. subjectively perceived loudness that relates to the volume
2. objectively measured sound pressure that relates to the voltage present; and
3. theoretically calculated sound intensity that relates to acoustic power.
Basically, when a person’s voice is captured on a microphone it produces an audio voltage that is proportional to the sound pressure that has a calculated sound intensity that represents the acoustic strength of the signal and acoustic power.
Sound intensity is measured in decibels as noted earlier, which allows it to reflect the nature of the dynamic range of the human ear and represented in a logarithmic scale using 10 as the base (the decibel scale). As a logarithmic scale it offers a means to see an order of magnitude reading by allowing two measured points to have a multiplying factor.
Basically, the quietest audible sound that represented the perceived near total silence is 0 dB. A sound 10 times more powerful would therefore be 10 dB, a sound 100 times more powerful than near total silence become 20 dB and a sound 1,000 times more powerful than near total silence is 30 dB and so on.
This also means that a 3 db increase is a doubling of the sound energy, whereas a – 3 db decrease is a halving of the sound energy. This leads to a serious of basic rules for working with decibels [v]:
Sound meter measurements often show typical noise levels such as a [vi]:
- whisper - 15 dB
- library - 45 dB
- normal conversation - 60 dB
- toilet flushing 75-85 dB
- noisy restaurant - 90 dB
- peak noise on a hospital ward - 100dB
- baby crying - 110 dB
- jet engine - 120 dB
- Porsche 911 Carrera RSR Turbo 2.1 - 138 dB
- balloon popping - 157 dB
Human hearing can be damaged by loud continuing exposure or short bursts, and the next diagram [vii] indicates the safety time limits to bare a noise without wearing hearing protection equipment:
Digital Forensics
Understanding this basic theory allows for now understanding what the digital forensics practitioner is attempting to do more easily. Often audio-based digital forensics requires to increase the perceived loudness of a voice or other noise to enable the listener to better understand speech or enhance a particular sound. This is really trying to make the desired sound to be more audible to the human ear.
It may be to reduce unwanted sounds, often referred to as noise reduction, or selectively removing or enhancing specific sounds.
Digital forensic work is not like audio engineering that has to bring signals together and to manipulate the inputs to create a high-quality recording. It is generally about working on recorded signals that are then analysed and modified to suit the need of the investigation. Saying that, digital forensics may need to record sites, specific sounds or re-enact events to test theories or support the repeatability of events for evidence purposes.
Although there are many terms used in audio [viii], the following provide a quick guide to key terminology needed to understand the basic modification of sound files that the digital forensic practitioner may do (shown alphabetically), which are types of signal distortion:
- Amplifying – a process of increasing the signal’s volume;
- Attenuation – a process of reducing the signal level to reduce its volume;
- Clipping – a process where the amplitude of a signal is limited to a maximum amplitude threshold;
- Compression – a process of reducing the dynamic range by reducing a signal’s output volume in relation to its input. This basically compresses a sound once it reaches a certain level
- Equalisation – a process to boost or cut particular frequencies;
- Filtering – a process of amplifying, passing or attenuating a frequency or frequency range;
- Format conversion – a process to change the file type (format) that enables the audio track to be played on different devices;
- Gain – a process that enables increasing the signal’s volume;
- Limiter – a process of limiting the loudness of a sound once it reaches a set threshold;
- Masking – a perception phenomenon where one sound is affected by the presence of another (effecting cancelling out the other) that may need one frequency to be manipulated to hear the other;
- Noise reduction – a process of removing certain frequencies or altering sounds with the intent to reduce or eliminate them; and
- Panning – a process of spreading the monaural signal in a stereo or multi-channel sound (e.g., moving between the left and right sound source).
More advanced digital forensics can be used for comparative analysis, such as the tonal, accent and voice frequency analysis for voice recognition or looking at audio files that constitute evidence to see if they have been manipulated through editing or modifying the signals.
Digital forensics relies on audio specialists to enhance sounds and enable sound sources to be used as admissible evidence in court.
About the Author
Adjunct Associate Professor Todd Hutchison is an international bestselling business author, a nationally accredited trainer, certified speaking professional (CSP), and a digital forensics specialist. He teaches in a Masters of Engineering at Curtin University.
Todd is a qualified and police licensed investigator and works as a digital forensic practitioner specialising in video and audio for Forensics Australia, and an executive of law firm Balfour Meagher.
He graduated in broadcast engineering, and also has related qualifications in camera operations, audio equipment operations, photography and music. He was the first-degree graduate in legal project management worldwide. He is a world champion (grade 2) in music as a member of the WA Police Pipe Band competing at the World Pipe Band Championships in Scotland, and a graduate of the famous Western Australia Academy of Performing Arts.
Todd also resides as the Global Chair of the International Institute of Legal Project Management.
He has trained police, military, lawyers, investigators and auditors.
References
[i] HomeRcodingPro, The Different Types of Microphone, URL: https://homerecordingpro.com/microphone-types/
[ii] Wikipedia, Hearing Range, URL https://en.wikipedia.org/wiki/Hearing_range
[iii] Wikipedia, Animal Hearing Frequency Range, URL; https://en.wikipedia.org/wiki/Hearing_range#/media/File:Animal_hearing_frequency_range.svg
[iv] DEWESoft, Sound Pressure Measurement, URL https://training.dewesoft.com/online/course/sound-pressure-measurement
[v] Pulsar (2019) What are Decibels, the Decibel Scale and Noise Measurement Units, URL: https://pulsarinstruments.com/en/post/understanding-decibels-decibel-scale-and-noise-measurement-units
[vi] Pulsar (2019) What are Decibels, the Decibel Scale and Noise Measurement Units, URL: https://pulsarinstruments.com/en/post/understanding-decibels-decibel-scale-and-noise-measurement-units
[vii] Pulsar (2019) What are Decibels, the Decibel Scale and Noise Measurement Units, URL: https://pulsarinstruments.com/en/post/understanding-decibels-decibel-scale-and-noise-measurement-units
[viii] Musician on a Mission, Audio Terms Every Term DIY Musician Need to Keno, URL: https://www.musicianonamission.com/audio-terms-by-subject/