The Scientific Quest to Use Voice As A Biomarker
Barbara Wilson Arboleda
Voice Rehabilitation, Expert in Power Voice and Rock/Pop Singing, Technophile, Project and Process Manager, and Budding Data Analyst
I’ve come across several articles recently enthusiastically proclaiming voice to be a specific and reliable biomarker for diagnosing diseases, particularly in the context of machine learning applications. However, the authors of these articles do not seem to be primarily voice researchers themselves. It makes me wonder if they are aware that quantifying the nature of voice signals remains an undiscovered Holy Grail in the field of voice pathology.
When I was in grad school, acoustic measures such as jitter, shimmer, and noise to harmonic ratio were considered reliable in quantifying clinicians’ perceptual evaluation of voice. As research progressed, numerous articles demonstrated the unreliability of these measures in defining specific disorders. Moreover, there were significant anomalies where dysphonic voices produced normal results.
Then came Cepstral Peak Prominence, which represented an advancement in analysis by not relying on identifying individual boundaries of sound signal cycles. CPP can be obtained using either a sustained vowel or connected speech, but concerns about its reliability in analyzing rough versus breathy voices.
Even if we assume these measurements can differentiate between disordered and normal voices, which is questionable, none of these measures can identify the underlying cause of any specific dysphonia. My philosophy is that "dysphonia is not diagnostic." Once you understand the multitude of factors that can contribute to voice disorders, you can see the overlapping characteristics that exist among them. Despite my trained ear, I am occasionally still surprised by what I see during a patient's laryngeal examination.
领英推荐
If we were to believe the enthusiastic researchers attempting to define voice biomarkers, we would think depression, Parkinson's Disease, coronary artery disease, and other conditions can be diagnosed with a few simple tasks spoken into a cell phone. I'm not yet convinced.
Let's take Parkinson's Disease as an example to illustrate the inherent challenges of this endeavor. Nearly 90% of people with Parkinson's will eventually experience voice and/or speech disorders. The typical presentation of individuals with PD includes hypophonia (quiet speech), breathiness, voice roughness, loss of inflection, reduced precision of speech sounds, and possibly an increased rate of speech.
People with Parkinson’s Disease also tend to be older, and older individuals are more likely to have vocal fold thinning, which can also cause breathiness and reduced loudness. While aging voice may not always exhibit reduced inflection or articulation, not all people with PD do either. When faced with a binary choice between a person with Parkinson's Disease and a "healthy control," these characteristics may seem diagnostic. However, when dealing with the population of individuals in a typical voice clinic, the results become less certain.
As a technophile, I love exploring technological solutions to complex problems. Still, I'm careful to not let this enthusiasm lead to premature adoption of inaccurate models. I hope to see increased involvement from voice clinicians and clinical voice scientists in this work, as it will help mitigate the ignorance of historical perspectives that could otherwise arise and prevent adoption of hasty conclusions that have significant impacts on patient care.
Medical Science Liaison with therapeutic expertise in Oncology | Neuroscience | Rare Disease | Gastroenterology
1 年Very well written Barbara!