Web-Based Speech Tool May Help ID Parkinson’s in Real World
A new tool that analyzes speech using real-world, web-based recordings — participants are recorded via a webcam and a microphone connected to a personal computer or laptop — identified Parkinson’s disease patients with 74% accuracy, a study demonstrated.
Moreover, this speech tool — called Parkinson’s Analysis with Remote Kinetic-tasks, or PARK — performed equally well when sex and age were assessed separately, making it a useful diagnostic test for both men and women, of all ages, the study found.
Most importantly, according to the researchers, more than 90% of the participants were tested at home, in situations that “contain real-world variability.”
“Using this tool, we can collect data from almost anyone anywhere with an audio-enabled device and help the participants screen for [Parkinson’s disease] remotely, contributing to equity and access in neurological care,” the scientists wrote.
The study, “Detecting Parkinson Disease Using a Web-Based Speech Task: Observational Study,” was published in the Journal of Medical Internet Research.
In most cases, a Parkinson’s diagnosis requires an in-person visit to a healthcare facility, in which a clinician assesses disease symptoms as the patient performs certain tasks, such as limb movements and walking, as well as tests for memory, cognitive ability, and speech.
Assessing speech is essential because it can be an early indicator of the disease — notably, up to 90% of Parkinson’s patients show some speech impairment.
However, “access to neurological care for Parkinson disease (PD) is a rare privilege for millions of people worldwide, especially in resource-limited countries,” the researchers wrote.
In fact, they noted, India had just 1,200 neurologists in 2013 for its population of 1.3 billion people, and in Africa, there are more than 3.3 million people per neurologist.
Another key factor is the difficulty many people have, no matter what country they live in, in getting to a clinic or other healthcare facility for testing. That difficulty is often compounded for the elderly, those with other medical conditions, and those living in rural areas.
In recent years, there have been significant advances and promising results in speech analysis to identify Parkinson’s disease, especially given the new technologies that are now available. However, many studies have been limited by the small number of participants, and by discrepancies in age between patients and healthy individuals used as controls.
Furthermore, data collection at a clinic, with its controlled environment, can limit the amount and diversity of vocal patterns assessed, and may not reflect real-world speech, scientists say.
Now, researchers based at the University of Rochester, in New York, have developed an internet-based speech analysis tool in which participants can record a short reading task at home to be assessed for Parkinson’s.
To evaluate the tool’s ability to identify people with Parkinson’s-related speech impairment, the team analyzed audio recordings of 726 individuals from the U.S. and elsewhere. The researchers used the PARK web-based tool to collect speech recordings.
The participants included 262 patients (36.1%) and 464 healthy controls.
Among those with the neurological disorder, the mean age was about 66, and 38.5% were women. In the control group, the mean age was close to 58 years, and 64.6% were women. Most individuals with Parkinson’s were between the ages of 40 and 80. Meanwhile, those who did not have the disease included participants ages 20–40 and individuals ages 80–90.
Using a webcam and a microphone connected to a computer, participants read the phrase, “The quick brown fox jumps over the lazy dog. The dog wakes up and follows the fox into the forest, but again the quick brown fox jumps over the lazy dog.” Of note, “The quick brown fox jumps over the lazy dog” is a popular pangram containing all the letters in the English alphabet.
To compare with at-home testing conditions, the researchers had 54 participants (7.4%) provide data in the lab under the guidance of a coordinator. This allowed them to test the “performance of the models trained with noisy home environment data against high-quality laboratory-environment data.”
Digital data were uploaded to a computer, and specific audio features from the participants’ speech were analyzed. Those features included the voice’s overall pitch, or tone and frequency, and variations in the pitch and loudness, known as amplitude. Also extracted were mel-frequency cepstral coefficients, or MFCCs, which are parameters based on regular frequency patterns underlying vocalizations that have been used in speech recognition systems.
The data were analyzed by four machine learning computer programs — specifically SVM, XGBoost, LightGBM, and random forest — to classify Parkinson’s versus non-Parkinson’s patients.
The primary calculated outcome was the area under the curve (AUC), in which a value of 1 indicated perfect separation between those with and without Parkinson’s. In contrast, a value of 0.5 denotes an inability to distinguish between the two groups. Percent accuracy also was calculated.
Overall, the results revealed the AUC for all four programs ranged from 0.745 to 0.753, with an accuracy between 72% and the highest at 74.1% using XGBoost.
Because a person’s voice is influenced by age and sex, the team assessed these categories separately. Generally, female voices are higher in pitch during youth, which gradually decreases with age, whereas male voices start out higher in pitch, lower much faster with age, then increase around age 45.
The AUC for males was between 0.725 to 0.795, with accuracies ranging from 66.5% to 71.7%. For females, the AUC extended from 0.659 to 0.717, with accuracies of 76.3% to 78.8%. An analysis of those older than 50 generated an AUC of 0.739 to 0.755, which were 70.4 to 72.3% accurate.
Finally, the speech features that influenced the results were dominated by the MFCC-related features and a group of validated impaired speech features that detect Parkinson’s from pronouncing vocal sounds such as “ahh.” In the age-specific analysis, similar features accounted for the differences between those with and without disease, as well as pitch-related characteristics.
The researchers said the results of this study show a new way forward for assessing people for Parkinson’s disease.
“We analyzed the running speech task from the data collected by using a web-based data collection platform that can be accessed by anyone anywhere in the world and requires only an internet-connected device with an integrated camera and microphone,” the researchers wrote.
“Our model performed equally well on data collected in a controlled laboratory environment and in the wild across different gender and age groups,” they concluded.
However, the team said more work still needs to be done to improve the PARK tool.
“To be practically deployable in clinical settings, the performance needs to be improved further,” the researchers said. The team will focus on making the data set balanced, designing better features, making the model more resilient to noise, and working to better assess age and gender variables.
In the future, “we believe it could potentially be generalized for real-world deployments,” the team said.