Today, we know Siri and Alexa as the technology behind our smart speakers. With simple commands, lights can be turned on, music played and volume adjusted. But the capability of voice recognition has yet to be fully realized, and amazing applications are on the horizon, according to Matthew Perez ’13, a doctoral candidate in computer science at the University of Michigan.
Through his research in artificial intelligence, Perez is creating smarter computers that can recognize speech patterns in patients with neurocognitive diseases like Parkinson’s and Huntington’s. It’s complicated work, given speech irregularities, variability among patients and a scarcity of data, but the potential breakthroughs are huge.
With new technology, physicians can monitor their patients more frequently and receive more in-depth analysis of how their speech is changing, as opposed to traditional methods, which require in-person visits every few months. “Someone who has a motor impairment – they’re usually unable to produce speech as fluidly or eloquently as people who are healthy are able to do,” Perez explains. “What we try to do is look for biomarkers that can help doctors and medical clinicians track and monitor disease progression over time.”
What’s interesting about Perez’s work is how impactful he can be from behind the scenes as a programmer, developing models with data, which he describes as the “lifeblood of artificial intelligence.” “My focus is in artificial intelligence, but I feel very fortunate to be able to work with professionals from the medical field in order to come up with innovative solutions to real world problems,” he said. “Collaboration is key especially for specialized applications like disordered speech analysis.”
The medical and healthcare industries can greatly benefit from the vast data and analytics generated by machine learning, he adds. He points to examples of machine learning that are already in practice. For instance, WoeBot is a chat bot that facilitates conversation for those with mental health issues. Additionally, research at Stanford University found that artificial intelligence algorithms can scan a chest X-ray in mere seconds, with readings consistent with the interpretations of radiologists.
Still, despite the enormous promise, researchers must move slowly as they train machines to analyze data and incorporate artificial intelligence in healthcare. Errors can cost lives and have grave consequences, Perez says. “It’s important to make sure we understand our models and ensure things are done properly.”
Perez didn’t plan to end up in the area he’s currently pursuing. After graduating from Punahou, he attended University of Notre Dame with the intention of becoming an electrical engineer, like his father. But his first college coding class made him realize he preferred computer science. As an undergraduate, he conducted research with a team working on concussion detection using vocal biomarkers. They were ultimately hoping to create an iOS app that could use an athlete’s voice to determine whether he or she had suffered a concussion.
During his junior year, Perez interned at Garmin, the GPS and wearable technology company, where he got a taste of what it would be like to be a programmer. “I was doing a lot of coding and debugging work, and I liked it there,” he says. “But I was more interested in working on projects that would be five or 10 years down the road.” After talking with his professor, he decided to pursue his doctorate.
Perez is now three years into a program that typically spans five to six years. He wants to continue doing research after completing his doctorate, possibly for a large technology company like Google, which recently announced its Project Euphonia initiative to make voice interfaces, like Google Assistant, respond better to those with voice impairments and other disabilities.
The Google project incorporates a variety of research, nonprofit collaborations and voice data from volunteers with impaired speech. Perez says many of these skills are transferable. If you can teach a machine to identify disordered speech, you can also teach a machine to better recognize people with accents and speech impediments.
“What really interests me about research is the opportunity to work with amazing people with a range of skillsets and approach interesting problems,” Perez says.
Zeen is a next generation WordPress theme. It’s powerful, beautifully designed and comes with everything you need to engage your visitors and increase conversions.