For People with Paralysis, Brain-to-Voice AI Streams Natural Speech.

Summary: Researchers have created a brain-computer program that is quickly generate natural-sounding talk from mental activity, giving voice to people with severe paralysis. The method decodes indicators from the machine brain and uses AI to convert them into audible statement with little delay—less than one minute.

This approach maintains competence and allows for constant speech, yet creating customized voices, unlike previous systems. Scientists are now close to enabling individuals with speech reduction to speak in real-time using just their brain activity thanks to this discovery.

Important Information:

    Near-Real-Time Speech: A new BCI technology delivers clear speech in a split second.

  • Personal Voice: The system creates the user’s personal voice using pre-injury recordings.
  • Machine Flexibility: Works across several brain-sensing systems, including non-invasive options.

UC Berkeley Cause

A team of researchers from UC Berkeley and UC San Francisco has discovered a way to regain natural talk to people with severe paralysis, marking a major milestone in the field of brain-computer interface (BCIs ).

This job solves the long-standing issue of overhead in conversation neuroprostheses, the time lag between when a area attempts to speak and when noise is produced. The researchers developed a streaming method that quickly synthesizes brain signals into audible speech using recent advances in artificial intelligence-based modeling.

The researchers used speech detection techniques to determine the brain signals that indicate the start of a speech attempt in order to measure latency. Credit: Neuroscience News

This technology, according to what was reported in Nature Neuroscience, is a significant step forward in facilitating communication for those who have lost the ability to speak. The National Institute on Deafness and Other Communication Disorders ( NIDCD ) of the National Institutes of Health supports the study.

” Our streaming approach brings the same rapid speech decoding capacity of devices like Alexa and Siri to neuroprostheses”, said Gopala Anumanchipalli, Robert E. and Beverly A. Brooks Assistant Professor of Electrical Engineering and Computer Sciences at UC Berkeley and co-principal investigator of the study.

We discovered that we could decode neural data and, for the first time, enable near-synchronous voice streaming by using a similar type of algorithm. The end result is a more naturalistic, fluent speech synthesis.

” This new technology has tremendous potential for improving quality of life for people living with severe paralysis affecting speech”, said neurosurgeon Edward Chang, &nbsp, senior co-principal investigator of the study.

Chang is in charge of the UCSF clinical trial that uses high-density electrode arrays to record neural activity directly from the brain surface.

” It’s exciting that the most recent advances in AI are significantly accelerating BCIs for practical real-world use in the near future.”

The researchers also showed that their approach can work well with a variety of other brain sensing interfaces, including microelectrode arrays ( MEAs ) in which electrodes penetrate the brain’s surface, or non-invasive recordings ( sEMG) that use sensors on the face to measure muscle activity.

We demonstrated that this method is applicable to a range of silent-speech datasets by demonstrating its accuracy on other such datasets, according to Kaylo Littlejohn, Ph.D. Co-lead author of the study and a PhD student at UC Berkeley’s Department of Electrical Engineering and Computer Sciences.

” The same algorithm can be used across different modalities provided a good signal is there”.

converting neural data into speech

According to study co-lead author Cheol Jun Cho, Ph. D., UC Berkeley D. student in electrical engineering and computer sciences, the neuroprosthesis works by sampling neural data from the motor cortex, the part of the brain that controls speech production, then uses AI to decode brain function into speech.

In the middle of that motor control, he said,” we are essentially intercepting signals where the thought is translated into articulation.”

What we’re decoding is what happens after a thought has occurred, after we’ve chosen what to say, after we’ve chosen words to use, and after we’ve chosen how to move our vocal-tract muscles.

To collect the data needed to train their algorithm, the researchers first had Ann, their subject, look at a prompt on the screen — like the phrase:” Hey, how are you”? — and then make a silent attempt to speak the sentence.

Without her ever having to vocalize at any point, Littlejohn said,” This gave us a mapping between the chunked windows of neural activity she generates and the target sentence.”

Because Ann does not have any residual vocalization, the researchers did not have target audio, or output, to which they could map the neural data, the input. By using AI to fill in the gaps, they succeeded in overcoming this challenge.

To create audio and simulate a target, Cho said,” We used a pre-programmed text-to-speech model.” ” And we also used Ann’s pre-injury voice, so when we decode the output, it sounds more like her”.

speech streaming in near real time

The researchers had an extended delay for decoding, roughly an 8-second delay for a single sentence in their previous BCI study. With the new streaming approach, audible output can be generated in near-real time, as the subject is attempting to speak.

The researchers used speech detection techniques to determine the brain signals that indicate the start of a speech attempt in order to measure latency.

Within a second, according to Anumanchipalli,” we can see relative to that intent signal that we are getting the first sound out .” ” And the device can continuously decode speech, so Ann can keep speaking without interruption”.

The speed was not sacrificed for precision. The faster interface provided the same high level of decoding accuracy as their prior, non-streaming method.

” That’s promising to see”, said Littlejohn. ” Previously, it was unknown if intelligible speech could be streamed from the brain in real time.”

Researchers are unsure whether large-scale AI systems are learning and adapting, or whether they are simply pattern-matching and repeating portions of the training data, according to Anumanchipalli. So the researchers also tested the real-time model’s ability to synthesize words that were not part of the training dataset vocabulary — in this case, 26 rare words taken from the NATO phonetic alphabet, such as” Alpha”,” Bravo”,” Charlie” and so on.

He said,” We wanted to see if we could really decode Ann’s speech patterns and generalize to the unseen words.”

Our model does this well, which indicates that it is indeed learning the building blocks of sound or voice, according to the researchers.

Ann, who also participated in the 2023 study, shared with researchers how her experience with the new streaming synthesis approach compared to the earlier study’s text-to-speech decoding method.

She” transmitted that streaming synthesis was a more volitionally controlled modality,” Anumanchipalli said. Hearing her own voice in near-real time made her feel more like she is real.

Future directions

This most recent study sets the stage for future advancements while bringing researchers one step closer to achieving naturalistic speech with BCI devices.

This proof-of-concept framework is a significant advance, Cho said. ” We are optimistic that we can now make advances at every level. For instance, we will continue to push the algorithm to see how we can create speech more effectively and quickly.

The researchers continue to work on adding expressivity to the output voice in order to reflect the variations in tone, pitch, or loudness that occur during speech, such as when someone is excited.

” That’s ongoing work, to try to see how well we can actually decode these paralinguistic features from brain activity”, said Littlejohn. This would close the gap between complete naturalization and a long-standing issue, even in the field of classical audio synthesis.

Funding: In addition to the NIDCD, the Japan Science and Technology Agency’s Moonshot Research and Development Program, the Joan and Sandy Weill Foundation, Ron Conway, Graham and Christina Spencer, the William K. Bowes, Jr. Foundation, the Rose Hills Innovator and UC Noyce Investigator programs, and the National Science Foundation provided funding for this research.

About this AI and BCI research news

Author: Marni Ellery
Source: UC Berkeley
Contact: Marni Ellery – UC Berkeley
Image: The image is credited to Neuroscience News

Original research has been made private.
A brain-to-voice streaming neuroprosthesis to restore naturalistic communication” by Gopala Anumanchipalli et al. Neuroscience of the natural world


Abstract

A brain-to-voice streaming neuroprosthesis to restore naturalistic communication

Natural spoken communication happens instantaneously. More than a few seconds of speech delay can interrupt the conversation’s natural flow. This makes it challenging for paralyzed people to engage in meaningful conversation, which could lead to feelings of isolation and frustration.

Here we used high-density surface recordings of the speech sensorimotor cortex in a clinical trial participant with severe paralysis and anarthria to drive a continuously streaming naturalistic speech synthesizer.

We created deep learning recurrent neural network transducer models to achieve online large-vocabulary, easily understandable, fluent speech synthesis that was personalized to the participant’s preinjury voice using neural decoding in 80-ms increments.

Offline, the models demonstrated implicit speech detection capabilities and could decode speech indefinitely, allowing for unrestrained use of the decoder and accelerating the process even further.

Our framework also successfully generalized to other silent-speech interfaces, including single-unit recordings and electromyography.

Our findings provide a speech-neuroprosthetic paradigm to restore naturalistic spoken communication to paralyzed individuals.

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Do You Want To Boost Your Business?

drop us a line and keep in touch

[ihc-register]