Description

Medical Devices Speaker Series 2023: Develop Advanced Brain-Machine Interfaces with Biomedical Signal Processing and AI Tools from MathWorks

From the series: Medical Devices Speaker Series 2023

Vikash Gilja, UC San Diego Jacobs School of Engineering

Biomedical signal processing is at the heart of Brain-Machine Interfaces (BMIs), offering immense potential to enhance the quality of life for individuals with neurological disorders and disabilities. This presentation by Vikash Gilja from UC San Diego explores how these advanced systems necessitate the integration of sophisticated biomedical signal processing techniques and AI algorithms.

Discover how MathWorks solutions can help you:

Acquire and analyze neural signals: Utilize powerful tools for biomedical signal processing and filtering to extract meaningful information from brain activity.
Develop and implement AI algorithms: Leverage machine learning and deep learning techniques to decode neural signals and translate them into control commands.
Design and simulate BMI systems: Model and test your BMI in a virtual environment before implementation, ensuring safety and efficacy.
Generate production-ready code: Automatically generate code from your algorithms, streamlining the development process and reducing the potential for errors.

This presentation was given at the 2023 MathWorks Medical Devices Speaker Series, which is a forum for researchers and industry practitioners using MATLAB^®and Simulink^® to showcase and discuss their state-of-the-art innovations in the areas of medical device research, prototyping, and compliance with FDA/MDR regulations for device certification.

About the presenter:

Vikash Gilja is an associate professor in the Electrical and Computer Engineering Department and is a member of the Neuroscience Graduate Program faculty at the University of California, San Diego (UCSD). He directs the Translational Neural Engineering Laboratory, which develops and expands the capabilities of neural prosthetic systems. To realize this goal, the lab develops machine learning–based algorithms that interpret multimodal physiology data in real time, and it validates these algorithms by conducting experiments in multiple clinical settings and animal models. Gilja received a B.S. degree in brain and cognitive sciences and B.S./M.Eng. degrees in electrical engineering and computer science from MIT in 2003 and 2004 and his Ph.D. in computer science at Stanford University in 2010.

Published: 18 May 2023

Full Transcript

--really like to introduce Dr. Vikash Gilja, here. He's an associate professor in the Electrical and Computer Engineering Department and is a member of Neuroscience Graduate Program faculty at the University of California at San Diego, where we are right now. And today Dr. Gilja is going to talk about his work and research in the Neuroscience and Neural Engineering Laboratory-- from brains to machine. Right?

Perfect. All right, Dr. Gilja. It is all yours.

Thank you for the invitation and the introduction. My overall talk was originally titled "Brain-Machine Interface, Concept to Client," but, given the time and the audience, I wanted to reduce down to one case example, which is an interplay between what are known as "motor prostheses" and "speech prostheses" that are under development, to give you a sense of how basic animal models and clinical translation developments are occurring in this field.

Some disclosures. I do some work in industry as well. And we'll come back to this in a moment. This is an emerging industry at the intersection of research and translation.

So a little bit about me and my lab. So we focus squarely on neuroengineering. And what I mean by that is, as a lab we tend to focus on techniques that involve signal processing, machine learning, some embedded systems to build experiments and to test our ideas, and then we interact with folks in other areas of engineering to build complete systems. We draw from the basic neurosciences, drawing upon experimental techniques and interactions with domain experts, with the overall goal of developing systems for clinical translation with partners in the clinical domain. I refer to this broadly as "applied neuroscience."

I sit in an electrical engineering department-- electrical engineering. We have physics, applied physics. There's an emerging field of applied neuroscience that involves iteration between the basic neurosciences engineering with a target towards clinical translation.

So specifically we're going to be talking about brain-controlled motor prostheses. These are systems that measure activity from the brain-- use that activity to infer intended movements. And there's been quite a bit of work in this domain, in the last few decades, for upper-limb motor control-- so, motor control for virtual computer mice, for robotic limbs. And more recently, we're starting to focus on speech.

And so, overall, my question is, how do we advance these clinical theses? Some of the ways we focus on this in my lab is both on the application side as well as on the measurement side. Today I'm only going to be focusing on one of these topics, but we do some work on probe design and optimization as well as expanding that application space.

One of the places where we focus on expanding the application space is actually on the other side of campus, at our medical center. There is an epilepsy monitoring unit where we have the unique opportunity to see the human brain in action. We can take experiments and tasks, and many of the patients are kind enough to volunteer for our studies.

And then, on the measurement side, there's a few different levels of measurement that we focus in on. Specifically I'm interested in measurement techniques that get us close to the underlying neurons. And we do that with intercortical electrodes-- electrodes that penetrate about a millimeter into the brain. We also do this with what are known as "subdural electrocorticography." This is one of the ways we test in clinic. These are devices that sit on the cortical surface.

So some of the overall focus in the lab is to better develop these probes and techniques to leverage them. But today we're going to be talking about vocal prostheses. And I'll be giving you a little bit of the background on what's happening in the clinical domain and how and why we're building a songbird model for speech-improvement prostheses.

So-- braincontrolled motor prostheses. How do these systems work? Much of the advance in this field has occurred because of an electrode-array device, called a "Utah array." And this device allows us to access hundreds of neurons in primary motor cortex.

So here is a research participant I was very fortunate to work with, code-named Q-- quartermaster from the Bond movies. She's a bit of a gadgeteer. And so we implanted this electrode array in her primary motor cortex, in the area that is related to control of the upper limb and the hand.

We can take activity from each one of these electrodes. Here I'm showing broadband trace neural signal from just one of those channels. We have 100 of those channels.

We can apply signal-processing techniques to extract features of interest and then apply decoding algorithms to estimate intent and then, in real time, use that intent to drive actuators-- in this case, a virtual computer person A little shout-out to MATLAB. Actually all of this work, all of this signal conditioning through decoding in this particular piece of work was done with Simulink Real-Time. So this was a mix of MATLAB and Simulink, to get to this system build where, within tens of milliseconds, we could be measuring neural activity, inferring intent, and driving that computer mouse on screen.

Today, with some of the later applications I'm going to show you, we use a mix of Simulink and Python tools. We use TensorFlow and PyTorch for some of our more modern models. So we're integrating Python as well.

So how do these systems work? Very high-level overview. There is a long history of studying motor-cortex neurons in animal model. The animal model of choice for many decades is monkey. You can train monkeys to complete motor tasks. You can record their neural activity in those motor areas. This all started with single electrodes, in the '60s, and then advanced to these multiple views.

And so again, if we take a look at the signal that we would record, the amplified signal that we record on one of these electrodes, what you see is this low-frequency activity and these punctate spikes. The spikes are the electronic signature of neurons close to the electrode site. And so we can extract times when those neurons are firing. And the reason we do this is, if we've positioned our electrodes in motor cortex, the statistics of those spikes relate to movement. So they're informative about movement.

Here-- just as a simple example of not moving versus moving-- there are more subtle aspects of movement that are found in the statistical relationship between movement and the firing of these neurons. So, critically, we come down to these rasters, these spike times, that are on the order of milliseconds. And again, we're recording now in present times from many electrodes simultaneously. So we'll get hundreds of these rasters in modern recordings.

So these ticks-- we can apply machine-learning techniques, to map from these ticks to real-time movement intent. And so this technique has been used for computer cursor control and then applied for general computer use using tablet-based operating systems. So, here, just general web browsing as an example, and then typing with an onscreen keyboard.

Now, if you've ever tried to type with an onscreen keyboard with a computer mouse, you might realize that it's a pretty slow way to provide input. And for that reason, the field is starting to explore new techniques. And we'll get to that in a moment.

But, for these studies, an enabling technology is something called the "Utah array," developed in the late '80s and early '90s. This was a key research device that is available for clinical use. It is approved for less than 30 days of monitoring, but there are multiple academic-based clinical trials that operate under FDA IDEs that allow for one-plus years of implantation and testing with these devices.

And so this is a summary of a subset of current clinical BMI trials that are using the Utah array. I was involved with BrainGate in the past. And I will again note that BrainGate is a heavy user of MathWorks products, both MATLAB and Simulink. In many of these cases, Simulink Real-Time is being used to operate these devices.

So and important question is, are these proof-of-concept examples good enough? And I'm going back to a survey study we ran back in 2015, but revisiting the results based on modern examples of BMI, because performance has been increasing year over year. And so what we did back in 2015 was, we surveyed a population of high-cervical-- individuals with high cervical injuries. These are quadriplegics-- individuals with quadriplegia.

We gave them a survey where we described various brain-machine interface devices. Don't worry about the details of the text. This is what we presented them-- device description, what surgery and maintenance would look like, what daily usage would look like. And then we ask them the likelihood to adopt the technology if it enabled a certain level of restoration-- a certain capability.

Interestingly, if we look two years ago, the state of the art in typing with these types of BMIs that we've been describing was about 90 characters per minute. I will note-- this year, there's a preprint of a new technique that is taking us to about 270 characters per minute. And we'll get there in a moment, and I'll describe that.

But, in the survey, we asked this particular question, and we found that about 20% to 40% of high-cervical spinal-cord-injured individuals were interested in adopting these implanted devices, either this intercortical device or that subdural grid that I mentioned earlier, for functional restoration. The performance is at a level where a significant percentage, a substantial percentage, of the target-patient population is expressing interest in the technology. And we're starting to exceed the performance limits that we set in the survey back in 2015, as the technology is advancing. An important lesson that we learned from this survey-- wireless, really important. And there's also quite a bit of interest in mouse control and hand restoration.

So that's kind of a quick overview of where we're at, as a field. There's quite a bit of development now happening on the industry side of the field. And I like to tell trainees that it's a really exciting time to be in this field, due to this development. This could be the beginning of an acceleration.

As I mentioned, my disclosures earlier, I've interacted with these two companies. I currently advise Paradromics. But there's a number of companies in this space, including BlackRock Microsystems that I mentioned earlier, that are developing neural-interface technologies for clinical application.

And so the result of this development is, those electron counts that I showed earlier, they are poised to grow. And that's important, because the more neurons we can record from these areas of the brain, the more information we have about the user's intent, the higher-bandwidth conduit we can create with this technology.

The examples that I went through were all around upper limb. There's growing interest to build real-time speech-and-vocalization prostheses. And what these would do is map brain activity to help with speech.

And you can think of this as a controller plant problem, where, in real time, we want to infer this intent and generate the signal. Timing is incredibly important, because if you take a healthy individual and you delay their speech by as little as 50 milliseconds, you'll induce vocal stuttering and a few other deficits.

So, to really solve this problem, we need to be thinking with low latency. But what's been done to date are slower systems, slower systems that will go not from brain to speech but brain to text. One example of such a system was published in 2021, and this achieved about 18 words per minute, with a very low word error rate.

For reference, most folks type at about 40 words per minute, with a standard keyboard. We speak at approximately-- again, varies from individual to individual. We speak at approximately 150 words per minute. But again, keyboard typing, about 40 words per minute.

So the key idea here was that they were still recording from this hand-and-arm area, but, instead of asking the individual to attempt movements that resemble computer mouse or touchpad, they asked the individual to attempt to hand-write-- hand-write characters. And they're applying neural networks to assess the probability that the individual was intending to generate the specific character. They could then also apply language models, to correct errors along the way.

So this is relatively long latency, relative to speech, but on the order of seconds between intent and output. So, still a very useful communication output. Again, this is all based on these microelectrode arrays being implanted in primary motor cortex.

Now, where we're going as a field is, we're moving from upper limb to speech. And this is something that has become accessible because we are now working with a patient population. And this is a first-order summary of the spatial map of primary motor cortex, here.

We've been implanting close to an area called the "hand knob." It has that name because it looks knoblike, and the activity tends to be informative about hand and arm movements. If we go a little bit more laterally, we can implant in areas that are involved in speech articulation.

Speech itself is a highly coordinated motor act. It's probably the most highly coordinated motor act that we as humans are capable of-- precise movement of many muscles, about 45 muscles, in the face and 100-plus muscles across the body, to coordinate speech production. But because it's a motor act, many of the same lessons that we learned as a field in animal model do apply here. And we can take this same idea-- and the authors of this paper did this, this here, this is not peer-reviewed yet. It's a preprint.

But, in that preprint, they're indicating that they can get 62 words per minute, because they're asking the individual to speak. It's a more natural communication form. Error rate is higher, at present. But importantly, as we increase electrode count, we anticipate as a field that error rate will go down.

They are not open, in terms of their English vocabulary, but they have a very large corpus of potential words in the system. And this system design is roughly the same-- big differences are where we're implanting and the output. Up here, the output was characters, in English characters. Down here, it is English phonemes-- sort of the basic acoustic units of speech. And both of these are applying language models that are appropriate for that outcome.

So this is where our field is headed. And where we'd like to take it forward is towards real-time speech synthesis. Why is this important? If we want to functionally restore communication at native rates-- that's what I'm doing right now-- we need to be able to synthesize accurately and quickly. And this is important to get to that 150 words per minute. This is also important for communication.

In those examples I was showing you, there's up-to-multisecond delays between intent and output. Imagine trying to hold a conversation with that millisecond delay. Your overall throughput drops very quickly. We've all experienced a little bit of this, living our Zoom lives. Right? A little bit of audio delay really impairs your ability to communicate.

And so, as a field, we're driving towards this. But how do we get there? We're limited in what we can do in a clinical domain. And if we go back to upper limb, one of the ways that the upper-limb motor prostheses evolved very quickly was, we had a strong basis in neurophysiology that led to animal-model neural engineering work that directly linked us to translational neural engineering. And this has created an ecosystem, an ecosystem where things can move very quickly.

I myself published a work in animal model, in 2012. And then, three years later, we were able to do a direct translation of that work in a clinical-translation setting. And another MATLAB shout-out-- the animal-model work was done with Simulink Real-Time. We actually forwarded code directly to the translational setting, with those tools. And many of those tools are being used to this day.

But overall, as a field, animal-model studies translate to a clinic very quickly. Historically, in this field, it's been three to four years between a publication and animal model and something very similar being replicated in the translational space.

But, unfortunately, monkeys have fairly limited vocal capacity. And for this reason, me and Tim Gentner, here, in Psychology, who has a long history of studying songbird behavior, we've teamed up to build a songbird model for speech-prosthesis development.

Important to realize that they're not homologous; they're analogous. Birds coevolved in parallel the ability to vocalize. But the controls problem is very, very similar to the controls problem that humans have.

So again, nonhuman primates have limited vocal capacity. Birds are on a short list of complex vocal learners. And obviously we're doing work with people, but many of these other species are very hard to work with. And so the songbird provides a nice opportunity.

And it's incredibly well studied. I don't know if you can hear the audio, that song. That's birdsong. There's many songbird species that have different capabilities, and that allows us to test system designs and hypotheses with great rigor

Additionally, as a parallel to human work, we can collect over very long durations. We can place electrodes in brain areas that correspond or are analogous to brain areas in the human, and so we can test new ideas in building that control And then, importantly, we can translate techniques and personnel between these two ecosystems, as was done previously in another animal model.

So what we're doing overall is, we're building up this system where we can take a neural activity from the bird and, in real time, decode their intent and play it back. My joke on this is, if we've done everything right, the bird should not detect any change. So this is a very intricate engineering problem, to convince the bird that we've changed nothing.

do this-- so our data collection, we pair animals together. They communicate. Generally the males sing for the females, as part of mating.

We can collect data over many hours. It's a free behavior. Here I'm showing a pressure waveform and a spectrogram of their vocal output.

If I zoom, in time, on that vocal output-- if you've ever seen a human-speech spectrogram, you'll note that there's many similarities. The frequency range is quite similar. You see harmonic stacks.

Now, they don't have vowels and consonants, but they have acoustic structures that are very similar to vowels and consonants. In some of the species we work with, we also see signs of structure that resembles the tree structure of grammar that we have in human language.

We also obviously record neurophysiology, along with this. These are techniques developed by a wonderful project scientist, Dr. Arneodo, and a PhD student, Pablo Tostado. And so we're taking silicon probes. These have tens to hundreds of electrodes. And we're implanting in two nuclei that are in the motor pathway of these animals, in a region called "HVC" that is a premotor region and a region called RA, which is a primary motor region. And we can record hundreds of neurons. Again, these are those ticks that are showing you earlier, here now in our songbird model. And from those ticks, we can look at underlying statistics of those neurons and how they relate to this song behavior.

We can take that activity and synthesize song. These are some of our earlier efforts in synthesizing from that premotor region. More recently, we're starting to access that deeper primary-motor region, and we're actually seeing improved results. And we're going to pair those two regions together.

But just kind of a summary of the types of models that we use-- we can apply feedforward neural networks, recurrent neural networks, taking recent spike history, and predicting with high precision what the vocal output is. Here, we also apply biomechanical models. Dr. Arneodo has developed a pretty intricate biomechanical model for the vocal systems of these animals, and we can integrate that as well in real time, for synthesis. So we can skip that spectrogram step.

Here are just some examples of offline decoding, with various models. I won't get into the underlying theory here, but this is also a reason why Simulink is a really nice tool, because we want to be able to swap out models in our testing pretty rapidly. These are models with very different behavior, but we have modeling choices that are much more subtle than this.

So I'm going to play four examples. The first is the animal's original song. The second is this biomechanical model, where we're bringing in the biophysics of both production. The third is an LSTM model that has that concept of dynamics but no knowledge of biomechanics. And the last is the feedforward neural network. And you can see visually the spectrogram reconstructions how they perform.

That's the original song. This is the operative biomechanical model-- output of the recurrent neural network, and the feedforward neural network. So, obviously, different levels of quality. And we can quantify this, as well.

So we're recording from these two nuclei simultaneously, now. And just to give you some intuition of how we can learn from this data, we can apply statistical machine-learning techniques to look at low-D representations in this activity. So we have this chicken scratch of these spikes. We also have the behavior. So here, I take in pressure waveforms-- well, Pablo has taken pressure waveforms-- and color-coded them based on vocal units that we can identify.

Then, on the neural-activity side, we can apply techniques that assume that these high-dimensional recordings live on a low-dimensional manifold-- in this case, a low-dimensional plane. And this will allow us to denoise the data if this assumption is true. And so we can take the activity from that motor region RA. We can project--

And importantly, what you're seeing here are, across these different vocal performances that are shown in the pressure waveform, is that we get repeated trajectories in the neural-activity space, showing that we're getting a consistent neural pattern and that neural pattern is linked to behavior, as is highlighted by the color coding. And this is what allows us to develop machine-learning models that will translate neural activity to vocal output and do so with very low latency.

Here's just another view, as a time series, taking those three dimensions, plotting them as a time series, so you more clearly see the consistency. If we look at that premotor region, now, in that 3D space, we don't get as clear of a view, but we see faster dynamics when we look at the time-series view. And these types of views are helping us to develop novel machine-learning techniques to improve this mapping.

And so the whole goal here is to better understand this premotor and motor activity, so that we can get to closed loop and use the techniques that we develop to assist in clinical translation to make these devices a reality. This work is highly collaborative. This is the team of scientists, faculty, and grad students involved. And I just want to thank support, including a wonderful MathWorks UCSD Microgrant that is supporting Simulink development Yes.

Questions for Dr. Gilja?

I've got two. I don't know if that's too much. Really fascinating stuff. My first question was on the brain-to-text. I know there's been other work on doing motor intent from EMG data.

Mhm?

And yours is going straight through ECOG?

It's straight from intercortical. So we're recording single neurons, many single neurons, in the brain.

How do you see the comparison between the two? Yeah.

Yeah. They address different clinical needs. So, to access EMG, there needs to be an active motor pathway. So, for individuals with spinal-cord injury, that may be limited. Particularly for individuals with ALS, there are limitations there.

I see this as a suite of technologies. And what we're kind of amping up together, as a field-- EMG, noninvasive brain measurement, intercortical-- is a suite of tools that can be used in assistive tech.

And my second quick one is, I heard you mention about Simulink Real-Time.

Mhm?

For that, if I remember right, that uses the Speedgoat hardware. Were you using Speedgoat, or were you using your own target?

Yeah. The earlier work, we actually developed our own target. So the reality is Speedgoat machines are a pretty high price point. And they took a while to source. So there was a lot of careful decision and testing that got us there so we could replicate it more quickly.