Predictive robots: our future assistants

One of the biggest challenges in developing Human Machine Interfaces is natural interaction. Solutions such as gesture and voice control have already enabled significant progress to be made in this area. Recently, the focus has also shifted towards controlling machines with thoughts: Brain Machine Interfaces measure brain EEG signals and use these to derive control commands for computers, machines or robots. The DFKI is one of the pioneers in the use of EEG data for interaction with robotic systems. With its latest research project, EXPECT, the research centre is aiming to develop an adaptive, self-learning platform for human-robot collaboration. It should not only enable various forms of active interaction but also be capable of deriving human intention from gestures, speech, eye movements and brain activity: the machine should be able to anticipate what the human is going to do next. Professor Elsa Andrea Kirchner, project leader for the Robotics Innovation Center, provides insights into the project and the state of research on Brain Machine Interfaces.

Controlling machines with a brain chip is already being tested by companies such as Neuralink and Synchron. Is this indeed the future of Human Machine Interfaces?

Elsa Andrea Kirchner: The problem with interacting with the brain through implanted chips is that you can only reach the part of the brain where the chip is located. You would need to implant many of these chips to obtain a good picture of what the brain is doing. For some purposes, this procedure is certainly useful, such as for stimulating the brain in Parkinson’s disease.

What is the alternative?

E. A. K.: Brain activity can also be measured externally, using electrodes attached to the head. However, when doing this, you are always measuring a sum of activities, so the resolution is lower than with an implanted electrode. Additionally, there is more noise because brain currents are measured through the skin, bones and hair. You therefore need powerful devices to record them, and you need good signal processing and machine learning to interpret the data correctly.

What is your approach in the EXPECT project?

E. A. K.: Quite often, when interacting with another person, we can tell what they want to do or expect from us. For example, a colleague hands me a tool because I am looking at it and he is standing right next to it. This is also something you want to achieve in interactions with machines. You don’t always want to explicitly tell the machine every single step; you want the machine to understand it on its own. There are many ways to achieve this, and one of these ways is the direct use of brain activity.

Does every person think the same way? Does an EEG always show the same thing, regardless of who is thinking the thought “Robot, open the gripper”?

E. A. K.: Our brains are organised very similarly. So, we have the same areas in similar locations. But we also know that there are significant differences between people. This means that, once you have measured a person’s brain activity with an EEG and analysed it using machine learning, you cannot simply transfer the trained model to another person. The performance might decrease by 20 or 30 percent. So, we have a number of challenges to overcome in training the models so that they can be used for several individuals, and this is one of the EXPECT project’s objectives.

Your platform for human-robot collaboration relies on various forms of active interaction, not just thoughts. Why is that?

E. A. K.: Imagine a stroke patient who, for example, is unable to move their right arm. Even in such a case, there is still some planning of movement happening in the brain. We can recognise that and move the arm using an exoskeleton. However, we encounter some problems with this approach. First, our interpretation is not 100 percent accurate. The second problem is that when a person thinks about a body movement, they may not necessarily want to execute it.

Most patients still have a tiny amount of muscle activity even after a stroke, which we can utelise. First, we interpret the EEG signal and recognise that the patient is thinking about a movement. At the same time, we monitor the patient’s muscles and if we detect activity, we know that they truly want to perform the movement.

This combination of different signals is crucial because if an exoskeleton suddenly moves the arm without the person wanting it, they may feel like they’ve lost control and the exoskeleton has taken over their free will.

In what other cases does it make sense to use various means of interaction?

E. A. K.: Take speech recognition, for example. Often, the environment is too noisy for it. In our projects, we are working on combining EEG and speech to ensure accurate speech recognition. Simultaneously, speech recognition can help us better interpret the EEG. During the training phase, one might say, “Please fetch the hammer” while measuring the EEG. Later, you only have to think it and the robot will understand it based on brain activity.

The main goal of your project is for the machine to anticipate the human’s intention. In which applications does this make sense?

E. A. K.: Sometimes you find yourself working with people who already know what they should do before you tell them anything. We find this to be a very positive experience. The same applies to when we work with a machine: there are situations where it would be better if the system knew my intention.

Imagine wearing an exoskeleton and trying to repair something overhead. The exoskeleton supports your arm and keeps it raised, which is initially helpful. But at some point you’ll finish your work and want to lower your arm again. Although the sensors recognise this, for a moment you still have to work against the exoskeleton. If we can recognise the planning of arm movements in the brain, the system can prepare for it and react faster. We have already tested this with individuals, and they could really feel the differences.

How does that work exactly?

E. A. K.: We can look into the brain and examine the timeframe during which the brain is planning the movement before sending a signal to the muscles. This can take up to 1.5 seconds, sometimes even longer. We can look into this preparation phase and recognise that the person wants to move. And this can only be done through brain signals, not gestures, muscle activity, eye movements, or speech.

Where are you currently in your research project?

E. A. K.: Within the EXPECT project, we are focusing on the possibilities of training on multimodal data and how to use it. For example, to switch between signals when the quality of one signal deteriorates. So, it’s not about the general approach, but rather about how we can use different methods to adapt to changing signal qualities.

We can train with different signals than those we later use. For example, if a patient’s muscle activity is not reliable initially, we can use EEG for training and later use muscle activity and eye tracking to improve and enhance performance.

You use gestures, speech, eye movements, and brain activity – is there a technology that will dominate in the future?

E. A. K.: That’s a difficult question to answer. It depends on how and what you want to communicate. But BCI is better suited for the future than other systems. However, I believe that the quality of interfaces will change in a way that makes them much more natural for us. Ideally, you wouldn’t see, feel, or notice an interface. And I believe in multimodal interfaces because that’s how we communicate as humans – with speech, facial expressions, and gestures.

What developments in semiconductor technology are particularly exciting for you?

E. A. K.: A group of researchers and engineers at the University of Duisburg Essen is working on terahertz technology. This technology is excellent for recognising the environment. You can see a wall, a window and a corner and even determine whether it’s made of wood, stone or plastic. There are many ideas on how this technology can be used to measure biosignals without physical contact – for example, by measuring muscle movement through the reflection of terahertz waves. And from these movements we can infer what the hand and fingers are doing.

The use of graphene to realise epidermal electronics is also interesting. It allows us to measure electrical activity in the muscle with very high resolution. This is not only fascinating for interaction but also to understand muscle-related diseases.

Is there anything missing in today’s available semiconductor solutions for your platform?

E. A. K.: Imagine you want to perform a complex analysis of brain activity. This requires a large amount of data and powerful machine learning models. This might be quite challenging to achieve on-site, but research is already focused on implementing these large AI models into small embedded devices. This is also crucial for us because if you’re walking around in nature with your exoskeleton and don’t have internet access, relying on cloud-based AI processing can be problematic.

To integrate the AI model into the system, we need highly energy-efficient computing power. The model must also continue learning in operation. For example, imagine a patient whose signal becomes better over time. The model should recognise this, so the system relies more on muscle activity than EEG.

What do you consider important in designing an optimal Human Machine Interface?

E. A. K.: The most important thing is to be open-minded. That means not saying, “I’m a BCI researcher, so I want to work with brain activity.” You should always think about what you want to achieve and how humans would interact.

Secondly, always keep in mind that we’re talking about a diverse society. Facial expressions may vary among different nationalities, for example. So, when developing such a system, consider not only the technology but also the social context. For me, it’s also crucial to talk to the people who will use the system in the future.

Machines that can read human thoughts – is this the first step towards the dystopia portrayed in Hollywood where machines gain power over humans?

E. A. K.: At the moment, we are not yet at the point where we can truly read thoughts completely. And, to be honest, I don’t believe the danger lies in the machine controlling the human. I think the risk is more likely to be another human having access to someone’s thoughts.

We had a project, for example, where we wanted to develop an EEG-based approach to detect high workload in individuals within a company, to prevent burnout, for instance.

In this case, you need to prevent someone with malicious intent from accessing the data in the cloud. But even if this data is only used to optimise a person’s production environment, such as slowing down a robot, it can still have consequences for the person. Because the employer might say, “I’d rather hire a younger person who can work faster.”

So, understanding a person can also be used to worsen the situation or harm individuals. For example, we can discriminate against people because we discover that they cannot recognise certain things or have a very low attention span. This is already a very real possibility if someone has access to a person’s EEG.

Finally, a look into the future – you can now be visionary: How will we interact with machines in 25 years?

E. A. K.: I expect that interacting with machines will be very similar to interacting with other people by then. We will interact with systems and speak to them very naturally. These systems will understand what we want. Multimodal interaction will be taken for granted. I believe that in the future, it will be very difficult to discern from the outside whether we are interacting with another human or a machine.