Researchers at TU Graz are using virtual reality and large language models to support people with autism spectrum disorder in training social skills. The system is intended to make treatment options more widely accessible.
An increasing number of people worldwide are affected by autism spectrum disorder (ASD); according to studies, one in 44 children is diagnosed with it. A central symptom is so-called “social blindness”, i.e. the inability to recognise emotions in others and to react appropriately to social situations. Suitable therapy is usually based on one-to-one or small group support, which is only available to a limited extent and is cost-intensive. Researchers at the Institute of Human-Centred Computing at Graz University of Technology (TU Graz) are using computer game technology to create an effective supplement that is inexpensive and available at any time. Initial studies show that this approach helps people with ASD to get through everyday life more safely.
Everyday situations without social consequences
The specially developed virtual environment Simville uses virtual reality, large language models (LLMs), speech recognition and speech generation to make social training location-independent and therefore more accessible for those affected. In this computer world, users train for realistic everyday situations, such as conversations with work colleagues or meeting people in a café. As this takes place in a controlled environment, users can act freely without having to fear social consequences. These training scenarios make them better prepared for similar interactions in everyday life.
“Our system is not meant to replace conventional therapies, but to complement and enhance them in a meaningful way,” says Christian Poglitsch from the Institute of Human-Centred Computing at TU Graz, who implemented the project as part of his doctoral thesis. The immersive but playful approach is of central importance to Simville. Tasks, storytelling and immediate feedback after acting out a scene motivate participants to practise regularly. In addition, the number of stimuli acting on the user can be controlled, so that beginners can start with a small number and increase this over time through their training and reduce it again if they become overwhelmed.
Language model conveys emotions
By integrating LLMs as well as speech recognition and generation, users can speak to the avatars in the game world like normal. What is said is converted into text by the speech recognition system, and a large language model creates a reaction tailored to the situation and the avatar responds accordingly in spoken language. The team used the model Gemini 12B from Google to create and play out the response. “What was fascinating was that the model was also able to convey a certain emotion. Depending on the context of what is being said, you can definitely hear the right undertone,” says Christian Poglitsch.
Test subjects feel safer
Initial studies show that training with Simville has positive effects. A study of 25 participants showed that after just a few sessions, many felt much more confident in social situations. Simville is now being incorporated into the international ETAP project led by Furtwangen University. The simulation interface is combined with extensive sensor technology in order to reduce or increase the intensity of the experience based on the user’s reaction. In addition, the Game Lab Graz at TU Graz would like to make Simville available as a demonstrator so that affected people can train with it themselves.