A reinforcement learning framework for guiding the agent to perform exploration based on clustering
en-GBde-DEes-ESfr-FR

A reinforcement learning framework for guiding the agent to perform exploration based on clustering

21/05/2025 Frontiers Journals

Exploration strategy design is a challenging problem in reinforcement learning (RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover unexplored (novel) areas or high reward (quality) areas. However, most existing methods perform exploration by only utilizing the novelty of states.
To solve the problems, a research team led by Prof. Wu-Jun LI published their new research on 15 Apr 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed a novel reinforcement learning framework, clustered reinforcement learning (CRL), for efficient exploration in RL. This framework is evaluated in four continuous control tasks and six hard-exploration Atari-2600 games. Compared with the existing research results, the proposed method can effectively guide the agent to perform efficient exploration.
In the research, they analyze the limited effectiveness of existing exploration strategies, which only use the novelty of states to guide the agent to perform exploration. To use the novelty and quality of states for exploration simultaneously, they adopt clustering to divide the collected states into several clusters based on which a bonus reward reflecting both novelty and quality in the neighboring area (cluster) of the current state is given to the agent. Furthermore, their proposed method can be combined with existing exploration strategies to boost their performance, as the bonus rewards employed by these existing exploration strategies solely capture the novelty of states. The experiments are performed on four continuous control tasks and six hard-exploration Atari-2600 games. The experimental results show that the proposed method can perform better than the existing exploration strategies.
DOI: 10.1007/s11704-024-3194-1
Research Article, Published: 15 April 2025
Xiao MA, Shen-Yi ZHAO, Zhao-Heng YIN, Wu-Jun LI. Clustered Reinforcement Learning. Front. Comput. Sci., 2025, 19(4): 194313, https://doi.org/10.1007/s11704-024-3194-1
Archivos adjuntos
  • Fig. 1 Comparison between clustering-based bonus rewards with novelty alone (η = 1.0) and clustering-based bonus rewards (η = 0.5). Here, the collected states (blue dots) are clustered into 5 clusters and the agent is rewarded with 1 in the orange area and receives no reward in other areas.
21/05/2025 Frontiers Journals
Regions: Asia, China
Keywords: Applied science, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonios

We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet
AlphaGalileo is a great source of global research news. I use it regularly.
Robert Lee Hotz, LA Times

Trabajamos en estrecha colaboración con...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2025 by DNN Corp Terms Of Use Privacy Statement