Exploiting large language model with reinforcement learning for generative job recommendations
en-GBde-DEes-ESfr-FR

Exploiting large language model with reinforcement learning for generative job recommendations

05/02/2026 Frontiers Journals

With the rapid advancement of Large Language Models (LLMs), an increasing number of researchers are focusing on Generative Recommender Systems (GRSs). Unlike traditional recommendation systems that rely on fixed candidate sets, GRSs leverage generative capabilities, making them more effective in exploring user interests.
Existing LLM-based GRSs primarily utilize Supervised Fine-Tuning (SFT) to enable LLMs to generate candidate items. Additionally, these systems employ similarity-based grounding methods to map the generated results to real-world items. However, SFT-based training is insufficient for LLMs to fully capture the complex interactive behaviors embedded in recommendation scenarios, and similarity-based grounding struggles with the challenges of long-text matching.
To solve the problems, a research team led by Hui XIONG published their new research on 15 January 2026 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The research team proposed GIRL (Generative Job Recommendation based on Large Language Models). Specifically, they designed a reward model to evaluate the matching degree between Curriculum Vitae (CVs) and Job Descriptions (JDs). To fine-tune the LLM-based recommender, they introduced a Proximal Policy Optimization (PPO)-based Reinforcement Learning (RL) method. Furthermore, they proposed a model-based grounding method to improve the accuracy of JD grounding.
The proposed method was extensively evaluated on two real-world datasets, and experimental results demonstrate that GIRL outperforms seven baseline methods, achieving superior recommendation effectiveness. Future research directions include exploring more advanced grounding techniques, expanding datasets for better generalization, and optimizing reinforcement learning strategies for enhanced model performance.

DOI
10.1007/s11704-025-40843-1
Attached files
  • The processing flow of GIRL
05/02/2026 Frontiers Journals
Regions: Asia, China
Keywords: Applied science, Computing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2026 by AlphaGalileo Terms Of Use Privacy Statement