Finding Hidden Catalytic Knowledge from Literature Data
en-GBde-DEes-ESfr-FR

Finding Hidden Catalytic Knowledge from Literature Data

08/06/2026 Tohoku University

Exciting new research at Tohoku University’s Advanced Institute for Materials Research (WPI-AIMR) explains how to transform decades of scattered literature data into computable design rules for catalysts. By using human intelligence, regression models, and AI agents, researchers can accelerate the discovery of efficient, low-cost catalysts for clean energy technologies like fuel cells, water splitting, and CO₂ reduction. By combining these methods, researchers can uncover new discoveries that were hidden in the literature data all along.

Catalysts, which speed up chemical reactions, are crucial for many important technologies and manufacturing processes. However, finding the right catalyst for the job is tricky. While the first step is usually to refer to previously published scientific literature, making a cohesive summary of all this data can be overwhelming. Even studies that investigate the same catalyst might cover different experimental conditions and measure different variables, making comparisons difficult. How do we find the best catalyst candidate if the data is all over the place? It would be like trying to compare a database of cake recipes that use different ingredient amounts, bake times, and oven temperatures.

“There is an enormous amount of information in the wealth of scientific literature published so far on catalysts,” remarks Distinguished Professor Hao Li (WPI-AIMR). “But taking all of these disparate, individual studies and summarizing them into actionable information – such as gleaning the blueprints for rational catalyst design – is incredibly difficult.”

This study summarizes three current methods for reorganizing, re-analyzing, and remodeling information that is “hidden” in the literature. The first is using human brainpower to summarize data manually. The second is data analysis, such as performing a statistical analysis called a regression model on big data to get a quantitative assessment of a certain catalyst’s structure-performance characteristics. The third is to use artificial intelligence (AI) to further assess the findings, and propose new candidate materials. Ideally, researchers will use all three together.

“Doing everything by hand is too slow, but relying solely on AI without careful cross-checking can be faulty, so we need a careful balance,” says Li.

Re-analyzing data from multiple studies may reveal new information or even anomalies that need the combination of human intelligence and AI to puzzle out an underlying theory to explain it. In this way, even old data can reveal new tricks.

Developing systematic methods to improve catalyst performance such as those proposed in this paper is highly beneficial to our society as they can lead to the faster development of sustainable energy solutions, reduced reliance on expensive noble metals, and progress toward a carbon-neutral society.

These findings were published in EES Catalysis on May 14, 2026.
Title: Finding the Hidden Catalytic Knowledge from Literature Data
Authors: Yuhang Wang, Yong Wang, Hao Li
Journal: EES Catalysis
DOI: 10.1039/D6EY00079G
Attached files
  • (a) Systematic statistical results of the selectivity of the main product in CO2RR by Sn-based catalysts. (b) Comparison of CO Faradaic efficiency for various DACs reported in the past three years. Comparison of experimental Faraday efficiencies of Cu-based single-atom alloys (SAAs) in CO2RR for C2+ product formation (c) and HER (d), compiled from the DigCat database. ©Hao Li et al.
  • Overview of the overall catalyst design process constructed for electrochemical hydrogen peroxide synthesis in 2e- water oxidation (2e- WOR). ©Hao Li et al.
  • (a) Schematic diagram of the overall architecture of the StableOx-Cat AI agent, where MOs represents metal oxides and MCP represents the Model Context Protocol. (b) Schematic diagram of the CRESt system, which is based on a large visual language model (LVLM)-driven agent framework. (c) Schematic diagram of the process of predicting and analyzing the structure-performance relationship of electrochemical nitrogen reduction reaction (eNRR) literature based on LLMs enhanced by ChemPrompt. ©Hao Li et al.
08/06/2026 Tohoku University
Regions: Asia, Japan
Keywords: Applied science, Artificial Intelligence, Science, Chemistry, Energy, Physics

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2026 by AlphaGalileo Terms Of Use Privacy Statement