AI and climate change: How to reliably record greenhouse gas emissions
en-GBde-DEes-ESfr-FR

AI and climate change: How to reliably record greenhouse gas emissions


An LMU team has developed a method for pulling data from corporate sustainability reports more accurately.

Large companies in the EU are legally required to report their greenhouse gas (GHG) emissions. Yet pulling this information manually from long PDF sustainability reports is slow and error-prone. Many teams try to speed up the process with automation—for example by using Large Language Models (LLMs), AI systems that read text and produce answers.

Project coordinator and postdoctoral researcher at the Social Data Science and AI Lab (SODA Lab), Dr. Malte Schierholz urges caution though: “With automatic extraction methods, it’s easy to fully trust the LLM’s output and overlook measurement errors that occur frequently.” Because the trend of increased automation is promising but risky at the same time, the research group Greenhouse Gas Insights and Sustainability Tracking (GIST) set out to build a reliable point of reference for collecting emission data.

A gold standard for recording emissions data

In a paper published in Scientific Data, the group introduces a gold-standard benchmark dataset for extracting GHG emissions. The dataset is based on sustainability reports sampled from companies in the MSCI World Small Cap index and the German DAX. “The basic task was to extract GHG emissions values from PDF files into a table,” says Schierholz. “What first sounded straightforward turned out to be surprisingly complex.”

In a multi-stage process, sustainable finance experts from LMU and Deutsche Bundesbank worked with methodologists to define strict annotation rules, ran multiple rounds of extraction and verification, and convened expert discussion groups. “If you want a dataset that’s both accurate and allows for comparisons between companies, you need clear rules and plenty of feedback loops throughout the data annotation process,” says Jacob Beck, who led the annotation effort. “In the end, some ambiguous cases still required expert group discussion.”

Many companies do not provide sufficient documentation

Sustainable Finance researcher Dr. Andreas Dimmelmeier (GreenDIA consortium) was not surprised: “The hard-to-resolve cases stem not only from complex and partly inconsistent reporting protocols, but also from missing context and incomplete disclosures in company reports. Many companies in our sample did not disclose emissions according to established reporting and calculation frameworks.”

The team also observed that about half of the reports contained no usable greenhouse gas data at all. When emissions were reported, they most often referred to direct emissions and indirect emissions from energy consumption. Data on other indirect emissions, such as those arising in the supply chain or from travel and transport, was rarely complete.

The dataset—together with scripts and supplementary materials—offers a transparent, rigorously curated foundation for evaluating automated approaches to sustainability reporting. By making the assumptions and decisions explicit, it enables fair method comparisons and clearer communication of annotation uncertainty. The GIST group hopes this resource will help researchers and practitioners measure progress more honestly and close critical data gaps on the path to net zero.
Jacob Beck, Anna Steinberg, Andreas Dimmelmeier, Laia Domenech Burin, Emily Kormanyos, Maurice Fehr & Malte Schierholz: Addressing data gaps in sustainability reporting: A benchmark dataset for greenhouse gas emission extraction. Scientific Data 2025
https://doi.org/10.1038/s41597-025-05664-8
Regions: Europe, Germany, Malta, North America, United States
Keywords: Science, Climate change, Applied science, Artificial Intelligence, Society, Economics/Management

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • e
  • The Research Council of Norway
  • SciDevNet
  • Swiss National Science Foundation
  • iesResearch
Copyright 2025 by AlphaGalileo Terms Of Use Privacy Statement