Conventional LCA faces a persistent bottleneck: background databases like ecoinvent, while comprehensive, often fail to cover all activity types within a sector. Downstream fine chemicals may lack usable data even when upstream bulk chemicals are well represented. Compiling life cycle inventories requires matching each input and output to corresponding background data—a process that is costly, time-consuming, and demands extensive domain expertise. Existing machine learning solutions have tackled these challenges but remain confined to single sectors, trained on hand-crafted features for agriculture, construction, or chemicals in isolation. Due to these limitations, there is a pressing need for a generalizable approach that can bridge sectors and scale across the full spectrum of industrial activities.
A research team from Tsinghua University, Shanghai E-Carbon Digital Technology Co., Ltd., Shanghai HiQ Smart Data Co., Ltd. (HiQLCD), and The University of Hong Kong, led by Professor Shanying Hu and Zhijun Gui, with Kai Zhao and Biao Luo as the first authors, has developed LCA-TextNet, a deep learning model that predicts life cycle impact assessment (LCIA) results from knowledge-based textual descriptions. The work, accepted for publication (DOI: 10.1016/j.ese.2026.100724) on June 16, 2026, in Environmental Science and Ecotechnology, represents a significant step toward cross-sector generalization in artificial intelligence-driven environmental assessment.
The researchers trained a Transformer-based architecture on over 16,000 activity datasets from the ecoinvent database (version 3.10), using seven categories of textual information as inputs. Rather than relying on domain-specific features like molecular descriptors for chemicals or building parameters for construction, the model maps high-dimensional text embeddings into a unified semantic space. The results are striking: LCA-TextNet achieves high accuracy (R² > 0.8) across 70% of sectors and for 17 of 25 environmental impact indicators. The model performs particularly well in data-rich, semantically coherent sectors such as waste treatment and recycling and wood products. However, performance varies—sectors like transport, water supply, and land use proved more challenging due to small sample sizes and heterogeneous textual descriptions. To address real-world deployment, the team introduced an "applicability domain" assessment that flags out-of-distribution predictions, helping users know when to trust the model and when to defer to expert review. When tested on newly introduced data from ecoinvent version 3.12—a realistic scenario where the model encounters previously unseen activities—incremental learning reduced the climate change mean absolute error by 70%, from 2.0 to 0.6 kg CO₂ equivalent per unit.
"The biggest bottleneck in LCA has always been data—not just the lack of it, but the sheer effort required to turn scattered process knowledge into structured inventory data," the authors said. "We realized that while inventory data are scarce, descriptive text is everywhere. Product names, process descriptions, technical comments—this information is readily available. Moreover, human inventory compilation itself is essentially a process of interpreting knowledge-rich text and translating it into structured, quantitative data. So we asked: can we train an artificial intelligence model to read that text and estimate environmental impacts directly? LCA-TextNet is our answer. It doesn't replace rigorous LCA—but it gives you a fast, reliable first estimate when you have nothing else to go on."
The framework offers a practical pathway for integrating natural language understanding into environmental modeling. When life cycle inventory data are available, LCA-TextNet can serve as a background data completion tool, filling gaps for inventory items that lack matching database entries. When inventory data are entirely unavailable, the model can directly predict LCIA results from textual functional-unit descriptions—enabling rapid screening-level assessments for early-stage designs, policy studies, and environmental, social, and governance (ESG) reporting. By exploiting the asymmetry between abundant descriptive information and scarce quantitative data, LCA-TextNet transforms textual knowledge into actionable environmental intelligence, potentially accelerating green transitions across industries where traditional LCA would otherwise stall. The model's code is publicly available on the HiQLCD GitHub repository (https://github.com/HiQ-LCD/LCATEXTNet), allowing researchers with valid LCA database licenses to retrain and apply the framework within the scope permitted by data licenses.
###
References
DOI
10.1016/j.ese.2026.100724
Original Source URL
https://doi.org/10.1016/j.ese.2026.100724
Funding information
The research was supported by the National Natural Science Foundation of China (No. U24B6016).
About Environmental Science and Ecotechnology
Environmental Science and Ecotechnology (ISSN 2666-4984) is an international, peer-reviewed, and open-access journal published by Elsevier. The journal publishes significant views and research across the full spectrum of ecology and environmental sciences, such as climate change, sustainability, biodiversity conservation, environment & health, green catalysis/processing for pollution control, and AI-driven environmental engineering. The latest impact factor of ESE is 14.3, according to the Journal Citation ReportsTM 2024.