By integrating multisource imaging and deep learning, the model enables earlier and more reliable identification of high-yield lines, offering breeders a powerful tool to accelerate the development of climate-resilient wheat varieties.
Wheat is a staple crop for more than 40% of the world’s population, and boosting its productivity remains one of the most effective ways to combat food insecurity. Traditionally, estimating yield has depended on destructive, labor-intensive sampling methods that involve threshing and weighing seeds. Such approaches are slow, costly, and prone to human error, limiting their use in large-scale breeding programs. In recent years, UAV-based remote sensing has emerged as a high-throughput, non-destructive alternative, enabling the extraction of vegetation indices (VIs) and morphological traits. While machine learning models using these data have shown promise, many rely on simple concatenation strategies that dilute the predictive power of multivariate datasets, leading to modest accuracy. To overcome these challenges, researchers turned to advanced transformer-based AI models capable of leveraging time series data and multiple data modalities.
A study (DOI: 10.1016/j.plaphe.2025.100039) published in Plant Phenomics on 30 March 2025 by Zhaoyu Zhai’s team, Nanjing Agricultural University, enables highly accurate, non-destructive prediction of wheat yield, accelerating the selection of high-yield varieties for food security.
The study employed a novel transformer-based model with a variate-independent tokenization approach to analyze UAV-derived multivariate time series data for wheat yield prediction. Specifically, the researchers extracted 14 vegetation indices and 28 morphological traits from RGB and multispectral imagery to evaluate spectral and structural variations across different wheat varieties and nitrogen treatments. By visualizing indices such as CVI, ARI, NDVI, and NCPI over time, they confirmed the reliability of these parameters in capturing growth dynamics, with high-yield varieties showing consistently stronger signals. Morphological traits, including plant height and canopy structure derived from synthetic point clouds, further correlated with yield outcomes, revealing both expected and unexpected patterns—such as yield decline due to lodging in certain varieties under high nitrogen application. Building on this foundation, the proposed model was benchmarked against recurrent neural networks, LSTMs, and other transformer variants. Using centered kernel alignment (CKA) to assess feature representation, the model demonstrated superior ability to capture multivariate relationships, achieving the highest R² of 0.862 and the lowest mean absolute error of 0.057, surpassing existing methods. Its robustness was further validated under varying nitrogen treatments, where medium-nitrogen conditions yielded the best predictive accuracy due to balanced canopy development. Importantly, the attention mechanism revealed that height-related traits and specific vegetation indices such as SIPI and PSRI were the most influential predictors, with flowering and maturation stages contributing most strongly to yield estimation. The model also showed reliable generalizability, maintaining R² values above 0.8 on unseen data from different years and varieties. Collectively, these results confirm that combining vegetation indices and morphological traits, integrated through a refined transformer architecture, substantially enhances the accuracy and interpretability of wheat yield prediction at the plot level.
This new approach offers a scalable, cost-effective tool for breeders seeking to identify high-yield, climate-adapted wheat varieties. By enabling precise, non-destructive, and rapid yield predictions, the model reduces dependence on manual measurements and accelerates the breeding cycle. Beyond wheat, the framework can be adapted to other staple crops, supporting global efforts to enhance food production under increasingly variable environmental conditions. The ability to interpret variable importance also improves transparency in AI-assisted agriculture, helping breeders understand which traits most influence yield outcomes.
###
References
DOI
10.1016/j.plaphe.2025.100039
Original URL
https://doi.org/10.1016/j.plaphe.2025.100039
Funding information
This work was supported by the Natural Science Foundation of Jiangsu Province (Grant No. BK20231004), the National Natural Science Foundation of China (Grant No. 32401697), and the National Key Research and Development Program of China (2022YFE0116200).
About Plant Phenomics
Plant Phenomics is dedicated to publishing novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics.