AI-driven Plant-MAE boosts 3D plant phenotyping with self-supervised learning

By learning directly from unlabeled datasets, Plant-MAE reduces reliance on exhaustive annotations, achieving state-of-the-art performance across multiple crops and environments.

Plant phenotyping links genotype to phenotype and environment, offering insights essential for crop breeding, genomic analysis, and sustainable farming. Traditional methods, based on manual measurements or 2D imaging, are time-consuming, prone to errors, and incapable of capturing full plant architecture. In contrast, 3D point cloud technology models plants with unprecedented accuracy, enabling researchers to examine complex traits such as canopy architecture and organ morphology. Yet, annotating these datasets is difficult and costly, limiting the progress of deep learning approaches. Self-supervised learning has recently emerged as a promising solution in fields such as robotics and autonomous driving. Its adoption in plant science could address major bottlenecks in phenotyping.

A study (DOI: 10.1016/j.plaphe.2025.100049) published in Plant Phenomics on 6 May 2025 by Ruifang Zhai’s team, Huazhong Agricultural University, not only accelerates phenotypic trait measurement but also provides a scalable pathway for integrating genomics, breeding, and precision agriculture into real-world farming.

In this study, researchers designed a experimental framework to evaluate the Plant-MAE model for plant organ segmentation. To ensure generalizability, they first divided a large collection of crop point clouds into two distinct sets: a pretraining dataset of 3,463 point clouds from eight crops, used for mask reconstruction to learn latent features, and an SVM dataset of 435 point clouds from five other crops, used to validate pretraining effectiveness and select optimal weights. Importantly, the pretraining data were unlabeled, demonstrating the model’s reliance on self-supervised learning. For the segmentation task, datasets of maize, tomato, and potato were employed, with additional validation performed on public datasets such as Pheno4D and Soybean-MVS to test generalization. To manage computational costs, voxel downsampling and farthest point sampling standardized data sizes to 5,000, 2,048, or 10,000 points depending on the task, while data augmentation techniques—including cropping, jittering, scaling, and rotation—expanded the dataset’s diversity. Pretraining ran for 500 epochs with batch sizes of 520 using AdamW optimization, while fine-tuning used 300 epochs with batch sizes of 20. Model stability was enhanced through scale normalization and a hierarchical encoder. Performance was evaluated using multiple metrics, including precision, recall, F1 score, and mean intersection over union (mIoU). Results showed that Plant-MAE achieved strong segmentation accuracy across diverse data acquisition methods, including terrestrial laser scanning, image-derived point clouds, and laser triangulation. For maize and potato TLS data, the model delivered high accuracy, surpassing the baseline Point-M2AE, although it exhibited lower recall for tassels due to class imbalance. On tomatoes and cabbages, Plant-MAE accurately segmented organs even under dense canopies, with performance exceeding 80% across all metrics. On the Pheno4D dataset, the model achieved near-perfect segmentation, further validating its robustness. Comparative experiments revealed that Plant-MAE consistently outperformed PointNet++, Point Transformer, and other state-of-the-art models, confirming its superior accuracy, stability, and adaptability across crops and environments.

By overcoming the annotation bottleneck, Plant-MAE paves the way for high-throughput, cross-crop phenotyping. Its ability to generalize across species and environments enables breeders and agronomists to monitor crop growth, assess stress responses, and quantify traits more effectively. This supports precision agriculture, where informed decisions on irrigation, fertilization, and pest management can improve yield and reduce resource waste. Furthermore, the model’s adaptability to field conditions ensures practical utility beyond controlled laboratory environments, offering a tool for real-time monitoring in diverse agricultural landscapes.

###

References

DOI

10.1016/j.plaphe.2025.100049

Original URL

https://doi.org/10.1016/j.plaphe.2025.100049

Funding information

This work was supported by the National Key Research and Development Program of China (2023YFF1000100, 2022YFD2002304).

About Plant Phenomics

Plant Phenomics is dedicated to publishing novel research that will advance all aspects of plant phenotyping from the cell to the plant population levels using innovative combinations of sensor systems and data analytics. Plant Phenomics aims also to connect phenomics to other science domains, such as genomics, genetics, physiology, molecular biology, bioinformatics, statistics, mathematics, and computer sciences. Plant Phenomics should thus contribute to advance plant sciences and agriculture/forestry/horticulture by addressing key scientific challenges in the area of plant phenomics.

https://doi.org/10.1016/j.plaphe.2025.100049

Title of original paper: Automated 3D Segmentation of Plant Organs via the Plant-MAE: A Self-Supervised Learning Framework
Authors: Kai Xie a, Chenxi Cui a, Xue Jiang a, Jianzhong Zhu a, Jinbao Liu a, Aobo Du a, Wanneng Yang b c d, Peng Song b d, Ruifang Zhai a b c
Journal: Plant Phenomics
Original Source URL: https://doi.org/10.1016/j.plaphe.2025.100049
DOI: 10.1016/j.plaphe.2025.100049
Latest article publication date: 6 May 2025
Subject of research: Not applicable
COI statement: The authors declare that they have no competing interests.

Attached files

Figure 2. Overview of Plant-MAE. (a) The main structure of the self-supervised pretraining model, including three parts—a token embedding at the forefront, followed by a hierarchical encoder–decoder and a point reconstruction module. (b) The main structure of the fully supervised fine-tuning model, which includes a pretrained encoder and a segmentation head. (c) Demonstration of the encoder structure used in (a) and (b), which consists of four MAFEBs linked by residual connections.

17/09/2025 TranSpread

Regions: North America, United States, Asia, China

Keywords: Applied science, Engineering, Business, Agriculture & fishing

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Latest Publications

Testimonials