A CBBL research team led by Professor
Balachandran Manavalan from the Department of Integrative Biotechnology at Sungkyunkwan University has developed DeepTYLCV, an accurate and interpretable artificial intelligence model for predicting the virulence of Tomato Yellow Leaf Curl Virus (TYLCV). The study was conducted with co-first authors Dr. Nattanong Bupi, Hariharan Sangaraju, and Duong Thanh Tran, was published in the leading plant science journal
Plant Communications (Impact Factor: 11.6; JCR: 6/273; Top 2.2% in the Plant Sciences category).
TYLCV is one of the most destructive viral pathogens affecting tomato production worldwide. Severe TYLCV strains can cause leaf curling, yellowing, stunted growth, and major yield losses. In recent years, highly virulent strains have continued to spread across regions and, in some cases, overcome genetic resistance in tomato cultivars. These challenges highlight the urgent need for accurate, early, scalable, and sequence-based disease surveillance.
Prof. Manavalan’s team has been working extensively at the interface of biology and artificial intelligence, developing AI-based solutions for peptide therapeutics, prediction of RNA/DNA modifications, protein function analysis, toxicity prediction, plant science, and biomedical applications. In 2023, the team developed IML-TYLCV, the first genome-based TYLCV severity prediction tool, which was published in the high-impact journal Research (IF: 10.9). However, IML-TYLCV was mainly trained on Korean isolates, limiting its applicability to globally diverse TYLCV strains. This challenge motivated the development of DeepTYLCV, a more robust AI framework for predicting TYLCV virulence across global viral isolates.
Unlike conventional field diagnosis or image-based AI models, which depend on visible symptoms and can be influenced by environmental conditions, DeepTYLCV uses viral genome-derived sequence information. This enables the model to identify mild and severe strains before symptom-based confirmation and provides a scalable strategy for monitoring emerging viral variants.
DeepTYLCV integrates protein language model embeddings with a hybrid architecture that combines a Transformer encoder and a multi-scale convolutional neural network, enabling the model to capture both global sequence patterns and local virulence-associated motifs. By combining deep sequence representations with optimized conventional feature descriptors, DeepTYLCV achieved superior predictive performance compared with the previous IML-TYLCV model.
A key strength of this study is its experimental validation. The research team performed blind predictions for 15 TYLCV isolates, including both international reference isolates and Korean field isolates. These predictions were validated using tomato plant infection assays, symptom severity scoring, and viral accumulation analysis. Remarkably, DeepTYLCV achieved 100% concordance between predicted and experimentally observed virulence classes, demonstrating its practical value for identifying emerging severe TYLCV variants.
This work provides a powerful example of how AI, viral genomics, and experimental plant pathology can be integrated to support precision agriculture and plant disease management. DeepTYLCV may serve as a valuable tool for early viral surveillance, resistance breeding programs, and rapid assessment of newly emerging TYLCV strains.
This research was supported by the National Research Foundation of Korea (NRF), funded by the Ministry of Science and ICT, Republic of Korea (Grant No. RS-2024-00344752). Additional support was provided by the BK21 FOUR Project of the Department of Integrative Biotechnology, Sungkyunkwan University (SKKU), Republic of Korea.