Molecular property prediction has long been a critical challenge in drug development. Traditional quantum chemistry methods are inefficient when dealing with large-scale molecular databases, failing to meet the demands of modern drug discovery. Existing AI models for this task often face a trade-off between model size and performance, with larger models generally achieving better results at the cost of increased computational requirements and reduced accessibility.
To address this issue, a research team led by Xuefeng CUI and Wei ZHAO from Shandong University published their latest findings on 15 May 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed a lightweight deep learning framework called MetaGIN for efficient and accurate molecular property prediction. Their results demonstrate that MetaGIN achieves state-of-the-art performance on multiple benchmark datasets while using significantly fewer parameters than existing models, effectively balancing efficiency and accuracy in molecular property prediction. On the PCQM4Mv2 dataset, MetaGIN achieved a mean absolute error of 0.0851, surpassing many more complex models. Achieving the tradeoff between model size and performance.
In this study, the team analyzed the impact of three-dimensional molecular structures on their properties and innovatively proposed a “3-hop convolution” technique to capture these complex spatial relationships. To predict molecular properties more accurately, they applied the MetaFormer architecture to graph neural networks, combining 3-hop convolution and graph propagation modules.
Specifically, MetaGIN firstly intends to use 3-hop convolution to capture local structural information of molecules, then integrates global information through the graph propagation module. This approach allows the model to consider both local and global features of molecules simultaneously, thereby improving prediction accuracy.
The research team conducted extensive experiments on the PCQM4Mv2 dataset and MoleculeNet benchmark. Results show that MetaGIN outperforms most existing methods on multiple datasets while maintaining a lightweight model (with fewer than 10 million parameters).
Future work will focus on pre-training techniques specifically designed for small molecules, which could further enhance the model's performance and generalization capabilities across various molecular property prediction tasks.
Additionally, the source code for the MetaGIN model is publicly accessible at: https://github.com/xwxztq/MetaGIN.
DOI: 10.1007/s11704-024-3784-y