Supercritical fluids are essential in cleaning, extraction, and chromatography, but determining their critical temperature and pressure experimentally is challenging—especially for polar or thermally unstable compounds. A study published in
Frontiers of Chemical Science and Engineering introduces a deep learningenhanced QSPR framework that directly incorporates complete molecular structures to boost predictive accuracy.
The team compiled a diverse data set of 1359 organic compounds (alkanes, alkenes, alcohols, acids, esters, etc.). Using density functional theory (B3LYP/6311G(d,p)), they optimized each molecular structure and extracted threedimensional electron density grids. From RDKit, 400 molecular descriptors were calculated, and the maximal information coefficient (MIC) selected the 20 most relevant descriptors for
Tc and
pc.
Three models were built and compared:
- Traditional ANN (descriptoronly). Validation set: for Tc, R2=0.865, MAPE = 4.14 %; for pc, R2=0.913, MAPE = 4.77 %.
- 3D ResNet (CNN) (molecular structure only). The model suffered severe overfitting due to the highdimensional input (108×98×44) and limited data; validation R2 dropped to 0.419 for Tc and 0.658 for pc.
- CNNenhanced ANN – a pretrained (frozen) ANN predicts from descriptors, while a trainable ResNet learns the residual error. This hybrid strategy achieved the best performance. Validation results:
- Tc: R2=0.888, Pearson r=0.947, MAPE = 5.03 %, MSE = 1682.
- pc: R2=0.919, r=0.960, MAPE = 6.37 %, MSE = 11.7.
Tenfold crossvalidation confirmed robustness (average
R2=0.875 for
Tc, 0.924 for
pc). All models were compared with the JOBACK group contribution method, which gave
R2=0.815 (MAPE = 4.92 %) for
Tc and
R2=0.916 (MAPE = 5.64 %) for
pc, and failed to predict 74 compounds. The CNNenhanced ANN consistently outperformed both JOBACK and the descriptoronly ANN.
The authors note that while their broadcompound model has slightly lower accuracy than models specialized for single classes, it provides reliable predictions across a very wide chemical space. A current limitation is the need for DFT optimization, which raises the barrier for nonspecialists; future work will explore faster methods or transfer learning.
This study demonstrates that integrating complete molecular structure information via deep learning can significantly enhance QSPR models, offering a powerful tool for predicting supercritical properties of organic compounds in engineering applications.
DOI
10.1007/s11705-026-2638-6