For years, lie detection depended heavily on polygraphs and other physiological tests, but these methods have long faced questions of reliability, intrusiveness, and limited usability outside controlled settings. Real-world deception is more complicated: it may appear through unnatural pauses, pitch shifts, blink frequency, microexpressions, gaze changes, posture, or semantic inconsistency in language. Because these signals vary greatly from person to person, systems based on only one modality often struggle to remain robust. That challenge has pushed researchers toward multimodal approaches that can combine verbal and nonverbal evidence, capture richer behavioral patterns, and better resist noise, bias, or missing cues. Based on these challenges, deeper research into multimodal deception detection is needed.
Researchers from Great Bay University, Nanyang Technological University, Wuhan University, Hefei University of Technology, Macau University of Science and Technology, and the Shenzhen Campus of Sun Yat-sen University published (DOI: 10.1007/s11633-025-1625-x) this survey in Machine Intelligence Research in 2026. The paper reviews the foundations and recent progress of multimodal deception detection, including benchmark datasets, feature design, evaluation methods, fusion strategies, and model architectures. It also highlights how the field has shifted from handcrafted features and conventional classifiers toward deep learning, transfer learning, and more adaptive systems designed for increasingly realistic scenarios.
One of the survey’s strongest contributions is the way it connects technical progress with the growing complexity of real-world data. It shows that the field has evolved from small, controlled datasets to larger and more diverse resources such as DOLOS, MDPE, and SEUMLD. DOLOS, for example, includes 1,675 video clips from 213 participants and adds fine-grained annotations of facial and vocal behaviors, while MDPE and SEUMLD extend the landscape with larger Chinese-language multimodal datasets. The paper also explains why evaluation must go beyond raw accuracy: in imbalanced datasets such as Box of Lies, accuracy alone can be misleading, making metrics such as F1 score and area under the curve more meaningful. Methodologically, the survey traces a clear transition from traditional feature engineering to deep learning systems that model temporal dynamics, multimodal fusion, transfer learning, and even unsupervised strategies, showing a field that is rapidly becoming more ambitious, more data-driven, and more application-oriented.
“This survey makes one message especially clear: teaching machines to detect deception is no longer about searching for a single telltale sign, but about understanding how voice, face, language, and behavior interact in context,” the paper suggests. “The next leap will depend not just on stronger models, but on better datasets, fairer evaluation, and systems that remain transparent and cautious when used in high-stakes settings.” That conclusion gives the article a wider significance: it is not simply cataloging tools, but reframing deception detection as a multidisciplinary problem shaped by computer vision, speech processing, psychology, and ethics at once.
The implications of this review extend far beyond academic benchmarking. Multimodal deception detection could influence security screening, forensic analysis, digital communication assessment, and other scenarios where trust is hard to verify quickly. At the same time, the survey stresses that technical progress must be matched by ethical restraint. Systems built on facial video, speech, and physiological data raise serious concerns about privacy, fairness, legality, and possible misuse, especially when applied to culturally diverse populations or judicial settings. In that sense, the paper points to a dual future for the field: more powerful multimodal AI on one side, and a growing need for human oversight, accountability, and responsible deployment on the other.
###
References
DOI
10.1007/s11633-025-1625-x
Original Source URL
https://doi.org/10.1007/s11633-025-1625-x
Funding Information
This work was supported by the National Natural Science Foundation of China (Nos. 62576076, 62441619, 62572359 and U22A201181), Guangdong Basic and Applied Basic Research Foundation, China (No. 2023A1515140037), and sponsored by the CCF-Tencent Rhino-Bird Open Research Fund.
About Machine Intelligence Research
Machine Intelligence Research (original title: International Journal of Automation and Computing) is published by Springer and sponsored by the Institute of Automation, Chinese Academy of Sciences. The journal publishes high-quality papers on original theoretical and experimental research, targets special issues on emerging topics, and strives to bridge the gap between theoretical research and practical applications.