Background
Self-driving vehicles rely closely on interactions with humans, vehicles, and the surrounding environment. However, the interactive analysis of self-driving is impacted by multiple perception sources, heterogeneous data, and complex environments in actual scenes. Due to the above issues, we are often unclear about the behavior of self-driving vehicles, do not understand their decisions, and it is also difficult to achieve synergy with our human intentions.
Professor Nan Ma of Beijing University of Technology and her research team published a paper titled “Interactive Cognition of Self-driving: A Multi-dimensional Analysis Model and Implementation” in the 2025 issue of Research. We introduce the significance of research in the field of self-driving interactive cognition, detailing its components and underlying infrastructure. Furthermore, we demonstrate how the self-driving interactive cognition, inspired by the Wiener model, embodies intelligence in complex environments with the purpose of stressing the importance of interactive cognition in complex environments and scientifically evaluating the analysis of machine interactive cognition. Then, a multi-dimensional analysis model of self-driving interactive cognition is established based on perceptual information acquisition, multi-channel and cross-modal data registration, attention mechanism, visual recognition and understanding, as well as embodied dynamic control. Supported by the above, we build a multi-view spatio-temporal graph convolutional network (MV-STGCN) model for action recognition to realize vehicle-to-human body language interactive cognition. Most importantly, we innovatively propose a Nonlinear-CRITIC-TOPSIS-based method to analyze the interactive cognition analyses of different action recognition algorithms efficiently, such as MV-STGCN. Future self-driving vehicles are bound to demonstrate multi-channel and cross-modal intelligence perception and human-vehicle-friendly interaction, and we are committed to how to better realize the humanoid driving analysis and the embodied intelligence of self-driving vehicles. “Self-driving + Interactive cognition” could make the future vehicles become interactive wheeled robots that can be trusted and better serve human society.
Considering vehicle's embodied intelligence as an essential basis, we first establish an analysis matrix of interactive cognition to achieve humanoid driving analysis. From the perspective of perceptual intelligence, including the following dimensions: The analysis of self-driving vehicle sensors such as cameras, radar and navigation to obtain perceptual data {D
i}; The analysis of multi-channel and cross-modal data registration {S
i}; The analysis of attention mechanism for perceived information {A
i}; The analysis of visual recognition and understanding {L
i}. Behavioral intelligence incorporates the following dimensions: the analysis of steering, braking and acceleration of their own vehicle with embodied control {C
i}; The analysis of body language interaction between vehicles and humans {P
i}; The analysis of vehicle language interactive cognition {V
i}; The analysis of synergistic interactive cognition between vehicles and environments {I
i}, thus forming the analysis matrix of self-driving interactive cognition. According to the analysis matrix of self-driving interactive cognition, a multi-dimensional analysis model of self-driving interactive cognition is further constructed, as shown in Figure 1.
In the foreseeable evolution of automotive systems, human-driven and autonomous vehicles are expected to coexist for decades. Autonomous vehicles, as mobile intelligent agents, increasingly exhibit learning capacities that extend beyond conventional computational intelligence to encompass interactive and memory intelligence, including trial-and-error learning from near-misses and accidents. We develop a multi-dimensional analysis model of self-driving interactive cognition that enables rigorous evaluation of perceptual and behavioural intelligence, with particular emphasis on learning competence. We further introduce a CRITIC–TOPSIS method based on Spearman's rank correlation coefficient to quantify multi-dimensional interactive cognitive abilities. Since 2016, our team has collaborated with several industrial and academic partners—including BAIC Research Institute, Dongfeng Yuexiang Technology, China Automotive Engineering Research Institute Co., Ltd—to advance the theory of interactive cognition for autonomous driving and to develop a suite of intelligent interaction systems. These systems support robust interaction and coordination between vehicles and humans, as well as inter-vehicle cooperation, across diverse scenes, environments, sensor modalities, and vehicle platforms, which will be essential for fostering public trust and accelerating the societal adoption of autonomous vehicles.
The complete study is accessible via DOI:10.34133/research.0903