In recent years, autonomous driving has garnered escalating attention for its potential to relieve drivers' burdens and improve driving safety. Vision-based 3D occupancy prediction, which predicts the spatial occupancy status and semantics of 3D voxel grids around the autonomous vehicle from image inputs, is an emerging perception task suitable for cost-effective perception system of autonomous driving. Although numerous studies have demonstrated the greater advantages of 3D occupancy prediction over object-centric perception tasks, there is still a lack of a dedicated review focusing on this rapidly developing field.
To solve the problem, a research team led by Di Huang published their new review on 15 January 2026 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The current review of occupancy prediction lacks comprehensive coverage of the latest work and does not provide a proper categorization and organization of existing methods. This paper broadly covers the recent advances tailored for vision-based 3D occupancy prediction, systematically summarizes the current methods into fine-grained categories from novel perspectives.
This paper is the first comprehensive review tailored to the vision-based 3D occupancy prediction methods for autonomous driving; This paper structurally summarizes vision-based 3D occupancy prediction methods from three views: feature-enhanced, computation-friendly and label-efficient approaches; The paper proposes some inspiring future outlooks for vision-based 3D occupancy prediction and provides a regularly updated github repository to collect the related papers, datasets, and codes.
DOI
10.1007/s11704-024-40443-5