Automated Concrete Bridge Damage Detection: A New Method Using Vision Transformer-Enhanced YOLO

A new method for automated concrete bridge damage detection using an efficient Vision Transformer-enhanced anchor-free YOLO (You Only Look Once) has been proposed by researchers from the University of Auckland, New Zealand, and Chongqing Jiaotong University, China. The study, published in Engineering, aims to address the challenges faced by existing deep learning techniques in detecting bridge damage captured by unmanned aerial vehicles (UAVs).

Concrete bridges are susceptible to deterioration due to environmental conditions, natural hazards, increasing traffic, and aging. Periodic inspections are crucial for assessing bridge conditions and providing early warnings of defects that may impact safety. However, traditional visual inspections are time-consuming, subjective, and error-prone. UAVs equipped with high-definition cameras have been increasingly used for bridge inspections, offering significant cost savings compared to manual inspections. Integrating UAVs with computer vision-based damage detection algorithms can further improve inspection efficiency.

Existing deep learning-based damage detection methods face several challenges. Defect scale variance, motion blur, and strong illumination significantly affect the accuracy and reliability of damage detectors. Anchor-based damage detectors struggle to generalize to real-world scenarios, and convolutional neural networks (CNNs) lack the capability to model long-range dependencies across the entire image. To address these issues, the researchers developed an efficient Vision Transformer-enhanced anchor-free YOLO method.

The researchers established a concrete bridge damage dataset, augmented with motion blur and varying brightness to better adapt to real-world conditions. They applied four key enhancements to the YOLOv5l algorithm: four detection heads to alleviate multi-scale damage detection issues, decoupled heads to address the conflict between classification and bounding box regression tasks, an anchor-free mechanism to reduce computational complexity and improve generalization, and a novel Vision Transformer (ViT) block, C3MaxViT, to enable CNNs to model long-range dependencies.

The proposed method was compared against state-of-the-art damage detection methods. Experimental results demonstrated an increase of 8.1% in mean average precision at intersection over union threshold of 0.5 (mAP₅₀) and an improvement of 8.4% in mAP@[0.5:.05:.95]. Ablation studies revealed that the four detection heads, decoupled head design, anchor-free mechanism, and C3MaxViT contributed improvements of 2.4%, 1.2%, 2.6%, and 1.9% in mAP₅₀, respectively.

The study’s main contributions include the creation of a multi-scale concrete bridge damage dataset augmented with motion blur and varying brightness levels, the development of a Vision Transformer-enhanced anchor-free YOLO method based on the YOLOv5l algorithm, and the proposal of a novel C3MaxViT block to model long-range dependencies. The proposed method shows promise in improving the accuracy and efficiency of automated concrete bridge damage detection, particularly in challenging real-world conditions. Future research could focus on establishing larger datasets, exploring dataset augmentation with synthetic defects, and developing models capable of handling low light and exposure conditions.

The paper “Automated Concrete Bridge Damage Detection Using an Efficient Vision Transformer-Enhanced Anchor-Free YOLO,” is authored by Xiaofei Yang, Enrique del Rey Castillo, Yang Zou, Liam Wotherspoon, Jianxi Yang, Hao Li. Full text of the open access paper: https://doi.org/10.1016/j.eng.2025.02.018. For more information about Engineering, visit the website at https://www.sciencedirect.com/journal/engineering.

Automated Concrete Bridge Damage Detection Using an Efficient Vision Transformer-Enhanced Anchor-Free YOLO
Author: Xiaofei Yang,Enrique del Rey Castillo,Yang Zou,Liam Wotherspoon,Jianxi Yang,Hao Li
Publication: Engineering
Publisher: Elsevier
Date: Available online 11 March 2025
https://doi.org/10.1016/j.eng.2025.02.018

Attached files

Technical roadmap of the proposed methodology. P2,P3,P4,and P5 are multi-scale feature maps generated in a top-down manner via upsampling, while N2,N3,N4,N5 represent multi-scale feature maps produced in a bottom-up manner through downsampling. (i) YOLOv5l architecture; (ii) model enhancements.

24/06/2025 Frontiers Journals

Regions: Asia, China, Oceania, New Zealand

Keywords: Applied science, Engineering

Disclaimer: AlphaGalileo is not responsible for the accuracy of content posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Latest Publications

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.

Koula Bouloukos, Senior manager, Editorial & Production Underknown

We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.

Peter Dunn, Director of Press and Media Relations at the University of Warwick

AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.

Automated Concrete Bridge Damage Detection: A New Method Using Vision Transformer-Enhanced YOLO

This item is under embargo and is only visible to journalists

Latest Publications

Testimonials

Koula Bouloukos, Senior manager, Editorial & Production Underknown

Peter Dunn, Director of Press and Media Relations at the University of Warwick

Ben Deighton, SciDevNet