Combining improved YOLOX and improved second for road vehicle fusion detection

LIU Kai; LUO Suyun; WEI Dan

doi:10.12299/jsues.24-0068

Volume 39 Issue 2

Jun. 2025

Turn off MathJax

Article Contents

Article Navigation > Journal of Shanghai University of Engineering Science > 2025 > 39(2): 148-156, 180

LIU Kai, LUO Suyun, WEI Dan. Combining improved YOLOX and improved second for road vehicle fusion detection[J]. Journal of Shanghai University of Engineering Science, 2025, 39(2): 148-156, 180. doi: 10.12299/jsues.24-0068

Citation:

LIU Kai, LUO Suyun, WEI Dan. Combining improved YOLOX and improved second for road vehicle fusion detection[J]. Journal of Shanghai University of Engineering Science, 2025, 39(2): 148-156, 180. doi: 10.12299/jsues.24-0068

Citation:

PDF( 2880 KB)

Combining improved YOLOX and improved second for road vehicle fusion detection

doi: 10.12299/jsues.24-0068

School of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

Received Date: 2024-03-13
Available Online: 2025-09-30
Publish Date: 2025-06-30

Abstract

Abstract

Common detection algorithms often struggled with missed detections, false positives, and large deviations in predicted orientation angles in road vehicle detection. A fusion detection algorithm combining improved YOLOX and Second was designed. By leveraging images and point clouds, two sub-networks were employed for vehicle detection. For image detection, convolutional block attention module, focal loss, and efficient intersection over union loss function were used to improve the detection performance of existing YOLOX. For point cloud detection, a residual sparse convolutional middle layer was designed to enhance the feature expression and context information association of Second algorithm, effectively reducing the missed detection rate of vehicles. The predictive directional angles was optimized by constructing a multi-bins strategy. Experimental conducted on KITTI dataset show that the algorithm surpassed the original, with improvements of 1.00%, 1.38%, and 2.66% in 3D average precision for easy, moderate, and hard targets, respectively. The accuracy of detecting target rotation angles is also greatly improved.
- vehicle inspection,
- residual sparse convolutional middle layer,
- multi-bins,
- fusion detection

FullText(HTML)

References(18)

References

[1]	BALTRUŠAITIS T, AHUJA C, MORENCY L P. Multimodal machine learning: a survey and taxonomy[J] . IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(2): 423 − 443. doi: 10.1109/TPAMI.2018.2798607
[2]	刘伟. 基于激光雷达和机器视觉的智能车前方障碍物检测研究[D] . 哈尔滨: 哈尔滨理工大学, 2019.
[3]	郑少武, 李巍华, 胡坚耀. 基于激光点云与图像信息融合的交通环境车辆检测[J] . 仪器仪表学报, 2022, 40(12): 143 − 151.
[4]	陆峰, 徐友春, 李永乐, 等. 基于信息融合的智能车障碍物检测方法[J] . 计算机应用, 2017, 37(S2): 115 − 119.
[5]	CHEN X Z, MA H M, WAN J, et al. Multi-view 3D object detection network for autonomous driving[C] //Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6526−6534.
[6]	KU J, MOZIFIAN M, LEE J, et al. Joint 3D proposal generation and object detection from view aggre-gation[C] //Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid: IEEE, 2018: 1−8.
[7]	YIN T W, ZHOU X Y, KRÄHENBÜHL P. Multimodal virtual point 3D detection[C] //Proceedings of the 35th International Conference on Neural Information Processing Systems. [S. l. ] : Curran Associates Inc. , 2021: 1261.
[8]	杨飞, 朱株, 龚小谨, 等. 基于三维激光雷达的动态障碍实时检测与跟踪[J] . 浙江大学学报(工学版), 2012, 46(9): 1565 − 1571. doi: 10.3785/j.issn.1008-973X.2012.09.003
[9]	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding Yolo series in 2021[EB/OL] . (2021-07-18)[2024-01-22] . https://doi.org/10.48550/arXiv.2107.08430.
[10]	YAN Y, MAO Y X, LI B. SECOND: sparsely embedded convolutional detection[J] . Sensors, 2018, 18(10): 3337. doi: 10.3390/s18103337
[11]	PANG S, MORRIS D, RADHA H. CLOCs: camera-LiDAR object candidates fusion for 3D object detection[C] //Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Las Vegas: IEEE, 2020: 10386−10393.
[12]	REDMON J, FARHADI A. YOLOv3: An incremental improvement[EB/OL] . (2018-04-08)[2024-01-22] . https://doi.org/10.48550/arXiv.1804.02767.
[13]	刘凯, 罗素云. 基于改进YOLOX-S的交通标志识别[J] . 电子测量技术, 2023, 46(1): 112 − 119.
[14]	ZHOU Y, TUZEL O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C] //Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4490−4499.
[15]	LIU Z, ZHAO X, HUANG T T, et al. TANet: robust 3D object detection from point clouds with triple attention[C] //Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 11677−11684.
[16]	XU J L, WANG G J, ZHANG X, et al. ACDet: attentive cross-view fusion for LiDAR-based 3D object detection[C] //Proceedings of 2022 International Conference on 3D Vision. Prague: IEEE, 2022: 74−83.
[17]	QI C R, LIU W, WU C X, et al. Frustum pointnets for 3D object detection from RGB-D data[C] //Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 918−927.
[18]	WEN L H, JO K H. Fast and accurate 3D object detection for lidar-camera-based autonomous vehicles using one shared voxel-based backbone[J] . IEEE Access, 2021, 9: 22080 − 22089. doi: 10.1109/ACCESS.2021.3055491