Instance segmentation method based on improved Mask R−CNN for the stacked automobile parts

ZHU Xinlong; CUI Guohua; CHEN Saixuan; YANG Lin

doi:10.12299/jsues.21-0309

Volume 36 Issue 2

Jun. 2022

Turn off MathJax

Article Contents

Article Navigation > Journal of Shanghai University of Engineering Science > 2022 > 36(2): 168-175

ZHU Xinlong, CUI Guohua, CHEN Saixuan, YANG Lin. Instance segmentation method based on improved Mask R−CNN for the stacked automobile parts[J]. Journal of Shanghai University of Engineering Science, 2022, 36(2): 168-175. doi: 10.12299/jsues.21-0309

Citation:

ZHU Xinlong, CUI Guohua, CHEN Saixuan, YANG Lin. Instance segmentation method based on improved Mask R−CNN for the stacked automobile parts[J]. Journal of Shanghai University of Engineering Science, 2022, 36(2): 168-175. doi: 10.12299/jsues.21-0309

Citation:

PDF( 1578 KB)

Instance segmentation method based on improved Mask R−CNN for the stacked automobile parts

doi: 10.12299/jsues.21-0309

School of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

Received Date: 2021-12-27
Available Online: 2022-11-16
Publish Date: 2022-06-30

Abstract

Abstract

Aiming at the problems of slow speed, low accuracy and poor robustness in recognition, detection and segmentation of stacked automobile parts, a fast detection and instance segmentation method based on improved Mask R−CNN algorithm was proposed. Firstly, the feature extraction network of Mask R-CNN was optimized, and ResNet + Feature Pyramid Networks (FPN) was replaced by MobileNets + FPN as the backbone network, which effectively reduced network parameters, compressed model volume and improved model detection speed. Then,Spatial Transformer Networks (STN) module was added after the ROI Align structure of Mask R-CNN to ensure the detection accuracy of the model. The experimental results show that the size of the model is compressed and the detection speed is doubled. The mean Average Precision (mAP) of the model is also improved. The detection of untrained new samples shows that the model is better than Mask R−CNN in speed, lighter and more accurate, and can quickly and accurately detect and segment stacked automobile parts, which verifies the practical feasibility of the improved model.
- instance segmentation,
- stacked,
- MobileNets model,
- spatial transformer networks (STN)

FullText(HTML)

References(15)

References

[1]	刘学平, 李玙乾, 刘励, 等. 自适应边缘优化的改进YOLOV3目标识别算法[J] . 微电子学与计算机,2019,36(7):59 − 64.
[2]	余永维, 韩鑫, 杜柳青. 基于Inception-SSD算法的零件识别[J] . 光学精密工程,2020,28(8):1799 − 1809.
[3]	HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[J] . IEEE Transactions on Pattern AnalysisMachine Intelligence,2017(99):1.
[4]	魏中雨, 黄海松, 姚立国. 基于机器视觉和深度神经网络的零件装配检测[J] . 组合机床与自动化加工技术,2020(3):74 − 77,82.
[5]	YANG Z X, DONG R X, XU H, et al. Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components[J] . Electronics,2020,9(6):1.
[6]	GUO D, KONG T, SUN F C, et al. Object discovery and grasp detection with a shared convolutional neural network[C]//Proceedings of 2016 IEEE International Conference on Robotics and Automation (ICRA). Stockholm: IEEE, 2016: 2038−2043.
[7]	王德明, 颜熠, 周光亮, 等. 基于实例分割网络与迭代优化方法的3D视觉分拣系统[J] . 机器人,2019,41(5):637 − 648.
[8]	ZHANG H, LAN X, BAI S, et al. A multi-task convolutional neural network for autonomous robotic grasping in object stacking scenes[C]//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Macau: IEEE, 2019: 6435−6442.
[9]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J] . IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137 − 1149. doi: 10.1109/TPAMI.2016.2577031
[10]	LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017: 936−944.
[11]	XIE S, GIRSHICK R, DOLLAR P, et al. Aggregated Residual Transformations for Deep Neural Networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu: IEEE, 2017: 5987−5995.
[12]	HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications[EB/OL]. (2017−04−17)[2021−10−17]. https://arxiv.org/pdf/1704.04861.pdf.
[13]	JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal: IEEE, 2015: 2017–2025.
[14]	NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa: IMLS, 2010: 807–814.
[15]	LIN T-Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]//Proceedings of Computer Vision – ECCV 2014. Zurich: Springer, 2014: 740−755.