###
Journal of Software:2021.32(4):1201-1227

多尺度目标检测的深度学习研究综述
陈科圻,朱志亮,邓小明,马翠霞,王宏安
(中国科学院大学 计算机科学与技术学院, 北京 100190;计算机科学国家重点实验室(中国科学院 软件研究所), 北京 100190;人机交互北京市重点实验室(中国科学院 软件研究所), 北京 100190;计算机科学国家重点实验室(中国科学院 软件研究所), 北京 100190;人机交互北京市重点实验室(中国科学院 软件研究所), 北京 100190;华东交通大学 软件学院, 江西 南昌 330013)
Deep Learning for Multi-scale Object Detection: A Survey
CHEN Ke-Qi,ZHU Zhi-Liang,DENG Xiao-Ming,MA Cui-Xia,WANG Hong-An
(School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, China;State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China;School of Software, East China Jiaotong University, Nanchang 330013, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 1097   Download 1016
Received:August 10, 2020    Revised:September 20, 2020
> 中文摘要: 目标检测一直以来都是计算机视觉领域的研究热点之一,其任务是返回给定图像中的单个或多个特定目标的类别与矩形包围框坐标.随着神经网络研究的飞速进展,R-CNN检测器的诞生标志着目标检测正式进入深度学习时代,速度和精度相较于传统算法均有了极大的提升.但是,目标检测的尺度问题对于深度学习算法而言也始终是一个难题,即检测器对于尺度极大或极小目标的检测精度会显著下降,因此,近年来有不少学者在研究如何才能更好地实现多尺度目标检测.虽然已有一系列的综述文章从算法流程、网络结构、训练方式和数据集等方面对基于深度学习的目标检测算法进行了总结与分析,但对多尺度目标检测的归纳和整理却鲜有人涉足.因此,首先对基于深度学习的目标检测的两个主要算法流派的奠基过程进行了回顾,包括以R-CNN系列为代表的两阶段算法和以YOLO、SSD为代表的一阶段算法;然后,以多尺度目标检测的实现为核心,重点诠释了图像金字塔、构建网络内的特征金字塔等典型策略;最后,对多尺度目标检测的现状进行总结,并针对未来的研究方向进行展望.
Abstract:Object detection is a classic computer vision task which aims to detect multiple objects of certain classes within a given image by bounding-box-level localization. With the rapid development of neural network technology and the birth of R-CNN detector as a milestone, a series of deep-learning-based object detectors have been developed in recent years, showing the overwhelming speed and accuracy advantage against traditional algorithms. However, how to precisely detect objects in large scale variance, also known as the scale problem, still remains a great challenge even for the deep learning methods, while many scholars have made several contributions to it over the last few years. Although there are already dozens of surveys focusing on the summarization of deep-learning-based object detectors in several aspects including algorithm procedure, network structure, training and datasets, very few of them concentrate on the methods of multi-scale object detection. Therefore, this paper firstly review the foundation of the deep-learning-based detectors in two main streams, including the two-stage detectors like R-CNN and one-stage detectors like YOLO and SSD. Then, the effective approaches are discussed to address the scale problems including most commonly used image pyramids, in-network feature pyramids, etc. At last, the current situations of the multi-scale object detection are concluded and the future research directions are looked ahead.
文章编号:     中图分类号:TP393    文献标志码:
基金项目:国家重点研发计划(2016YFB1001200);国家自然科学基金(61872346) 国家重点研发计划(2016YFB1001200);国家自然科学基金(61872346)
Foundation items:National Key Research and Development Program of China (2016YFB1001200); National Natural Science Foundation of China (61872346)
Author NameAffiliationE-mail
CHEN Ke-Qi School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, China
State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China 
 
ZHU Zhi-Liang State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
School of Software, East China Jiaotong University, Nanchang 330013, China 
 
DENG Xiao-Ming State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China 
 
MA Cui-Xia School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, China
State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China 
cuixia@iscas.ac.cn 
WANG Hong-An School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100190, China
State Key Laboratory of Computer Science (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
Beijing Key Laboratory of Human-computer Interaction (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China 
 
Reference text:

陈科圻,朱志亮,邓小明,马翠霞,王宏安.多尺度目标检测的深度学习研究综述.软件学报,2021,32(4):1201-1227

CHEN Ke-Qi,ZHU Zhi-Liang,DENG Xiao-Ming,MA Cui-Xia,WANG Hong-An.Deep Learning for Multi-scale Object Detection: A Survey.Journal of Software,2021,32(4):1201-1227