###
Journal of Software:2019.30(11):3567-3577

包含跨域建模和深度融合网络的手绘草图检索
于邓,刘玉杰,邢敏敏,李宗民,李华
(中国石油大学(华东) 计算机通信工程学院, 山东 青岛 266580;中国科学院 计算计算技术研究所, 北京 100190)
Sketch-based Image Retrieval Using Cross-domain Modeling and Deep Fusion Network
YU Deng,LIU Yu-Jie,XING Min-Min,LI Zong-Min,LI Hua
(College of Computer and Communication Engineering, China University of Petroleum(East China), Qingdao 266580, China;Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 31   Download 40
Received:June 01, 2017    Revised:September 18, 2017
> 中文摘要: 在手绘草图检索(sketch-based image retrieval,简称SBIR)领域,引入一种手绘草图的新型检索模型.手绘草图与自然图片之间存在巨大的差异性,这是因为,与自然图片相比,手绘草图展现出高度抽象的视觉表达,用现有的方法对手绘草图进行特征提取,其产生的特征描述子对于手绘草图的内容无法进行有效地拟合;对于相同的物体,不同的人群用手绘草图描述方式和表达也存在巨大的差距,这就使得手绘草图-自然图片的匹配更加困难;同时,将手绘草图与自然图片映射到相同视觉域的工作,也是一项具有困难的任务.所以,手绘草图检索技术是公认的比较有挑战性的任务.提出一种将手绘草图与自然图片在多个层次上映射到同一视觉域的策略来解决跨域的问题.同时,引入多层深度融合卷积神经网络(multi-layer deep fusion convolutional neural network)的框架来训练并获得手绘草图和自然彩色图片的多层特征表达.在Flickr15k图像数据库进行检索实验,实验结果显示,多层深度融合卷积网络学习到的特征的检索精度超过了现有的手工特征以及由自然图片或者手绘草图训练出来的卷积神经网络(convolutional neural network,简称CNN)的特征.
Abstract:The purpose of this paper is to introduce a new approach for the free-hand sketch representation in the sketch-based image retrieval (SBIR), where the sketches are treated as the queries to search for the natural photos in the natural image dataset. This task is known as an extremely challenging work for 3 main reasons:(1) Sketches show a highly abstract visual appearance versus natural photos, fewer context can be extracted as descriptors using the existing methods. (2) For the same object, different people provide widely different sketches, making sketch-photo matching harder. (3) Mapping the sketches and photos into a common domain is also a challenging task. In this study, the cross-domain question is addressed using a strategy of mapping sketches and natural photos in multiple layers. For the first time, a multi-layer deep CNN framework is introduced to train the multi-layer representation of free hand sketches and natural photos. Flickr15k dataset is used as the benchmark for the retrieval and it is shown that the learned representation significantly outperforms both hand-crafted features as well as deep features trained by sketches or photos.
文章编号:     中图分类号:TP391    文献标志码:
基金项目:国家自然科学基金(61379106,61379082,61227802);山东省自然科学基金(ZR2013FM036,ZR2015FM011) 国家自然科学基金(61379106,61379082,61227802);山东省自然科学基金(ZR2013FM036,ZR2015FM011)
Foundation items:National Natural Science Foundation of China (61379106, 61379082, 61227802); Natural Science Foundation of Shandong Province (ZR2013FM036, ZR2015FM011)
Reference text:

于邓,刘玉杰,邢敏敏,李宗民,李华.包含跨域建模和深度融合网络的手绘草图检索.软件学报,2019,30(11):3567-3577

YU Deng,LIU Yu-Jie,XING Min-Min,LI Zong-Min,LI Hua.Sketch-based Image Retrieval Using Cross-domain Modeling and Deep Fusion Network.Journal of Software,2019,30(11):3567-3577