###
Journal of Software:2020.31(7):1959-1968

联合姿态先验的人体精确解析双分支网络模型
高明达,孙玉宝,刘青山,邵晓雯
(江苏省大数据分析技术重点实验室(南京信息工程大学 自动化学院), 江苏 南京 210044;江苏省大气环境与装备技术协同创新中心(南京信息工程大学 自动化学院), 江苏 南京 210044)
Posture Prior Driven Double-branch Network Model for Accurate Human Parsing
GAO Ming-Da,SUN Yu-Bao,LIU Qing-Shan,SHAO Xiao-Wen
(Jiangsu Key Laboratory of Big Data Analysis Technology(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China;Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative Innovation Center(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 773   Download 756
Received:April 30, 2019    Revised:July 11, 2019
> 中文摘要: 人体解析旨在将人体图像分割成多个具有细粒度语义的部件区域,进行形成对人体图像的语义理解.然而,由于人体姿态的复杂性,现有的人体解析算法容易对人体四肢部件形成误判,且对于小目标区域的分割不够精确.针对上述问题,联合人体姿态估计信息,提出了一种人体精确解析的双分支网络模型.该模型首先使用基干网络表征人体图像,将人体姿态估计模型预测到的姿态先验作为基干网络的注意力信息,进而形成人体结构先验驱动的多尺度特征表达,并将提取的特征分别输入至全卷积网络解析分支与检测解析分支.全卷积网络解析分支获得全局分割结果,检测解析分支更关注小尺度目标的检测与分割,融合两个分支的预测信息可以获得更为精确的分割结果.实验结果验证了该算法的有效性,在当前主流的人体解析数据集LIP和ATR上,所提方法的mIoU评测指标分别为52.19%和68.29%,有效提升了解析精度,在人体四肢部件以及小目标部件区域获得了更为准确的分割结果.
Abstract:Human parsing aims to segment a human image into multiple parts with fine-grained semantics and provides more detailed understanding of image contents. When the human body posture is complicated, the existing human parsing methods are easy to cause misjudgment to the human limb components, and the segmentation of the small target is not accurate enough. In order to solve the above problems, a double-branch networkjointingposture prior is proposed for accurate human parsing. The model first uses the backbone network to acquire the characteristics of the human body image, and then uses the pose prior information predicted by the human pose estimation model as the attention information to form a multi-scale feature expression driven by the human body structure prior. The multi-scale features are fed into the fully convolution network parsing branch and detection parsing branch separately. The fully convolutional network obtains global segmentation results, and the detection parsing branch pays more attention to the detection and segmentation of small-scale targets. The segmentation results of the two branches are fused to obtain the final parsing result, which can be more accurate. The experiment results verify the effectiveness of the proposed algorithm. Our Thisapproach has achieved 52.19% mIoU on LIP dataset, 68.29% mIoU on ATR dataset, which improves the human parsing accuracy effectively and achieves more accurate segmentation results in the human limb components and small target componentsn parsing accuracy effectively and achieves more accurate segmentation results in the human limb components and small target components.
文章编号:     中图分类号:TP391    文献标志码:
基金项目:国家自然科学基金(61825601,61532009,61672292);江苏省级项目(BRA2019077,DZXX-037) 国家自然科学基金(61825601,61532009,61672292);江苏省级项目(BRA2019077,DZXX-037)
Foundation items:National Natural Science Foundation of China (61825601, 61532009, 61672292); Jiangsu Provincial Project (BRA2019077, DZXX-037)
Author NameAffiliationE-mail
GAO Ming-Da Jiangsu Key Laboratory of Big Data Analysis Technology(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China
Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative Innovation Center(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China 
 
SUN Yu-Bao Jiangsu Key Laboratory of Big Data Analysis Technology(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China
Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative Innovation Center(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China 
sunyb@nuist.edu.cn 
LIU Qing-Shan Jiangsu Key Laboratory of Big Data Analysis Technology(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China
Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative Innovation Center(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China 
 
SHAO Xiao-Wen Jiangsu Key Laboratory of Big Data Analysis Technology(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China
Jiangsu Province Atmospheric Environment and Equipment Technology Collaborative Innovation Center(School of Automation, Nanjing University of Information Science and Technology), Nanjing 210044, China 
 
Reference text:

高明达,孙玉宝,刘青山,邵晓雯.联合姿态先验的人体精确解析双分支网络模型.软件学报,2020,31(7):1959-1968

GAO Ming-Da,SUN Yu-Bao,LIU Qing-Shan,SHAO Xiao-Wen.Posture Prior Driven Double-branch Network Model for Accurate Human Parsing.Journal of Software,2020,31(7):1959-1968