Journal of Software:2017.28(2):234-245

(西安交通大学 软件学院, 陕西 西安 710049)
Layout Analysis Algorithm of Questionnaire Image
DUAN Lu,SONG Yong-Hong,ZHANG Yuan-Lin
(School of Software, Xi'an Jiaotong University, Xi'an 710049, China)
Chart / table
Similar Articles
Article :Browse 1263   Download 2381
Received:October 28, 2015    Revised:December 22, 2015
> 中文摘要: 针对目前已有的问卷图像版面分析算法无法自动识别信息填写区域和无法处理无固定格式的问卷图像等问题,提出了一种连通区域和神经网络相结合的问卷图像版面分析算法.首先获得扫描得到的问卷图像的中心有效图形,接着提出并应用了一种针对问卷图像的快速倾斜矫正方法,对中心有效图像进行倾斜矫正;再利用水平投影进行行分割得到问卷行;然后提取每个问卷行的首个连通区域判断是否存在表格区域即表格问卷行,若存在表格问卷行,则对其进行表格区域分布分析和表格类型判断,得到可能的答案区域,否则直接对文本问卷行进行分析,得到可能的答案区域;最后利用神经网络判断筛选区域的类型,得到最终的答案填写区域.针对问卷图像的实验结果表明,该算法可以准确地识别各种问卷图像中的信息填写区域.
中文关键词: 行分割  连通区域  表格处理  神经网络
Abstract:The recognition of the information area with common format in the non-fixed format questionnaire is the major problem in existing questionnaire layout recognition algorithm. To address those problems, a new approach for questionnaire layout analysis based on regional connectivity and neural networks is proposed. First, a center valid graphics is generated by preprocessing the scanned image firstly. Then, a rapid skew correction algorithm is applied for questionnaire images. Next, many questionnaire rows are obtained by using horizontal projection profile segmentation algorithms. After that, the first connected region for each row is extracted to estimate the existence of form region. Based on the analysis of general questionnaire row and table row, a large amount of possible answers region are generated. Finally, the neural network is used to determine the type of possible information areas. Experiments show that the proposed algorithm can automatically identify common questionnaire.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61238303) 国家自然科学基金(61238303)
Foundation items:National Natural Science Foundation of China (61238303)
Reference text:


DUAN Lu,SONG Yong-Hong,ZHANG Yuan-Lin.Layout Analysis Algorithm of Questionnaire Image.Journal of Software,2017,28(2):234-245