###
:2014.25(10):2266-2281

异态汉字识别方法研究
王恺,李成学,王庆人,赵宏,张健
(南开大学 计算机与控制工程学院, 天津 300071)
Research on Abnormal Chinese Character Recognition
WANG Kai,LI Cheng-Xue,WANG Qing-Ren,ZHAO Hong,ZHANG Jian
(College of Computer and Control Engineering, Nankai University, Tianjin 300071, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 1947   Download 2426
Received:June 22, 2013    Revised:September 09, 2013
> 中文摘要: 复杂图像文字识别是基于内容图像检索的一个重要研究方向.针对图像中的文字可能存在倾斜、光照不均、噪音干扰和边缘柔化等多种异态问题,提出一种有效的异态汉字识别方法,称作SC-HOG.首先,利用稀疏编码得到基向量和稀疏系数,通过重构图像滤除噪音、处理边缘柔化;然后,利用梯度方向直方图抽取复原图像的汉字边缘梯度特征,削弱倾斜和光照的影响;最后,将获取的特征向量送入分类器,实现异态汉字的识别.通过合成数据集和真实数据集两方面的实验来验证SC-HOG方法的有效性:前一方面实验结果表明,SC-HOG方法对于倾斜、光照不均、噪音干扰和边缘柔化等异态情况有较强的鲁棒性;后一方面实验结果表明,SC-HOG方法在原生数字图像和场景图像真实样本集上也能取得较好的结果.
Abstract:Recognizing characters from the complex image plays an important role in content-based image retrieval and has been well studied in past decades. The methods for normal characters recognition, however, become inapplicable when characters suffer from skew, uneven illumination, noise and anti-aliasing. A new method, named SC-HOG, is proposed in this paper for recognizing abnormal Chinese characters. Firstly, sparse coding is applied on abnormal character image to smooth noises and reduce anti-aliasing. Secondly, HOG features that help reducing the influence of skew and uneven illumination are extracted. Finally, these features are fed into a well-trained classifier to recognize the character of the given image. Experiments on both synthetic and real data sets show that the proposed method, SC-HOG, achieves high accuracy on abnormal Chinese characters recognition.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61201424);天津市自然科学基金(12JCYBJC10100);中央高校基本科研业务费专项资金(65012131) 国家自然科学基金(61201424);天津市自然科学基金(12JCYBJC10100);中央高校基本科研业务费专项资金(65012131)
Foundation items:
Reference text:

王恺,李成学,王庆人,赵宏,张健.异态汉字识别方法研究.软件学报,2014,25(10):2266-2281

WANG Kai,LI Cheng-Xue,WANG Qing-Ren,ZHAO Hong,ZHANG Jian.Research on Abnormal Chinese Character Recognition.Journal of Software,2014,25(10):2266-2281