[1]王涛,江加和.基于语义分割技术的任意方向文字识别[J].应用科技,2018,45(03):55-60.[doi:10.11991/yykj.201705006]
 WANG Tao,JIANG Jiahe.Text recognition in any direction based on semantic segmentation[J].yykj,2018,45(03):55-60.[doi:10.11991/yykj.201705006]
点击复制

基于语义分割技术的任意方向文字识别(/HTML)
分享到:

《应用科技》[ISSN:1009-671X/CN:23-1191/U]

卷:
第45卷
期数:
2018年03期
页码:
55-60
栏目:
自动化技术
出版日期:
2018-05-05

文章信息/Info

Title:
Text recognition in any direction based on semantic segmentation
作者:
王涛 江加和
北京航空航天大学 自动化科学与电气工程学院, 北京 100191
Author(s):
WANG Tao JIANG Jiahe
School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China
关键词:
自然图像任意文本行文本检测语义分割全卷积网络矩形卷积核融合池化层条件随机场
Keywords:
natural imagetext in any directiontext detectionsemantic segmentationFCNrectangular convolution kernelpooled fusionCRF
分类号:
TP391
DOI:
10.11991/yykj.201705006
文献标志码:
A
摘要:
针对现有文本检测与定位方法只能处理单一方向文本行的缺点,提出了一种基于语义分割方法的用于自然图像中文本检测的新方法。首先通过对现有检测方法以及目前语义分割方法在文本行检测中的局限性分析。然后对加入矩形卷积核的全卷积网络模型进行训练,获得文本行区域的分类图。最后,通过全连接条件随机场(conditional random field, CRF)的高精度分割能力将网络前端输出的文本行区域中的文字给区分出来。该框架用于处理任意方向、语言和字体中的文本。所提出的方法在MSRA-TD500和ICDAR2015两个文本检测数据集上获得良好的分割结果且性能优越。
Abstract:
A new method for text detection in natural images based on semantic segmentation method was proposed to solve the shortcomings of the existing text detection and location methods. Firstly, the limitations of existing detection methods and current semantic segmentation methods in text line detection were analyzed; then the full convolution network (FCN) model added with rectangular convolution kernel was trained to obtain the classification map of the text line area; finally, by the precision segmentation of the full-connection conditional random field (CRF), the characters in the text line area of network front-end output were distinguished. The framework can be used to process the texts in any direction, language, and font. The proposed method achieved surprising segmentation results in both MSRA-TD500 and ICDAR2015 text detection datasets and achieved advanced performance.

参考文献/References:

[1] JADERBERG M, SIMONYAN K, VEDALDI A, et al. Reading text in the wild with convolutional neural networks[J]. International journal of computer vision, 2016, 116(1):1-20.
[2] ZHANG Z, ZHANG C, SHEN W, et al. Multi-oriented text detection with fully convolutional networks[C]//CVPR, Las Vegas, USA, 2016:4159-4167.
[3] GÓMEZ L, KARATZAS D. Textproposals:a text-specific selective search algorithm for word spotting in the wild[J]. Pattern recognition, 2017, 70:60-74.
[4] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(4):640-651.
[5] CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 40(4):834-848.
[6] NEUMANN L, MATAS J. Real-time scene text localization and recognition[C]//Proceedings of 2012 Conference on Computer Vision and Pattern Recognition. Providence, RI, USA, 2012:3538-3545.
[7] HUANG Weilin, QIAO Yu, TANG Xiaoou. Robust scene text detection with convolution neural network induced MSER trees[C]//Proceedings of the 13th European Conference on Computer Vision-ECCV 2014. Switzerland, 2014:497-511.
[8] TIAN S, PAN Y, HUANG C, et al. Text flow:A unified text detection system in natural scene images[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, 2015:4651-4659.
[9] GUPTA A, VEDALDI A, ZISSERMAN A. Synthetic data for text localisation in natural images[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, United States, 2016:2315-2324.
[10] REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(6):1137-1149.
[11] ZHANG Zheng, SHEN Wei, YAO Cong, et al. Symmetry-based text line detection in natural scenes[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, 2015:2558-2567.
[12] LIAO M, SHI B, BAI X, et al. TextBoxes:a fast text detector with a single deep neural network[C]//AAAI. San Francisco, USA, 2017:4161-4167.
[13] JADERBERG M, VEDALDI A, ZISSERMAN A. Deep features for text spotting[C]//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland, 2014:512-528.
[14] WANG Tao, WU D J, COATES A, et al. End-to-end text recognition with convolutional neural networks[C]//Proceedings of the 21st International Conference on Pattern Recognition. Tsukuba, Japan, 2013:3304-3308.

备注/Memo

备注/Memo:
收稿日期:2017-05-09。
基金项目:国家自然科学基金项目(61273141)
作者简介:王涛(1991-),男,硕士研究生;江加和(1965-),男,副教授
通讯作者:王涛,E-mail:zhwang.tao@163.com
更新日期/Last Update: 2018-06-14