Acta Scientiarum Naturalium Universitatis Pekinensis

Previous Articles     Next Articles

An Unsupervised Method for Chinese Speech Text Localization in Comic Images

LIU Dong, LI Luyuan, WANG Yongtao, TANG Zhi   

  1. Institute of Computer Science and Technology, Peking University, Beijing 100080;
  • Received:2013-06-21 Online:2014-01-20 Published:2014-01-20



  1. 北京大学计算机科学技术研究所, 北京 100080;

Abstract: For satisfying the growing needs of reading Chinese comic images on mobile devices, the authors propose an unsupervised Chinese speech text localization method which is different from the existing learning-based methods.The method consists of three major stages: 1) the first stage is to detect the white region that surrounds the text charactders (speech balloons, similarly hereinafter) using the connectivity of white region within the balloons and localize the characters within the speech balloon; 2) the detected characters are clustered into character strings (a row or column of characters aligning horizontally or vertically) based on the character shape and the consistency of typesetting, and their font features are extracted; 3) based on the features of the extracted fonts, the third stage is to detect rest of the character strings via Bayesian classifier. The proposed method is tested on a dataset consists of 900 comic images and reaches satisfactory results.

Key words: Chinese comic image, speech balloon detection, speech text localization, characters clustering, Bayesian classifier

摘要: 针对中文漫画图像的特点, 提出一种无监督的中文漫画图像对白自动定位方法, 以满足中文漫画移动阅读的需求。不同于现有基于学习的方法, 该方法不需要训练集, 且具有较强的鲁棒性, 主要包括3个步骤: 1) 利用包围漫画图像文字的空白区域(气泡)的连通性进行气泡检测, 并在气泡中检测完整字符对; 2) 基于字符形状与字符排版规则的一致性, 聚类形成字符行或字符列, 并提取字体特征; 3) 联合多页漫画图像字体特征, 利用贝叶斯分类器检测多页漫画中的剩余字符。在包含900页漫画图像的数据集上进行实验, 结果表明, 该方法可以有效定位中文漫画图像中的对白区域, 取得比较满意的实验结果。

关键词: 中文漫画图像, 气泡检测, 对白定位, 字符聚类, 贝叶斯分类

CLC Number: