基于注意的图像识别模型及其在人像检测中的应用

北京大学学报（自然科学版）

基于注意的图像识别模型及其在人像检测中的应用

王曙光,程民德

北京大学数学科学学院，北京，100871

收稿日期:1999-03-08 出版日期:2000-05-20 发布日期:2000-05-20

A Model of Attention-based Image Recognition and Its Application in Face Detection

WANG Shuguang, CHENG Minde

School of Mathematical Sciences, Peking University, Beijing, 100871

Received:1999-03-08 Online:2000-05-20 Published:2000-05-20

摘要/Abstract

摘要： 提出了一种基于注意机制的图像识别模型。其基本想法是：在进行复杂的场景分析或目标识别时，首先通过注视控制机制，获得视景中的关键特征区域，并将注视点按照一定的顺序对这些关键特征区域进行串行扫描。在扫描到每一个关键特征区时，将对该区域附近的局部模式进行记忆或匹配。对一个复杂目标的记忆将包括2部分，一部分是对局部模式的记忆，即组成该目标的各“部件”的模式；另一部分是对各局部模式之间的空间关系的记忆，即组成该目标的各“部件”之间的结构关系。与记忆过程对应，识别过程也包括2部分，一部分是对局部模式的匹配，另一部分是对各局部模式之间结构关系的匹配。当匹配上的局部模式足够多且其空间关系正确时，目标即得到识别。用该模型对复杂背景中的人像进行了检测。实验结果表明，模型较好地解决了不变性识别的问题，即识别结果与目标物的平移、旋转和尺度变化无关，并且具有良好的鲁棒性和速度，是一种具有认知意义并且可以实用化的模型。

关键词: 视觉注意, 视觉模型, 人像检测, 不变性识别

Abstract: An attention-based image recognition model is proposed. When analyze complex visual field or pattern, visual attention mechanism is used to detect saliency features in the image and drive the fixation point to scan the saliency features sequentially. During each fixation, the local pattern at the fixation point is memorized or matched. There are two parts in the memory of a complex pattern, the memory of local patterns that constitute the complex pattern and the memory of space relations between local patterns. Corresponding to memory process, the recognition process also contains two parts, the matching of local patterns and the matching of space relations between local patterns. An object is recognized only when there are enough numbers of local patterns is matched and the space relations between these local patterns are correct. This model is used in face detection in complex background. The results shows that the model can solves the problem of invariant recognition with respect to shift, rotation and scale, and the computing is fast and robust. This model likes human's vision system and is applicable.

Key words: visual attention, visual model, face detection, invariant recognition

中图分类号:

TP391.41

王曙光,程民德. 基于注意的图像识别模型及其在人像检测中的应用[J]. 北京大学学报（自然科学版）.

WANG Shuguang,CHENG Minde. A Model of Attention-based Image Recognition and Its Application in Face Detection[J]. Acta Scientiarum Naturalium Universitatis Pekinensis.

导出引用管理器 EndNote|Ris|BibTeX

链接本文: https://xbna.pku.edu.cn/CN/

https://xbna.pku.edu.cn/CN/Y2000/V36/I3/307

[1]	刘一,王旭磊,查红彬. 基于局部字袋模型的三维部分形状检索方法[J]. 北京大学学报（自然科学版）, 2009, 45(6): 965-972.
[2]	辛谷雨,查红彬. 一种基于旋转不变量的三维形状描述子[J]. 北京大学学报（自然科学版）, 2007, 43(3): 428-433.
[3]	李笑岚,查红彬. 消除纹理接缝的2D－3D纹理映射[J]. 北京大学学报（自然科学版）, 2006, 42(5): 674-680.
[4]	余鹏,封举富. 基于多分辨率小波和高斯混合模型的纹理图像分割[J]. 北京大学学报（自然科学版）, 2005, 41(3): 338-343.
[5]	封举富,时建新. 基因选择的快速Fisher优化模型[J]. 北京大学学报（自然科学版）, 2005, 41(1): 122-128.
[6]	皮文凯, 刘宏, 查红彬. 基于自适应背景模型的全方位视觉人体运动检测[J]. 北京大学学报（自然科学版）, 2004, 40(3): 458-464.
[7]	罗涛. 头肩视频图像的运动物体自动提取[J]. 北京大学学报（自然科学版）, 2000, 36(5): 599-607.
[8]	郭宗明. 图像区域分割和映射在动画序列图自动着色中的应用[J]. 北京大学学报（自然科学版）, 1998, 34(2): 264-267.
[9]	封举富, 石青云, 程民德. 基于二次曲线的立体视觉统一方法[J]. 北京大学学报（自然科学版）, 1998, 34(2): 268-274.