北京大学学报(自然科学版)

基于动态时间规划的基因芯片数据识别

刘敬伟,程乾生   

  1. 北京大学数学科学学院信息科学系,北京,100871
  • 收稿日期:2001-11-26 出版日期:2002-09-20 发布日期:2002-09-20

Dynamic Programming Based Gene Chip Recognition

LIU Jingwei,CHENG Qiansheng   

  1. Department of Information Science, School of Mathematical Sciences, Peking University, Beijing, 100871
  • Received:2001-11-26 Online:2002-09-20 Published:2002-09-20

摘要: 研究了动态时间规划(DP)在基因芯片数据识别中的应用,提出了基因芯片数据的全局最大自相似度的定义以及基于最大自相似度和高维局部片段校对的基因芯片数据自动识别方法。讨论了基于最大相似度建立模板的方法与基于最大相似度的基因沿校对路径平均的建立模板方法对基因识别和分类的影响。对肿瘤基因的识别实验结果表明:基于最大相似度的DP算法(DP-MS)能够达到100%的识别率,本方法可以应用于基因芯片数据的识别、分类和基因疾病推断。

关键词: Smith-Waterman算法, 动态时间规划(DP), 基因芯片, 基因识别

Abstract: The dynamic programming algorithm (DP) is applied to gene chip recognition. The definition of global maximum self-similarity of gene chip data and an automatic gene recognition method based on the maximum self-similarity and local high dimensional segment alignment (DP-MS) are proposed. And, the different effects of gene recognition and classification of maximum self-similarity template construction method and averaging along alignment of maximum self-similarity template construction method are also discussed. The experimental result of tumor gene recognition shows that the maximum self-similarity template construction method (DP-MS) can achieve 100% recognition rate. Therefore, it could be used for gene recognition, classification and disease inference from gene chip data.

Key words: Smith-Waterman algorithm, dynamic programming, gene chip, gene recognition

中图分类号: