北京大学学报(自然科学版)

基于句法特征的冗长查询处理技术

姚兰,林鸿飞,林原,马云龙   

  1. 大连理工大学信息检索研究室, 大连 116024;
  • 收稿日期:2012-03-01 出版日期:2013-03-20 发布日期:2013-03-20

A Parsing Approach for Verbose Queries

YAO Lan, LIN Hongfei, LIN Yuan, MA Yunlong   

  1. Information Retrieval Laboratory, Dalian University of Technology, Dalian 116024;
  • Received:2012-03-01 Online:2013-03-20 Published:2013-03-20

摘要: 将传统的“词袋”思想进行扩展, 把文档看成由句子组成的“句袋”, 通过依存句法分析得到“句袋”与查询中词间的依存关系。利用两者依存关系的匹配程度, 计算冗长查询和初次检索返回文档之间的相似度, 对初始检索结果进行重排序。通过在TREC标准数据集上的实验, 证明该方法能够较有效地解决查询的冗长导致偏离查询主题和低召回率情况下相关文档排序靠后的问题。特别是对于低召回率的情况, 检索结果的MAP值和P@N都有显著提高。

关键词: 依存关系, 冗长查询, 结果重构, 查询扩展

Abstract: The authors extended the traditional “bag of words” idea. Every document was regarded as “bag of sentences”. The dependency relationship of the words was obtained from the “bag of sentences” and verbose queries by dependency parsing. According to the matching degree of the dependence relationship, the similarity scores between verbose queries and documents was obtained. Finally, the initial results were re-ranked. Experiment on a standard TREC corpus shows that new approach can improve retrieval effectiveness for verbose query and the low recall rate. For the low recall rate, the MAP and P@N have a significantly improvement.

Key words: dependency relationship, verbose queries, result reconfiguration, query expansion

中图分类号: