北京大学学报(自然科学版)

面向知识库的中文自然语言问句的语义理解

许坤,冯岩松,赵东岩,陈立伟,邹磊   

  1. 北京大学计算机科学技术研究所, 北京100080;
  • 收稿日期:2013-06-17 出版日期:2014-01-20 发布日期:2014-01-20

Automatic Understanding of Natural Language Questions for Querying Chinese Knowledge Bases

XU Kun, FENG Yansong, ZHAO Dongyan, CHEN Liwei, ZOU Lei   

  1. Institute of Computer Science and Technology, Peking University, Beijing 100080;
  • Received:2013-06-17 Online:2014-01-20 Published:2014-01-20

摘要: 设计从自然语言问句到结构化查询的转换框架。该方法从自然语言问句的句法结构入手, 提出一套启发式识别实体与关系的方法, 并利用语料库建立从实体到知识库的映射, 对谓词进行消歧, 进而转化为计算机可理解的结构化查询语言。从百度知道抽取人物、地点、组织3类共 42 个问题作为标准测试集。实验结果表明, 所提出的框架能够有效地将中文自然语言问句转换为结构化查询, 为下一代智能问答系统打下良好的基础。

关键词: 自然语言问句, 知识库, 查询语义图

Abstract: A framework to transform natural language questions into computer-understoodable structured queries is presented. The authors propose to use query semantic graph to represent the semantics in Chinese questions, and adopt predicate and entity disambiguation to match the query graph to the schema of a knowledge base. The authors collect a benchmark of 42 frequently-asked questions randomly sampled from 3 categories of Baidu Knows, including person, location and organization. Experiment results show that proposed framework can effectively convert natural language questions into SPARQL queries, and lay a good foundation for the next generation of intelligent question answering systems.

Key words: natural language question, knowledge base, query semantic graph

中图分类号: