Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2016, Vol. 52 ›› Issue (1): 134-140.DOI: 10.13209/j.0479-8023.2016.017

Previous Articles     Next Articles

Multiple-Choice Question Answering Based on Textual Entailment

WANG Baoxin, ZHENG Dequan, WANG Xiaoxue, ZHAO Shanshan, ZHAO Tiejun   

  1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001
  • Received:2015-06-19 Online:2016-01-20 Published:2016-01-20
  • Contact: ZHENG Dequan, E-mail: dqzheng(at)mtlab.hit.edu.cn

基于文本蕴含的选择类问题解答技术研究

王宝鑫, 郑德权, 王晓雪, 赵姗姗, 赵铁军#br#   

  1. 哈尔滨工业大学计算机科学与技术学院, 哈尔滨 150001
  • 通讯作者: 郑德权, E-mail: dqzheng(at)mtlab.hit.edu.cn
  • 基金资助:
    国家自然科学基金(61173073)和863 计划(2015AA015405)资助

Abstract:

This paper proposes a method to compute textual entailment strength, taking multiple-choice questions which have clear candidate answers as research objects, aiming at the phenomenon of long text entailing short text. Two methods are used to answer the college entrance examination geography multiple-choice questions based on the Wikipedia Chinese Corpus in the absence of large-scale questions and answers. One is based on the sentence similarity and the other is based on the textual entailment proposed above. The accuracy rate of the proposed method is 36.93%, increasing by 2.44% than the way based on the word embedding sentence similarity, increasing 7.66% than the way based on the Vector Space Model sentence similarity, which confirm the effectiveness of the method based on the textual entailment.

Key words: textual entailment, multiple-choice question, word embedding, sentence similarity

摘要:

利用选择类问题具有明确候选项的特点, 简化问题分类过程, 并针对长文本语义蕴含短文本语义的语言现象, 提出一种根据文本蕴含强度大小对候选答案进行排序的方法。在没有大规模问答对的情况下, 采用维基百科中文语料库, 以全国各省市高考地理选择题作为实验数据, 通过句子相似度和文本蕴含两种方法来解答地理选择题。实验表明, 基于文本蕴含方法的准确率为36.93%, 比基于词嵌入的句子相似度方法提高2.44%, 比基于向量空间模型的句子相似度方法提高7.66%, 验证了该文本蕴含强度计算方法的有效性。

关键词: 文本蕴含, 选择题, 词嵌入, 句子相似度

CLC Number: