Research on the Sense Guessing of Chinese Unknown Words Based on “Semantic Knowledge-base of Modern Chinese”

doi:10.13209/j.0479-8023.2016.009

Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2016, Vol. 52 ›› Issue (1): 10-16.DOI: 10.13209/j.0479-8023.2016.009

Previous Articles Next Articles

Research on the Sense Guessing of Chinese Unknown Words Based on “Semantic Knowledge-base of Modern Chinese”

SHANG Fenfen^1,2, GU Yanhui^1,2, DAI Rubing³, LI Bin³, ZHOU Junsheng^1,2, QU Weiguang^1,2

1. School of Computer Science and Technology, Nanjing Normal University, Nanjing 210023
2. Jiangsu Research Center of Information Security & Privacy Technology, Nanjing 210023
3. School of Chinese Language and Culture, Nanjing 210097

Received:2015-06-19 Online:2016-01-20 Published:2016-01-20
Contact: GU Yanhui, E-mail: gu(at)njnu.edu.cn

基于《现代汉语语义词典》的未登录词语义预测研究

尚芬芬^1,2, 顾彦慧^1,2, 戴茹冰³, 李斌³, 周俊生^1,2, 曲维光^1,2

1. 南京师范大学计算机科学与技术学院, 南京 210023
2. 江苏省信息安全保密技术工程研究中心, 南京 210023
3. 南京师范大学文学院, 南京 210097

通讯作者: 顾彦慧, E-mail: gu(at)njnu.edu.cn
基金资助:
国家自然科学基金(61272221, 61472191)、国家社会科学基金(11CYY030, 10CYY021)、江苏省社会科学基金(12YYA002)和江苏省高校自然科学基金(14KJB520022)资助

Abstract

Abstract:

Based on the research issue of sense guessing of Chinese unknown words, different levels of semantic dictionary were introduced by applying “Semantic Knowledge-base of Modern Chinese”. Models have constructed for sense guessing by using these dictionary. Each model was intergrated to predict the unknown words and obtained better performance. Based on each model, semantic prediction and annotation of the unknown words in People’s Daily which published in 2000 were evaluated. Finally, corpus resources with the sense annotation of unknown words were obtained.

Key words: Chinese unknown words, sense guessing, semantic annotation, ensemble learning

摘要：

基于《现代汉语语义词典》, 首先建立不同语义层次的词典, 根据词典分别构建模型并进行语义预测, 然后将各个模型进行集成, 通过集成模型再对未登录词进行语义预测, 得到较好的预测性能。利用预测模型对2000年《人民日报》语料进行未登录词语义预测和标注, 最终得到带有未登录词语义义项标注的语料资源。

关键词: 汉语未登录词, 语义预测, 语义标注, 集成学习

CLC Number:

TP391

SHANG Fenfen, GU Yanhui, DAI Rubing, LI Bin, ZHOU Junsheng, QU Weiguang. Research on the Sense Guessing of Chinese Unknown Words Based on “Semantic Knowledge-base of Modern Chinese”[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(1): 10-16.

尚芬芬, 顾彦慧, 戴茹冰, 李斌, 周俊生, 曲维光. 基于《现代汉语语义词典》的未登录词语义预测研究[J]. 北京大学学报（自然科学版）, 2016, 52(1): 10-16.

Add to citation manager EndNote|Ris|BibTeX

URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2016.009

https://xbna.pku.edu.cn/EN/Y2016/V52/I1/10

Research on the Sense Guessing of Chinese Unknown Words Based on “Semantic Knowledge-base of Modern Chinese”

基于《现代汉语语义词典》的未登录词语义预测研究

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 1

Recommended Articles

Metrics