Unsupervised Syntactically Controllable Paraphrase Network for Adversarial Example Generation

doi:10.13209/j.0479-8023.2020.079

Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2021, Vol. 57 ›› Issue (1): 83-90.DOI: 10.13209/j.0479-8023.2020.079

Previous Articles Next Articles

Unsupervised Syntactically Controllable Paraphrase Network for Adversarial Example Generation

YANG Erguang¹, LIU Mingtong¹, ZHANG Yujie^1,†, MENG Yao², HU Changjian², XU Jin’an¹, CHEN Yufeng¹#br#

School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044 2. Lenovo Research, AI Laboratory, Beijing 100085

Received:2020-06-09 Revised:2020-08-15 Online:2021-01-20 Published:2021-01-20
Contact: ZHANG Yujie, E-mail: yjzhang(at)bjtu.edu.cn

无监督的句法可控复述模型用于对抗样本生成

杨二光¹, 刘明童¹, 张玉洁^1,†, 孟遥², 胡长建², 徐金安¹, 陈钰枫¹

1. 北京交通大学计算机与信息技术学院, 北京 100044 2. 联想研究院人工智能实验室, 北京 100085

通讯作者: 张玉洁, E-mail: yjzhang(at)bjtu.edu.cn
基金资助:
国家自然科学基金(61876198, 61976015, 61976016)资助

Abstract

Abstract:

Prior work on adversarial example generation with syntactically controlled paraphrase networks requires large-scale paraphrase parallel corpora to train models. The performance of the model is seriously limited by the domain and scale of paraphrase parallel corpus. To solve this problem, this paper proposes an unsuprervised syntactically controlled paraphrase model to generate adversarial examples which only needs monolingual data. Specifically, variational autoencoder is used to learn model, which maps a sentence and a syntactic parse tree into semantic and syntactic variables, respectively. By learning to reconstruct the input sentence from syntactic and semantic variables, the model effectively learns to generate syntactic paraphrases without using any parallel data. Experiment results on unsupervised sentence paraphrasing and adversarial example generation demonstrate that the proposed model achieves new state-of-the-art results on unsupervised paraphrase generation and generate effective adversarial examples. These examples can be used to improve the robustness and generalization of NLP (natural language processing) model.

Key words: unsupervised learning, syntactically controllable paraphrase network, adversarial example

摘要：

针对使用句法可控的复述生成模型生成对抗样本时模型性能受限于复述平行语料的领域和规模的问题, 提出仅需要单语语料训练的无监督的句法可控复述生成模型, 用以生成对抗样本。采用变分自编码方式学习模型, 首先将句子和句法树分别映射为语义变量和句法变量, 然后基于语义变量和句法变量重构原始句子。在重构过程中, 模型可以在不使用任何平行语料的情况下学习生成句法变化的复述。在无监督复述生成和对抗样本生成任务中的实验结果表明, 所提方法在无监督复述生成任务中取得最佳性能, 在对抗样本生成任务中可以生成有效的对抗样本, 用以改进神经自然语言处理(NLP)模型的鲁棒性和泛化能力。

关键词: 无监督学习, 句法可控复述生成模型, 对抗样本

YANG Erguang, LIU Mingtong, ZHANG Yujie, MENG Yao, HU Changjian, XU Jin’an, CHEN Yufeng. Unsupervised Syntactically Controllable Paraphrase Network for Adversarial Example Generation[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2021, 57(1): 83-90.

杨二光, 刘明童, 张玉洁, 孟遥, 胡长建, 徐金安, 陈钰枫. 无监督的句法可控复述模型用于对抗样本生成[J]. 北京大学学报自然科学版, 2021, 57(1): 83-90.

Add to citation manager EndNote|Ris|BibTeX

URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2020.079

https://xbna.pku.edu.cn/EN/Y2021/V57/I1/83

Unsupervised Syntactically Controllable Paraphrase Network for Adversarial Example Generation

无监督的句法可控复述模型用于对抗样本生成

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 1

Recommended Articles

Metrics