Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2024, Vol. 60 ›› Issue (1): 43-52.DOI: 10.13209/j.0479-8023.2023.075

Previous Articles     Next Articles

Can ChatGPT Be Served as the Sentiment Expert? An Evaluation of ChatGPT on Sentiment and Metaphor Analysis

ZHANG Yazhou1,2, WANG Mengyao1, RONG Lu3, YU Yang1, ZHAO Dongming4, QIN Jing2,†   

  1. 1. School of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002 2. School of Nursing, The Hong Kong Polytechnic University, Hong Kong 999077 3. Human Resources Office, Zhengzhou University of Light Industry, Zhengzhou 450002 4. Artificial Intelligence Laboratory, China Mobile Communication Group Tianjin Co, Tianjin 300020
  • Received:2023-05-17 Revised:2023-07-31 Online:2024-01-20 Published:2024-01-20
  • Contact: QIN Jing, E-mail: harry.qin(at)polyu.edu.hk

ChatGPT可否充当情感专家?——调查其在情感与隐喻分析的潜力

张亚洲1,2, 王梦遥1, 戎璐3, 俞洋1, 赵东明4, 秦璟2,†   

  1. 1. 郑州轻工业大学软件学院, 郑州 450002 2. 香港理工大学护理学院, 香港 999077 3. 郑州轻工业大学人事处, 郑州 450002 4. 中国移动通信集团天津有限公司人工智能实验室, 天津 3000201
  • 通讯作者: 秦璟, E-mail: harry.qin(at)polyu.edu.hk
  • 基金资助:
    国家自然科学基金青年基金(62006212)、中国博士后科学基金(2023M733907)、信息物理社会可信服务计算教育部重点实验室开放基金(CPSDSC202103)和 Project of Strategic Importance Grant of the Hong Kong Polytechnic University (1-ZE2Q)资助

Abstract:

To explore the potential for subjective understanding, the subjectivity and metaphorical nature of ChatGPT, this paper evaluates ChatGPT on five sentiment, humor, and metaphor benchmark datasets and discusses its strengths and limitations on different tasks by comparing it with the most cutting-edge models in the field. In addition, this paper also compares the performance of ChatGPT and humans in sentiment analysis, with gaps of 9.52%, 16.64% and 6.69% in human results on sentiment, humor and metaphor tasks. The results suggest that although ChatGPT achieves the best performance in dialogue generation, it still has potential for improvement in sentiment understanding. Finally, this paper investigates ChatGPT’s sensitivity to cueing templates in an emotion understanding scenario by improving the cueing templates.

Key words: ChatGPT, sentiment analysis, humor detection, metaphor recognition

摘要:

为了探索ChatGPT情感分析能力以及对主观性和隐喻性理解的潜力, 将ChatGPT在5个情感、幽默与隐喻基准数据集上展开评估, 通过与领域内最前沿的模型对比, 讨论其在不同任务上的优势与局限。此外, 还通过对比ChatGPT与人类在情感分析中的性能差别, 发现 ChatGPT在情感、幽默与隐喻任务上与人类结果分别相差9.52%, 16.64%和6.69%。实验结果表明, 尽管ChatGPT在对话生成方面获得最佳表现, 但是其在情感理解方面仍具有改进的潜力。最后, 通过改善提示模板, 调查ChatGPT在情感理解场景下对提示模板的敏感性。

关键词: ChatGPT, 情感分析, 幽默检测, 隐喻识别