User Profiling Based on Multimodal Fusion Technology

doi:10.13209/j.0479-8023.2019.097

Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2020, Vol. 56 ›› Issue (1): 105-111.DOI: 10.13209/j.0479-8023.2019.097

Previous Articles Next Articles

User Profiling Based on Multimodal Fusion Technology

ZHANG Zhuang¹, FENG Xiaonian², QIAN Tieyun^1,†

1. School of Computer Science, Wuhan University, Wuhan 430072 2. China Power Finance Co., Ltd, Beijing 100005

Received:2019-05-21 Revised:2019-09-22 Online:2020-01-20 Published:2020-01-20
Contact: QIAN Tieyun, E-mail: qty(at)whu.edu.cn

基于多模态融合技术的用户画像方法

张壮¹, 冯小年², 钱铁云^1,†

1. 武汉大学计算机学院, 武汉 430072 2. 中国电力财务有限公司, 北京 100005

通讯作者: 钱铁云, E-mail: qty(at)whu.edu.cn
基金资助:
国家自然科学基金(61572376)资助

Abstract

Abstract:

Existing studies in user profiling are unable to fully utilize the multimodal information. This paper presents a cross-modal joint representation learning network, and develop a multi-modal fusion model. Firstly, a stacking method is adopted to learn the joint representation network which fuse the cross-modal information. Then, attention mechanism is introduced to automatically learn the contribution of different modal to the prediction task. Proposed model has a well defined loss function and network structure, which enables combining the related features in various models by learning the joint representations after feature-level and decision-level fusion. The extensive experiments on real data sets show that proposed model outperforms the baselines.

Key words: user profiling, model combination, stacking, cross-modal learning joint representation, multi-layer and multi-level model fusion

摘要：

针对当前用户画像工作中各模态信息不能被充分利用的问题, 提出一种跨模态学习思想, 设计一种基于多模态融合的用户画像模型。首先利用 Stacking集成方法, 融合多种跨模态学习联合表示网络, 对相应的模型组合进行学习, 然后引入注意力机制, 使得模型能够学习不同模态的表示对预测结果的贡献差异性。改进后的模型具有精心设计的网络结构和目标函数, 能够生成一个由特征级融合和决策级融合组成的联合特征表示, 从而可以合并不同模态的相关特征。在真实数据集上的实验结果表明, 所提模型优于当前最好的基线方法。

关键词: 用户画像, 模型组合, stacking, 跨模态学习联合表示, 多层多级模型融合

ZHANG Zhuang, FENG Xiaonian, QIAN Tieyun. User Profiling Based on Multimodal Fusion Technology[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2020, 56(1): 105-111.

张壮, 冯小年, 钱铁云. 基于多模态融合技术的用户画像方法[J]. 北京大学学报自然科学版, 2020, 56(1): 105-111.

Add to citation manager EndNote|Ris|BibTeX

URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2019.097

https://xbna.pku.edu.cn/EN/Y2020/V56/I1/105

[1]	FAN Wenzhi, WANG Teng. Comparison of Atmospheric Correction Methods for InSAR Interferograms of Sentinel-1 Satellite: Taking the Central Tibet as an Example [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2025, 61(1): 111-120.
[2]	LI Jiaqi, WANG Shuguang, NING Jieyuan. Theoretical Investigation on Earthquake Source Spectra Isolation by Iteratively Stacking Separation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2016, 52(3): 427-436.
[3]	RAO Junyang,JIA Aixia,FENG Yansong,ZHAO Dongyan. Ontology-Based News Personalized Recommendation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2014, 50(1): 1-8.
[4]	GE Zengxi,CHEN Xiaofei. Point Source Stacking Method to Compute Synthetic Seismogram of Finite Moving Planar Source [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2008, 44(3): 407-412.

User Profiling Based on Multimodal Fusion Technology

基于多模态融合技术的用户画像方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 4

Recommended Articles

Metrics