Acta Scientiarum Naturalium Universitatis Pekinensis ›› 2017, Vol. 53 ›› Issue (2): 262-272.DOI: 10.13209/j.0479-8023.2017.033
• Orginal Article • Previous Articles Next Articles
Chuanming YU1, Bolin FENG1, Yuheng ZUO1, Baiyun CHEN1, Lu AN2,†()
Received:
2016-07-22
Revised:
2016-09-24
Online:
2017-03-20
Published:
2017-03-20
Contact:
Lu AN
余传明1, 冯博琳1, 左宇恒1, 陈百云1, 安璐2,†()
通讯作者:
安璐
基金资助:
CLC Number:
Chuanming YU, Bolin FENG, Yuheng ZUO, Baiyun CHEN, Lu AN. An Individual-Group-Merchant Relation Model for Identifying Online Fake Reviews[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 262-272.
余传明, 冯博琳, 左宇恒, 陈百云, 安璐. 基于个人-群体-商户关系模型的虚假评论识别研究[J]. 北京大学学报自然科学版, 2017, 53(2): 262-272.
Add to citation manager EndNote|Ris|BibTeX
URL: https://xbna.pku.edu.cn/EN/10.13209/j.0479-8023.2017.033
指标 | 估值 | 标准误差 | z值 | Pr(>|z|) |
---|---|---|---|---|
Intercept | -3.5174 | 0.5886 | -5.976 | 2.29×10-9*** |
RUR | -5.4879 | 1.9562 | -2.805 | 5.03×10-3** |
URW | 2.5755 | 1.2074 | 2.133 | 3.291×10-2* |
RR | 50.6619 | 11.9434 | 4.424 | 2.22×10-5*** |
RCS | 2.3308 | 1.0796 | 2.159 | 3.085×10-2* |
RTW | 3.5696 | 0.8075 | 4.421 | 9.85×10-6*** |
Table 1 Individual behavior indicators’ evaluation
指标 | 估值 | 标准误差 | z值 | Pr(>|z|) |
---|---|---|---|---|
Intercept | -3.5174 | 0.5886 | -5.976 | 2.29×10-9*** |
RUR | -5.4879 | 1.9562 | -2.805 | 5.03×10-3** |
URW | 2.5755 | 1.2074 | 2.133 | 3.291×10-2* |
RR | 50.6619 | 11.9434 | 4.424 | 2.22×10-5*** |
RCS | 2.3308 | 1.0796 | 2.159 | 3.085×10-2* |
RTW | 3.5696 | 0.8075 | 4.421 | 9.85×10-6*** |
象 | k | P | R | F1 |
---|---|---|---|---|
个人 | 90 | 0.7992 | 0.7543 | 0.7737 |
92 | 0.8044 | 0.7719 | 0.7863 | |
4 | 0.8103 | 0.7894 | 0.7989 | |
6 | 0.8168 | 0.8070 | 0.8116 | |
8 | 0.8245 | 0.8245 | 0.8245 | |
00 | 0.8152 | 0.8245 | 0.8195 | |
02 | 0.8052 | 0.8245 | 0.8137 | |
04 | 0.8165 | 0.8421 | 0.8262 | |
商家 | 30 | 0.5471 | 0.4909 | 0.5094 |
32 | 0.5690 | 0.5272 | 0.5424 | |
34 | 0.5907 | 0.5636 | 0.5744 | |
36 | 0.5774 | 0.5636 | 0.5698 | |
38 | 0.5636 | 0.5636 | 0.5636 | |
40 | 0.5866 | 0.6 | 0.5926 | |
42 | 0.5721 | 0.6 | 0.5833 | |
44 | 0.5553 | 0.6 | 0.5717 | |
群组 | 66 | 0.9786 | 0.8292 | 0.8874 |
68 | 0.9630 | 0.8292 | 0.8863 | |
70 | 0.9637 | 0.8536 | 0.9010 | |
72 | 0.9644 | 0.8780 | 0.9154 | |
74 | 0.9654 | 0.9024 | 0.9298 | |
76 | 0.9668 | 0.9268 | 0.9441 | |
78 | 0.9505 | 0.9268 | 0.9385 | |
80 | 0.9512 | 0.9512 | 0.9512 |
Table 2 Experimental results of the IGMRM model
象 | k | P | R | F1 |
---|---|---|---|---|
个人 | 90 | 0.7992 | 0.7543 | 0.7737 |
92 | 0.8044 | 0.7719 | 0.7863 | |
4 | 0.8103 | 0.7894 | 0.7989 | |
6 | 0.8168 | 0.8070 | 0.8116 | |
8 | 0.8245 | 0.8245 | 0.8245 | |
00 | 0.8152 | 0.8245 | 0.8195 | |
02 | 0.8052 | 0.8245 | 0.8137 | |
04 | 0.8165 | 0.8421 | 0.8262 | |
商家 | 30 | 0.5471 | 0.4909 | 0.5094 |
32 | 0.5690 | 0.5272 | 0.5424 | |
34 | 0.5907 | 0.5636 | 0.5744 | |
36 | 0.5774 | 0.5636 | 0.5698 | |
38 | 0.5636 | 0.5636 | 0.5636 | |
40 | 0.5866 | 0.6 | 0.5926 | |
42 | 0.5721 | 0.6 | 0.5833 | |
44 | 0.5553 | 0.6 | 0.5717 | |
群组 | 66 | 0.9786 | 0.8292 | 0.8874 |
68 | 0.9630 | 0.8292 | 0.8863 | |
70 | 0.9637 | 0.8536 | 0.9010 | |
72 | 0.9644 | 0.8780 | 0.9154 | |
74 | 0.9654 | 0.9024 | 0.9298 | |
76 | 0.9668 | 0.9268 | 0.9441 | |
78 | 0.9505 | 0.9268 | 0.9385 | |
80 | 0.9512 | 0.9512 | 0.9512 |
商家编号 | f(m) | 排名 |
---|---|---|
12**73 | 7.461918×10-1 | 1 |
7**09 | 5.370122×10-1 | 2 |
14**68 | 3.934663×10-1 | 3 |
10**30 | 2.677272×10-5 | 4 |
7**72 | 1.714189×10-5 | 5 |
7**96 | 1.453285×10-5 | 6 |
12**22 | 1.339835×10-5 | 7 |
12**29 | 1.184025×10-5 | 8 |
12**76 | 1.018929×10-5 | 9 |
10**92 | 8.861665×10-6 | 10 |
Table 3 Merchants’ fake degree rank
商家编号 | f(m) | 排名 |
---|---|---|
12**73 | 7.461918×10-1 | 1 |
7**09 | 5.370122×10-1 | 2 |
14**68 | 3.934663×10-1 | 3 |
10**30 | 2.677272×10-5 | 4 |
7**72 | 1.714189×10-5 | 5 |
7**96 | 1.453285×10-5 | 6 |
12**22 | 1.339835×10-5 | 7 |
12**29 | 1.184025×10-5 | 8 |
12**76 | 1.018929×10-5 | 9 |
10**92 | 8.861665×10-6 | 10 |
个人编号(IP) | f(u) | 排名 |
---|---|---|
111.*. *.140 | 0.379819 | 1 |
222.*.*.179 | 0.3325661 | 2 |
122.*.*.223 | 0.3296828 | 3 |
42.*.*.160 | 0.3249425 | 4 |
42.*.*.100 | 0.3037860 | 5 |
222.*.*.110 | 0.2878484 | 6 |
42.*.*.74 | 0.2876034 | 7 |
125.*.*.19 | 0.2819457 | 8 |
42.*.*.246 | 0.2724028 | 9 |
61.*.*.244 | 0.2243581 | 10 |
Table 4 Users’ fake degree rank
个人编号(IP) | f(u) | 排名 |
---|---|---|
111.*. *.140 | 0.379819 | 1 |
222.*.*.179 | 0.3325661 | 2 |
122.*.*.223 | 0.3296828 | 3 |
42.*.*.160 | 0.3249425 | 4 |
42.*.*.100 | 0.3037860 | 5 |
222.*.*.110 | 0.2878484 | 6 |
42.*.*.74 | 0.2876034 | 7 |
125.*.*.19 | 0.2819457 | 8 |
42.*.*.246 | 0.2724028 | 9 |
61.*.*.244 | 0.2243581 | 10 |
群组编号(IP) | f (g) | 排名 |
---|---|---|
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246993 | 1 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246467 | 2 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246448 | 3 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.246""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245663 | 4 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245570 | 5 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.246""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1245103 | 6 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.100""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245025 | 7 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1245014 | 8 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1244561 | 9 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""58.*.*.60""61.*.*.244" | 0.1244181 | 10 |
Table 5 Groups’ fake degree rank
群组编号(IP) | f (g) | 排名 |
---|---|---|
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246993 | 1 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246467 | 2 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60""61.*.*.244" | 0.1246448 | 3 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.246""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245663 | 4 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245570 | 5 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.246""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1245103 | 6 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.100""42.*.*.160""42.*.*.74""61.*.*.244" | 0.1245025 | 7 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1245014 | 8 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.246""42.*.*.100""42.*.*.160""42.*.*.74""58.*.*.60" | 0.1244561 | 9 |
"111.*.*.140" "122.*.*.223" "125.*.*.19""222.*.*.179""222.*.*.110""42.*.*.38""42.*.*.100""42.*.*.160""58.*.*.60""61.*.*.244" | 0.1244181 | 10 |
模型 | P | R | F |
---|---|---|---|
LR | 0.4555 | 0.7857 | 0.5263 |
KNN | 0.7481 | 0.7926 | 0.7675 |
IGMRM | 0.8165 | 0.8421 | 0.8262 |
Table 6 Comparision among three methods
模型 | P | R | F |
---|---|---|---|
LR | 0.4555 | 0.7857 | 0.5263 |
KNN | 0.7481 | 0.7926 | 0.7675 |
IGMRM | 0.8165 | 0.8421 | 0.8262 |
[1] | 陈明亮, 章晶晶. 网络口碑再传播意愿影响因素的实证研究. 浙江大学学报: 人文社会科学版, 2008, 38(5): 127-135 |
[2] | Mukherjee A, Liu B, Glance N.Spotting fake reviewer groups in consumer reviews // The 21st International Conference on World Wide Web. New York: ACM, 2012: 191-200 |
[3] | Jindal N, Liu B. Analyzing and detecting review spam // The 7th International Conference on Data Mining Proceedings. Washington, DC: IEEE Computer So-ciety, 2007: 547-552 |
[4] | Jindal N, Liu B.Opinion spam and analysis // International Conference on Web Search and Data Mining Proceedings. New York: ACM, 2008: 219-230 |
[5] | Jindal N, Liu B.Review spam detection // The 16th International Conference on World Wide Web Proceedings. New York: ACM, 2007: 1189-1190 |
[6] | Xu C, Zhang J, Chang K Y, et al.Uncovering Collusive spammers in Chinese review websites // ACM International Conference on Information & Knowledge Management Proceedings. Burlingame, 2013: 979-988 |
[7] | Wang G, Xie SH, Liu B, et al.Review graph based online store review spammer detection // IEEE International Conference on Data Mining. Vancouver, 2011: 1242-1247 |
[8] | 宋海霞, 严馨, 余正涛, 等. 基于自适应聚类的虚假评论检测. 南京大学学报: 自然科学, 2013, 49(4): 433-438 |
[9] | 聂卉, 王佳佳. 产品评论垃圾识别研究综述. 现代图书情报技术, 2014(2): 63-71 |
[10] | Ott M, Choi Y J, Cardie C, et al.Finding deceptive opinion spam by any stretch of the imagination // The 49th Meeting of the Association for Computational Linguistics: Human Language Technologies. Strouds-burg, PA: Association for Computational Linguistics, 2011: 309-319 |
[11] | Mukherjee A, Venkataraman V.What Yelp fake review filter might be doing // The 7th International Conference on Weblogs and Social Media. Palo Alto: AAAI Press, 2013: 409-418 |
[12] | Li H, Chen Z Y, Liu B, Wei X K, et al.Spotting fake reviews via collective positive-unlabeled learning. International Conference on Data Mining Proceedings, 2014, 18(3): 899-904 |
[13] | Mukherjee A, Kumar A, Liu B, et al.Castellanos M and Ghost R. Spotting opinion spammers using behavioral footprints // The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2013: 632-640 |
[14] | Akoglu L, Chandy R, Faloutsos C.Opinion fraud detection in online reviews by network effects // International AAAI Conference on Weblogs and Social Media. Cambridge, 2013: 1-10 |
[15] | 邱云飞, 王建坤, 邵良杉, 等. 基于用户行为的产品垃圾评论者检测研究. 计算机工程, 2012, 38(11): 254-257 |
[16] | 孙升芸, 田萱, 何军. 基于评论行为的商品垃圾评论的识别研究. 计算机工程与设计, 2012, 33(11): 4314-4319 |
[17] | 李霄, 丁晟春. 垃圾商品评论信息的识别研究. 现代图书情报技术, 2013(1): 63-68 |
[18] | 邓莎莎, 张朋柱, 张晓燕, 等. 基于欺骗语言线索的虚假评论识别. 系统管理学报, 2014(2): 263-270 |
[19] | 孟美任, 丁晟春. 虚假商品评论信息发布者行为动机分析. 情报科学, 2013, 31(10): 100-104 |
[20] | 陈燕方, 娄策群. 在线商品虚假评论形成路径研究. 现代情报, 2015, 35(1): 49-53 |
[1] | LIU Qiuhui, ZHANG Kunli, XU Hongfei, YU Shiwen, ZAN Hongying. Research on Automatic Recognition of Auxiliary “DE” [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(3): 466-474. |
[2] | KE Yonghong, ZHU Yongfu, SUI Zhifang, YU Shiwen. A Method for Semantic Roles Labeling Consistency Calculation Based on Multi-features [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(3): 475-480. |
[3] | YANG Meng, LI Peifeng, ZHU Qiaoming. An Approach of Sentence Similarity on Tree-LSTM [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(3): 481-486. |
[4] | ZHANG Yu, ZENG Li, ZOU Lei. Regular Path Queries on Large Graph Data [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 236-242. |
[5] | WEI Xing, WANG Wei, CHEN Jingping, XIE Yanlu, ZHANG Jinsong. A Study of Articulatory Features Based Detection of Mandrain Pronunciation Erroneous Tendency for Automatic Annotation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 243-248. |
[6] | LIN Xinyi, YAN Rui, ZHAO Dongyan. A Hybrid Optimization Framework Fusing Word- and Sentence-Level Information for Extractive Summarization [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 229-235. |
[7] | ZHOU Nan, ZHAO Yue, LI Yaoqiang, XU Xiaona, CAIWANG Lamu, WU Licheng. Study on Continuous Speech Recognition Based on Bottleneck Features for Lhasa-Tibetan Dialect [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 249-254. |
[8] | TAN Yiming, WANG Mingwen, LI Maoxi. Neural Post-Editing Based on Machine Translation Quality Estimation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 255-261. |
[9] | WU Huanqin, ZHANG Hongyang, LI Jingmei, ZHU Junguo, YANG Muyun, LI Sheng. Training Machine Translation Quality Estimation Model Based on Pseudo Data [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 279-285. |
[10] | Lü Shuning, DONG Zhian. Domain Term Extraction Using URL-Key [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 262-270. |
[11] | WANG Wenchao, Lü Xueqiang, ZHANG Kai, ZHOU Jianshe. Research on Automatic Writing of Football Game News [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2018, 54(2): 271-278. |
[12] | Wenhao YING, Xinyan XIAO, Sujian LI, Yajuan LÜ, Zhifang SUI. Improving Query-Focused Summarization with CNN-Based Similarity [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 197-203. |
[13] | Qingsheng LI, Qiang XU, Jianguo XIAO, Quan LIU, Jiefang ZHANG. A Structure and Style Model for Chinese Character Dynamic Generation [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 219-229. |
[14] | Yujing CHEN, Xueqiang LÜ, Jianshe ZHOU, Ning LI. Research on Automatic Writing of NBA Sports News [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 211-218. |
[15] | Lilin ZHANG, Maoxi LI, Wenyan XIAO, Jianyi WAN, Mingwen WANG. Improve Automatic Evaluation of Machine Translation Using Specific-Domain Paraphrase [J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2017, 53(2): 230-238. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||