2020美赛C题o奖：Beat The Market: Comprehensive Exploration of Amazon Reviews and Ratings

基本信息

源码名称：2020美赛C题o奖：Beat The Market: Comprehensive Exploration of Amazon Reviews and Ratings

源码大小：4.52M

文件格式：.pdf

开发语言：Python

更新时间：2022-01-14

友情提示：（无需注册或充值，赞助后即可获取资源下载链接）

嘿，亲！知识可是无价之宝呢，但咱这精心整理的资料也耗费了不少心血呀。小小地破费一下，绝对物超所值哦！如有下载和支付问题，请联系我们QQ(微信同号)：813200300

本次赞助数额为： 2 元　

源码介绍

Beat The Market: Comprehensive Exploration of Amazon Reviews and Ratings

Contents
1 Introduction 2
1.1 Problem Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Clarification and Restatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Our Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Problem 1: Data Preprocessing and Mining 4
2.1 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Text Mining by LDA Topic Model[1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Overall Data Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Problem 2 (a): Ratings and Reviews Based Data Measures 8
3.1 Weighted Rating Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Avg and Std of Weighted Sentimental Scores for Reviews[2][3][4] . . . . . . . . . . . . . . 8
3.3 Users’ Preference Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Problem 2 (b): Reputation Metric 11
4.1 Reputation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Analysis on Reputation Variation Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5 Problem 2 (c): Nested Two-Layer LSTM 12
5.1 The Structure of the Nested Two-layer LSTM Model . . . . . . . . . . . . . . . . . . . . . 12
5.2 Analysis on Potential Success of Products . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6 Problem 2 (d): Causal Effectiveness Between Reveiws 15
6.1 Ripple Effects of Extreme Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2 Causal Inference for Ratio of Low Rating and Review Length . . . . . . . . . . . . . . . . . 16
7 Problem 2 (e)[5]: Correlation between Affective Words and Star Ratings 18
7.1 Analysis for Alignment of Rate and Review Scored by Certain Affective Words . . . . . . . 18
7.2 Micro Observation of Asymmetricity between Rate and Review[6] . . . . . . . . . . . . . . 18
8 Strengths and Weaknesses 20
8.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
9 Conclusion 21
10 The Letter to the Marketing Director of Sunshine Company 22
Appendices 26
Appendix A LDA Topic Model for Microwave and Pacifier 26
Appendix B Product Contrast of Microwave and Pacifier Over Time 27
Appendix C Code 27
C.1 Data Preprocess and Overall Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
C.2 LDA Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
C.3 Sentimental Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
C.4 LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
C.5 Reputation Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40