Making serendipitous recommendation by providing less obvious items
Student thesis: Doctoral Thesis
Related Research Unit(s)
Recommendation systems play an important role in providing the relevant items to the customer in e-commerce website. Recommendation systems traditionally focus on improving accuracy (such as RMSE (Root Mean Square Error) and precision). While accuracy is a useful and natural measurement on recommendation quality, accurate recommendations may not be necessarily useful if the suggested items are obvious and already well-known to the users. In order to provide less obvious suggestions, recent researches began to explore alternative strategies. Several approaches have been proposed, including criteria for recommendations such as, for example, diversity [129, 3, 69, 122, 50, 2], novelty [122, 50, 61, 115], and more recently, serendipity [36, 1, 56, 88, 57, 52, 76]. Intuitively, a serendipitous recommendation is one that can provide a pleasant surprise to the user. There are two basic requirements for a good serendipitous recommendation, namely, the suggestion should be unexpected to the user (unexpectedness) and yet it must still be useful (usefulness) . Since serendipitous recommendations do not restrict themselves to items similar to the user profiles or his/her previous choices, or the popular items, a good serendipitous recommendation system not only broadens the user's choices, but also provides a valuable tool for e-retailers to cross-sell their off-the-beaten-track products as well. The above discussion motivates us to propose three serendipitous recommendations algorithms in this thesis, namely, Inno, Inno++ and Robin. • Inno is a serendipitous recommendation algorithm which generates recommendations from innovators. Typical user-based collaborative filtering (UserCF) makes recommendation from like-minded users. When identifying like-minded users, the number of co-rated items and their rating consistency on the corated item are considered while the purchase precedence and activeness of the users are not taken into account. Yet, these two factors contain valuable information that can contribute to recommendation serendipity. First, taking an analogy of Diffusion of Innovations Theory (DIT), the earlier that a like-minded user purchased an item, the more likely that he would be a trend leader in his respective area of interest. Hence, recommendations generated from these like-minded users may probably fit the user's preference and useful for the given user. Such like-minded users are called the innovators in DIT, and they should have a higher level of influence on their followers than a typical like-minded user. Second, innovators are typically more active and more adventurous users who are willing to try out new products of various genres that the given user may not hear about. And these items can bring unexpectedness to the user. Therefore, recommendations generated from innovators may be unexpected and useful for the user, which contributes to the serendipity of the recommendations. Experiment results show that our Inno which takes purchase precedence and activeness of the users into account outperforms the typical UserCF and an representative innovator-based method which does not consider these two factors. • Inno++: Although our experiment results show that Inno performs well in terms of accuracy, serendipity and diversity, it does not take into account two important components of serendipity: unexpectedness and utility. Inspired by Inno, we further propose an extension of Inno, namely Inno++. Compared to Inno, Inno++ also makes recommendation from innovators, but it takes unexpectedness and utility into account in the recommendation generating process. Inno++ first constructs a candidate recommended item set from the innovators, and then calculates the unexpectedness and utility of each candidate item. To model unexpectedness, Inno++ combines the concepts of item popularity (or rareness) and dissimilarity: the less popular (or rarer) is an suggested item and the further is its distance from a user's profile, the more unexpected it is assumed to be. Secondly, to model usefulness, Inno++ adopts PureSVD latent factor model, whose effectiveness in capturing user interests has already been demonstrated. The unexpectedness and utility are combined as an integrated score to generate recommendations. Experiment results show that Inno++ outperforms Inno in both accuracy and serendipity. • Robin: While Inno++ takes into account unexpectedness and utility in generating recommendations, it combines them linearly. We postulate that combining these two factors into a formal learning process would improve the recommendation quality. To verify this assumption, Robin is proposed. In order to put unexpectedness and utility into a learning framework, we introduce the notion of unexpectedness weight into the utility model. With the unexpectedness weight, Robin gives a higher weight to the items that are less popular and less similar to the user profile. Moreover, we investigate the optimal weights of item rareness and item dissimilarity on the recommendation quality. One of the main issues of serendipitous recommendation algorithm is the definition of serendipity. This paper adopted the definition of serendipity which covers unexpectedness and utility . However, unexpectedness and usefulness are not clearly defined in this work. Although some works [1, 76] have further elaborated the definition of unexpectedness and usefulness, they do no reach an consensus. For example, Adamopoulos et al. consider that the further is the recommendation's distance from a user's profile (dissimilarity), the more unexpected it is assumed to be. While Lu et al. argue that the less popular (or rareness) is an suggested item, the more unexpectedness it brings to the user. However, we consider unexpectedness should be related to both dissimilarity and rareness. Based on this discussion, we propose a novel definition of unexpectedness which constitutes the basis of the proposed algorithms. The effectiveness of the three proposed schemes have been experimentally evaluated by various metrics (precision, recall, f1-score, serendipity and diversity) on popular benchmark datasets. The results are further compared with existing representative serendipitous/non serendipitous recommendation algorithms. Experiment results are encouraging: three proposed scheme not only produce superior results in terms of serendipity, but also lead in terms of accuracy (precision, recall, f1-score) and diversity.
- Electronic commerce, Recommender systems (Information filtering)