Rating- and ranking- oriented collaborative filtering techniques for recommender systems

推薦系統中評分導向與排名導向的協同過濾技術

Student thesis: Master's Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date15 Jul 2014

Abstract

Nowadays recommender systems are being applied to almost all kinds of web services including e-commerce (Amazon.com), video service (Youtube.com), social networks (Facebook.com) and so on. Collaborative filtering (CF) is one of the most popular strategies for recommender systems. CF is a technique of making predictions about user interests by analyzing the previous preferences from many users. In general, CF can be classified into two categories: rating-oriented CF and ranking-oriented CF according to the objectives of recommender systems. Rating-oriented CF aims at predicting the exact rating values that users would give to resources. On the other hand, ranking-oriented CF focuses on ranking the optimal order of resources based on the user preferences in spite of the precision of predicted rating values. The first issue we would explore is about rating-oriented CF, where we try to incorporate some new regression models into the matrix-factorization-based model. Traditional matrix factorization approach for rating-oriented CF adopts the gradient descent to minimize the objective function, where the objective function is a ridge regression model actually. In order to incorporate other regression models into the matrix-factorization-based model, we adopt another optimization algorithm called alternating regression. We investigate the models with absolute shrinkage and selection operator (Lasso) regularization and elastic net regularization, and find that the model with elastic net regularization can get sparse latent matrices and the performance is comparable to the traditional model. The second problem addressed in this thesis is to analyze the ranking-oriented CF from the perspective of loss function. Analyzing algorithms from the perspective of loss function is helpful because it can analyze the properties of an algorithm directly regardless of its complicated modelling. To gain the insight into the popular bias problem, we also study the tendency of a CF algorithm in recommending the most popular resources. We demonstrate that some previous models work well because their algorithms have the property of boosting the popular resources. We call this property of boosting the popular resources popularity tendency. We find that some sophisticated models work better than the popularity-based algorithm because they incorporate personalized recommendation (personalization) while keeping popularity tendency (non-personalization). With the objective of controlling the trade-off between personalization and popularity tendency, we propose two new models with the generalized logistic loss function and the hinge loss function respectively. By changing the parameters in our proposed models, the trade-off between personalization and popularity tendency can been adjusted such that the optimal trade-off can been found. Finally, we propose a special recommendation strategy that is based on the popularity prediction of resources. Some resources gain popularity in the communities while others do not. Popularity prediction means to predict the popularity of certain resources in the future based on the feedbacks from users during the early session. The predicted most popular resources can be recommended to users. Traditional popularity prediction models only utilized the statistics of the reviews, for example, the number of the reviews and the average word length. However, they did not consider the contents of the reviews which can reflect the user preferences. We propose to do sentiment analysis on the contents of reviews, which means to extract the sentimental score of each review and treat it as one of its features. A polynomial regression model with the feature of sentimental scores is developed to predict the popularity of resources. For each problem mentioned above, we conduct experiments on the real datasets and the experimental results show that our proposed methods do improve the performance.

    Research areas

  • Recommender systems (Information filtering)