Rating- and ranking- oriented collaborative filtering techniques for recommender systems
推薦系統中評分導向與排名導向的協同過濾技術
Student thesis: Master's Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 15 Jul 2014 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(522be720-4996-4ada-8f0f-d65601fe53ca).html |
---|---|
Other link(s) | Links |
Abstract
Nowadays recommender systems are being applied to almost all kinds of web services
including e-commerce (Amazon.com), video service (Youtube.com), social networks (Facebook.com) and so on. Collaborative filtering (CF) is one of the most popular
strategies for recommender systems. CF is a technique of making predictions about
user interests by analyzing the previous preferences from many users. In general, CF
can be classified into two categories: rating-oriented CF and ranking-oriented CF according to the objectives of recommender systems. Rating-oriented CF aims at predicting the exact rating values that users would give to resources. On the other hand,
ranking-oriented CF focuses on ranking the optimal order of resources based on the
user preferences in spite of the precision of predicted rating values.
The first issue we would explore is about rating-oriented CF, where we try to
incorporate some new regression models into the matrix-factorization-based model.
Traditional matrix factorization approach for rating-oriented CF adopts the gradient descent to minimize the objective function, where the objective function is a ridge
regression model actually. In order to incorporate other regression models into the
matrix-factorization-based model, we adopt another optimization algorithm called alternating regression. We investigate the models with absolute shrinkage and selection
operator (Lasso) regularization and elastic net regularization, and find that the model
with elastic net regularization can get sparse latent matrices and the performance is
comparable to the traditional model.
The second problem addressed in this thesis is to analyze the ranking-oriented CF
from the perspective of loss function. Analyzing algorithms from the perspective of
loss function is helpful because it can analyze the properties of an algorithm directly
regardless of its complicated modelling. To gain the insight into the popular bias problem, we also study the tendency of a CF algorithm in recommending the most popular
resources. We demonstrate that some previous models work well because their algorithms have the property of boosting the popular resources. We call this property of
boosting the popular resources popularity tendency. We find that some sophisticated
models work better than the popularity-based algorithm because they incorporate personalized recommendation (personalization) while keeping popularity tendency (non-personalization). With the objective of controlling the trade-off between personalization and popularity tendency, we propose two new models with the generalized logistic
loss function and the hinge loss function respectively. By changing the parameters in
our proposed models, the trade-off between personalization and popularity tendency
can been adjusted such that the optimal trade-off can been found.
Finally, we propose a special recommendation strategy that is based on the popularity prediction of resources. Some resources gain popularity in the communities
while others do not. Popularity prediction means to predict the popularity of certain resources in the future based on the feedbacks from users during the early session. The
predicted most popular resources can be recommended to users. Traditional popularity
prediction models only utilized the statistics of the reviews, for example, the number
of the reviews and the average word length. However, they did not consider the contents of the reviews which can reflect the user preferences. We propose to do sentiment
analysis on the contents of reviews, which means to extract the sentimental score of
each review and treat it as one of its features. A polynomial regression model with the
feature of sentimental scores is developed to predict the popularity of resources.
For each problem mentioned above, we conduct experiments on the real datasets
and the experimental results show that our proposed methods do improve the performance.
- Recommender systems (Information filtering)