This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

GS2-RS: Generating Self-Serendipity Preference in Recommender Systems for Addressing Cold Start Problems

Yuanbo Xu, Yongjian Yang, En Wang
Abstract

Classical accuracy-oriented Recommender Systems (RSs) typically face the cold-start problem and the filter-bubble problem when users suffer the familiar, repeated, and even predictable recommendations, making them boring and unsatisfied. To address the above issues, serendipity-oriented RSs are proposed to recommend appealing and valuable items significantly deviating from users’ historical interactions and thus satisfying them by introducing unexplored but relevant candidate items to them. In this paper, we devise a novel serendipity-oriented recommender system (Generative Self-Serendipity Recommender System, GS2-RS) that generates users’ self-serendipity preferences to enhance the recommendation performance. Specifically, this model extracts users’ interest and satisfaction preferences, generates virtual but convincible neighbors’ preferences from themselves, and achieves their self-serendipity preference. Then these preferences are injected into the rating matrix as additional information for RS models. Note that GS2-RS can not only tackle the cold-start problem but also provides diverse but relevant recommendations to relieve the filter-bubble problem. Extensive experiments on benchmark datasets illustrate that the proposed GS2-RS model can significantly outperform the state-of-the-art baseline approaches in serendipity measures with a stable accuracy performance.

Introduction

Recent decades have witnessed the magnificent success of recommender systems in both industry and academia. As an important tool to filter the enormous information, a proper recommender system aims to select relevant candidate items for object users while extracting their personalized preferences from their historical shopping logs simultaneously. To achieve this goal, collaborative filtering (CF) (Zou et al. 2020) and matrix factorization (MF) (Chen et al. 2020b) have been the most popular algorithms and were developed for decades. However, as shown in the recent research literature, conventional CF and MF models often suffer from cold-start situations (Chae et al. 2020), filter bubbles (Kapoor et al. 2015), and overfitting (Feldman, Frostig, and Hardt 2019), which results in inaccurate, irrelevant, and repeated recommendations. To make things worse, these recommendations might irritate users and hurt their shopping experience.

Refer to caption
Figure 1: Illustration of our proposed GS2-RS framework for recommendations.

In terms of the data sparsity (the sparsity of the user-item interaction matrix), the Cold-Start (CS) problem and the Filter-Bubble (FB) problem are the vital issues that limit the performance of existing recommendation models. CS problem is a common but terrible challenge for recommender systems. E-commerce websites, such as Amazon, Yelp, and Taobao, usually have millions of users and items. However, the feedback interactions (clicks, browses, purchases, and ratings, etc. In this paper, we focus on ratings.) between them only account for a minimal fraction. Obviously, without enough historical interactions, it is difficult for CF models to understand users’ preferences who do not have enough ratings, thereby leading to an inaccurate recommendation. To relieve this cold-start situation, some existing recommendation models prefer to recommend the most popular items to those cold-start users to achieve a better recommendation accuracy expectation. Nevertheless, the similar, repeated recommendations to most users without personalizations result in another critical challenge-the filter-bubble problem-in recommender systems, especially for MF models. With these homogenous recommendations, the popular items’ interaction weights increase in MF models. Naturally, those items have more chances to be explored by users than the other items should, dominating the recommendation results, which is a typical scenario of the Matthew Effect. What is worse, the CS problem and FB problem usually appear in pairs and reinforce mutually, which seriously impact the recommendation quality.

To address the filter-bubble problem, most researchers so far have focused on exploiting auxiliary information such as users’ attributes (Bi et al. 2020), user’s social relations (Fu et al. 2021), and item’s description text (Chae et al. 2020) or reviews (Wang, Ounis, and Macdonald 2021). Along with this line, serendipity-oriented recommender system are proposed to make unexpected but valuable items to users (Yang et al. 2018; Ziarani and Ravanmehr 2021). However, their models are effective and useful only when these auxiliary information are available. Besides, over-utilizing these auxiliary information may cause the privacy disclosure problem (Burbach et al. 2018). For the cold-start problem, some studies employed novel data framework to develop the existing recommendation models, such as multi-layer perceptron (He et al. 2017), recurrent neural networks (Xu et al. 2019) or the latest graph convolutional network (Zhang et al. 2020), to find the latent neighbors or the similar latent representations of cold-start users/items. However, these neural network-based models require huge computing ability and costly to be deployed. Our research focuses on original rating-based RS, a Top-K recommendation task with only one user-item rating matrix, without requiring any auxiliary information or changing the existing recommendation models. In this context, data completion has been the most popular algorithm to tackle the CS problem (Chae et al. 2020; Kim and Suh 2019; Ziarani and Ravanmehr 2021). These models fill the sparse user-item rating matrix by inferring the data distribution from existing ratings to relieve the cold-start problem. Nevertheless, data completion models are often too coarse to give a reasonable recommendation (for example, they can not extract users’ preferences on items). Thus they can not address the filtering bubble problem.

In this paper, we propose a novel recommender system framework that can tackle the cold-start problem and the filter-bubble problem at the same time. Different from existing neural network-based models or data completion models, our core idea is to generate virtual but convincible users’ preferences on items, drop out the impossible items from the candidate item set, and enhance most existing recommendation models. Specifically, GS2-RS, which stands for Generative Self-Serendipity Recommendation System, consists of three following modules: 1) preferences modelling. We analyze the user-item rating matrix to form user’s historical interest and satisfaction preferences. Then we train different Conditional Generative Adversarial Nets (CGCN) to generate users’ virtual preferences (interest and satisfaction) on candidate items. 2) self-serendipity fusion and matrix injection. We devise a gate mechanism to combine users’ interests and satisfactions to form their self-serendipity preferences. Then GS2-RS drops out the impossible items from the candidate item set by filling 0s in the user-item rating matrix, which builds an enhanced rating matrix. 3) recommendation. GS2-RS outputs this enhanced rating matrix to any existing recommendation models. Our proposed model can achieve a personalized, customized recommendation by tuning the gate threshold in the matrix injection stage.

To the best of our knowledge, our work is the first attempt to employ GANs for building a serendipity-oriented recommender system that tackles both CS and FB problems at the same time (as shown in Figure 1). Notably, the contributions of our proposed model are summarized as follows. (i) We propose a novel serendipity-oriented recommendation framework (GS2-RS), which can tackle both cold-start and filter-bubble problems without any auxiliary information. (ii) GS2-RS utilizes GANs to generate users’ serendipity preferences from only the user-item rating matrix and makes an explainable, personalized recommendation with a delicate matrix injection method. Specifically, GS2-RS can be treated as the preprocessing for any existing recommendation models. (iii) We conduct extensive empirical studies on three public datasets, and find that GS2-RS can achieve a superior recommendation performance on both accuracy metric and serendipity metric.

GS2-RS:Generative Self-Serendipity Recommender System

Preliminary and Problem Statement

Given a recommender system, which contains MM users and NN items, the user-item rating matrix RR is a M×NM\times N low-rank sparse matrix, where its entity rijr_{ij} stands for the rating that user ii marked for item jj, ranging from [1,2,3,4,5]. Note that in real-world scenarios, RR is usually extremely sparse (more than 90% data is unknown to learning models), meaning there are many “?” in RR.

We focus on original rating-based RS, which means that for achieving a Top-K recommendation task, the input of our model is only the original user-item rating matrix RR. However, the sparsity of RR usually leads to the cold-start (CS) problem and the filter-bubble (FB) problem (Chae et al. 2020), as shown in Figure 2. Cold-start problem: The extreme sparsity may confuse the RS models to infer inaccurate preferences of users on candidate items, which causes item- and user-based CS problems. Filter-bubble problem: Some models pay so much attention to existing ratings that the more ratings an item has, the more chances it can be recommended to users. In Figure 2, as a FB problem, the item with rating [1,4,4] dominate other items to be recommended in some existing models, like CF models (Chen et al. 2020a). Both the problems affect recommender systems’ performance and affect users’ shopping experience.

Refer to caption
Figure 2: A simple example to explain CS and FB problem directly.

To tackle the problem above, we are inspired by (Chae et al. 2020), which utilizes GANs to generate users’ virtual neighbors for enhancing CF models. However, this model can not consider the FB problem because it directly generates the virtual neighbors’ rating without inferring users’ preferences. Along with this line, we first introduce users’ two type perferences: interest and satisfaction preferences: Interest: for a user uu and an item ii in a candidate itemset iIucani\in I_{u}^{\text{can}}, if rui?r_{ui}\neq?, which means that uu has the interest to buy ii, we define user’ interest preference ruiin=1r^{\text{in}}_{ui}=1, ruiinRinr^{\text{in}}_{ui}\in R^{\text{in}}, iIuini\in I_{u}^{\text{in}}. Satisfaction: for a user uu and an item iIuini\in I_{u}^{\text{in}}, if ruiαuir_{ui}\geq\alpha_{ui}, we define user’s satisfaction preference ruisa=1r^{\text{sa}}_{ui}=1, iIusai\in I_{u}^{\text{sa}}; else ruisa=0r^{\text{sa}}_{ui}=0, ruisaRsar^{\text{sa}}_{ui}\in R^{\text{sa}}. Note that the threshold αui\alpha_{ui} can be defined as either uu or ii’s average rating, or other contextual values. In our proposed model, we first extract users’ historical preferences, as shown in Figure 3.

Refer to caption
Figure 3: Historical preference extraction. αui\alpha_{ui} is the uu’s average rating. Note that for RsaR^{\text{sa}}’s fourth row, we take both uu’s and ii’s average ratings ((2+2)/2=2; (5+2)/2=3.5) into consideration as the thresholds.

Meanwhile, we analyze users’ preferences at a fine-grained level. We introduce users’ serendipity items: the items with high relevance but low shopping purpose (Yang et al. 2018). For common situations, these items can produce a wonderful purchasing experience once the users buy them. So recommend serendipity items can achieve better diversity and satisfaction. By considering serendipity items, the filter-bubble problem can be relieved effectively.

Multiple Preference Modelling

This section introduces how we deduce users’ multiple preferences, including users’ virtual interest and satisfaction preferences, based on Conditional Generative Adversarial Nets (CGAN) (Mirza and Osindero 2014), a framework to train generative models with complicated, high-dimensional real-world data such as images. Specifically, CGAN is an extension to the original GAN (Goodfellow et al. 2014): it allows a generative model 𝒢\mathcal{G} to produce date according to a specific condition vector c by treating the desired condition vector as an additional input with the random noise input z. Thus, CGAN’s objective function is formulated as follows:

V(𝒟,𝒢)=𝔼xpdata(x)[ln𝒟(x|c)]𝔼zpz(z)[ln𝒟(𝒢(z|c))],\displaystyle V(\mathcal{D,G})={\mathbb{E}_{\textbf{x}-{p_{\text{data}}}(\textbf{x})}}[\ln\mathcal{D}(\textbf{x}|\textbf{c})]-{\mathbb{E}_{\textbf{z}-{p_{\textbf{z}}}(\textbf{z})}}[\ln\mathcal{D}(\mathcal{G}(\textbf{z}|\textbf{c}))], (1)

where x is a ground truth data from the data distribution pdatap_{\text{data}}, zz is a noise input vector sampled from known prior pzp_{\textbf{z}}. 𝒢(z)\mathcal{G}(\textbf{z}) is a synthetic data from the generator distribution pgp_{\textbf{g}}. cc corresponds to a condition vector such as a one-hot vector of a specifc class label. With the optimal objective min𝒢max𝒟V(𝒟,𝒢)\mathop{\min}\limits_{\mathcal{G}}\mathop{\max}\limits_{\mathcal{D}}V(\mathcal{D,G}). Ideally, the completely trained 𝒢\mathcal{G} is expected to generate realistic data that the discriminative model 𝒟\mathcal{D} evaluates the possibility that the data came from the ground truth rather than from 𝒢\mathcal{G} equals 0.5.

Refer to caption
Figure 4: CGAN model training.

After we extract users’ interest and satisfaction preferences Rin,RsaR^{\text{in}},R^{\text{sa}}, we build two CGANs for each type of preferences, respectively. The training procedure is shown in Figure 4. Note that we apply the same CGAN framework on interest and satisfaction preferences, respectively. For the sake of simplicity, we take interest preferences as the example and briefly introduce how this framework can be properly deployed on satisfaction preferences. Formally, we train our two CGANs as follows:

V(𝒟rin,𝒢rin,𝒟fin,𝒢fin)1|U|(uU(ln𝒟rin(𝐫uin|𝐜uin)ln𝒟rin(𝒢rin(𝐳uin|𝐜uin)fuin)))+1|U|(uU(ln𝒟fin(𝐟uin|𝐜uin)ln𝒟fin(𝒢fin(𝐳uin|𝐜uin)))),\begin{array}[]{l}V({\cal D}_{\rm{r}}^{{\rm{in}}},{\cal G}_{\rm{r}}^{{\rm{in}}},{\cal D}_{\rm{f}}^{{\rm{in}}},{\cal G}_{\rm{f}}^{{\rm{in}}})\\ \\ \simeq\frac{1}{{\left|U\right|}}(\sum\limits_{u\in U}{(\ln{\cal D}_{\rm{r}}^{{\rm{in}}}({\bf{r}}_{u}^{{\rm{in}}}|{\bf{c}}_{u}^{{\rm{in}}})-\ln{\cal D}_{\rm{r}}^{{\rm{in}}}({\cal G}_{\rm{r}}^{{\rm{in}}}({\bf{z}}_{u}^{{\rm{in}}}|{\bf{c}}_{u}^{{\rm{in}}})\bullet{\rm{f}}_{u}^{{\rm{in}}}))})\\ {\rm{+}}\frac{1}{{\left|U\right|}}(\sum\limits_{u\in U}{(\ln{\cal D}_{\rm{f}}^{{\rm{in}}}({\bf{f}}_{u}^{{\rm{in}}}|{\bf{c}}_{u}^{{\rm{in}}})-\ln{\cal D}_{\rm{f}}^{{\rm{in}}}({\cal G}_{\rm{f}}^{{\rm{in}}}({\bf{z}}_{u}^{{\rm{in}}}|{\bf{c}}_{u}^{{\rm{in}}})))}),\end{array} (2)

where cuin\textbf{c}_{u}^{\text{in}} denotes uu’s interest condition vector, zuin\textbf{z}_{u}^{\text{in}} denotes uu’s noise vector, ruin\textbf{r}_{u}^{\text{in}} denotes uu’s interest vector, fuin\textbf{f}_{u}^{\text{in}} denotes the interest indicator vector. Note that fuin\textbf{f}_{u}^{\text{in}} is in the same dimension as ruin\textbf{r}_{u}^{\text{in}}, and each entity fuiinf_{ui}^{\text{in}} of fuin\textbf{f}_{u}^{\text{in}} is 1/01/0 to indicate that there is/not an interest value for item ii.

In Formula 2, 𝒢rin\mathcal{G}_{\text{r}}^{\text{in}} and 𝒢fin\mathcal{G}_{\text{f}}^{\text{in}} are employed to produce uu’s synthetic interest vector and interest indicator vector, denoted as 𝐫¯uin\overline{\bf{r}}_{u}^{{\rm{in}}}, 𝐟¯uin\overline{\bf{f}}_{u}^{{\rm{in}}}. While 𝒟rin\mathcal{D}_{\text{r}}^{\text{in}} and 𝒟fin\mathcal{D}_{\text{f}}^{\text{in}} are employed to distinguish real interest vector 𝐫uin{\bf{r}}_{u}^{{\rm{in}}}, indicator vector 𝐟uin{\bf{f}}_{u}^{{\rm{in}}} from synthetic vectors 𝐫¯uin\overline{\bf{r}}_{u}^{{\rm{in}}}, 𝐟¯uin\overline{\bf{f}}_{u}^{{\rm{in}}}, respectively. Specifically, there are two designs in this framework: first, each 𝒢\mathcal{G}’s outputs’ value are restricted to the range of (0 1) for computing with a Layer Normalization; second, a masking operation is employed to avoid useless computations: (𝐳uin|𝐜uin)fuin{({\bf{z}}_{u}^{{\rm{in}}}|{\bf{c}}_{u}^{{\rm{in}}})\bullet{\rm{f}}_{u}^{{\rm{in}}}}. This element-wise dot operation forces the discriminators to focus on the values ruiinr^{\text{in}}_{ui} in interest vector 𝐫uin{\bf{r}}_{u}^{{\rm{in}}}, which contributes to the computing results. Besides, the masking operation relieves the data sparsity issue. All 𝒟,𝒢\mathcal{D},\mathcal{G} are co-trained and deployed by DNN where the stochastic gradient descent (SGD) with minibatch and the back-propagation algorithm are employed.

Refer to caption
Figure 5: Usage of GS2-RS for enhancing SOTA recommender systems.

Similar as interest preferences, satisfaction preferences can be modeled as follows:

V(𝒟rsa,𝒢rsa,𝒟fsa,𝒢fsa)1|U|(uU(ln𝒟rsa(𝐫usa|𝐜usa)ln𝒟rsa(𝒢rsa(𝐳usa|𝐜usa)fusa)))+1|U|(uU(ln𝒟fsa(𝐟usa|𝐜usa)ln𝒟fsa(𝒢fsa(𝐳usa|𝐜usa)))),\begin{array}[]{l}V({\cal D}_{\rm{r}}^{{\rm{sa}}},{\cal G}_{\rm{r}}^{{\rm{sa}}},{\cal D}_{\rm{f}}^{{\rm{sa}}},{\cal G}_{\rm{f}}^{{\rm{sa}}})\\ \\ \simeq\frac{1}{{\left|U\right|}}(\sum\limits_{u\in U}{(\ln{\cal D}_{\rm{r}}^{{\rm{sa}}}({\bf{r}}_{u}^{{\rm{sa}}}|{\bf{c}}_{u}^{{\rm{sa}}})-\ln{\cal D}_{\rm{r}}^{{\rm{sa}}}({\cal G}_{\rm{r}}^{{\rm{sa}}}({\bf{z}}_{u}^{{\rm{sa}}}|{\bf{c}}_{u}^{{\rm{sa}}})\bullet{\rm{f}}_{u}^{{\rm{sa}}}))})\\ {\rm{+}}\frac{1}{{\left|U\right|}}(\sum\limits_{u\in U}{(\ln{\cal D}_{\rm{f}}^{{\rm{sa}}}({\bf{f}}_{u}^{{\rm{sa}}}|{\bf{c}}_{u}^{{\rm{sa}}})-\ln{\cal D}_{\rm{f}}^{{\rm{sa}}}({\cal G}_{\rm{f}}^{{\rm{sa}}}({\bf{z}}_{u}^{{\rm{sa}}}|{\bf{c}}_{u}^{{\rm{sa}}})))}),\end{array} (3)

After the training of our CGANS, we are ready to generate users’ virtual preferences, including interest and satisfaction. With these virtual preferences, we aim at deducing users’ self-serendipity preferences for an accurate recommendation.

Self-serendipity Fusion and Zero Matrix Injection

The usage of GS2-RS is shown in Figure 5. With four well-trained generators 𝒢rin\mathcal{G}_{\text{r}}^{\text{in}}, 𝒢fin\mathcal{G}_{\text{f}}^{\text{in}}, 𝒢rsa\mathcal{G}_{\text{r}}^{\text{sa}}, 𝒢fsa\mathcal{G}_{\text{f}}^{\text{sa}}, we feed objective user uu’s condition vectors cuin\textbf{c}_{u}^{\text{in}}, cusa\textbf{c}_{u}^{\text{sa}}, and noise vectors zuin\textbf{z}_{u}^{\text{in}}, zusa\textbf{z}_{u}^{\text{sa}}, we can achieve the synthetic vectors 𝐫¯uin\overline{\bf{r}}_{u}^{{\rm{in}}}, 𝐟¯uin\overline{\bf{f}}_{u}^{{\rm{in}}}, 𝐫¯usa\overline{\bf{r}}_{u}^{{\rm{sa}}}, 𝐟¯usa\overline{\bf{f}}_{u}^{{\rm{sa}}}, which are formulated as follows:

𝒢u/f(𝐳u,𝐜u)=<𝐫¯u,𝐟¯u>,\displaystyle\mathcal{G}_{u/{\rm{f}}}^{*}({\bf{z}}_{u}^{*},{\bf{c}}_{u}^{*}){\rm{=<}}\overline{\bf{r}}_{u}^{*},\overline{\bf{f}}_{u}^{*}>, (4)

where denotes either in{}^{\text{in}} or sa{}^{\text{sa}}. For each objective user, our propose model can generate several synthetic r¯u,f¯u{\overline{\rm{\textbf{r}}}_{u}^{*}}{\rm{,}}{\overline{\rm{\textbf{f}}}_{u}^{*}} pairs. Generally, we treat each vector pairs as a virtual user uu^{\prime} for the objective user uu, named self neighbors. And each virtual user uu^{\prime}’s preferences can be formulated as follows:

𝐫u=𝐫¯u𝐟¯u.{\bf{r}}_{u^{\prime}}^{*}={\bf{\bar{r}}}_{u}^{*}\odot{\bf{\bar{f}}}_{u}^{\rm{*}}. (5)

Note that the number of virtual users can be tuned for better performance. We will discuss it in our experiment section. For the sake of simplicity, we only generate tt=2 self neighbors here for explanations, as shown in Figure 5. Then we feed the virtual user uu^{\prime}’s preference vectors and original user uu’s preference vector r\textbf{r}^{*} into existing CF models to achieve self preference 𝐫¯u{\bf{\bar{r}}}^{*}_{u} (Formula 6), for interest and satisfaction, respectively. This operation can generate two enhanced preference matrices R¯in{\bar{R}^{\text{in}}}, R¯sa{\bar{R}^{\text{sa}}}. Note that we can achieve these two matrices with different operations among original vectors and their virtual neighbors, such as average, threshold mechanism, collaborative filtering, etc.

𝐫¯u=CFmodel(ru,r,u1r)ut.{\bf{\bar{r}}}^{*}_{u}=\text{CFmodel}({\textbf{r}^{*}_{u}},\textbf{r}{{}^{*}_{u^{\prime}1}},...\textbf{r}{{}^{*}_{u^{\prime}t}}). (6)

Then serendipity fusion operation is employed for marking each potential candidate item for objective users. First, we give a reminder of serendipity items: the items with high relevance but low shopping purpose (Yang et al. 2018). With this consideration, intuitively, we can deduce that the items with high satisfaction but low interest should be marked as serendipity items, which means that the objective user would achieve a much better experience after buying them. To achieve the serendipity item set, we set the thresholds θ\theta^{*} for interest and satisfaction, respectively. The item with rsaθinr^{\text{sa}}\geq\theta^{\text{in}} but rin<θinr^{\text{in}}<\theta^{\text{in}} is marked as suis_{ui}=1. For each user, there is an indicator su\textbf{s}_{u} to mark his serendipity items.

One of our contributions is to solve the cold-start problem. The reason for the cold-start issue is the sparsity of the user-item matrix. Thus we employ a zero matrix injection to relieve the sparsity issue. Unlike existing matrix injection methods, we do not inject the potential candidate items into the user-item matrix. Instead, we pick impossible items to filter the candidate items. The reason is that 1) the rating deduction from users’ preferences is usually tricky and inaccurate, with many uncontrollable factors. 2) In real-world scenarios, the potential items of an objective user take a relatively small partition from thousands or millions of unobserved items. So adding zeros for impossible items could relieve the sparsity issue greatly.

While we filter the items with two vectors 𝐫¯uin {{\bf{\bar{r}}}^{{\text{in }}}_{u}} and 𝐫¯usa{{\bf{\bar{r}}}^{{\text{sa}}}_{u}} for user uu’s zero injections, there are several situations with different values (r¯uiin {{\bar{r}}^{{\text{in }}}_{ui}} and r¯uisa{{\bar{r}}^{{\text{sa}}}_{ui}}, ii-th is the location index of both vectors), respectively. Note that the 𝐫¯uin {{\bf{\bar{r}}}^{{\text{in }}}_{u}} and 𝐫¯usa{{\bf{\bar{r}}}^{{\text{sa}}}_{u}} are computed by Formula 5, and the element value is [r¯ui\bar{r}^{*}_{ui}, 0, ?], 0<r¯ui10<\bar{r}_{ui}\leq 1. Generally, we obey the following principles to inject ruih=0r^{\text{h}}_{ui}=0 into enhanced matrix Rh\textbf{R}^{\text{h}}:

  • If r¯uiin<θin{{\bar{r}}^{{\text{in}}}_{ui}}<\theta^{\text{in}} and r¯uisa<θsa{{\bar{r}}^{{\text{sa}}}_{ui}}<\theta^{\text{sa}}, inject ruih=0r^{\text{h}}_{ui}=0;

  • If r¯uiin<θin{{\bar{r}}^{{\text{in}}}_{ui}}<\theta^{\text{in}}/r¯uisa<θsa{{\bar{r}}^{{\text{sa}}}_{ui}}<\theta^{\text{sa}}, and r¯uisa/r¯uiin=?{{\bar{r}}^{{\text{sa}}}_{ui}}/{{\bar{r}}^{{\text{in}}}_{ui}}=?, inject ruih=0r^{\text{h}}_{ui}=0;

  • Else, we set ruih=ruir^{\text{h}}_{ui}=r_{ui}.

Intuitively, the impossible item set for recommendations consists of low interest and low satisfaction (below the thresholds) items. Also, we inject 0s for the items with an unknown preference (?) and a low interest/satisfaction preference. As far as we know, these 0s can relieve the user-item matrix’s sparsity. Moreover, the enhanced user-item matrix Rh\textbf{R}^{\text{h}} can give more meaningful feedback to indicate users’ preferences on items and reduce the unknown feedbacks, which relieves the cold-start problem. While for the items/users with few or zero feedbacks (new items/users in recommender systems) in original user-item matrix R, our proposed model can inject some zeros as an initialization. It relieves the new user/item cold-start problem. Note that these zeros are temporary, not fixed. Once the original user-item matrix has been updated to a large extent (utilizes many new ruir_{ui} to replace ?), We can employ GS2-RS again to update Rh\textbf{R}^{\text{h}}.

With Rh\textbf{R}^{\text{h}} as input (replacing original R), and interest matrix R¯in{{\bar{\textbf{R}}}}^{\text{in}}, satisfaction matrix R¯sa{\bar{\textbf{R}}}^{\text{sa}} and self-serendipity matrix S as side information inputs, we can achieve different types (Top-k, CTR, or next purchase, etc) recommendation results L with different RS models:

𝐋=RSmodel(Rh,R¯in,R¯sa,S).\bf{L}{\rm{=RSmodel({{\textbf{R}}}^{\text{h}},{\bar{\textbf{R}}}^{\text{in}},{\bar{\textbf{R}}}^{\text{sa}},\textbf{S})}}. (7)

The complete GS2-RS model is described in Algorithm 1:

Algorithm 1 Generative Self-Serendipity RS model
0:  Original user-item matrix R.
0:  Recommendations results L.
1:  Initializations: condition vectors c\textbf{c}^{*}, parameters and thresholds.Step 1: Preference Modelling
2:  Calculate interest matrix Rin\textbf{R}^{\text{in}}, satisfaction matrix Rsa\textbf{R}^{\text{sa}} with threshold αui\alpha_{ui} (Figure 3);
3:  Train CGANs with Formula 2 and Formula 3;
4:  Output generators 𝒢rin\mathcal{G}_{\text{r}}^{\text{in}}, 𝒢fin\mathcal{G}_{\text{f}}^{\text{in}}, 𝒢rsa\mathcal{G}_{\text{r}}^{\text{sa}}, 𝒢fsa\mathcal{G}_{\text{f}}^{\text{sa}};Step 2: Self-serendipity Fusion&Zero Injection
5:  for objective user uu do
6:     Ensure self-neigbor number tt;
7:     for each self-neigbor uu^{\prime} do
8:        Generate 𝐫¯uin\overline{\bf{r}}_{u}^{{\rm{in}}}, 𝐟¯uin\overline{\bf{f}}_{u}^{{\rm{in}}}, 𝐫¯usa\overline{\bf{r}}_{u}^{{\rm{sa}}}, 𝐟¯usa\overline{\bf{f}}_{u}^{{\rm{sa}}} with 𝒢rin\mathcal{G}_{\text{r}}^{\text{in}}, 𝒢fin\mathcal{G}_{\text{f}}^{\text{in}}, 𝒢rsa\mathcal{G}_{\text{r}}^{\text{sa}}, 𝒢fsa\mathcal{G}_{\text{f}}^{\text{sa}};
9:        Calculate ruin\textbf{r}^{\text{in}}_{u^{\prime}}, rusa\textbf{r}^{\text{sa}}_{u^{\prime}} with Formula 5;
10:     Calculate 𝐫¯uin{\bf{\bar{r}}}^{\text{in}}_{u}, 𝐫¯usa{\bf{\bar{r}}}^{\text{sa}}_{u} with Formula 6;
11:     Compare 𝐫¯uin{\bf{\bar{r}}}^{\text{in}}_{u} with 𝐫¯usa{\bf{\bar{r}}}^{\text{sa}}_{u} element-wised with thresholds θin,θsa\theta^{\text{in}},\theta^{\text{sa}};
12:     Mark the self-serendipity indicator suis_{ui};
13:     Inject 0s to form ruih{}^{\text{h}}_{ui};
14:  Output Rh\textbf{R}^{\text{h}}, R¯in{\bar{\textbf{R}}}^{\text{in}}, R¯sa{\bar{\textbf{R}}}^{\text{sa}} and S;Step 3: Recommendations
15:  Input Rh\textbf{R}^{\text{h}} instead of R into SOTA RS models;
16:  Input R¯in{\bar{\textbf{R}}}^{\text{in}}, R¯sa{\bar{\textbf{R}}}^{\text{sa}} and S as side information into SOTA RS models;
17:  Calculate L with Formula 7;
18:  return  Recommendations results L.

Recommendation Analysis

This section introduces how our proposed model enhances the SOTA recommender systems and solves the filter-bubble problem.

Enhancing CF/MF/NN based Recommenders

Existing recommender systems usually consist of three specific categories: collaborative filtering-based RS (CF models), matrix factorization-based RS (MF models), and neural network-based RS (NN models). GS2-RS can enhance both recommender systems for the following reasons: For enhancing CF models, GS2-RS could give more valuable feedbacks in Rh\textbf{R}^{\text{h}}, (0s that injected in Zero Matrix Injection), which can help the model to compute the distance between different users/items, and select more accurate users’ to filter the items.

For MF models, we employ WRMF or other matrix factorization to learn users’/items’ latent vectors from user-item matrix R, and then make recommendations. Note that all the existing matrix factorization algorithms’ performance is affected dramatically by the matrix’s sparsity, while our proposed model GS2-RS can relieve the sparsity problem by replacing R with Rh\textbf{R}^{\text{h}}. Moreover, the 0s in Rh\textbf{R}^{\text{h}} also can be treated as ratings, which can restrict the learned latent users’/items’ vectors for accurate performance. Note that MF models are usually employed as the preprocessing for NN models to get the input of the neural network framework. Moreover, the input is vital for NN models. Indeed, our proposed model can enhance NN models by offering a superior input. We will give a discussion in the experimental section.

Enhancing Personalized Recommendations for the Filter-Bubble problem

Personalized Recommendation is an important factor in recommender systems because a boring, homogenous recommendation is not expected for each individual. For enhancing personalized recommendations, GS2-RS employs users’ preferences (interest, satisfactions, and self-serendipity) to decide the recommendation orders in the recommendation results L. With interest matrix R¯in{\bar{\textbf{R}}}^{\text{in}}, satisfaction matrix R¯sa{\bar{\textbf{R}}}^{\text{sa}} and self-serendipity matrix S, we can make a fine-grained user profiling for personalized recommendations. For example, we could check the percentage of interest and satisfaction items in historical records for an objective user to distinguish the user’s distribution on preferences. Then we rerank the recommendation list in accord with this preference distribution. Moreover, the self-serendipity item should also be considered for achieving personalized, accurate recommendations. Details are introduced in the experimental section. With these side information, the filter-bubble problem can be adequately solved.

Experiments

This section validates our proposed framework with three aspects: 1) How GS2-RS enhances the overall recommendation performance. 2) How GS2-RS solves the cold-start and filter-bubble problems, and 3) How the threshold affects GS2-RS’s performance.

Datasets

We utilize two publicly accessible datasets: Movielens111http://grouplens.org/datasets/movielens/ and Amazon222http://www/kaggle.com/snap/amazon-fine-food-reviews/. Detailds are indicated in Table 1. Grid search and 5- fold cross validation are used to find the best parameters. In our proposed GS2-RS, the thresholds are set: αu=irui/#num(rui){\alpha_{u}}=\sum\limits_{i}{{r_{ui}}}/\#num{(r_{ui})}, θin=θsa=0.5\theta^{\text{in}}=\theta^{\text{sa}}=0.5. The learning rate is 0.01.

Table 1: Datasets Statistics
Datasets #Users #Items #Feedbacks Sparsity
Movielens 6,040 3,952 1,000,209 95.81%\%
Amazon 16,619 37,762 256,287 99.95%\%

Baselines

To validate GS2-RS, we select several classic RS models and SOTA RS models as the validations: 1) Collaborative Filtering (CF) (Koren and Bell 2015) and 2) Weighted Matrix Factorization (WMF) (Koenigstein, Ram, and Shavitt 2012; Chen et al. 2020c) are widely applied RS models, and 3) Neural Collaborative Filtering (NCF) (He et al. 2017): a general neural network based recommendation framework, which emppoys GMF as its preprocessing. 4) Joint Variational Autoencoder (JoVA) (Askari, Szlichta, and Salehi-Abari 2021): an ensemble of two VAEs to capture user-user and item-item correlations simultaneously for recommendation. 5) Augmented Reality CF (AR-CF) (Chae et al. 2020): a GAN based CF model, which is applied directly on ratings.

Metrics

We employ standard metrics to validate overall recommendation performance, including Precision, Recall, and Normalized Discounted Cumulative Gain (NDCG), and Mean Reciprocal Rank (MRR).

For cold-start problem effectiveness, we utilize Exposure Ratio (ER) as the metric. Formally, the exposure ratio is computed by B/A where B is the number of cold-start items which are exposed to at least one user, and A is the number of the entire cold-start items. For filter-bubble problem effectiveness, we utilize diversity (DI) and serendipity (SE) (Yang et al. 2018) as metrics. Note that both metrics should be applied on a recommendation item set, as follows (Ireal\textbf{I}^{\text{real}} is the ground truth):

DI=#categorynum(𝐈rec)/#num(𝐈rec){\rm{DI}}=\#categorynum({{\bf{I}}^{{\rm{rec}}}})/\#num({{\bf{I}}^{{\rm{rec}}}});

SE=#num(𝐈rec𝐈real(𝐈sa𝐈in))/#num(𝐈real){\rm{SE}}=\#num({{\bf{I}}^{{\rm{rec}}}}\cap{{\bf{I}}^{{\rm{real}}}}\cap({{\bf{I}}^{{\rm{sa}}}}-{{\bf{I}}^{{\rm{in}}}}))/\#num({{\bf{I}}^{{\rm{real}}}}).

Overall Performance for Enhancing Recommendation

The overall performance across two datasets is shown in Table 2. Generally speaking, GS2-RS outperforms all compared models on both datasets for all metrics. The improvement can be attributed to two aspects: 1) The improvement from two GAN-based models (AR-CF and GS2-RS) over other models indicates that the usage of GAN on sparse matrices is obvious for recommendations. 2) GS2-RS can enhance performance by generating their virtual preferences, which is effective than directly generating their virtual ratings on items, which can be indicated from GS2-RS’s improvement on AR-CF.

Refer to caption
(a) Pre. on Movielens
Refer to caption
(b) NDCG on Movielens
Figure 6: Enhancing RS models as Preprocessing.
Refer to caption
(a) Movielens
Refer to caption
(b) Amazon
Figure 7: Exposure ratio for cold-start items.

Moreover, as we claimed, GS2-RS can be applied as the preprocessing for SOTA RS models, which are shown in Figure 6. We observe that our proposed model can enhance the SOTA models’ Precision and NDCG performance on Movielens. GS2-RS universally and consistently provide the best accuracy, and we believe that the benefits are credited to GS2-RS’s characteristics that it can take advantage of the performance gains coming from the generated virtual (but plausible) users’ preferences as qualified training data.

Table 2: The overall performance metrics of the compared methods for recommendations. The boldface font denotes the winner and denotes the second winner in that columen. Perform.+ denotes the preformance gain percentage on second best model. And the improvement is significant with t-test p<<0.05.
Datasets Movielens Amazon
Metrics Pre.@10 Rec.@10 NDCG@10 MRR Pre.@10 Rec.@10 NDCG@10 MRR
CF 0.0193 0.0893 0.2210 0.4056 0.0088 0.0049 0.1129 0.3656
WMF 0.1541 0.1644 0.3059 0.4731 0.1032 0.1321 0.2048 0.4111
NCF 0.1873 0.1831 0.2710 0.3594 0.1321 0.1224 0.2321 0.3321
JoVA 0.2010 0.2001 0.3010 0.4687 0.1177 0.1331 0.3015 0.4014
AR-CF 0.1707 0.1127 0.3971 0.4332 0.1421 0.1644 0.4015 0.4233
GS2-RS 0.2230 0.2006 0.4449 0.4837 0.1534 0.1756 0.4333 0.5210
Perform.+ 10.94% 0.24% 12.03% 2.24% 7.95% 6.81% 7.92% 23.08%

Performance for Solving Cold-Start Problem

The problem with the cold-start items is that they are challenging to be recommended. So we employ Exposure Ratio (ER) to evaluate the performance for solving the cold-start problem. We employ NCF, JoVA, AR-CF as baselines compared with GS2-RS, as shown in Figure 7. T%\%denotes the bottom percentage of items that have interactions with users, and we treat these items as cold-start items. We observe that NCF and JoVA have difficulty solving the cold-start problem, and AR-CF and our model outperform them considerably. Meanwhile, because our proposed model generates users’ preferences, not ratings, GS2-RS performs better than AR-CF. Jointly considered with accuracy results in Figure 6, the results demonstrate the effectiveness of our model in solving cold-start issues with a stable recommendation performance.

Performance for Solving Filter-Bubble Problem

We utilize Diversity and Serendipity to evaluate the performance for solving the filter-bubble problem, as shown in Figure 8. From the results, we observe that 1) GS2-RS can improve the diversity and serendipity significantly, especially on sparse dataset Amazon. 2) JoVA achieves a second-best accuracy performance but a relatively low diversity and serendipity compared with other baselines. With a high diversity and serendipity, we can offer users an attractive and exciting recommendation instead of a boring, repeated one, greatly relieving the filter-bubble problem.

Refer to caption
(a) Diversity (DI) @10
Refer to caption
(b) Serendipity (SE) @10
Figure 8: Diversity/Serendipity for the FB problem.
Refer to caption
Figure 9: Effect of threshold θ\theta on accuracy and serendipity.

Threshold Effect Analysis

We validate the effect of θ\theta, the most vital threshold of GS2-RS, ranging (0.0, 1,0) with step 0.1. Note that we set θin\theta^{\text{in}}=θsa\theta^{\text{sa}}=θ\theta for validation, as shown in Figure 9. From the results, we observe that GS2-RS achieves the best performance at θ\theta=0.5. When θ\theta is descending to 0, GS2-RS can not filter any items, which fades to a basic GAN-based model. When θ\theta is ascending to 1, GS2-RS drops every item and damages its performance.

Related Work

As well-known basic problems in the recommendation, cold-start and filter-bubble problems are explored by many researchers recently: For the cold-start problem, (Chen et al. 2020d) proposed a tagging algorithm to tag unobserved items for relieving cold-start issue. (Bi et al. 2020) utilized cross-domain information to reduce data sparsity for cold-start problem and achieved the SOTA recommendation performance. Meanwhile, some researchers explore the usage of GAN (Goodfellow et al. 2014) for the cold-start problem: (Chae et al. 2020) generated virtual neighbor for objective users and made accurate recommendations by reducing cold-start items. (Wang 2021) generated users’ embedding for recommendation and improved NDCG@100 significantly. However, these frameworks do not consider generating users’ preferences, especially fine-grained preferences (as our framework GS2-RS).

For the filter-bubble problems, some researchers (Burbach et al. 2018; Koren and Bell 2015) tried to add the diversity of recommendation list to tackle it. (Kapoor et al. 2015) adapted users’ novelty preferences into recommendations, which added the diversity for relieving the filter-bubble problem. Recently, the idea of serendipity has been proposed to solve the filter-bubble problem by offering novel, diverse and high-satisfaction recommendations. (Ziarani and Ravanmehr 2021) gave general explanations about why serendipity items could work for tackling the filter-bubble situations. (Yang et al. 2018) proposed a matrix factorization-based model for enhancing serendipity for superior recommendations. However, the challenge is utilizing serendipity into the recommender system framework to solve cold-start and filter-bubble problems simultaneously.

Concluding Remarks

We have introduced GS2-RS, a novel framework for addressing cold-start and filter-bubble problems with the CGAN framework and matrix injection method. From empirical experiments on public datasets, we demonstrated that GS2-RS is effective in dealing with both problems: GS2-RS significantly advances accuracy, diversity, and serendipity compared to SOTA RS models. In the future, we plan to explore extending GS2-RS to incorporate images (with GCN), social networks (with KG), or context (with NLP).

References

  • Askari, Szlichta, and Salehi-Abari (2021) Askari, B.; Szlichta, J.; and Salehi-Abari, A. 2021. Variational Autoencoders for Top-K Recommendation with Implicit Feedback. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2061–2065.
  • Bi et al. (2020) Bi, Y.; Song, L.; Yao, M.; Wu, Z.; Wang, J.; and Xiao, J. 2020. A Heterogeneous Information Network based Cross Domain Insurance Recommendation System for Cold Start Users. In SIGIR ’20: The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2211–2220.
  • Burbach et al. (2018) Burbach, L.; Nakayama, J.; Plettenberg, N.; Ziefle, M.; and Valdez, A. C. 2018. User preferences in recommendation algorithms: the influence of user diversity, trust, and product category on privacy perceptions in recommender algorithms. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys, 306–310.
  • Chae et al. (2020) Chae, D.; Kim, J.; Chau, D. H.; and Kim, S. 2020. AR-CF: Augmenting Virtual Users and Items in Collaborative Filtering for Addressing Cold-Start Problems. In SIGIR ’20: The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1251–1260.
  • Chen et al. (2020a) Chen, C.; Zhang, M.; Zhang, Y.; Ma, W.; Liu, Y.; and Ma, S. 2020a. Efficient Heterogeneous Collaborative Filtering without Negative Sampling for Recommendation. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI, 19–26.
  • Chen et al. (2020b) Chen, J.; Wang, C.; Zhou, S.; Shi, Q.; Chen, J.; Feng, Y.; and Chen, C. 2020b. Fast Adaptively Weighted Matrix Factorization for Recommendation with Implicit Feedback. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI, 3470–3477.
  • Chen et al. (2020c) Chen, J.; Wang, C.; Zhou, S.; Shi, Q.; Chen, J.; Feng, Y.; and Chen, C. 2020c. Fast Adaptively Weighted Matrix Factorization for Recommendation with Implicit Feedback. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI, 3470–3477.
  • Chen et al. (2020d) Chen, X.; Du, C.; He, X.; and Wang, J. 2020d. JIT2R: A Joint Framework for Item Tagging and Tag-based Recommendation. In SIGIR ’20: The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1681–1684.
  • Feldman, Frostig, and Hardt (2019) Feldman, V.; Frostig, R.; and Hardt, M. 2019. The advantages of multiple classes for reducing overfitting from test set reuse. In Proceedings of the 36th International Conference on Machine Learning, ICML, volume 97, 1892–1900.
  • Fu et al. (2021) Fu, B.; Zhang, W.; Hu, G.; Dai, X.; Huang, S.; and Chen, J. 2021. Dual Side Deep Context-aware Modulation for Social Recommendation. In Proceedings of the 26th International Conference on World Wide Web, WWW, 2524–2534.
  • Goodfellow et al. (2014) Goodfellow, I. J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A. C.; and Bengio, Y. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, 2672–2680.
  • He et al. (2017) He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; and Chua, T. 2017. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, WWW, 173–182.
  • Kapoor et al. (2015) Kapoor, K.; Kumar, V.; Terveen, L. G.; Konstan, J. A.; and Schrater, P. R. 2015. ”I like to explore sometimes”: Adapting to Dynamic User Novelty Preferences. In Proceedings of the 9th ACM Conference on Recommender Systems, RecSys, 19–26.
  • Kim and Suh (2019) Kim, D.; and Suh, B. 2019. Enhancing VAEs for collaborative filtering: flexible priors & gating mechanisms. In Proceedings of the 13th ACM Conference on Recommender Systems, RecSys, 403–407.
  • Koenigstein, Ram, and Shavitt (2012) Koenigstein, N.; Ram, P.; and Shavitt, Y. 2012. Efficient retrieval of recommendations in a matrix factorization framework. In 21st ACM International Conference on Information and Knowledge Management, CIKM’12, 535–544.
  • Koren and Bell (2015) Koren, Y.; and Bell, R. M. 2015. Advances in Collaborative Filtering. In Recommender Systems Handbook, 77–118. Springer.
  • Mirza and Osindero (2014) Mirza, M.; and Osindero, S. 2014. Conditional Generative Adversarial Nets. CoRR, abs/1411.1784.
  • Wang (2021) Wang, W. 2021. Learning to Recommend from Sparse Data via Generative User Feedback. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI, 4436–4444.
  • Wang, Ounis, and Macdonald (2021) Wang, X.; Ounis, I.; and Macdonald, C. 2021. Leveraging Review Properties for Effective Recommendation. In Leskovec, J.; Grobelnik, M.; Najork, M.; Tang, J.; and Zia, L., eds., Proceedings of the 26th International Conference on World Wide Web, WWW, 2209–2219.
  • Xu et al. (2019) Xu, Y.; Yang, Y.; Han, J.; Wang, E.; Ming, J.; and Xiong, H. 2019. Slanderous user detection with modified recurrent neural networks in recommender system. Information Sciences, 505: 265–281.
  • Yang et al. (2018) Yang, Y.; Xu, Y.; Wang, E.; Han, J.; and Yu, Z. 2018. Improving Existing Collaborative Filtering Recommendations via Serendipity-Based Algorithm. IEEE Transactions on Multimedia., 20(7): 1888–1900.
  • Zhang et al. (2020) Zhang, S.; Yin, H.; Chen, T.; Nguyen, Q. V. H.; Huang, Z.; and Cui, L. 2020. GCN-Based User Representation Learning for Unifying Robust Recommendation and Fraudster Detection. In SIGIR ’20: The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, 689–698.
  • Ziarani and Ravanmehr (2021) Ziarani, R. J.; and Ravanmehr, R. 2021. Serendipity in Recommender Systems: A Systematic Literature Review. J. Comput. Sci. Technol., 36(2): 375–396.
  • Zou et al. (2020) Zou, L.; Xia, L.; Gu, Y.; Zhao, X.; Liu, W.; Huang, J. X.; and Yin, D. 2020. Neural Interactive Collaborative Filtering. In SIGIR ’20: The 43th International ACM SIGIR Conference on Research and Development in Information Retrieval, 749–758.