Against Filter Bubbles: Diversified Music Recommendation via Weighted Hypergraph Embedding Learning
Abstract
Recommender systems serve a dual purpose for users: sifting out inappropriate or mismatched information while accurately identifying items that align with their preferences. Numerous recommendation algorithms are designed to provide users with a personalized array of information tailored to their preferences. Nevertheless, excessive personalization can confine users within a “filter bubble”. Consequently, achieving the right balance between accuracy and diversity in recommendations is a pressing concern. To address this challenge, exemplified by music recommendation, we introduce the Diversified Weighted Hypergraph music Recommendation algorithm (DWHRec). In the DWHRec algorithm, the initial connections between users and listened tracks are represented by a weighted hypergraph. Simultaneously, associations between artists, albums and tags with tracks are also appended to the hypergraph. To explore users’ latent preferences, a hypergraph-based random walk embedding method is applied to the constructed hypergraph. In our investigation, accuracy is gauged by the alignment between the user and the track, whereas the array of recommended track types measures diversity. We rigorously compared DWHRec against seven state-of-the-art recommendation algorithms using two real-world music datasets. The experimental results validate DWHRec as a solution that adeptly harmonizes accuracy and diversity, delivering a more enriched musical experience. Beyond music recommendation, DWHRec can be extended to cater to other scenarios with similar data structures.
keywords:
Filter Bubbles; Diversified Recommendation; Hypergraph Convolution1 Introduction
Recommender systems (RecSys), as one of the most commonly used methods for technical information filtering, yield tangible benefits for both service providers and users [4, 17, 22]. In our current digital era, online service providers deploy these systems to deliver personalized suggestions, aiming to heighten user satisfaction by catering to a broad spectrum of needs across diverse user profiles [22, 58]. Through the utilization of recommender systems, service providers can better discern user preferences, boost product visibility, and enhance user engagement [21, 39]. For users, recommender systems offer liberation from the overwhelming sea of information and facilitate the discovery of content aligned with their tastes [40]. Many recommendation algorithms [15, 56, 14] relentlessly pursue precision in their recommendations, striving to push the boundaries of accuracy to the utmost.
Nevertheless, the excessive pursuit of recommendation accuracy may lead to personalized outcomes that exceed user expectations, thereby exposing them to the pitfalls of filter bubbles [3, 8]. The idea that filter bubbles [36] shape individuals’ thoughts by influencing the information they receive is rooted in communication, informatics, sociology and psychology [8, 20]. Numerous studies consistently highlight a reality: filter bubbles undeniably exert an impact on users within social networks [9, 52, 31]. An effective approach to bursting the bubble is to implement diversity-aware strategies [41, 46, 23]. Studies have documented that recommendation diversity is crucial in many cases, and poor diversity characteristics undermine traditional recommender systems [4, 1].
In this paper, we study the diversification problem of music recommendation, which is one of the most prevalent applications penetrating our daily lives. It emerges as a critical application of web services, witnessing millions of listening events occurring between a vast user base and tracks in the music repository every moment [17, 48, 40]. To obtain an effective diversified music RecSys, we need to consider the characteristics of the music recommendation data. Firstly, it has rich auxiliary information for each music track, including its Albums, Artists and annotated Tags. Each auxiliary information reveals one kind of relationship among the music tracks. For example, music belonging to the same album tends to have similar theme, which reveals the similarity among all music tracks within the same album. Secondly, music recommendation data is not just one-time interactions. Unlike user behavior in E-commerce, where there is usually at most one-time interaction between each user-item pair, users can interact multiple times with a single music track. This characteristic causes each single-time interaction between user-item pairs not to be convincing to speculate the user’s preference. We are unable to assume the user’s preference toward a music track if the user only listens to the track one time. On the contrary, only a one-time interaction usually indicates a disfavor toward the music track because the user never listens to it again. Thus, we need to discern the user’s preference in a more fine-grained manner.
To fully consider the aforementioned characteristics in diversified music recommendation, we introduce the Weighted Hypergraph Convolution for Recommendation (DWHRec). DWHRec first integrates the relationships between users, tracks, albums, artists and tags by constructing a unified hypergraph. Specifically, we build types of hyper-edges within DWHRec to represent the interactions between user-track, tag-track, album-track, and artist-track separately. For example, each album-track type hyper-edge connects all the music tracks within the same album. By constructing the hyper-edges within the same hyper-graph, DWHRec is able to fully utilize all the auxiliary information to construct relationships between music tracks. Besides, to describe user’s preference in a more fine-grained manner, we add weight to each hyper-edge based on the edge type. For example, the weight of each user-item hyperedge is calculated based on the interaction number, more interactions lead to larger edge weight. In this way, we can discern user’s behaviors regarding the interaction number on each item.
After fully representing the data as a hyper-graph, we propose two methods on different steps to diversify the recommendation. During the embedding step, we first perform random walks on the constructed weighted hypergraph to collect walk sequences, and then utilize a skip-gram model to obtain the user/track embedding. Compared with graph convolution with fixed aggregation neighborhood, random-walk-based methods can reach the neighborhood at a farther distance, benefiting the diversified representation of center nodes. Besides, our constructed hypergraph contains rich item relationship information, and random walking on the weighted hypergraph can acquire more diversified walk paths compared with walking on the user-item bipartite graph. During the ranking step, we also design a novel diversifying function to rank the list. DWHRec first calculates the relevance score by the dot product of user/track embedding and then re-scores the relevance with a trade-off factor to introduce diversity. Then, the final ranking list considers both relevance and diversity, easily trading off with the designed factor. In conjunction with the two designed diversifying methods, DWHRec achieves the best performance on both accuracy and diversity on the real-world music recommendation dataset, yielding its effectiveness on the diversified music recommendation task. To summarize, the main contributions of this study can be outlined as follows:
-
1.
Introduction of DWHRec: A novel method that integrates users, tracks, albums, artists, and tags into a unified hypergraph, enhancing the representation of relationships in music recommendation data.
-
2.
We propose random walks on the weighted hypergraph to generate more diversified walk paths based on different sources of information. Combined with the skip-gram model, DWHRec can obtain more diversified user/track embeddings.
-
3.
In the ranking stage, we introduce a novel diversifying function that calculates relevance scores adjusted by a trade-off factor to balance relevance and diversity in the final recommendation list.
-
4.
Extensive experiments are conducted on real-world music datasets to demonstrate the effectiveness of DWHRec and the influences of different modules. Comparative experiments show that DWHRec is capable of striking an appropriate balance between recommendation accuracy and diversity.
The remainder of this study is organized as follows. Firstly, we present a brief review of existing related works in the fields of music recommendation systems, diversified recommendation and hypergraph-based recommendation (Section 2). Secondly, we construct a diversified music recommendation algorithm based on weighted hypergraph embedding (Section 3). Thirdly, in the experimental session, we compare the proposed DWHRec with several state-of-the-art recommendation algorithms, and detailed experimental results are reported (Sections 4–5). Finally, in the concluding part of this study, we summarize the findings and envision possible extensions (Section 6).
2 Related Work
As stated afore, we leverage music recommendations to investigate filter bubbles and employ a diversified recommendation approach to alleviate the problem. Therefore, it is essential to introduce the related concepts and the current state of research. Our study aligns with three main research directions: music recommendation, diversified recommendation and hypergraph-based recommendation.
2.1 Music Recommendation
In this study, our focus is on music recommender systems. With the proliferation of digital music, the evolution of music recommendation proves beneficial for users in picking desirable music pieces from an extensive repository [47]. The various music recommendation methods developed thus far can be broadly classified into two fundamental categories: content-based and collaborative-filtering approaches.
Music is an artistic presentation of sound and is characterized by numerous acoustic features [48, 49, 6, 34, 45]. Huang [19] initially extracted audio signatures as music features from audio data, rated new music by utilizing a vector quantization method, and ultimately generated music recommendations. In contrast to recommendations that solely focus on the music content itself, Bu [5] incorporated features extracted from the Mel-frequency cepstral coefficients as a bridge to compare similarities between tracks, embedding them into a unified hypergraph architecture.
Music typically conveys some form of emotion in addition to its acoustic characteristics. Shan [47] explored the discovery of affinity in film music and proposed a generalized framework for implementing emotion-based music recommendations. To tailor recommendations to better suit the user’s current context, Hariri [13] mined popular tags for tracks from social media websites, employed a topic modeling approach to learn latent topics representing various contexts from these tags, and then transformed each track into a set of latent topics capturing the general characteristics of that track.
Another strategy, which is the focus of our attention in this study, relies solely on the past behavior of the user, excluding the content of the music itself, often difficult to access. The direct approach refers to detecting associations between users, tracks, albums and artists through available information, uncovering the latent structure between them, and deducing the tracks that users may be attracted to, providing recommendations [46]. Mao [28] corrected user preference relations found from users’ ratings with a quality model and proposed a regularization framework to calculate the recommendation probability of tracks. Knowledge graphs, as tools to reason on data to extract new and implicit information, were naturally applied to music recommendations that mine associations between users and tracks [35]. La Gatta et al. [22] suggested that hypergraph data models might be more capable of seamlessly representing all possible and complex interactions between users and tracks with related characteristics. Furthermore, music play has a natural characteristic, i.e., sequence. Specifically, Cheng [7] exploited this property to seek the relevance of tracks and attempted to leverage the information encoded in music play sequence into the matrix factorization methods [21, 25] to improve the recommendation performance.
In summary, content-based music recommendation aims to establish similarities between tracks, with less consideration for personalization, while collaborative filtering-based recommendation pays more attention to detecting potential associations between users and tracks, being more inclined to match the users’ historical preferences.
2.2 Diversified Recommendation
Over the years, numerous recommendation algorithms have been developed to construct recommender systems that process massive amounts of data, aiming to identify potential user preferences and provide optimal recommendations [42, 2, 10, 62]. Ongoing studies on recommender systems often prioritize accuracy as the sole objective [53, 51, 61]. However, most personalized recommender systems compromise the accuracy of recommended items in their pursuit of increased diversity in recommendations [12, 16]. Simultaneously, it has been pointed out that accuracy may not be the exclusive goal of a recommender system [30, 26, 54]. Recommender systems would benefit from giving more consideration to other crucial aspects, such as diversity and novelty. Research has shown that, frequently, the diversity of recommendations is significant, and traditional recommender systems tend to exhibit poor diversity characteristics [4, 1, 18, 16].
Diversified recommendations can enhance the diversity and serendipity of a recommendation list, providing surprises and satisfaction to users, as these items can remain highly relevant to users [24, 43]. Furthermore, diversified recommendations can assist in exploring long-tail items, which is another challenging area and hot spot within recommender systems research [24].
Many current diversity-oriented recommender systems adopt a fixed strategy to adjust the diversity degree for all users. Typically, they define a score function that balances diversity and accuracy with a hyper-parameter. Subsequently, the generated recommendation list is re-ranked based on the calculated scores [26, 43]. For instance, Bradley [4] focused on a variation of a quality metric achieved through a linear combination of similarity and diversity, where the relative weight of the similarity and diversity factors can be altered by adjusting a hyper-parameter. Simultaneously, a non-linear alternative form of the quality metric, computed as the simple harmonic mean of similarity and diversity, has been proposed.
Beyond the pre-defined score function mentioned above, multi-objective optimization technology offers a viable alternative for balancing accuracy and diversity [27]. Unlike single-objective optimization, which can optimize only one objective function, multi-objective optimization can concurrently optimize several objective functions to obtain the Pareto optimal solution set. Two indicators, intra-user diversity and mean absolute error were selected to evaluate recommended diversity and accuracy, respectively [27].
Taken together, this simple way of defining diversity does not apply to more complex scenarios, such as music recommendation, where relationships between multiple entities need consideration.
2.3 Hypergraph-based Recommendation
Recently, numerous studies have delved into hypergraphs [57, 37, 59]. For instance, Bu [5] introduced a unified hypergraph framework for music recommendation, incorporating diverse social media information and acoustic-based content into the algorithm. Theodoridis [50] extended Bu’s framework [5] to include group sparsity constraints, enabling the exploitation of the group structure within the data. Treating music recommendation as a hypergraph-based ranking problem, Tan [49] integrated rich social media information to identify music tracks tailored to individual user preferences. A novel music recommendation framework that leverages a hypergraph data model and hypergraph embedding techniques was delivered by La Gatta [22]. By reexamining user mobility and social relationships, a hypergraph embedding method is specifically designed for a large-scale location-based social network dataset, facilitating automatic feature learning [57]. Introducing a generic user-item-attribute-context data model summarizing various information resources and higher-order relationships for constructing a multipartite hypergraph fulfilling multi-objective recommendation needs, a solution was proposed, utilizing hypergraph ordering [29].
While these approaches have garnered considerable success in resource recommendation applications, they have yet to establish a centralized hypergraph. Our goal is to construct a hypergraph with items at the center, surrounded by other resources, aiming to enhance the strength of connections between resources.
3 Weighted Hypergraph-based Diversified music recommendation
The next key point for discussion involves around the method of seeking and establishing connections among users, tracks, albums, artists and tags. Utilizing the hypergraph, we present a music recommendation algorithm based on hypergraph embedding, meticulously designed to strike a balance the weights of accuracy and diversity. This section provides a detailed illustration of our approach.
3.1 Preliminaries
We summarize the notation in Table 1. The graph comprises two essential components: vertices and edges. A hypergraph, denoted as , consists of a vertex set and a hyperedge set . Each hyperedge encompasses an arbitrary number of vertices . For a weighted hypergraph, represented as , the vertices and hyperedges are accompanied by a weighting function : , indicating the strength of connections. A higher weight signifies closer proximity between the connected vertices. Additionally, each hyperedge is linked to a non-negative number , referred to as the weight of the hyperedge .
A hyperedge is said to contain a vertex when . The degree of a hyperedge , denoted by , is defined as the cardinality of , i.e., . If every hyperedge has a degree of , the hypergraph reduces to a normal graph. A hyperpath between vertices and exists when there is an alternative sequence of distinct vertices and hyperedges , , , , , , such that for . A hypergraph is connected if there is a path for every pair of vertices.
Finally, we obtain the vertex-hyperedge incidence matrix, . The hypergraph can be succinctly represented by of size matrix. Let indicate that a vertex is part of a hyperedge , and otherwise. The incidence matrix, , is defined by its elements:
(1) |
Logically, we define the degree of a vertex as
(2) |
and the degree of a hyperedge as the number of connected vertices, denoted as,
(3) |
Symbol | Description |
Hypergraph. | |
The set of vertices. | |
The set of hyperedges. | |
The weight on hyperedges and vertices. | |
A certain vertex within the vertex set. | |
A certain hyperedge within the hyperedge set. | |
The degree of a vertex or a hyperedge. | |
The incidence matrix. | |
The elements in the incidence matrix. | |
The set of users. | |
The set of tracks. | |
The set of artists. | |
The set of albums. | |
The set of tags. | |
A certain user within the user set. | |
A certain track within the track set. | |
A certain album within the album set. | |
A certain artist within the artist set. | |
A certain ta within the tag set. | |
The listening history. | |
The quantity of elements. |
3.2 Hypergraph Composition
The hypergraph structure is a fundamental component of our proposed algorithm. Consequently, it is essential to furnish a detailed configuration explanation for the elements within the hypergraph.
We consider five types of objects and four types of relationships. Specifically, the objects comprise users , three resource types (i.e., tracks , albums and artists ) and tags that users attach to them. The relationships within the constructed hypergraph are partitioned into actions on resources and inclusion relationships among resources. Actions relations on resources engage two types of interactions: users listening to tracks () and users tagging tracks (). It is worth noting that the relationship between users and the tagging of tracks represents the collective annotation of a particular track by all users, rather than a specific user. Inclusion relationships among resources are defined by connections between tracks and releases and between tracks and artists, signified by and , respectively.
The hypergraph is defined as a collection of vertices and hyperedges . The vertex set, , comprises distinct entities, specifically:
(1) Users (): the set of users;
(2) Albums (): the set of albums;
(3) Artists (): the set of artists;
(4) Tracks (): the set of tracks;
(5) Tags (): the set of tags that users attach to tracks.
So, the set of vertices , is defined as the union of , , , and , i.e., . Hyperedges are introduced to represent the four relationships among the aforementioned objects. In the unified hypergraph, there exist four types of hyperedges, each corresponding to a specific relationship type. To correspond with the relation set , we define the set of hyperedges as , (). For convenience and better comprehension, we provide an illustrative representation of these relationships in Table 2.
Name | Notations | |
Relations | Hyperedge Types | |
Listening tracks | ||
Tagging tracks | ||
Tracks belong to an album | ||
Tracks belong to an artist |
The construction of the four types of relations and hyperedges is listed as follows.
Definition 1 (Hyperedge ).
The first type of hyperedge, denoted as , represents the tracks that the user has listened to in the past. It consists of two parts: the user vertex and track vertices. The play count, a significant metric for listening events, is denoted as , where is the user and is the track. In , the weight of the user vertex is set to , while the weight of other track vertex is determined as follows:
(4) |
where represents the track belonging to the same hyperedge as user .
Definition 2 (Hyperedge ).
The second type of hyperedge, denoted as , represents the tags that the user has annotated to a track. It comprises two components: the track vertex and the attached tag vertices. In , the number of times the tag attached to the track is denoted as . The weight of the track vertex is set to , and the weight of other tag vertex is set to:
(5) |
where represents the tag belonging to the same hyperedge as track .
Definition 3 (Hyperedge ).
Certain connections exist between different tracks within the same album, prompting the natural construction of a hyperedge for tracks belonging to the same album. All the tracks mentioned here that share an album constitute the albums in the dataset. The third type of hyperedge, denoted as , represents tracks belonging to the same album. It contains two components: the album vertex and the tracks vertices it includes. In , the notation denotes the number of times the track is played in the album . The weight of the album vertex is set to , and the weight of other track vertex is set to:
(6) |
where refers to the track that belongs to the same hyperedge as album .
Definition 4 (Hyperedge ).
Similarly, hyperedges can be also created for tracks and artists. Multiple tracks are composed or performed by the same artist. The hyperedge captures the relationship between the artist and several tracks, designed to prevent the omission of important tracks and serving as a complement to . The vertices of encompass all tracks belonging to the artist and the artist itself. In , the frequency with which the track by the artist has been played is denoted as . The weight of the artist vertex is set to , and the weight of other track vertex is regarded as:
(7) |
where is the track belonging to the same hyperedge as artist .
3.3 Recommendation via Hypergraph
Figure 1 vividly and succinctly depicts the entire workflow of the DWHRec algorithm, as detailed in Algorithm 1, which outlines the core framework of the system. DWHRec takes two inputs: data and hyper-parameters. The data includes the user’s listening history , tracks , tags associated with the tracks, albums and artists . DWHRec receives and processes these inputs through a series of steps to generate lists of recommendations for all users. The hyper-parameters consist of the iteration counts and the number of steps in the random walks stage, as well as the vector dimension and window size in the embedding stage and the recommended list length .

DWHRec can be segmented into four stages (Algorithm 1). Firstly, a weighted hypergraph is constructed based on information from , , , and (line ). A more detailed description of this process is presented in Algorithm 2. Secondly, utilizing the constructed hypergraph, the random walks method is designed to detect potential associations between users and candidate tracks (line ). A more detailed explanation of this process is provided in Algorithm 3. Thirdly, considering the paths left by the vertex walks as sentences and the vertices as words, the skip-gram word embedding model is applied to vectorize the users and tracks vertices, obtaining their vector representations (line ). Finally, the designed scoring function takes the vectors of users and tracks as input, calculating the scores for candidate tracks. These candidate tracks are graded, and the recommendation list is generated by ranking them according to the highest rating (lines ). Through these steps, DWHRec achieves the goal of providing users with as diversified recommendations as possible.
Input: Data about listening history behaviors , tracks , tags , albums and artists , hyper-parameters include the number of iterations for random walks, the number of steps taken during each iteration of random walks, the size of the vertex vectors generated by word embedding model, the window size and the length of the recommendation list .
Output: Top- recommendation list for all users.
Method: DWHRec.
3.3.1 Hypergraph Construction
Our objective is to provide users with specific track recommendations in the form of a personalized list. When making recommendations, the goal is to cater to users’ individual preferences by suggesting tracks that are as familiar or similar as possible to their historical listening behaviors. To assist users in discovering tracks they are likely to enjoy, we model the information using a hypergraph data structure. This structure captures relationships between various entities, interconnecting all the information embedded in the data source.
As mentioned earlier, a fundamental step in DWHRec is the construction of the hypergraph, and Algorithm 2 gives the pseudo-code for this process.
Input: Information about listening history behaviors , tracks , tags , albums and artists ;
Output: ;
Method: HypergraphConstruction.
The hypergraph is defined by a triple that comprises the set of vertices and hyperedges , along with their corresponding weights . Naively, hypergraph construction process can be approximately divided into three steps, constructing the vertices set , constructing the hyperedges set and assigning the hyperedges weights . In practice, hyperedges and weights are often combined, with an initial weight being attached to a hyperedge during its construction.
An empty hypergraph is initialized with an empty set of hyperedges and vertices set , all assigned a weight of (line ). Further, the algorithm proceeds to construct the hyperedges , , and to represent user-track, tag-track, album-track and artist-track associations, respectively, assigning each corresponding hyperedge an initial weight (lines ). The vertices set is formed by concatenating the sets of users , artists , albums , tracks and tags (line ). Additionally, the hyperedges set is the union of the hyperedges , where ranges from to (line ). Finally, a hypergraph is created (line ).
In the constructed hypergraph, tracks assume a central position. Through these tracks, connections are established among users, tags, albums and artists, facilitating the integration of information. This structure enables the exploration of users’ latent preferences by intertwining information from multiple aspects.
3.3.2 Random Walks on Hypergraph
After the construction of the hypergraph, the desire is to utilize it for recommendations. However, the information stored in the hypergraph cannot be used directly, necessitating further processing. Random walks are conducted on the hypergraph to facilitate the establishment of connections between various entities.
Input: Hypergraph , the number of iterations for random walks, the number of steps taken during each iteration of random walks.
Output: Walks list .
Method: RandomWalk.
The pseudocode of the random walks process is approximately rendered in Algorithm 3. The algorithm accepts a hypergraph as input, along with two hyper-parameters: , representing the number of iterations, and , indicating the number of random walk steps. It outputs a list containing the walking results for all vertices. The variable is a list that holds the results of all vertices walks, initialized as empty (line ). For each vertex , a random walk operation is performed, and the order of the walks is recorded (lines ).
Before commencing the actual walking process, some preparation are necessary. Specifically, a variable is declared and initialized to empty, tracking the order in which travels (line ). Let the currently visiting vertex be (line ). A hyperedge is randomly picked from the hyperedges containing the vertex and marked as the currently accessed hyperedge (line ).
After the preparation is completed, the actual walking can commence. The vertex becomes the first vertex visited, and it is appended to ’s visited path (line ). For the remaining steps, the following manipulations are performed (lines ). The transition probability for a hyperedge to jump to another hyperedge is evaluated first. Based on this probability, the current hyperedge could either remain unchanged or switch to another hyperedge where the current vertex is (line ). This mechanism enables the algorithm to explore more hyperedges deeply, avoiding getting stuck in a loop inside the hyperedges. After determining the hyperedge for the next walk, the next vertex for the walk needs to be further identified. The process involves selecting a new vertex to join the random walk for the next walk based on the probabilities associated with the vertices in the hyperedge (line ). Through these procedures, a single random walk path for vertex is recorded after jumps (lines ). The process of jumps repeated times iteratively forms the complete walking trajectory for vertex (lines ). Apparently, the walking paths of all vertices constitute the (lines ).
The random walks process on the hypergraph generates walking paths for vertices. Adjacent vertices in the hypergraph are likely to be adjacent in the paths, and non-adjacent vertices are also included in the walking paths, even though they may be separated by larger distances.
3.3.3 Embedding and Recommendation
The random walks process transforms the hypergraph structure into walking paths for vertices. However, further exploration is still required, as the latent preference information of users for tracks remains concealed within these paths.
To be aware of users’ track preferences, we employ the skip-gram model [38] to learn the efficient estimation of word representations in vector space. Word2Vec [32, 44, 33] is a class of neural network models that, within the context of a given unlabeled corpus, generates vectors for words in the corpus to characterize their semantic information. A widely used model within Word2Vec is skip-gram, extensively employed in natural language processing (NLP) as an unsupervised model for learning semantic knowledge from massive text corpora. In our case, the vertices of the hypergraph are considered as words, while the random walks are treated as sentences.
After the application of the skip-gram model, the users and tracks in the hypergraph are demonstrated as vectors. Further, the user’s enjoyment of a track can be proxied by calculating the dot product between these two vectors. The well-known dot product metric can be delegated to portray user ’s favoring of track , denoted as:
(8) |
The users’ preferences for the tracks are then converted into products between the user vertex vector and the vectors associated with the tracks.
Diversity characteristics serve as a measure of how diversified the observed range of users’ music is—the diversity of music listened to, intuitively related to openness [11]. In general, diversity is defined as the degree of differentiation between items in the recommended list, and is formally expressed as:
(9) |
where is a hyper-parameter that controls the acceptable level of diversity, acting as a trade-off between accuracy and diversity.
In summary, DWHRec algorithm utilizes a hypergraph structure to model the associations between users and tracks. Subsequently, through multiple iterations of random walks on the hypergraph, it generates traversal paths for both users and tracks. Next, the skip-gram word embedding model is employed to analyze these traversal paths, representing users and tracks as vectors. Finally, by computing the diversity degree, the algorithm measures the extent of user preferences for tracks, thereby generating a diversified recommendation list.
4 Experimental Setup
The experimental work to validate the effectiveness of the algorithm is necessary, and for this purpose, extensive, intricate and prolonged preparations have been carried out. To ensure the relative fairness and transparency of the experiments, we will provide a relatively detailed description of the experimental setup.
4.1 Dataset Collection
Last.fm111http://www.last.fm/ is a well-known data gathering site widely used in the field of music recommendation research [5, 50, 35, 43]. It builds a detailed profile of each user’s musical tastes by recording the tracks they have historically interacted with. Last.fm collects these tracks from Internet radio stations or the user’s computer, for example, by transferring (“scrubbing”) them to Last.fm’s database via a music player (e.g. Spotify222https://open.spotify.com/) or a plug-in installed in the user’s music player.
Users were discovered by crawling the Last.fm social graph using the “user.getFriends” endpoint and by crawling the listening users of a certain group of artists, which were obtained via chart.getTopArtists endpoint [43]. After removing duplicates from the approximately crawled user names, a grand total of almost unique users were harvested. Due to the substantial number of listening records per user, unique users were randomly chosen. Through other Last.fm Application Programming Interfaces (APIs), a significant number of unique listening events (LEs) from these users were available.
4.2 Dataset Description
Play count measures how frequently the observed user engages in music listening [11]. For tracks with a high play count, we can assume that the user has likely enjoyed them. However, tracks that are played rarely or have not been played cannot be easily dismissed as unappealing to users. Some tracks that users cannot access have a play count of . Additionally, there are tracks that users do not currently prefer, but it does not imply that they will not prefer them in the long run as their preferences evolve. The interaction between a user and a track (or artist, album) is reflected in the fact that the user has listened to a particular track, i.e., such a listening event exists. Referring to the article [43], we also use a simple key consisting of artist and track name tuples to distinguish individual tracks.
The basic statistical information contained in the dataset is shown in Table 3. For this experiment, two Last.fm datasets were utilized, named lastfm-100k and lastfm-200k, according to the approximate size of the datasets. In the filtered datasets, lastfm-100k contains users with close to entries for listening events (LEs). In comparison, the number of users and LEs in lastfm-200k is almost twice as large, with users and more than thousand LEs. The number of tracks in the two datasets is and , respectively. Regarding the average actions per user, average actions per track and sparsity, the two datasets are quite similar, with values of vs , vs , and % vs %, respectively.
Name | Quantity | |||||
Users | Tracks | LEs | Average actions per user | Average actions per track | Sparsity | |
lastfm-100k | 501 | 25,279 | 99,861 | 199.3 | 4.0 | 99.2% |
lastfm-200k | 1,001 | 42,668 | 200,173 | 200.0 | 4.7 | 99.5% |
4.3 Experimental Setup
The algorithm DWHRec generates the recommendation list by initially calculating the scores of users and candidate tracks according to the customized scoring function, and then ranking the candidate tracks depending on the scores. To ensure that each user and each track in the testset carries its own vector representation, extra care is required in constructing the dataset. For each user, we ranked their listening records and spanned the top records (it was observed that the vast majority of the users’ listening records ranked above had single-digit listening times). In the experimental section, we stochastically split the dataset into training and test sets, with the training set comprising and the testset containing the remaining . We conducted ten experiments, iterating over all ten permutations, and the final result displayed is the mean of the ten outcomes. For baselines in our experiment, we evaluated their performance using the RecBole toolkit [64, 63, 55].
DWHRec contains six hyper-parameters, the number of iterations for random walks, the number of steps for the walks, the representation dimension of the vector, the word embedding window size , the length of the recommendation list and the weighting factor of the scoring function. In the experiment, the first five hyper-parameters are assigned values using a manually set approach, sequentially set as , , and . Regarding the diversity weighting factor , an adaptive weight is employed, where the diversity weight for the -th item to be recommended is given by .
4.4 Evaluation Metrics
Five evaluation metrics, categorized into accuracy and diversity, were applied to compare different algorithms [60]. Recall and Hit Ratio, two accuracy metrics unrelated to ordering, measured the fundamental accuracy. Ranking accuracy was assessed using two metrics: Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). The Aggregate Diversity (AGGR-DIV) metric was utilized to measure the diversity of recommendations.
The AGGR-DIV metric measures the degree of diversity among items in the recommendation list across all users. It is computed as the total number of unique genres on tracks in the recommendation list, considering all users. AGGR-DIV is defined as follows:
(10) |
Where represents the recommendation list for user , and represents the set of tags annotated to the tracks in . In Equation (10), is the cumulative discounted probability of over , and represents the number of occurrences of in . The notation represents the number of occurrences of in , taking a value of if is present in and otherwise. and are defined as follows:
For the track , if contains the tag , the count of is denoted as ; otherwise, the count is zero. Then, the proportion of in the tags associated with is equal to the frequency of occurrence of , denoted as . The probability of in , expressed as , is calculated in a discounted form of ,
where represents that is the -th track in that contains . By traversing and accumulating , we obtain the probability for . A higher AGGR-DIV value indicates better diversity in the recommended results.
4.5 Baselines
We compared our approach with several state-of-the-art recommendation algorithms. Below is a brief description of these algorithms.
-
1.
Popularity Based (PB) [39]: The popularity-based recommender always recommends to users the most frequently listened-to tracks by all users in the dataset.
-
2.
Bayesian Personalized Ranking (BPR) [40]: It is the most widely used method for recommendation based on matrix factorization with pairwise ranking loss.
-
3.
Neural Collaborative Filtering–Matrix Factorization (NeuMF) [15]: The author is dedicated to developing neural network-based techniques to tackle the critical issue in recommendation, specifically collaborative filtering based on implicit feedback.
-
4.
Deep Matrix Factorization (DMF) [56]: The algorithm constructs a user-item matrix that incorporates explicit ratings and non-preference implicit feedback. It introduces a novel matrix factorization model with a neural network architecture to learn a generic low-dimensional space representation of users and items.
-
5.
Light Graph Convolution Network (LightGCN) [14]: LightGCN learns user and item embeddings by linearly propagating them on the user-item interaction graph. It uses the weighted sum of embeddings learned at all layers as the final embedding.
-
6.
Diversified GNN-based Recommender System (DGRec) [58]: It is currently the state-of-the-art diversified recommendation algorithm. The authors suggest diversifying GNN-based recommender systems by directly improving the embedding generation procedure.
-
7.
Hypergraph Embeddings for Music Recommendation (HEMR) [22]: It is a recommendation algorithm specifically designed for music recommendations, which is the most similar baseline to our method by utilizing hypergraphs.
5 Experimental Results
Conducting comparative experiments between DWHRec and seven other recommendation algorithms on two datasets, using five metrics to validate recommendation performance, aims to assess the model’s effectiveness. We delve into the analysis of the impact of hyper-parameter variations on the model through sensitivity experiments. In ablation experiments, we dissect the influence of different types of hyperedges on the model.
5.1 Overall Comparison
Seven advanced algorithms, namely PB, BPR, NeuMF, DMF, LightGCN, HEMR and DGRec, are employed as comparative models. DWHRec is compared with these models using the same experimental datasets, lastfm-100k and lastfm-200k (Table 3). The quality of the recommendation results is evaluated based on metrics, including accuracy indicators such as map, recall, hit ratio and ndcg, as well as diversity metrics represented by aggr-div.
Table 4 shows the results of the eight recommendation algorithms on two datasets utilizing evaluation metrics such as map, recall, hit ratio, ndcg and aggr-div. The table is divided into two sections: the upper half showcases results for the lastfm-100k dataset, while the lower half displays results for the lastfm-200k dataset. Both datasets share identical row and column names. The row names correspond to the abbreviations for eight comparative models, while the column names represent metrics for a recommended length of . The values in the table indicate the performance of each model under specific metrics. Bold entries highlight the best results, entries with an underline (“_”) signify the second-best results, and entries with a wavy underline (“~”) indicate the third-best results.
Models | Evaluation Metrics | ||||
MAP@20 | Recall@20 | Hit Ratio@20 | NDCG@20 | AGGR-DIV@20 | |
lastfm-100k | |||||
PB | 0.0002 | 0.0014 | 0.0271 | 0.0077 | 0.0145 |
BPR | 0.0002 | 0.0010 | 0.0198 | 0.0067 | 0.0106 |
NeuMF | 0.0014 | 0.0070 | 0.1178 | 0.0425 | 0.0106 |
DMF | 0.0019 | 0.0074 | 0.1186 | 0.0482 | 0.0116 |
LightGCN | 0.0038 | 0.0143 | 0.2052 | 0.0842 | 0.0113 |
HEMR | 0.0009 | 0.0048 | 0.0629 | 0.0195 | 0.1181 |
DGRec | 0.0006 | 0.0027 | 0.0437 | 0.0162 | 0.0108 |
DWHRec | 0.0149 | 0.0252 | 0.2559 | 0.1643 | 0.1721 |
lastfm-200k | |||||
PB | 0.0002 | 0.0022 | 0.0415 | 0.0114 | 0.0153 |
BPR | 0.0036 | 0.0109 | 0.1276 | 0.0583 | 0.0103 |
NeuMF | 0.0047 | 0.0204 | 0.3095 | 0.1226 | 0.0113 |
DMF | 0.0005 | 0.0025 | 0.0488 | 0.0173 | 0.0103 |
LightGCN | 0.0017 | 0.0060 | 0.0874 | 0.0362 | 0.0113 |
HEMR | 0.0009 | 0.0052 | 0.0672 | 0.0206 | 0.1296 |
DGRec | 0.0015 | 0.0071 | 0.1047 | 0.0389 | 0.0111 |
DWHRec | 0.0142 | 0.0220 | 0.2211 | 0.1501 | 0.1842 |
On the lastfm-100k dataset in Table 4, it is evident that DWHRec outperformed other algorithms significantly by a considerable margin across the five metrics provided. DWHRec consistently occupied a leading position under these metrics. LightGCN also demonstrated strong performance, securing the second position in four other metrics except for aggr-div@20. HEMR attained the second position in ndcg@20, while PB secured the third position in the same metric. For NeuMF, its performance closely approached that of DMF. The remaining algorithms distinctly lagged behind both DMF and NeuMF in performance.
On the lastfm-200k dataset in Table 4, changes in trends were observed. DWHRec exhibited outstanding performance, notably excelling in map@20, ndcg@20 and aggr-div@20 metrics, surpassing other algorithms by a significant margin. In terms of map@20 and recall@20 metrics, DWHRec, NeuMF and BPR occupied the top tier, securing the first three positions. The performance of NeuMF and DWHRec on recall@20 was remarkably close, with minimal differences. LightGCN’s performance, however, was not as robust on this dataset. The distinction between DGRec and LightGCN was relatively small. In terms of four accuracy metrics, BPR outperformed both LightGCN and DGRec, with the exception of the aggr-div@20 metric.
Figure 2 proclaims a brief overview of the comparative results, featuring two key metrics: ndcg for accuracy and aggr-div for diversity. The horizontal axis of the figure depicts the length of the recommendation list (), spanning ten data points from to . To enhance visibility, all lines are uniformly represented with dashed lines (“--”), each accompanied by distinct colors and point shapes. Notably, the color of each line matches the color of the points along it. Blue-gray triangles denote PB (“\trianglepafill”). Light pink stars represent BPR (“\starletfill”). NeuMF is marked with red cross symbols (“”). DMF is identified by olive circular points (“\circletfill”). Light blue colored squares fill the LightGCN (“\squadfill”). The golden pentagons make up the HEMR (“\pentagofill”). Purple plus signs characterize the DGRec (“\linevh”). DWHRec is adorned with orange diamonds (“\rhombusfill”).
Figure 2 is composed of four parts labeled a, b, c and d. Figures 2(a) and 2(b) depict the ndcg and aggr-div results for the lastfm-100k dataset, while Figures 2(c) and 2(d) portray the corresponding results for the lastfm-200k dataset. All four figures share a common horizontal axis representing the values of . The vertical axis is organized into two groups: the ndcg metric for Figures 2(a) and 2(c), and the aggr-div metric for Figures 2(b) and 2(d).




Across the range from to , the ndcg results for all algorithms increased as grew (Figures 2(a) and 2(c)). However, the growth rates of ndcg values varied among the algorithms. In the lastfm-100k dataset (Figure 2(a)), HEMR exhibited the fastest growth rate. DWHRec, DMF, NeuMF, LightGCN and PB accelerated at comparable rates, forming the second tier. The remaining algorithms demonstrated slower growth rates. Whereas in the lastfm-200k dataset (Figure 2(c)), NeuMF displayed the sharpest growth rate, with HEMR closely following its pace and showing a commendable growth spurt. Over the range of from to , the aggr-div results for all algorithms decreased with the increase in (Figures 2(b) and 2(d)). As increases, the aggr-div values for all algorithms remain closely aligned, except for HEMR and DWHRec.
Throughout the entire experiment, DWHRec consistently achieved favorable results in evaluation metrics such as map@20, recall@20, ndcg@20 and aggr-div@20, yielding significant differences from the other seven algorithms (Table 4 and Figure 2). Notably, DWHRec delivered commendable aggr-div outcomes, which is an encouraging result.
5.2 Hyper-parameter Sensitivity
The DWHRec model includes multiple hyper-parameters, roughly categorized into two main types. The hyper-parameter represents the number of iterations for random walks, and denotes the number of steps a vertex can take in a single iteration. Hyper-parameters and control the random walks state of vertices in the hypergraph. Hyper-parameter is the size of the sliding window, and is the dimension of the representation vector in the word embedding process. Hyper-parameters and intervene in the word embedding process, thereby influencing the generation of vertex vectors. Through sensitivity experiments on these hyper-parameters, we aim to explore how their variations impact the final recommendation performance.
For the sensitivity experiments on hyper-parameters, some preparatory work was undertaken. Specifically, the lastfm-100k dataset was selected as the experimental subject. Four recommendation lengths were chosen, namely , , and . Here, represents a scenario with a sufficiently long recommendation length, signifies an extremely short length, while and represent typical scenarios. During the testing of a specific hyper-parameter, other hyper-parameters were kept constant. Two metrics, recall and aggr-div, were used to characterize the performance of the recommendations. The experimental results are shown in Figures 3–6.
In practice, when conducting sensitivity experiments for hyper-parameters, attention to certain details is crucial. Both and are hyper-parameters that affect the random walks process (Algorithm 3). To investigate the impact of , is kept constant at a value of . Likewise, when testing , is fixed at . The other two hyper-parameters, and , are set to values of and , respectively. Both and are hyper-parameters that influence the word embedding process. To explore the impact of on performance, is set to a common value of . Conversely, when investigating the influence of , is set to a constant value of . As for the remaining two hyper-parameters, and , they are kept as small as possible and constant, with specific values of and , respectively. The aforementioned configurations are designed to objectively and accurately assess the impact of hyper-parameters on model performance.
The experimental results illustrate that the choice of has a noticeable impact on the model’s performance (Figure 3), whether in terms of recall (Figure 3(a)) or aggr-div (Figure 3(b)). Under the recall metric (Figure 3(a)), a longitudinal view reveals that for the same set of values, a larger value corresponds to a larger recall value. Observing along the horizontal axis shows that, for the four selected values of , as increases, the recall values also increase. However, different values of result in varying degrees of improvement in recall. Given a change of in as one unit, when takes the minimum, intermediate and maximum values, the average increase in recall metric values is %, % and %, respectively, for each unit increase in . For the aggr-div metric (Figure 3(b)), a similar trend is observed on the horizontal axis as in the recall metric. When is set to , , and , the aggr-div values increase by %, % and %, respectively, as increases by one unit. In contrast to the recall metric, in the aggr-div indicator, the average growth rate tends to increase with larger values. Additionally, in the vertical direction, the pattern of aggr-div values is opposite to that of the recall metric. For the same value, smaller values result in larger aggr-div values.


The impact of the hyper-parameter on the model’s recommendation performance is very similar to that of (Figure 4). The effects of changing within the same set of values and the influence of different values on the metrics are consistent with the conclusions under values. The main difference introduced by the two hyper-parameters is reflected in the rate of change of the metric values. Compared to , yields a greater increase in recall values (Figure 4(a)) when is at its minimum, with a growth rate of %. However, when takes intermediate and maximum values, the growth rates of recall values are lower than the performance under , at % and %, respectively. Alternatively, concerning the aggr-div metric (Figure 4(b)), the impact of values is more profound than that of values. The growth rates are %, % and % for the three different scenarios of values, respectively.


The impact of hyper-parameter on the model’s performance is quite intricate (Figure 5). Let vary by as a unit. When increases by one unit, it brings an average negative growth of % to the recall metric (Figure 5(a)) for a recommended list length of . However, when is , it brings an average positive growth rate of %. In the scenario where takes the middle value, the improvement of has a minimal impact on the recall value, with an average growth change of only %. Across different values of , the changing trend of on the aggr-div metric remains consistent (Figure 5(b)). Among them, the impact of is very close when takes the minimum and maximum values. With each unit increase in , aggr-div values decrease by % and %, respectively. The effect of is most pronounced when takes the middle value, causing a decrease of % in aggr-div for each additional unit.


The impact of on the model is reflected differently in recall and aggr-div (Figure 6). Overall, with the increase of , the recall metric experiences a negative impact (Figure 6(a)), while aggr-div receives a positive influence (Figure 6(b)). The impact of on the recall metric strengthens with the increase of the recommended list length. Assuming varies in units of , when is equal to , an increase of one unit in leads to an average decrease of %. On the aggr-div metric, an increase in results in more diversified recommendations. The average growth rate of aggr-div, induced by the variation in , exhibits an inverted U-shaped trend with the increase in . When takes values within the middle range, the average growth rate reaches %.
Throughout the entire sensitivity experiment, we aim to uncover certain patterns: (1) In conclusion, all four hyper-parameters can influence the model’s performance, impacting either the recall or aggr-div metrics, or both. Specifically, when , , or are small, increasing their values leads to noticeable changes. However, as the hyper-parameter values continue to increase and reach a certain threshold, the model’s performance tends to plateau, and in some cases, it may even produce the opposite effect. (2) To summarize, increasing the value of and can enhance the predictive performance of the model within a certain range. A larger improvement is observed in the recall metric when is smaller, while a greater enhancement is seen in the aggr-div metric when is larger. (3) The impact of on the recall metric is complex and overall quite subtle. However, increasing the value of has a noticeable negative impact on the aggr-div metric. Conversely, can balance the recall and aggr-div metrics. Increasing can lead to a decrease in recommendation accuracy while simultaneously enhancing the diversity of recommendations. (4) Under the same set of hyper-parameter values, the larger is, the larger the recall metric value is. This is because guessing the user’s preferred items in a shorter list is more challenging. As the recommended list lengthens, it undoubtedly increases the likelihood of the recommended items within the user’s preferences. (5) When each parameter surpasses its respective threshold, the model exhibits strong stability.


5.3 Ablation Study
DWHRec model is divided into three key steps: hypergraph construction, random walks and word vector embedding. Among them, random walks and word vector embedding are designed to obtain computationally convenient user and track vectors. The core of the model lies in its hypergraph structure. Investigating the roles of various types of hyperedges can help clarify their relationships and contributions to the model.
In Section 3.2, four types of hyperedges are introduced in the DWHRec algorithm, denoted as , , and . The primary objective of DWHRec is to recommend a more diverse set of tracks to users. Users and tracks are the two most important atoms. While the listening events of users to tracks, represented by , serving as a crucial bridge connecting the two, is indispensable. Therefore, to examine the roles of various types of hyperedges in the model, an ablation study was conducted on the lastfm-100k dataset by alternatively eliminating one of , and .
The concise experimental results are presented in Table 5, while Figure 7 offers a performance comparison of the model based on the recall and aggr-div metrics for all values. Table 5 is a subset of Figure 7, specifically representing the case when . The prefix symbol “-” in both Table 5 and Figure 7 signifies “remove” or “subtract”, meaning the action of elimination or deletion.
Method | Recall@50 | AGGR-DIV@50 |
DWHRec | 0.0396 | 0.0934 |
- | 0.0731 | 0.0143 |
- | 0.0275 | 0.0772 |
- | 0.0200 | 0.0551 |
Observing Table 5 and Figure 7, we made several findings: (1) Regarding the recall@50 metric, removing and led to a decrease in performance, but removing resulted in a significant improvement. The experiments indicated that compared to the DWHRec model with all types of hyperedges as a baseline, removing , and resulted in an increase of %, % and %. In which case, the model’s performance, after removing and respectively, was only % and % of the full model’s performance. (2) In the case of the aggr-div@50 metric, removing any type of hyperedge seemed to have a clear impact. Experiments revealed that after removing , and , the model provided %, % and % of the complete model’s performance, with gaps of %, % and %, respectively. (3) Removing any hyperedge resulted in an apparent drop in aggr-div values, while interventions on recall remained relatively complex.


The ablation experiments analyzed the impact of different types of hyperedges on the model. The experiments indicated that both the album hyperedge and the artist hyperedge could improve the model’s performance, whether in terms of recommendation accuracy or diversity. Comparatively, the impact of was slightly higher than that of . Artists might not have released albums but would certainly create tracks. Therefore, the information captured by was more comprehensive, while might have missed some important information. The tagging hyperedge tended to interfere with the model’s ability to make accurate predictions. Since tags are directly related to the category of tracks, they could lead to a significant increase in diversity. Overall, all four types of hyperedges contributed to achieving a balance between recommendation accuracy and diversity.
6 Conclusions
We developed a novel algorithm, DWHRec, for diversified music recommendations using a weighted hypergraph embedding technique. This algorithm operates in two stages: constructing a hypergraph from user listening history, tracks, albums, artists and tags; and then recommending tracks by exploring connections through a random walks hypergraph embedding. We validated DWHRec’s effectiveness by comparing it with seven advanced algorithms using a dataset from Last.fm, demonstrating significant improvements across multiple evaluation metrics. Additionally, we assessed the sensitivity to hyper-parameters and the impact of different types of hyperedges on accuracy and diversity. Our findings suggest that DWHRec is not only effective for music recommendation but also adaptable to other fields with similar data structures. The hypergraph approach, particularly the use of various hyperedges, plays a crucial role in enhancing recommendation accuracy and ensuring diversity.
Acknowledgement
This work was supported by the Fundamental Research Funds for the Central Universities (Grant No. 2023ZY-SX019).
References
- Adomavicius & Kwon [2012] Adomavicius, G., & Kwon, Y. (2012). Improving aggregate recommendation diversity using ranking-based techniques. IEEE Transactions on Knowledge and Data Engineering, 24, 896–911. doi:10.1109/TKDE.2011.15.
- Balabanović & Shoham [1997] Balabanović, M., & Shoham, Y. (1997). Fab: Content-based, collaborative recommendation. Communications of the ACM, 40, 66–72.
- Bozdag et al. [2014] Bozdag, E., Gao, Q., Houben, G., & Warnier, M. (2014). Does offline political segregation affect the filter bubble? an empirical analysis of information diversity for dutch and turkish twitter users. Computers in Human Behavior, 41, 405–415. doi:10.1016/j.chb.2014.05.028.
- Bradley & Smyth [2001] Bradley, K., & Smyth, B. (2001). Improving recommendation diversity. In Proceedings of the Conference on Artificial Intelligence and Cognitive Science (pp. 141–152). volume 85.
- Bu et al. [2010] Bu, J., Tan, S., Chen, C., Wang, C., Wu, H., Zhang, L., & He, X. (2010). Music recommendation by unified hypergraph: Combining social media information and music content. In Proceedings of the ACM International Conference on Multimedia (pp. 391–400). doi:10.1145/1873951.1874005.
- Cheng & Tang [2016] Cheng, R., & Tang, B. (2016). A music recommendation system based on acoustic features and user personalities. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 203–213). volume 9794. doi:10.1007/978-3-319-42996-0\_17.
- Cheng et al. [2017] Cheng, Z., Shen, J., Zhu, L., S, M. K., & Nie, L. (2017). Exploiting music play sequence for music recommendation. In Proceedings of the International Joint Conference on Artificial Intelligence (pp. 3654–3660). volume 17.
- Curkovic [2019] Curkovic, M. (2019). Need for controlling of the filter bubble effect. Science and Engineering Ethics, 25, 323. doi:10.1007/s11948-017-0005-1.
- Dahlgren [2021] Dahlgren, P. M. (2021). A critical review of filter bubbles and a comparison with selective exposure. Nordicom Review, 42, 15–33. doi:10.2478/nor-2021-0002.
- Deshpande & Karypis [2004] Deshpande, M., & Karypis, G. (2004). Item-based top-n recommendation algorithms. ACM Transactions on Information Systems, 22, 143–177. doi:10.1145/963770.963776.
- Farrahi et al. [2014] Farrahi, K., Schedl, M., Vall, A., Hauger, D., & Tkalcic, M. (2014). Impact of listening behavior on music recommendation. In Proceedings of the International Society for Music Information Retrieval (pp. 483–488).
- Fleder & Hosanagar [2007] Fleder, D. M., & Hosanagar, K. (2007). Recommender systems and their impact on sales diversity. In Proceedings of the ACM Conference on Electronic Commerce (pp. 192–199). doi:10.1145/1250910.1250939.
- Hariri et al. [2012] Hariri, N., Mobasher, B., & Burke, R. (2012). Using social tags to infer context in hybrid music recommendation. In Proceedings of the International Workshop on Web Information and Data Management (pp. 41–48). doi:10.1145/2389936.2389946.
- He et al. [2020] He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., & Wang, M. (2020). Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 639–648). doi:10.1145/3397271.3401063.
- He et al. [2017] He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.-S. (2017). Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web (pp. 173–182). doi:10.1145/3038912.3052569.
- Heitz et al. [2022] Heitz, L., Lischka, J. A., Birrer, A., Paudel, B., Tolmeijer, S., Laugwitz, L., & Bernstein, A. (2022). Benefits of diverse news recommendations for democracy: A user study. Digital Journalism, (pp. 1–21). doi:10.1080/21670811.2021.2021804.
- Hu et al. [2008] Hu, Y., Koren, Y., & Volinsky, C. (2008). Collaborative filtering for implicit feedback datasets. In Proceedings of the IEEE International Conference on Data Mining (pp. 263–272). doi:10.1109/ICDM.2008.22.
- Huang et al. [2019] Huang, H., Shen, H., & Meng, Z. (2019). Item diversified recommendation based on influence diffusion. Information Processing & Management, 56, 939–954. doi:10.1016/j.ipm.2019.01.006.
- Huang & Jenor [2004] Huang, Y., & Jenor, S. (2004). An audio recommendation system based on audio signature description scheme in mpeg-7 audio. In Proceedings of the IEEE International Conference on Multimedia and Expo (pp. 639–642). volume 1. doi:10.1109/ICME.2004.1394273.
- Knobloch-Westerwick & Westerwick [2023] Knobloch-Westerwick, S., & Westerwick, A. (2023). Algorithmic personalization of source cues in the filter bubble: Self-esteem and self-construal impact information exposure. New Media & Society, 25, 2095–2117. doi:10.1177/14614448211027963.
- Koren et al. [2009] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42, 30–37. doi:10.1109/MC.2009.263.
- La Gatta et al. [2023] La Gatta, V., Moscato, V., Pennone, M., Postiglione, M., & Sperlí, G. (2023). Music recommendation via hypergraph embedding. IEEE Transactions on Neural Networks and Learning Systems, 34, 7887–7899. doi:10.1109/TNNLS.2022.3146968.
- Li et al. [2021] Li, X., Chen, C.-H., Zheng, P., Jiang, Z., & Wang, L. (2021). A context-aware diversity-oriented knowledge recommendation approach for smart engineering solution design. Knowledge-Based Systems, 215, 106739. doi:10.1016/j.knosys.2021.106739.
- Liu & Zheng [2020] Liu, S., & Zheng, Y. (2020). Long-tail session-based recommendation. In Proceedings of the ACM Conference on Recommender Systems (pp. 509–514). doi:10.1145/3383313.3412222.
- Loepp et al. [2019] Loepp, B., Donkers, T., Kleemann, T., & Ziegler, J. (2019). Interactive recommending with tag-enhanced matrix factorization (tagmf). International Journal of Human-Computer Studies, 121, 21–41. doi:10.1016/j.ijhcs.2018.05.002.
- Lu & Tintarev [2018] Lu, F., & Tintarev, N. (2018). A diversity adjusting strategy with personality for music recommendation. In Proceedings of the Joint Workshop on Interfaces and Human Decision Making for Recommender Systems (pp. 7–14).
- Ma et al. [2023] Ma, T., Wang, X., Zhou, F., & Wang, S. (2023). Research on diversity and accuracy of the recommendation system based on multi-objective optimization. Neural Computing and Applications, 35, 5155–5163. doi:10.1007/s00521-020-05438-w.
- Mao et al. [2016] Mao, K., Chen, G., Hu, Y., & Zhang, L. (2016). Music recommendation using graph based quality model. Signal Processing, 120, 806–813. doi:10.1016/j.sigpro.2015.03.026.
- Mao et al. [2019] Mao, M., Lu, J., Han, J., & Zhang, G. (2019). Multiobjective e-commerce recommendations based on hypergraph ranking. Information Sciences, 471, 269–287. doi:10.1016/j.ins.2018.07.029.
- McCrae & John [1992] McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and its applications. Journal of Personality, 60, 175–215. doi:10.1111/j.1467-6494.1992.tb00970.x.
- Michiels et al. [2022] Michiels, L., Leysen, J., Smets, A., & Goethals, B. (2022). What are filter bubbles really? a review of the conceptual and empirical work. In Proceedings of the ACM Conference on User Modeling, Adaptation and Personalization (pp. 274–279). doi:10.1145/3511047.3538028.
- Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, . doi:10.48550/arXiv.1301.3781.
- Mikolov et al. [2013b] Mikolov, T., Le, Q. V., & Sutskever, I. (2013b). Exploiting similarities among languages for machine translation. CoRR, abs/1309.4168. doi:10.48550/arXiv.1309.4168.
- Niyazov et al. [2021] Niyazov, A., Mikhailova, E., & Egorova, O. (2021). Content-based music recommendation system. In Proceedings of the Conference of Open Innovations Association (pp. 274–279). doi:10.23919/FRUCT52173.2021.9435533.
- Oramas et al. [2016] Oramas, S., Ostuni, V. C., Noia, T. D., Serra, X., & Sciascio, E. D. (2016). Sound and music recommendation with knowledge graphs. ACM Transactions on Intelligent Systems and Technology, 8, 1–21. doi:10.1145/2926718.
- Pariser [2011] Pariser, E. (2011). The Filter Bubble: What the Internet Is Hiding from You. Penguin UK.
- Pavone et al. [2022] Pavone, M., Saberi, A., Schiffer, M., & Tsao, M. W. (2022). Technical note–online hypergraph matching with delays. Operations Research, 70, 2194–2212. doi:10.1287/opre.2022.2277.
- Perozzi et al. [2014] Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 701–710). doi:10.1145/2623330.2623732.
- Rakshit et al. [2023] Rakshit, P., Saha, S., Chatterjee, A., Mistri, S., Das, S., & Dhar, G. (2023). A popularity-based recommendation system using machine learning. In Proceedings of the Machine Learning in Information and Communication Technology (pp. 143–150). doi:10.1007/978-981-19-5090-2\_14.
- Rendle et al. [2012] Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2012). Bpr: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618, . doi:10.48550/arXiv.1205.2618.
- Resnick et al. [2013] Resnick, P., Garrett, R. K., Kriplean, T., Munson, S. A., & Stroud, N. J. (2013). Bursting your (filter) bubble: Strategies for promoting diverse exposure. In Proceedings of the Conference on Computer Supported Cooperative Work Companion (pp. 95–100). doi:10.1145/2441955.2441981.
- Resnick et al. [1994] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., & Riedl, J. (1994). Grouplens: An open architecture for collaborative filtering of netnews. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (pp. 175–186). doi:10.1145/192844.192905.
- Robinson et al. [2020] Robinson, K., Brown, D., & Schedl, M. (2020). User insights on diversity in music recommendation lists. In Proceedings of the International Society for Music Information Retrieval (pp. 446–453).
- Rong [2014] Rong, X. (2014). Word2vec parameter learning explained. CoRR, abs/1411.2738. doi:10.48550/arXiv.1411.2738.
- Sakurai et al. [2022] Sakurai, K., Togo, R., Ogawa, T., & Haseyama, M. (2022). Deep reinforcement learning-based music recommendation with knowledge graph using acoustic features. ITE Transactions on Media Technology and Applications, 10, 8–17. doi:10.3169/mta.10.8.
- Schedl & Hauger [2015] Schedl, M., & Hauger, D. (2015). Tailoring music recommendations to users by considering diversity, mainstreaminess, and novelty. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 947–950). doi:10.1145/2766462.2767763.
- Shan et al. [2009] Shan, M., Kuo, F., Chiang, M., & Lee, S. (2009). Emotion-based music recommendation by affinity discovery from film music. Expert Systems with Applications, 36, 7666–7674. doi:10.1016/j.eswa.2008.09.042.
- Shao et al. [2009] Shao, B., Wang, D., Li, T., & Ogihara, M. (2009). Music recommendation based on acoustic features and user access patterns. IEEE Transactions on Audio, Speech, and Language Processing, 17, 1602–1611. doi:10.1109/TASL.2009.2020893.
- Tan et al. [2011] Tan, S., Bu, J., Chen, C., Xu, B., Wang, C., & He, X. (2011). Using rich social media information for music recommendation via hypergraph model. ACM Transactions on Multimedia Computing, Communications, and Applications, 7S, 1–22. doi:10.1145/2037676.2037679.
- Theodoridis et al. [2013] Theodoridis, A., Kotropoulos, C., & Panagakis, Y. (2013). Music recommendation using hypergraphs and group sparsity. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 56–60). doi:10.1109/ICASSP.2013.6637608.
- Wang et al. [2022] Wang, R., Wu, Z., Lou, J., & Jiang, Y. (2022). Attention-based dynamic user modeling and deep collaborative filtering recommendation. Expert Systems with Applications, 188, 116036. doi:10.1016/j.eswa.2021.116036.
- Wolfowicz et al. [2023] Wolfowicz, M., Weisburd, D., & Hasisi, B. (2023). Examining the interactive effects of the filter bubble and the echo chamber on radicalization. Journal of Experimental Criminology, 19, 119–141. doi:10.1007/s11292-021-09471-0.
- Wu et al. [2023] Wu, L., He, X., Wang, X., Zhang, K., & Wang, M. (2023). A survey on accuracy-oriented neural recommendation: From collaborative filtering to information-rich recommendation. IEEE Transactions on Knowledge and Data Engineering, 35, 4425–4445. doi:10.1109/TKDE.2022.3145690.
- Xie et al. [2022] Xie, R., Liu, Q., Liu, S., Zhang, Z., Cui, P., Zhang, B., & Lin, L. (2022). Improving accuracy and diversity in matching of recommendation with diversified preference network. IEEE Transactions on Big Data, 8, 955–967. doi:10.1109/TBDATA.2021.3103263.
- Xu et al. [2023] Xu, L., Tian, Z., Zhang, G., Zhang, J., Wang, L., Zheng, B., Li, Y., Tang, J., Zhang, Z., Hou, Y., Pan, X., Zhao, W. X., Chen, X., & Wen, J.-R. (2023). Towards a more user-friendly and easy-to-use benchmark library for recommender systems. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2837–2847). doi:10.1145/3539618.3591889.
- Xue et al. [2017] Xue, H., Dai, X., Zhang, J., Huang, S., & Chen, J. (2017). Deep matrix factorization models for recommender systems. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (pp. 3203–3209). volume 17. doi:10.24963/ijcai.2017/447.
- Yang et al. [2019] Yang, D., Qu, B., Yang, J., & Cudre-Mauroux, P. (2019). Revisiting user mobility and social relationships in lbsns: A hypergraph embedding approach. In Proceedings of the World Wide Web Conference (pp. 2147–2157). doi:10.1145/3308558.3313635.
- Yang et al. [2023a] Yang, L., Wang, S., Tao, Y., Sun, J., Liu, X., Yu, P. S., & Wang, T. (2023a). Dgrec: Graph neural network for recommendation with diversified embedding generation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining (pp. 661–669). doi:10.1145/3539597.3570472.
- Yang et al. [2023b] Yang, M., Liu, Z., Yang, L., Liu, X., Wang, C., Peng, H., & Yu, P. (2023b). Group identification via transitional hypergraph convolution with cross-view self-supervised learning. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (pp. 2969–2979). doi:10.1145/3583780.3614902.
- Zanon et al. [2022] Zanon, A. L., da Rocha, L. C. D., & Manzato, M. G. (2022). Balancing the trade-off between accuracy and diversity in recommender systems with personalized explanations based on linked open data. Knowledge-Based Systems, 252, 109333. doi:10.1016/j.knosys.2022.109333.
- Zhang et al. [2023] Zhang, M., Wu, S., Yu, X., Liu, Q., & Wang, L. (2023). Dynamic graph neural networks for sequential recommendation. IEEE Transactions on Knowledge and Data Engineering, 35, 4741–4753. doi:10.1109/TKDE.2022.3151618.
- Zhao et al. [2016] Zhao, D., Xiu, J., Yang, Z., & Liu, C. (2016). An improved user-based movie recommendation algorithm. In Proceedings of the IEEE International Conference on Computer and Communications (pp. 874–877). doi:10.1109/CompComm.2016.7924828.
- Zhao et al. [2022] Zhao, W. X., Hou, Y., Pan, X., Yang, C., Zhang, Z., Lin, Z., Zhang, J., Bian, S., Tang, J., Sun, W., Chen, Y., Xu, L., Zhang, G., Tian, Z., Tian, C., Mu, S., Fan, X., Chen, X., & Wen, J.-R. (2022). Recbole 2.0: Towards a more up-to-date recommendation library. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (pp. 4722–4726). doi:10.1145/3511808.3557680.
- Zhao et al. [2021] Zhao, W. X., Mu, S., Hou, Y., Lin, Z., Chen, Y., Pan, X., Li, K., Lu, Y., Wang, H., Tian, C., Min, Y., Feng, Z., Fan, X., Chen, X., Wang, P., Ji, W., Li, Y., Wang, X., & Wen, J. (2021). Recbole: Towards a unified, comprehensive and efficient framework for recommendation algorithms. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (pp. 4653–4664). doi:10.1145/3459637.3482016.