A local geometry of hyperedges in hypergraphs, and its applications to social networks

Dong Quan Ngoc Nguyen and Lin Xing Department of Applied and Computational Mathematics and Statistics
University of Notre Dame
Notre Dame, Indiana 46556, USA [email protected] http://nd.edu/ dnguye15 [email protected]

(Date: September 29, 2020)

Abstract.

In many real world datasets arising from social networks, there are hidden higher order relations among data points which cannot be captured using graph modeling. It is natural to use a more general notion of hypergraphs to model such social networks. In this paper, we introduce a new local geometry of hyperdges in hypegraphs which allows to capture higher order relations among data points. Furthermore based on this new geometry, we also introduce new methodology–the nearest neighbors method in hypegraphs–for analyzing datasets arising from sociology.

1. Introduction

One of the challenges in the modern age is to classify data arising from many resources; for example, following the rapid developments of several areas in mathematics, a large number of publications in mathematics creates a tremendous amount of data, which signifies useful information such as relationship (or collaborations) among authors and their publications, and their influences on development of mathematics. It is often the case that analyzing such data is not straightforward, and very difficult task because of the extremely fast growth of relations among data, and of data itself.

One of the main problems in machine learning is weight (respectively, label) prediction problem in which each instance is associated with a set of weights (respectively, labels), and the aim is to propose predictive models for predicting weights (respectively, labels) for unobserved data. These learning problem have important applications in many areas such as protein function classification [4], text categorization [17], and semantic scene classification [2], to cite a few. For networks which can be modeled as graphs, many approaches to prediction problem were proposed (see, for example, [7], [5], [6], [13], [14], [15]).

In real world datasets, for example, in social networks, it often occurs that higher order relations among data points are present which can not be captured because graph modeling of such datasets only signifies binary relation. So it is natural to consider weight (respectively, label) prediction problem for a generalization of graphs–hypergraphs. Recall that a hypergraph $X$ is a pair $(\mathcal{V}(X),\mathcal{E}(X))$ , where $\mathcal{V}(X)$ is the set of data points (called vertices of $X$ ), and $\mathcal{E}(X)$ is a subset of the power set of $\mathcal{V}(X)$ which represents higher order relations among data points. Each element in $\mathcal{E}(X)$ is called a hyperedge. Since each hyperedge can consist of a collection of data points, it captures the higher order correlation information among these data points. Recent trends in learning problem have been focused on hypergraph datasets (see, for example, [1], [16])

In this paper, we propose a geometric approach for studying weight (respectively, label) prediction problem on hypergraph data which will be used to apply for analyzing networks arising from sociology. Traditionally for datasets which can be modeled as graphs, there is a standard method called $k$ nearest neighbors which applies to weight (respectively, label) prediction problem. The classical $k$ nearest neighbors method relies on the Euclidean metrics. In contrast with many traditionally geometric approaches based on the Euclidean metrics (or distances), we propose a new methodology of nearest neighbors in hypergraphs, using metrics modulo equivalence relations which is a weaker notion than that of usual metrics, but it seems natural to use such metrics in learning data. Equivalence relation signifies the correlation among data points, and a metric modulo such an equivalence relation signifies how much different among data points are, with respect to such correlation information. In this work, we suggest that equivalence relation is a natural mathematical notion for modeling correlation information in datasets, and that metrics modulo equivalence relations are more suitable for learning correlation in datasets.

The structure of our paper is as follows. In Subsection 2.1, we introduce a notion of metric spaces with respect to equivalence relations. In Subsection 2.2, based on the notion of metric spaces with respect to equivalence relation, we propose our methodology for studying weight (respectively, label) prediction problem. In Subsection 2.3, we construct a class of metrics modulo equivalence relations on the set of hyperedges of a hypergraph which can be combined with the methodology proposed in Subsection 2.2 for studying prediction problems. In Section 3, we apply our methodology to real world datasets from sociology, and report our experimental results.

2. A class of metrics modulo certain equivalence relations for the set of hyperedges of a hypergraph, and methodologies

In weight (respectively, label) prediction problem, one is given a dataset consisting of data points $x_{1},\ldots,x_{n}$ such that there is a subset, say $\{x_{i_{1}},\ldots,x_{i_{k}}\}$ , each of whose element is associated to a weight (respectively, label). The aim is to propose a predictive model for predicting the weight (respectively, label) of each element in the set $\{x_{1},\ldots,x_{n}\}\setminus\{x_{i_{1}},\ldots,x_{i_{k}}\}$ . In this paper, we propose a new approach using insights from metric geometry to weight (respectively, label) prediction problem for datasets which can be modeled as hypergraphs.

A hypergraph $X$ is a pair $(\mathcal{V}(X),\mathcal{E}(X))$ , where $\mathcal{V}(X)$ is the set of points, called vertices of $X$ , and $\mathcal{E}(X)$ is a subset of the power set of $\mathcal{V}(X)$ , each of whose elements is called a hyperedge of $X$ . Each hyperedge $\mathfrak{e}$ is a subset of $\mathcal{V}(X)$ . If $\mathfrak{e}$ has exactly $m$ elements, $\mathfrak{e}$ is called an $m$ -hyperedge of $X$ . In this paper, we are interested in weight (respectively, label) prediction problem for hypergraphs, and so the following notion is natural for modeling such datasets.

Definition 2.1.

(incomplete hypergraphs)

Let $X=(\mathcal{V}(X),\mathcal{E}(X))$ be a hypergraph, and let $A$ be either an interval $[a,b]$ in $\mathbb{R}$ or a set of labels $\{1,\ldots,q\}$ for some positive integer $q$ . $X$ is called an incomplete hypergraph with respect to $A$ if there exists a map $F:\mathcal{E}_{0}(X)\to A$ for some subset $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ .

Remark 2.2.

In this paper, we consider two types of incomplete hypergraphs which are specified in the following.

(i)

In weight prediction problem, $A$ is an interval $[a,b]\subset\mathbb{R}$ , and a subset $\mathcal{E}_{0}(X)$ of the set of hyperedges in $X$ is given such that there is a map $W:\mathcal{E}_{0}(X)\to[a,b]$ . In this case $W$ is the map $F$ in Definition 2.1. Each value $W(\mathfrak{e})$ for $\mathfrak{e}\in\mathcal{E}_{0}(X)$ is called the weight of $\mathfrak{e}$ . Weight prediction problem asks for a predictive model of $\overline{W}$ such that $\overline{W}$ is a map from $\mathcal{E}(X)$ to $[a,b]$ for which

$\displaystyle\overline{W}(\mathfrak{e}_{0})=W(\mathfrak{e}_{0})$

for every $\mathfrak{e}_{0}\in\mathcal{E}_{0}(X)$ .
(ii)

In label prediction problem, $A$ is a set of labels $\{1,2,\ldots,q$ for some positive integer $q$ , and a subset $\mathcal{E}_{0}(X)$ of the set of hyperedges in $X$ is given such that there is a map $L:\mathcal{E}_{0}(X)\to[a,b]$ . In this case $L$ is the map $F$ in Definition 2.1. Each value $l(\mathfrak{e})$ for $\mathfrak{e}\in\mathcal{E}_{0}(X)$ is called the label of $\mathfrak{e}$ . Label prediction problem asks for a predictive model of $\overline{L}$ such that $\overline{L}$ is a map from $\mathcal{E}(X)$ to $[a,b]$ for which

$\displaystyle\overline{L}(\mathfrak{e}_{0})=L(\mathfrak{e}_{0})$

for every $\mathfrak{e}_{0}\in\mathcal{E}_{0}(X)$ .

In classical weight (respectively, label) prediction problem for datasets which can be modeled as graphs, one can apply the classical $k$ nearest neighbors method. The main idea of the classical $k$ nearest neighbors method is that a dataset $X=\{x_{1},\ldots,x_{n}\}$ is equipped with a Euclidean metric (or distance), say $d$ . For each $x\in X$ , let $x_{i_{1}},\ldots,x_{i_{k}}$ be the $k$ nearest points from $X$ to $x$ with respect to the metric $d$ such that

\displaystyle d(x,x_{i_{1}})\leq d(x,x_{i_{2}})\leq\cdots\leq d(x,x_{i_{k}}).

Then the $k$ nearest neighbors method expect that the weight (respectively, label) of $x$ should be in proximity with the $x_{i_{m}}$ for $1\leq m\leq k$ .

Our main contribution in this paper is to introduce modified $k$ nearest neighbors methods for learning hypergraphs, especially for weight (respectively, label) prediction problem. In order to do that, we introduce a class of metrics modulo equivalence relations on the set of hyperedges in a hypergraph. Note that in contrast with the usual metric, a metric $d$ modulo an equivalence relation $\cong$ allows that $d(\mathfrak{e},\mathfrak{f})=0$ as long as $\mathfrak{e}\cong\mathfrak{f}$ (which means that $a$ is equivalent to $b$ with respect to certain properties, for example, similar weights or labels.) Thus such a metric is more suitable for studying weight (respectively, label) prediction problem since two distinct points in a dataset may have the same weight (respectively, label). In this case, the equivalence relation $\cong$ signifies the relation that certain distinct points in datasets share similar properties with respect to weight (respectively, label) map, and $d(\mathfrak{e},\mathfrak{f})=0$ encodes the information that the points $\mathfrak{e},\mathfrak{f}$ are expected to have the same weight (respectively, label).

We first introduce a notion of metric spaces modulo equivalence relations which is used in many places in this paper.

2.1. Metric spaces modulo equivalence relations

We present in this subsection a notion of metric spaces modulo equivalence relations. We also explain why this notion is suitable for applying to geometrical structures of hypergraphs. We first recall a notion of equivalence relations on sets.

Definition 2.3.

(Equivalence relation)

Let $X$ be a set. An equivalence relation, denoted by $\cong$ , on $X$ is a subset of $X\times X$ such that the following are true:

(i)

(Reflexivity) $(a,a)\in\;\cong$ for every $a\in X$ .
(ii)

(Symmetry) $(a,b)\in\;\cong$ if and only if $(b,a)\in\;\cong$ .
(iii)

(Transitivity) if $(a,b)\in\;\cong$ and $(b,c)\in\;\cong$ then $(a,c)\in\;\cong$ .

When $(a,b)\in\;\cong$ , we say that $a$ is $\cong$ –equivalent to $b$ . Throughout this paper, in order to signify this relation, we write $a\cong b$ whenever $(a,b)\in\;\cong$ .

An equivalence relation $\cong$ on a set provides a way to identify similar elements in the set. Equivalently if one can find a measurement to measure how similar elements in a set are, then one can modify this measurement to introduce an equivalence relation on the set.

Reflexivity in Definition 2.3 says that an element in $X$ should be naturally considered to be similar to itself. If an element $a$ is similar to an element $b$ , then $b$ should be naturally considered to be similar to $a$ , which is exactly the symmetry condition in Definition 2.3. For the last condition, Transitivity, if an element $a$ is similar to an element $b$ and $b$ is similar an element $c$ , it is natural to view that $a$ is similar to $c$ .

Equivalence relation is a weaker notion than that of the identity relation which is more suitable and natural to study hypergraph data from the metric-geometry viewpoint. The main aim and intuition behind our paper is that in order to study questions in machine learning on a hypergraph $X$ , it is not necessary to construct a metric on the hypergraph, i.e., a mapping $d:X\times X\to\mathbb{R}_{\geq 0}$ which satisfies similar conditions as the usual absolute value in $\mathbb{R}$ , for example, $d(x,y)=0$ if and only if $x=y$ . Instead one only needs to construct a metric up to an equivalence relation “ $\cong$ ”, i.e., a mapping $d:X\times X\to\mathbb{R}_{\geq 0}$ which satisfies all the conditions of a metric except the identity relation replaced by a weaker condition that $d(x,y)=0$ if and only if $x\cong y$ . It turns out that such a metric exists naturally in a very general class of hypergraphs, and can be exploited to introduce several modified methologies in machine learning which apply to weight prediction problem, and multi-label classification problem. Especially in this paper, we introduce a class of metrics for the set of hyperedges of a hypergraph which will be used to provide a modifed kNN for predicting labels of hyperedges. Up to the knowledge of the authors, this paper is the first one which realizes the existence of such metrics for the set of hyperedges in a hypergraph. In order to obtain such metrics, we introduce a notion of neighbors of hyperedges in a hypergraph which contains a topological feature of the hyperedges. It seems to the authors that this notion of neighbors of hyperedges has never been introduced before in literature.

We recall the notion of metrics modulo equivalence relation on a set.

Definition 2.4.

(Metric modulo an equivalence relation)

Let $X$ be a set, and $\cong$ an equivalence relation on $X$ . A mapping $d:X\times X\to\mathbb{R}$ is said to be a metric on $X$ modulo the equivalence relation $\cong$ if the following condition are satisfied:

(i)

$d(a,b)\geq 0$ for all $a,b\in X$ .
(ii)

$d(a,b)=0$ if and only if $a\cong b$ .
(iii)

(Symmetry) $d(a,b)=d(b,a)$ for all $a,b\in X$ .
(iv)

(Triangle inequality) for any $a,b,c\in X$ ,

$d(a,b)\leq d(a,c)+d(c,b).$

A metric modulo an equivalence relation $\cong$ on a set $X$ acts almost like a metric. The only difference between a metric modulo an equivalence relation and a metric on a set is that condition (ii) in Definition 2.4 is replaced by a stronger condition that $d(a,b)=0$ if and only if $a=b$ . But in studying some properties, say $(P_{i})_{i}$ , associated to a given dataset, where each $P_{i}$ denotes a property of interest, it is often the case that distinct elements in the dataset share exactly the same properties $(P_{i})_{i}$ . In this case, it is natural to view that the distance between these distinct elements as zero since they are considered to be equivalent with respect to the properties $(P_{i})_{i}$ . Note that if one uses a usual metric on this dataset, then one cannot identify similarities among distinct elements sharing the same properties. Thus it is more natural to use a metric modulo an equivalence relation on the dataset to study the structure of the dataset related to a given set of properties. From this point of view, one can apply this idea to a variety of problems such as weight prediction, or multi-label classification as shown in this paper.

2.2. Methodology

In this subsection, we introduce our methodology for weight (respectively, label) prediction problem for hypergraph datasets. Throughout this subsection, let $X=(\mathcal{V}(X),\mathcal{E}(X))$ be a hypergraph. The main contributions of our work are the constructions of several special metric structures of the set of hyperedges $\mathcal{E}(X)$ . The main motivation for equipping $\mathcal{E}(X)$ with such metric structures is that we want to modify the classical $k$ nearest neighbors (kNN) method (which only works if the underlying space is Euclidean), and apply it to the weight (respectively, label) prediction problem for elements in $\mathcal{E}(X)$ . In this work, we introduce two different paths to reach to the modification of classical kNN method, the first one called modified kNN (which directly use the metrics we construct on $\mathcal{E}(X)$ ), and the second one called embedded kNN (which instead of using the metrics, we only need to make use of the local geometry around each hyperedge in $\mathcal{E}(X)$ , which will be also introduced also in this work.) In the following, we describe in detail our modified and embedded kNN methods.

2.2.1. Modified kNN methods

Assume that there is a metric modulo an equivalence relation $\cong$ , denoted by $d$ on $\mathcal{E}(X)$ .

$\star$ Weight Prediction Problem.

In the weight prediction problem, we assume that there is a subset $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ such that there is a map $W:\mathcal{S}_{0}\to[a,b]$ for some interval $[a,b]$ in $\mathbb{R}$ . The value $W(\mathfrak{e})$ is called the weight of $\mathfrak{e}$ . The aim of the weight prediction problem is to predict what possible values of $W(\mathfrak{e})$ with $\mathfrak{e}\in\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ are. We propose a modified kNN method for predicting weights of elements in $\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ as follows.

Fix a positive integer $k\geq 1$ . Take an arbitrary element $\mathfrak{e}\in\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ . Let $\mathrm{DIS}(\mathfrak{e})$ denote the set of all distances from $\mathfrak{e}$ to elements in $\mathcal{E}_{0}(X)$ , i.e.,

\displaystyle\mathrm{DIS}(\mathfrak{e})=\{d(\mathfrak{e},\mathfrak{e}_{0})\;|\mathfrak{e}_{0}\in\mathcal{E}_{0}(X)\}.

Note that the set $\mathrm{DIS}(\mathfrak{e})$ only consists of finitely many distinct positive real numbers, say $d_{1}(\mathfrak{e}),\ldots,d_{r_{\mathfrak{e}}}(\mathfrak{e})$ for some positive integer $r_{\mathfrak{e}}\geq 1$ such that

\displaystyle d_{1}(\mathfrak{e})<d_{2}(\mathfrak{e})<\cdots<d_{r_{\mathfrak{e}}}(\mathfrak{e}).

In experimental analysis, we always choose $k$ such that $k\leq r_{\mathfrak{e}}$ for every $\mathfrak{e}\in\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ .

We only choose the $k$ smallest values in $\mathrm{DIS}(\mathfrak{e})$ , say $d_{1}(\mathfrak{e})<d_{2}(\mathfrak{e})<\cdots<d_{k}(\mathfrak{e})$ , and let $\text{kNN}(\mathfrak{e})$ denote the set of all elements $\mathfrak{e}_{0}\in\mathcal{E}_{0}(X)$ such that there exists an integer $\ell_{\mathfrak{e}_{0}}$ with $1\leq\ell_{\mathfrak{e}_{0}}\leq k$ for which $d_{\ell_{\mathfrak{e}_{0}}}=d(\mathfrak{e},\mathfrak{e}_{0})$ , i.e.,

(1)

\displaystyle\text{kNN}(\mathfrak{e})=\{\mathfrak{e}_{0}\in\mathcal{E}_{0}(X)\;|\text{$d(\mathfrak{e},\mathfrak{e}_{0})=d_{\ell_{\mathfrak{e}_{0}}}$ for some $1\leq\ell_{\mathfrak{e}_{0}}\leq k$}\}

We propose a predictive model for $W$ , denoted by $\overline{W}$ , as follows.

\displaystyle\overline{W}(\mathfrak{e})=\dfrac{\sum_{\mathfrak{e}_{0}\in\text{kNN}(\mathfrak{e})}W(\mathfrak{e}_{0})}{\mathrm{card}(\text{kNN}(\mathfrak{e}))},

where $\mathrm{card}(\text{kNN}(\mathfrak{e}))$ denotes the number of elements in $\text{kNN}(\mathfrak{e})$ .

In experimental analysis, we perform the modified kNN method introduced above with $d$ replaced by classes of metrics modulo certain equivalence relations which we construct on the set of hyperedges $\mathcal{E}(X)$ in Subsection 2.3. In Table 1, we list the class of metrics we use in the modified kNN methods.

Table 1. Metrics modulo certain equivalence relation used in modified kNN methods

The metric $d$
$\mathcal{D}_{h,\mathcal{B},+\infty}$ defined in (2.11)
$\mathcal{D}_{h,\mathcal{B},\epsilon}$ defined in (2.13)

$\star$ Label Prediction Problem

In the label prediction problem, we assume that there is a subset, say $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ such that there is a map $L:\mathcal{E}_{0}(X)\to\{1,\ldots,q\}$ for some integer $q\geq 2$ . The value $L(x_{0})$ is called the label of $x_{0}$ . The aim of the label prediction problem is to predict what possible values of $L(x)$ with $x\in\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ are. We propose a modified kNN method for predicting labels of elements in $\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ as follows.

For each $\mathfrak{e}\in\mathcal{E}(X)\setminus\mathcal{E}_{0}(X)$ , we define the set $\text{kNN}(\mathfrak{e})$ as in (1).

We propose a predictive model for $L$ , denoted by $\overline{L}$ , as follows. Let $\mathcal{L}(\text{kNN})(\mathfrak{e})$ be the multi-set of labels $L(\mathfrak{e}_{0})$ for $\mathfrak{e}_{0}\in\text{kNN}(\mathfrak{e})$ . If there is a mode, say $L(\mathfrak{e}_{0})$ , in $\mathcal{L}(\text{kNN})(\mathfrak{e})$ for some $\mathfrak{e}_{0}\in\text{kNN}(\mathfrak{e})$ , we set $\overline{L}(\mathfrak{e})=L(\mathfrak{e}_{0})$ ; otherwise let $\overline{L}(\mathfrak{e})$ be the closest integer to the average $\dfrac{\sum_{\mathfrak{e}_{0}\in\text{kNN}(\mathfrak{e})}L(\mathfrak{e}_{0})}{\mathrm{card}(\text{kNN}(\mathfrak{e}))}$ .

2.2.2. Embedded kNN methods

In this subsection, we introduce our embedded kNN methods. In the construction of each metric in this work, we need to introduce a local geometry at each element in the set $\mathcal{E}(X)$ . The local geometry at each element needs to contain the local structure around each element which we want to study from the set of hyperedges $\mathcal{E}(X)$ . Thus the local geometry at a hyperedge gathers the local hypergraph structure around the hyperedge with respect to certain properties of hypergraphs that we want to investigate. In this work, we introduce the viewpoint that the local geometry at a hyperedge $\mathfrak{e}$ is a set of certain hyperedges in $\mathcal{E}(X)$ which we consider as sharing similar features of $\mathfrak{e}$ with respect to certain properties. We introduce two different types of local geometry in the set of hyperedges which we call type I and type II neighborhoods of hyperedges, respectively.

Table 2 lists types of neighborhoods for hyperedges introduced in this paper.

Table 2. Types of neighborhoods used in embedded kNN methods

Types of neighborhoods
Type I neighborhoods $\mathcal{N}_{h,\mathcal{B},+\infty}$ , defined in (2.10)
Type II neighborhoods $\mathcal{N}_{h,\mathcal{B},\epsilon}$ , defined in (2.12)

We first consider label prediction problem.

$\star$ Label Prediction Problem

Let $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ for which each hyperedge in $\mathcal{E}_{0}(X)$ is equipped with a label in $\{1,\ldots,q\}$ , i.e., there is a map $L:\mathcal{E}_{0}(X)\to\{1,\ldots,q\}$ .

In this case, let $\mathcal{N}_{\text{edge}}$ denote either the type I neighborhood $\mathcal{N}_{h,\mathcal{B},+\infty}$ or type II neighborhood $\mathcal{N}_{h,\mathcal{B},\epsilon}$ in Table 2. We define the map $T_{\mathcal{E}(X)}:\mathcal{E}(X)\to\mathbb{R}^{q}$ which represents each hyperedge as a point in $\mathbb{R}^{q}$ as follows. For each $\mathfrak{e}\in\mathcal{E}(X)$ ,

\displaystyle T_{\mathcal{E}(X)}(\mathfrak{e})=(\eta_{i}(\mathcal{N}_{\text{edge}}(\mathfrak{e})),\ldots,\eta_{q}(\mathcal{N}_{\text{edge}}(\mathfrak{e}))),

where for each $1\leq i\leq q$ , $\eta_{i}(\mathcal{N}_{\text{edge}}(\mathfrak{e}))$ denotes the number of hyperedges $\mathfrak{f}\in\mathcal{N}_{\text{edge}}(\mathfrak{e})$ such that $L(\mathfrak{f})=i$ . We call $T_{\mathcal{E}(X)}$ the feature map of $\mathcal{E}(X)$ . Note that the map $T_{\mathcal{E}(X)}$ is well-defined since by our notion of neighborhoods of hyperedges, each hyperedge in the neighborhood $\mathcal{N}_{\text{edge}}(\mathfrak{e})$ belongs to the set $\mathcal{E}_{0}(X)$ , and thus one can associate a label to $\mathfrak{f}$ , say $L(\mathfrak{f})$ .

Under the map $T_{\mathcal{E}(X)}$ , the set of hyperedges $\mathcal{E}(X)$ can be represented as a subset, say $T_{\mathcal{E}(X)}(\mathcal{E}(X))$ , of $\mathbb{R}^{q}$ . For label prediction problem, we view the label of each element $T_{\mathcal{E}(X)}(\mathfrak{e})\in\mathbb{R}^{q}$ for a given hyperedge $\mathfrak{e}$ is the same as that of $\mathfrak{e}\in\mathcal{E}(X)$ . In order to predict the label of a hyperedge $\mathfrak{e}$ , we perform the classical kNN method for the set $T_{\mathcal{E}(X)}(\mathcal{E}(X))\subseteq\mathbb{R}^{q}$ to predict the label of $T_{\mathcal{E}(X)}(\mathfrak{e})$ which we view as the label of the hyperedge $\mathfrak{e}$ .

$\star$ Weight Prediction Problem

In this case, let $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ for which each hyperedge in $\mathcal{E}_{0}(X)$ is equipped with a weight in an interval $[a,b]$ in $\mathbb{R}$ , i.e., there is a map $W:\mathcal{E}_{0}(X)\to[a,b]$ . For an integer $q\geq 2$ , we associate a map $L:\mathcal{E}_{0}(X)\to\{1,\ldots,q\}$ to $L$ as follows. We first divide the interval $[a,b]$ into $q$ sub-intervals of equal length such that $[a_{0},a_{1}]$ , $(a_{1},a_{2}]$ , …, $(a_{q-1},a_{q}]$ where $a_{0}=a$ , $a_{q}=b$ , and $a_{i}=a_{i-1}+(b-a)/q$ for each $1\leq i\leq q$ . For each $\mathfrak{e}\in\mathcal{E}_{0}(X)$ , we let $L(\mathfrak{e})=1$ if $W(\mathfrak{e})\in[a_{0},a_{1}]$ , and $L(\mathfrak{e})=i$ if $i$ is the unique integer in $\{2,\ldots,q\}$ such that $W(\mathfrak{e})\in(a_{i-1},a_{i}]$ . Thus one obtains the map $L:\mathcal{E}_{0}(X)\to\{1,\ldots,q\}$ .

Repeating the same arguments as in Label Prediction Problem, one obtains the feature map defined by

\displaystyle T_{\mathcal{E}(X)}(\mathfrak{e})=(\eta_{i}(\mathcal{N}_{\text{edge}}(\mathfrak{e})),\ldots,\eta_{q}(\mathcal{N}_{\text{edge}}(\mathfrak{e}))),

Under the map $T_{\mathcal{E}(X)}$ , the set of hyperedges $\mathcal{E}(X)$ can be represented as a subset, say $T_{\mathcal{E}(X)}(\mathcal{E}(X))$ , of $\mathbb{R}^{q}$ . For weight prediction problem, we view the weight of each element $T_{\mathcal{E}(X)}(\mathfrak{e})\in\mathbb{R}^{q}$ for a given hyperedge $\mathfrak{e}$ is the same as that of $\mathfrak{e}\in\mathcal{E}(X)$ . In order to predict the weight of a hyperedge $\mathfrak{e}$ , we perform the classical kNN method for the set $T_{\mathcal{E}(X)}(\mathcal{E}(X))\subseteq\mathbb{R}^{q}$ to predict the weight of $T_{\mathcal{E}(X)}(\mathfrak{e})$ which we view as the weight of the hyperedge $\mathfrak{e}$ .

In our experimental analysis, for simplicity, we always let $q=2$ .

2.3. A class of metrics for the set of hyperedges of a hypergraph

In this subsection, we introduce a class of metrics modulo certain equivalent relations on the set of hyperedges in a hypergraph. We consider incomplete hypergraphs introduced in Definition 2.1 which is suitable for studying the weight (respectively, label) prediction problem for the set of hyperedges.

For the rest of this section, we fix an incomplete hypergraph $X=(\mathcal{V}(X),\mathcal{E}(X)$ such that there is a subset $\mathcal{E}_{0}(X)$ of $\mathcal{E}(X)$ and a map $F:\mathcal{E}_{0}(X)\to A$ , where $A$ is either an interval in $[a,b]$ in $\mathbb{R}$ or a set of labels $\{1,\ldots,q\}$ for some integer $q\geq 2$ . In weight prediction problem, $F$ is the weight map $W$ , and $A$ is an interval $[a,b]$ whereas in label prediction problem, $F$ is the label map $L$ , and $A=\{1,\ldots,q\}$ .

We first introduce a notion of neighborhoods of hyperedges in $X$ . For each $\mathfrak{e}\in\mathcal{E}(X)$ , denote by $s(\mathfrak{e})$ the size of $\mathfrak{e}$ , i.e., the number of vertices in $\mathfrak{e}$ . Let $\mathcal{S}$ denote the set consisting of the sizes of all hyperedges in $X$ , i.e., $\mathcal{S}=\{s(\mathfrak{e})\;|\;\mathfrak{e}\in\mathcal{E}(X)\}$ . Let $\mathcal{M}:\mathcal{S}\to\mathbb{R}_{>0}\cup\{+\infty\}$ be a mapping which will be used to control the sizes of neighbors of hyperedges.

Definition 2.5.

( $(\mathcal{E}_{0}(X),\mathcal{M})$ -neighborhoods)

Let $\mathfrak{e}$ be a hyperedge in $\mathcal{E}(X)$ . The $(\mathcal{E}_{0}(X),\mathcal{M})$ -neighborhood of $\mathfrak{e}$ is defined by

\displaystyle\mathcal{N}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\{\mathfrak{f}\in\mathcal{E}_{0}(X)\;|\;\text{$\mathrm{card}(\mathfrak{e}\cap\mathfrak{f})\geq\lfloor s(\mathfrak{e})/2\rfloor$ and $s(\mathfrak{f})\leq M(s(\mathfrak{e}))$}\},

where $\mathfrak{e}\cap\mathfrak{f}$ is the set of vertices that $\mathfrak{e}$ and $\mathfrak{f}$ have in common, and $\lfloor\cdot\rfloor$ denotes the floor function.

For the construction of a metric for an incomplete hypergraph, for each hyperedge $\mathfrak{e}$ , we only focus on $(\mathcal{E}_{0}(X),\mathcal{M})$ -neighbors $\mathfrak{f}$ of $\mathfrak{e}$ such that the differences between the values $\sigma(\mathfrak{f})$ and the predicted value of $F(\mathfrak{e})$ are very small, and can be controlled using a tuning parameter $h$ . The smaller these differences become, the more precise the predicted value of $F(\mathfrak{e})$ is.

We now introduce a notion of neighborhoods of hyperedges, depending on a tuning parameter $h$ .

Definition 2.6.

Let $h>0$ be a tuning parameter. For each vertex $\mathfrak{e}\in\mathcal{E}(X)$ , set

\mathrm{AVG}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\dfrac{\sum_{\mathfrak{f}\in\mathcal{N}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})}F(\mathfrak{f})}{\mathrm{card}(\mathcal{N}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e}))}.

Set

\displaystyle\mathcal{N}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\{\mathfrak{f}\in\mathcal{N}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})\;|\;|F(\mathfrak{f})-\mathrm{AVG}_{\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})|\leq h\}.

Refer to caption — Figure 1. Example of hypergraph with equivalent hyperedges.

For each $\mathfrak{e}\in\mathcal{E}(X)$ , set

\displaystyle\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\mathrm{card}(\mathcal{N}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})),

that is $\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})$ is the number of elements in $\mathcal{N}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})$ .

We introduce an equivalence relation on the set of hyperedges $\mathcal{E}(X)$ which allows to identify certain hyperedges in $\mathcal{E}(X)$ . Note that if two hyperedges, say $\mathfrak{e}$ and $\mathfrak{f}$ satisfy $\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})$ , then it is natural to view $\mathfrak{e}$ and $\mathfrak{f}$ as similar hyperedge in weight (respectively, label) prediction problem since their neighborhood structures around the average mean of weights (respectively, labels) are very similar.

Hence it is natural to define a binary relation on $\mathcal{E}(X)$ as follows: two hyperedges $\mathfrak{e}$ and $\mathfrak{f}$ are equivalent, denoted by $\mathfrak{e}\cong\mathfrak{f}$ if $\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})$ . One obtains the following.

Proposition 2.7.

The binary relation “ $\cong$ ” is an equivalence relation.

Example 2.8.

Firgure 1 is an example of hypergraph that indicates which hyperedges are $\cong_{3}$ -equivalent.

In this example, $\mathcal{E}_{0}(X)=\{\mathfrak{a},\mathfrak{b},\mathfrak{c},\mathfrak{d},\mathfrak{g},\mathfrak{h}\}$ . The weights of $\mathfrak{a},\mathfrak{b},\mathfrak{c},\mathfrak{d},\mathfrak{g},\mathfrak{h}$ are $0.4,0.5,0.35,0.46,-0.2,-0.15$ respectively. Let $\mathcal{M}(s(\mathfrak{e}))=\dfrac{4}{3}s(\mathfrak{e})$ for each $\mathfrak{e}\in\mathcal{E}(X)$ . Then $\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\mathcal{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})=2$ , which indicates $\mathfrak{e}\cong\mathfrak{f}$ .

Now we define a mapping $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}:\mathcal{E}(X)\times\mathcal{E}(X)\to\mathbb{R}_{\geq 0}$ as follows. For hyperedges $\mathfrak{e},\mathfrak{f}$ , define

(2)

\displaystyle\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})=|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})|.

Theorem 2.9.

$\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}$ is a metric modulo the equivalence relation $\cong$ .

Proof.

It is clear that $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})\geq 0$ which shows that (i) in Definition 2.4 holds.

Assume that $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})$ for some $\mathfrak{e},\mathfrak{f}\in\mathcal{E}(X)$ . Then it follows from (2) that $fC_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})=\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})$ which implies that $\mathfrak{e}\cong\mathfrak{f}$ . Thus (ii) in Definition 2.4 holds.

It is obvious that $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})=\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f},\mathfrak{e})$ , and thus (iii) in Definition 2.4 holds.

Let $\mathfrak{e},\mathfrak{f},\mathfrak{g}$ be hyperedges in $\mathcal{E}(X)$ . We see that

	$\displaystyle\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{g})$	$\displaystyle=\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g})\|$
		$\displaystyle=\|(\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f}))+(\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g}))\|$
		$\displaystyle\leq\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})\|+\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g})\|$
		$\displaystyle=\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})+\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f},\mathfrak{g}).$

Therefore (iv) in Definition 2.4 holds, and thus $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}$ is a metric modulo the equivalence relation $\cong$ .

∎

Depending on the range of the values of $\mathcal{M}$ , one can obtain different types of neighborhoods of hyperedges, and different versions of the metric $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}$ . Two distinguished cases which we study in the paper are as follows.

Definition 2.10.

(Type I neighborhoods of hyperedges)

Let $\mathcal{M}(s(\mathfrak{e}))=+\infty$ for each hyperedge $\mathfrak{e}\in\mathcal{E}(X)$ . For $\mathfrak{e}\in\mathcal{E}(X)$ , following the construction of the neighborhood $\mathcal{N}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})$ with $\mathcal{M}(s(\mathfrak{e}))$ replaced by $+\infty$ , one obtains a new type of neighborhood of $\mathfrak{e}$ , denoted by $\mathcal{N}_{h,\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$ , which we call the Type I neighborhood of $\mathfrak{e}$ .

Definition 2.11.

(A first metric on $\mathcal{E}(X)$ )

Let $\mathcal{M}(s(\mathfrak{e}))=+\infty$ for each hyperedge $\mathfrak{e}\in\mathcal{E}(X)$ . Following the construction of the metric $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}$ with $\mathcal{M}(s(\mathfrak{e}))$ replaced by $+\infty$ , one obtains a metric modulo the equivalence relation $\cong$ which we denote by $\mathcal{D}_{h,\mathcal{E}_{0}(X),+\infty}$ .

The key feature of the metric $\mathcal{D}_{h,\mathcal{E}_{0}(X),+\infty}$ is that it does not control the sizes of neighbors of hyperedges. So it contains all weight information from the neighborhoods of hyperedges in its definition without filtering which weights are possibly important for prediction.

Definition 2.12.

(Type II neighborhoods of hyperedges)

Let $\mathcal{M}(s(\mathfrak{e}))=\epsilon s(\mathfrak{e})$ for each hyperedge $\mathfrak{e}\in\mathcal{E}(X)$ . For $\mathfrak{e}\in\mathcal{E}(X)$ , following the construction of the neighborhood $\mathcal{N}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})$ with $\mathcal{M}(s(\mathfrak{e}))$ replaced by $\epsilon s(\mathfrak{e})$ , one obtains a new type of neighborhood of $\mathfrak{e}$ , denoted by $\mathcal{N}_{h,\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$ , which we call the Type II neighborhood of $\mathfrak{e}$ .

Definition 2.13.

(A second metric on $\mathcal{E}(X)$ )

Let $\mathcal{M}(s(\mathfrak{e}))=\epsilon s(\mathfrak{e})$ for each hyperedge $\mathfrak{e}\in\mathcal{E}(X)$ where $\epsilon$ is a constant in the interval $(0,2]$ . Following the construction of the metric $\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}$ with $\mathcal{M}(s(\mathfrak{e}))$ replaced by $\epsilon s(\mathfrak{e})y$ , one obtains a metric modulo the equivalence relation $\cong$ which we denote by $\mathcal{D}_{h,\mathcal{E}_{0}(X),\epsilon}$ .

3. Experimental analysis

In this section, we apply the methodology proposed in Subsection 2.2 combined with the class of metrics in Subsection 2.3 to construct predictive models for weight (respectively, label) prediction problem for hypergraph datasets.

We are not aware of any hypergraph datasets containing weights (respectively, labels) of hyperedges. So in our experimental analysis, we use datasets of weighted bipartite networks, and transform them into hypergraphs.

Let $G=(U,V,E)$ be a bipartite network, where $U$ and $V$ are disjoint sets of vertices, $E$ is the set of edges. For each $e\in E$ , $e$ is uniquely formed by a pair of vertices $(u,v)$ where $u\in U$ and $v\in V$ . A hypergraph $X=(\mathcal{V}(X),\mathcal{E}(X))$ induced by the bipartite graph $G$ is defined as follows.

(i)

By exchanging the notation between $U$ and $V$ , one, without loss of generality, lets $\mathcal{V}(X)=U$ , that is, the set of vertices in $X$ is defined to be one set of vertices in $G$ .
(ii)

For each $v\in V$ , let $\mathfrak{e}_{v}$ denotes the set of all vertices $u$ in $U$ such that $(u,v)\in E$ . We define the set of hyperedges of $X$ as

$\displaystyle\mathcal{E}(X)=\{\mathfrak{e}_{v}\;|\;v\in V\}.$

In this case, each hyperedge $\mathfrak{e}$ in $\mathcal{E}(X)$ uniquely corresponds to a vertex in $V$ . And thus $\mathcal{E}(X)$ is in bijection with $V$ .

In our experimental analysis, we use two real-world datasets from social networks which are Epinions network and MovieLens network. They are both bipartite rating networks where one set of vertices are users and the other set of vertices are items. The weight of an edge presents a rating score from a user to an item. In the hypergraphs induced by these two networks, the set of vertices $\mathcal{V}(X)$ is the set of users, then each hyperedge is formed by the set of all users that rate the same item. Therefore, each hyperedge corresponds to an item in the bipartite graph. An example of transforming a bipartite graph into a hypergraph is shown in Figure 2 and 3. Descriptions of the two datasets are as follows.

•

Epinions. This dataset was collected by Paola Massa in a $5$ -week crawl (Novemeber/December 2003) from the Epinions.com Website (see the dataset at http://www.trustlet.orgdownloaded_epinions.html )(see [9]). In Epinions, the vertices are users and reviews, each user rates the helpfullness of a review on a $1-5$ scale, where $1$ means totally not helpful and $5$ means totally helpful.
•

MovieLens. This dataset contains movie ratings collected from https://movielens.org/ (see [10]). In this dataset, the vertices are users and movies, each edge connects a user with a movie and represents a rating score. The rating scores are integers from $0$ to $5$ .

Table 3 contains descriptions of hypergraphs used in our computation. The weights of hyperedges are computed using the edge weights from bipartite network datasets and the notion of goodness introduced in [7]. In both of the two hypergraphs, each hyperedge corresponds to an item in the bipartite graphs, the weight of each hyperedge, which is computed using goodness, indicates how good an item is. In order to associate each hyperedge with a label, we divide the range of weights, which is a closed interval from minimum value of weights to maximum value of weights, into $q$ intervals with equal length. Then the label of a hyperedge is $i$ if its weight belongs to the $i$ th interval. The set of labels are integers from $1$ to $q$ .

According to the definitions of neighborhoods, one needs to select tuning parameters $h$ for $\mathcal{N}_{h,\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$ , and $\mathcal{N}_{h,\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$ . In our computation, we set the value of $h$ based on the neighborhood of each hyperedge. For each $\mathfrak{e}\in\mathcal{E}$ , $h$ equals to the standard deviation of weights of all hyperedges that belong to the neighborhood of $\mathfrak{e}$ . The choices of $k$ for $k$ NN are integers from $1$ to $20$ , and the values of $\epsilon$ are set to be $\frac{5}{3},\frac{4}{3},1,\frac{2}{3}$ , or $\frac{1}{2}$ . The performances of weight predictions are evaluated using mean of absolute error (MAE) and root mean squared error (RMSE). The performances of label predictions are evaluated using error rate which is computed by the proportion of incorrect prediction. We select the values of $k$ and $\epsilon$ that correspond to the smallest MAE or error rate.

Figures 4 and 5 present the results of predicting weights of hyperedges using Epinions dataset. In this example, the smallest MAE and RMSE values are obtained by setting $\epsilon$ equal to $\frac{5}{3}$ . In general, using the second type of neighborhood ( $\mathcal{N}_{h,\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$ ) leads to a better prediction results than using the first type of neighborhood ( $\mathcal{N}_{h,\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$ ).

Tables 4 and 5 present the results of predicting weights of hyperedges using modified $k$ NN method and embedded $k$ NN method. In our experimental analysis, for simplicity, we set $q=2$ for the computation of $T_{\mathcal{E}(X)}(\mathfrak{e})=(\eta_{1}(\mathcal{N}_{\text{edge}}(\mathfrak{e})),\ldots,\eta_{q}(\mathcal{N}_{\text{edge}}(\mathfrak{e})))$ in the embedded $k$ NN method, where $\mathcal{N}_{\text{edge}}(\mathfrak{e})$ is either type I or type II neighborhood of $\mathfrak{e}$ . Each cell reports a pair of numbers (mean of absolute error (MAE), root mean squared error (RMSE)).

Table 6 presents the error rates of predicting labels of hyperedges using modified $k$ NN methods. Table 7 presents the error rates of predicting labels of hyperedges using embedded $k$ NN methods.

Table 3. Descriptions of hypergraphs constructed using each dataset

	number of hyperedges	number of vertices	$s(\mathfrak{e})$ :Size of hyperedge $\mathfrak{e}$
Epinions	33774	32610	$2\leq s(\mathfrak{e})\leq 918$
MovieLens	7498	75106	$2\leq s(\mathfrak{e})\leq 687$

Table 4. Prediction of weights of hyperedges using modified

k

Metric	$\mathcal{D}_{h,\mathcal{E}_{0}(X),+\infty}$	$\mathcal{D}_{h,\mathcal{E}_{0}(X),\epsilon}$
Epinions	(0.236, 0.301)	(0.234, 0.299)
MovieLens	(0.172, 0.217)	(0.184, 0.232)

Table 5. Prediction of weights of hyperedges using embedded

k

Neighborhood	$\mathcal{N}_{\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$	$\mathcal{N}_{\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$
Epinions	(0.234, 0.298)	(0.232, 0.296)
MovieLens	(0.171, 0.217)	(0.183, 0.231)

Table 6. Error rates of predicting labels using modified

k

Dataset	Epinions		MovieLens
Metric	$\mathcal{D}_{h,\mathcal{E}_{0}(X),+\infty}$	$\mathcal{D}_{h,\mathcal{E}_{0}(X),\epsilon}$	$\mathcal{D}_{h,\mathcal{E}_{0}(X),+\infty}$	$\mathcal{D}_{h,\mathcal{E}_{0}(X),\epsilon}$
2 labels	0.182 (k=1)	0.176 (k=1)	0.250 (k=1)	0.250 (k=4)
3 labels	0.316 (k=3)	0.306(=2)	0.296 (k=1)	0.296 (k=1)
4 labels	0.406 (k=4)	0.412 (k=1)	0.303 (k=6)	0.303 (k=4)
5 labels	0.507 (k=3)	0.512 (k=4)	0.535 (k=5)	0.532 (k=1)

Table 7. Error rates of predicting labels using embedded

k

Dataset	Epinions		MovieLens
Neighborhood	$\mathcal{N}_{\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$	$\mathcal{N}_{\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$	$\mathcal{N}_{\mathcal{E}_{0}(X),+\infty}(\mathfrak{e})$	$\mathcal{N}_{\mathcal{E}_{0}(X),\epsilon}(\mathfrak{e})$
2 labels	0.182(k=2)	0.176 (k=2)	0.252 (k=19)	0.249 (k=13)
3 labels	0.323 (k=2)	0.319 (k=5)	0.295 (k=10)	0.296 (k=3)
4 labels	0.406 (k=10)	0.412 (k=7)	0.304 (k=15)	0.303 (k=10)
5 labels	0.511 (k=5)	0.520 (k=2)	0.519 (k=15)	0.527 (k=1)

References

[1] S. Agarwal, K. Branson, and S. Belongie, Higher order learning with graphs. In ICML, pages 17–24, 2006.
[2] M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, Learning multi-label scene classification. Pattern Recognition, 37(9):1757–1771, 2004.
[3] G. Chen, J. Zhang, F. Wang, C. Zhang, and Y. Gao, Efficient multi- label classification with hypergraph regularization, Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, vol. 0, pp. 1658–1665, 2009.
[4] A. Elisseeff and J. Weston, A kernel method for multi-labelled classification, In Advances in Neural Information Processing Systems 14, pages 681–687, 2001.
[5] F. Kang, R. Jin, and R. Sukthankar, Correlated label propagation with application to multi-label learning. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1719–1726, 2006.
[6] H. Kazawa, T. Izumitani, H. Taira, and E. Maeda, Maximal margin labeling for multi-topic text categorization. In Advances in Neural Information Processing Systems 17, pages 649–656. 2005.
[7] Srijan Kumar, Francesca Spezzano, V. S Subrahmanian, and Christos Faloutsos, Edge Weight Prediction in Weighted Signed Networks. 2016 IEEE 16th International Conference on Data Mining (ICDM) (2016): 221–230.
[8] Dong Li, Zhiming Xu, Sheng Li, and Xin Sun, Link prediction in social networks based on hypergraph, Proceedings of the 22nd International Conference on World Wide Web, Pages 41-42
[9] P. Massa, and P. Avesani,Trust-aware Recommender Systems. Proceedings of the 2007 ACM Conference on Recommender Systems (2007): 17–24. Web.
[10] F. Maxwell Harper and Joseph A. Konstan, The MovieLens Datasets: History and Context, ACM Transactions on Interactive Intelligent Systems 5, 4: 19:1-19:19.
[11] J. Silva and R. Willett, Hypergraph-based anomaly detection of high-dimensional co-occurrences, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, pp. 563–569, 2009.
[12] Z. Tian, T. Hwang, and R. Kuang, A hypergraph- based learning algorithm for classifying gene expression and arraycgh data with prior knowledge, Bioinformatics, Volume 25, Issue 21, 1 November 2009, Pages 2831–2838.
[13] A. Torralba, K. P. Murphy, and W. T. Freeman, Sharing visual features for multiclass and multiview object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(5):854–869, 2007.
[14] N. Ueda and K. Saito, Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 16, pages 721–728. 2003.
[15] M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10):1338–1351, 2006.
[16] Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf, Learning with hypergraphs: Clustering, classification, and embedding, In NIPS, 2006.
[17] S. Yu, K. Yu, V. Tresp, and H.-P. Kriegel, Multi-output regularized feature projection, IEEE Transactions on Knowledge and Data Engineering, 18(12):1600–1613, 2006.

	$\displaystyle\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{g})$	$\displaystyle=\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g})\|$
		$\displaystyle=\|(\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f}))+(\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g}))\|$
		$\displaystyle\leq\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})\|+\|\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f})-\mathfrak{C}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{g})\|$
		$\displaystyle=\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{e},\mathfrak{f})+\mathcal{D}_{h,\mathcal{E}_{0}(X),\mathcal{M}}(\mathfrak{f},\mathfrak{g}).$