Dual-Geometric Space Embedding Model for Two-View Knowledge Graphs
Abstract.
Two-view knowledge graphs (KGs) jointly represent two components: an ontology view for abstract and commonsense concepts, and an instance view for specific entities that are instantiated from ontological concepts. As such, these KGs contain heterogeneous structures that are hierarchical, from the ontology-view, and cyclical, from the instance-view. Despite these various structures in KGs, most recent works on embedding KGs assume that the entire KG belongs to only one of the two views but not both simultaneously. For works that seek to put both views of the KG together, the instance and ontology views are assumed to belong to the same geometric space, such as all nodes embedded in the same Euclidean space or non-Euclidean product space, an assumption no longer reasonable for two-view KGs where different portions of the graph exhibit different structures. To address this issue, we define and construct a dual-geometric space embedding model (DGS) that models two-view KGs using a complex non-Euclidean geometric space, by embedding different portions of the KG in different geometric spaces. DGS utilizes the spherical space, hyperbolic space, and their intersecting space in a unified framework for learning embeddings. Furthermore, for the spherical space, we propose novel closed spherical space operators that directly operate in the spherical space without the need for mapping to an approximate tangent space. Experiments on public datasets show that DGS significantly outperforms previous state-of-the-art baseline models on KG completion tasks, demonstrating its ability to better model heterogeneous structures in KGs.
1. Introduction
Knowledge graphs (KGs) are essential data structures that have been shown to improve several semantic applications, including semantic search (Berant and Liang, 2014), question answering (Huang et al., 2019a; Iyer et al., 2022), and recommender systems (Wang et al., 2019). Two-view knowledge graphs (Huang et al., 2019b) consist of both (1) an instance-view component, containing entity-entity relations, and (2) an ontology-view component, containing concept-concept and entity-concept relations. In addition, there are nodes involved in both entity-concept relations and entity-entity relations, which we refer to as bridge nodes, e.g., nodes 5 and 6 in Figure 1(a). Bridge nodes provide the connections to bridge both KG views. The instance-view component contains data that form cyclical structures, such as nodes 8-10 in Figure 1(a). The ontology-view component contains data that form hierarchical structures. As such, the two-view KG contains intrinsic heterogeneous structures.

A key challenge with KGs is that they are often highly incomplete, making KG completion an important area of investigation. In recent years, several research works have focused on improving knowledge graph embeddings (KGE) to address this task (Bordes et al., 2013; Yang et al., 2015). The idea behind embedding methods is to map entities and relations to a latent low-dimensional vector space while preserving the semantics and inherent structures in the KG. However, most embedding methods, including graph neural network (GNN)-based (Chami et al., 2020; Sun et al., 2020; Chami et al., 2019; Iyer et al., 2021) and non-GNN-based methods (Kolyvakis et al., 2019; Trouillon et al., 2016), are limited to operating on KGs belonging to either the instance-view or the ontology-view but not both. Specifically, they either model entities and relations in the zero-curvature Euclidean space (Sun et al., 2019), omitting both cyclic and hierarchical structures, or in the hyperbolic space (Bai et al., 2021; Zhang et al., 2020), omitting cyclic structures and treating relations as hierarchical.
For works that seek to put both views of the KG together, the instance and ontology views are assumed to belong to the same space, an assumption no longer reasonable for two-view KGs. Specifically, all KG relations are modeled using the Euclidean space (Hao et al., 2019), or are all modeled using a product space, being the Riemannian product manifold of the Euclidean, spherical and hyperbolic spaces (Wang et al., 2021). However, our goal in modeling two-view KGs is to use a unified framework to embed hierarchical structures of the KG in the hyperbolic space, cyclic structures of the KG in the spherical space, and nodes involved in both structures in a unified intersection space.
To address the above challenges, in this paper, we present the Dual-Geometric Space embedding model (DGS) for two-view KGs. To summarize, our work makes the following contributions:
-
•
We formulate the problem of modeling a two-view KG in complex non-Euclidean geometric space for KG completion. To our knowledge, we are the first to model two-view KGs in distinct non-Euclidean spaces using a unifying framework, e.g., different views belong to different geometric spaces.
-
•
We propose to model instance-view entities in the spherical space for solely cyclical relations, and ontology-view concepts in the hyperbolic space for solely hierarchical relations, and bridge entities involved in both cyclic/hierarchical structures in a specially designed intersection bridge space.
-
•
To the best of our knowledge we are also the first to design closed spherical space operations, to directly operate in the spherical space without mapping to an external approximation space, e.g., the tangent Euclidean space.
-
•
We investigate seven variant and ablation models of DGS and evaluate these models on two KG completion tasks. Extensive experiments demonstrate the effectiveness of DGS in populating knowledge in two-view KGs. Further, our model significantly outperforms its single non-Euclidean and Euclidean geometric space counterparts including the product space, and existing state-of-the-art graph neural network (GNN) embedding methods.
2. Preliminary and Related Work
Here, we formalize two-view KGs, which jointly model the entities and concepts in the instance and ontology views, and discuss the achievement of primary state-of-the-art models for two-view KGs.
2.1. Problem Formulation
A two-view KG, , consists of nodes and edges, such that nodes denote the set of entities , or set of concepts, . Edges denote relations where are the set of entity-entity relations, are the set of concept-concept relations, and are the set of entity-concept relations. The two-view KG consists of an instance-view component containing entities with relations of , and an ontology-view component containing concepts and bridge entities with relations of and . We denote a subset of entities in to be bridge entities, that communicate between both instance and ontology views. These bridge entities associate with both and relations. Relations in are entity-concept relations, which are similar to concept-concept relations such as “is_a”. and as well as , and are each disjoint sets. We denote to be the -th entity, to be entity ’s representation in the Cartesian coordinate system, and to be the -th relation in the instance-view component. Without loss of generality (WLOG), we also model the ontology-view component where is the -th concept, is concept ’s representation in the Cartesian coordinate system, and is the -th relation in the ontology-view component. Therefore, the instance-view can be considered as a set of triples in the form , which may be any real-world instance associations and are often cyclical in structure. Likewise, the ontology-view can be represented by a set of triples , which contain hierarchical associations such as is_a. Here we use Figure 1(a) to illustrate the aforementioned problem formulation. For example, (Soil, composed_of, Mineral) and (Fungus, is_a, Eukaryote) are instance-view and ontology-view triples respectively. Further, nodes of Bacteria-Soil-Mineral is an example of a cycle between non-bridge entities.
A bridge entity may be involved with two types of triples:
for (hierarchical) entity-concept relations and
for (cyclical) entity-entity relations. For example, Rabbit is a bridge entity in Figure 1(a). The triple (Rabbit, emit, Carbon Dioxide) represents an entity-entity relation, and the triple (Rabbit, is_a, Herbivore) represents an entity-concept relation.
The objective of our research is to learn KG embeddings of nodes and relations in the KG, such that we seamlessly unify multiple curvature geometric spaces to better capture the contextual information and heterogeneous structures in the KGs. We evaluate the quality of the learned embeddings on the KG tasks of triple completion and entity typing, described in Section 4.
2.2. Non-Euclidean Geometric Spaces
In this section, we describe the various properties of non-Euclidean geometric spaces, which are curved spaces unlike the zero-curvature Euclidean space. The textbook (Petersen et al., 2006) provides more details. Geometric spaces of Euclidean (), spherical (), and hyperbolic () spaces belong to Riemannian manifolds (), such that each point has a corresponding tangent space, , that approximates around . Further, each Riemmanian manifold, , is associated with a Riemanian metric that defines the geodesic distance of two points on the manifold and the curvature of the space. In the spherical space, curvature , suitable for capturing cyclical structures (Zhang et al., 2021), while in the hyperbolic space, curvature , suitable for capturing hierarchical structures (Nickel and Kiela, 2017). Widely used models on the hyperbolic space include the Poincaré ball model (Cannon et al., 1997), the Lorentz (Beltrami, 1897) model, and the Klein model (Beltrami, 1897). As the these three models are both isometric and isomorphic to one another, WLOG we utilize the Poincaré ball model in our work.
Non-Euclidean Space Optimization
DGS utilizes Riemannian optimization for updating entity and relational embeddings because Euclidean space optimization methods, such as SGD, provide the update direction of the Euclidean gradient to be in non-curvature space. This does not align with parameters in our model that must be updated in positive or negative curvature spaces. For parameter learning, DGS uses RSGD (Bonnabel, 2013), whose update function is denoted below, where denotes the tangent Euclidean space of , denotes the Riemannian gradient of loss function , denotes retraction onto , or non-Euclidean space at , and denotes learning rate at time :
The retraction operator, involves mapping between spaces. For non-Euclidean spaces, the retraction is generally performed between the non-Euclidean space and approximate tangent Euclidean space using logarithmic and exponential mapping functions as follows, where is a logarithmic map at center from the hyperbolic space to Euclidean tangent space, and is an exponential map at center from the Euclidean tangent space to hyperbolic space of .
(1) |
(2) |
2.3. Two-View KG Models
In this section, we describe the models that are utilized for two-view KGs, which consider the problem setting of modeling ontological and instance views. To address the challenges of these models, we propose DGS in Section 3.
JOIE
The JOIE (Hao et al., 2019) model is an embedding model that considers the problem of two-view KGs. However, JOIE models all triples in the same zero-curvature Euclidean space, omitting the hierarchical and cyclical structures of the KG. Further, there is no special consideration of representation for bridge nodes.
Models leveraging product of spaces
Models including
M2GNN (Wang et al., 2021) which extends (Gu et al., 2018) from social networks to the domain of KGs, (Sun et al., 2022), etc. help to address the limitation of JOIE by utilizing the hyperbolic and spherical spaces for representing triples. However, they treat all triples in the KG to be in the same non-Euclidean product space, formed by a Riemannian product of the Euclidean, spherical, and hyperbolic spaces. Thus, the embedding space of hierarchical triples is not distinguished from the embedding space of cyclic triples, and further there is also no special consideration of representation for bridge nodes.
3. DGS Architecture
This section describes our proposed dual-geometric space embedding model (DGS), which jointly embeds entities and concepts in a knowledge graph. DGS embeds different link types of the two-view KG in different non-Euclidean manifold spaces to capture the inherent heterogeneous structures of the KG, as described below. We also design general procedures, which are detailed in the Supplements, for efficiently converting arbitrary -dimensional embeddings between the polar and Cartesian coordinate representations, which are utilized by sub-models in DGS. Source code is available at https://github.com/roshnigiyer/dgs.
3.1. Modeling
The key questions in the modeling process are in how to: (1) design or adopt an appropriate embedding space for nodes, (2) define the representation of entity/concept and relation in that embedding space, and (3) define the KG embedding model and loss function of DGS. Our framework enables each of entities and concepts to be influenced by bridge nodes, and simultaneously bridge nodes to be informed by entities and concepts. In this way, DGS exhibits a unified framework to jointly model two-views of KGs.
DGS models nodes in the instance view as points in a spherical space with fixed norm space , and nodes in the ontology view as points in a hyperbolic space with learnable norm space per concept. The bridge nodes lie in the intersection of the two, which is a submanifold intersection space, called , shown as the dotted circle in Figure 1(b). contains the same fixed norm space as the spherical space . For modeling compatibility, we set the degrees of freedom of , , and to , , and respectively. has one degree of freedom less than and because it is a submanifold intersection space of both the spherical and hyperbolic spaces. The norm of is because that is the intersection norm space of and . The concept-specific norm spaces, , are learnable in order for the hierarchy in the KG to be learned by the embedding. In practice, it can be seen that hierarchical root concepts move towards the center of the hyperbolic space e.g., towards norm 0, shown in Section 4.
Parameter optimization, detailed in Section 3.2, is performed using RSGD, described in Section 2.2, on hinge loss functions which utilize non-Euclidean space geodesic distances for the spherical and hyperbolic spaces respectively. We construct the hinge loss function such that positive triples are scored higher than negative triples within positive margin hyperparameters.
3.1.1. Modeling Instance View KG in Spherical Space
Representation of entities.
We propose to embed the entities from the instance-view on the surface of the spherical ball from the spherical space, , in order to better capture the cyclic structures present in this KG view. Entities are represented as points that belong to the surface of the spherical ball in as shown in Figure 1(b). Formally, , is the -dimensional -norm ball, where is the Euclidean norm.
For an entity , we propose to model the relation applied to the entity as a rotation operation, and therefore represent the relation as a vector of angles . Below, we show our proposed vector transformation procedure to model the relation as rotation of the head entity, , and prove that the operation is closed in the spherical space. eliminates the need to project representations to the tangent Euclidean space in order to perform standard transformations. To the best of our knowledge, we are the first to propose a closed vector transformation procedure that operates directly in the spherical space.
Spherical Space Vector Transformation Procedure: .
This section describes our proposed vector transformation procedure, , for the spherical space, which directly performs operations using properties, such as positive curvature, in the spherical space. This section also outlines the proof of the closedness property of the transformation procedure. For computational efficiency, we also extend the vector transformation procedure to the batch version, the batch vector transformation procedure, which is detailed in the Supplements. takes as input an entity embedding and relation operator, which it uses to transform the entity embedding: :
-
(1)
Given , we convert to the corresponding representation in polar coordinates, with denoting the radius which has the value . Refer to Section A.1 for details on the conversion procedure.
(3) (4) -
(2)
Denote to be the -th entry of and apply the transformation:
(5) (6) -
(3)
Convert from polar coordinates back to Cartesian coordinates. Refer to Section A.2 for details on the conversion procedure.
(7)
Theorem.
The vector transformation procedure, , is closed in the spherical space.
Proof.
The proof is outlined by examining all three steps for from the transformation procedure, where and represent the same point in the embedding space by the isomorphic Cartesian and polar coordinate systems respectively.
-
(1)
The Cartesian coordinate representation is equivalent to the polar coordinate representation of the point, , under MCP model, detailed in the Supplements. Further, radius of the polar representation embeddings lies in the spherical norm space .
-
(2)
defines polar representation embeddings in the spherical space, and
. Thus, both angular and radial coordinates are preserved. -
(3)
The polar coordinate representation is equivalent to the Cartesian coordinate representation of the point, , under MPC model, detailed in the Supplements.
∎
Instance View Loss Function.
The instance-view model uses a hinge loss function that is maximized for all triples in the instance space , with positive triples denoted, with corresponding score and negative triples denoted, with corresponding score , and is a positive margin hyperparameter. Specifically, the instance loss function measures the distance between the predicted tail entity and the ground truth.
(8) |
(9) |
(10) |
We calculate spherical geodesic distance (Meng et al., 2019) between points and on the manifold as follows:
(11) |
includes links between all entities including both non-bridge entities and bridge-entities. In this way, the representation of non-bridge entities is influenced by the bridge entities.
3.1.2. Modeling Ontological View KG in Hyperbolic Space
Representation of concepts.
We propose to embed concepts from the ontology-view on the Poincaré disk from the hyperbolic space, , in order to better capture the hierarchical structures present in this KG view. Concepts are represented as points, that belong inside the Poincaré disk in as shown in Figure 1(b). Formally, , is the -dimensional -norm ball, where is the Euclidean norm. We assume the center of the disk is aligned with the center of the sphere, and for convenience set the last dimension to 0.
For concepts in the hyperbolic space, , any hyperbolic space model for KG can be applied in principle, which we denote as follows with to denote a relation between two concepts:
(12) |
We illustrate MuRP (Balaževic´ et al., 2019) as an example scoring function, which uses the hyperbolic geodesic distance and relies on Möbius addition to model the relation, where and are defined in Section 2.2, is a learnable diagonal relation matrix representing the stretch transformation by relation with representation , and scalar biases of concepts and :
(13) |
(14) |
We calculate hyperbolic geodesic distance (Nickel and Kiela, 2017) between points and on the manifold as follows:
(15) |
Ontological View Loss Function.
The ontology-view model uses a hinge loss function that is maximized for all links between concepts in the ontology space and all links between bridge nodes and concepts, i.e., , with positive triples denoted, with corresponding score and negative triples denoted, with corresponding score , and is a positive margin hyperparameter:
(16) |
(17) |
(18) |
includes triples composed of concepts and bridge-entities. In this way, the representation of concepts is influenced by the bridge entities, which is described in Section 3.1.3.
3.1.3. Modeling the Intersection of the Two Spaces.
Representation of bridge entities.
Bridge nodes are entity nodes that bridge the communication between the instance-view and ontology-view components. Bridge nodes are connected to concepts in the graph, formed by link ontology but may also be connected to other entities, formed by link instance . As shown in Figure 1(a), nodes 5 and 6 are bridge nodes that are involved in both cyclic and hierarchical structures through their links to other entities as well as concepts. As such, we propose to embed bridge entities in the intersection space of the Poincare disk and surface of the spherical ball, in order to better capture the heterogeneous KG structures that are associated with these nodes. We refer to the intersection submanifold embedding space as the bridge space, , where the representation of these nodes are informed by both and and has one lower degree of freedom. The bridge space can therefore be derived as a sphere in general. Formally, , is the -dimensional -norm ball, where is the Euclidean norm, and the value of last dimension . Links associated with bridge nodes are , and operations on bridge nodes, such as geodesic distance and loss functions, happen in either the spherical space or hyperbolic space.
To ensure compatibility with the connected concept nodes, we map the bridge entities, , to an embedding in the ontology space through a non-linear transformation function, , where denotes an averaging over all relations in . Logarithmic and exponential mapping functions of and are described in Section 2.2.
(19) |
(20) |
(21) |
(22) |
where both the weight matrix and bias are specific to each relation in and reserved for the ontology .
Bridge Node Loss Function.
The bridge-node model uses a hinge loss function as a combination of the entity’s ontology-specific loss, , and instance-specific loss, , that is maximized for which contains all triples associated with the bridge nodes. Positive triples are denoted and , and negative triples are denoted and with loss function defined as follows:
(23) |
(24) |
(25) |
The combination loss function above enables bridge nodes to learn from both spaces of intersection of and .
3.2. Training
This section details the training framework of DGS, for representing two-view KGs, described in Section 3.1. We describe, for each epoch, the training of each node in the two-view KG which includes (1) the forward propagation step, (2) the loss function, and (3) the backward propagation step to optimize parameters. Algorithm 1 provides a summary of the framework.
Embedding Initialization.
We randomly initialize all embeddings of and in polar coordinates: and , then convert entity embeddings to their corresponding Cartesian representation: . Link is a randomly sampled position on the Poincare disk. Refer to Section A in Supplements for details about the conversion procedure, describing both polar-Cartesian coordinate conversion (MPC) and Cartesian-polar coordinate conversion (MCP). For the instance-view model, we also choose a value for norm , that is sampled from a uniform distribution: for all entities, and for the ontology-view model, we choose a value for norm , assigned uniformly at random per entity. We set the curvature values of the spherical and hyperbolic spaces as and respectively. We leave the non-trivial problem of learning optimal curvatures as future work.
Training Procedure for Instance View KG
Parameter optimization is performed using Riemannian stochastic gradient descent (RSGD) for the spherical space as follows for entity embedding and relational embedding updates respectively. To ensure that the updated entity embedding remains in the norm- space, we perform a rescaling operation, , to project out-of-boundary embeddings back to the surface of the -ball.
(26) |
(27) |
(28) |
(29) |
Training procedure for concepts.
Parameter optimization is performed using RSGD for the hyperbolic space as follows for concept embedding and relational embedding updates respectively, where the corresponding concept norm space, , is also learned through RSGD by updating embeddings of . Diagonal relational matrix is also updated through RSGD, and we update scalar biases through stochastic gradient descent.
(30) |
(31) |
(32) |
(33) |
(34) |
After the epoch’s update of concept embeddings, we once again reset the value of the last dimension to 0 to satisfy the original framework constraint of the Poincaré disk. We also enforce that the angular dimensions of relational embeddings are in .
Training Procedure for Bridge Nodes
Parameter optimization is performed using RSGD for the bridge space as follows for bridge entity embedding and relational embedding updates respectively. Ontology optimization, , and instance optimization, , are performed alternatively in batches for each of the two embedding types according to the type of link that the embedding is associated with, e.g., or . This enables the representation of bridge nodes to be informed by both and .
(35) |
(36) |
(37) |
(38) |
(39) |
After each epoch’s update of the instance optimization for , the value of the last dimension is reset to 0 and rescaled with defined to be the same as to ensure the intersection space constraint of the bridge entity model. Angular dimensions are also enforced to be in [0, 2).
4. Experiments
In this section, we evaluate DGS on two KG tasks: the triple completion task on each of the instance and ontology views of the KG and the entity typing task to test quality of the learned bridge space in communicating between each view of the KG. We also provide a case study on entity typing for different variants of DGS by embedding on other combinations of geometric spaces. Further, we provide a visualization of embeddings before and after the learning process projected onto the 3-D geometric space of DGS.
4.1. Datasets
We utilize the datasets of YAGO26K-906 and DB111K-174 since they have the two-view KG setting unlike other datasets for KG embeddings that consider solely an instance-view (Toutanova and Chen, 2015) or ontology-view (Dettmers et al., 2018). YAGO26K-906 and DB111K-174 are prepared from (Hao et al., 2019), which are extracted from YAGO (Suchanek et al., 2007) and DBpedia (Lehmann et al., 2015) respectively. Refer to (Hao et al., 2019) for the detailed construction process. Table 1 provides dataset statistics and Table 2 provides data splits for both datasets on both KG tasks. It can be observed that the instance-view contains many more triples than the ontology-view and that DB111K-174 contains a larger proportion of entity-concept triples (10.35%) compared to YAGO26K-906 (2.43%).
Dataset | Nodes | Relations | |||
# | # | #-: | #-: | #-: | |
YAGO26K-906 | 26,078 | 906 | 390,738 | 8,962 | 9,962 |
DB111K-174 | 111,762 | 174 | 863,643 | 763 | 99,748 |
YAGO26K-906 | |||
Task | Tr() | V() | Ts() |
Triple Completion | 332,128/7,618/8,691 | 19,536/448/485 | 39,074/896/1,019 |
Entity Typing | 211,346/4,876/5,379 | 23,549/543/598 | 156,311/3,592/3,985 |
DB111K-174 | |||
Task | Tr() | V() | Ts() |
Triple Completion | 734,096/648/84,864 | 43,182/38/5,018 | 86,365/77/10,131 |
Entity Typing | 466,538/462/53,863 | 51,828/46/5,985 | 345,504/337/39,900 |
4.2. Models
4.2.1. Baselines
We compare DGS to state-of-the-art neural network embedding models, which include Euclidean, non-Euclidean, and product space KGE models, as well as GNN-based models for KG completion and entity typing.
-
•
TransE (Bordes et al., 2013), one of the first KGE models, which simply captures the relationship between entities as a translation.
-
•
DistMult (Yang et al., 2015), a matrix factorization KGE model, modeling the relationship between entities via multiplication.
-
•
ComplEx (Trouillon et al., 2016), a KGE model that extends DistMult into the complex number field.
-
•
RotatE (Sun et al., 2019), a recent KGE model, based on the rotation assumption where a relation is a rotation from the subject to the object in the complex vector space.
- •
-
•
HyperKG (Kolyvakis et al., 2019), a KGE model extending translational KGE methods to the Poincaré-ball model of hyperbolic geometry.
-
•
HAKE (Zhang et al., 2020), which extends RotatE by having relations combining modulus scaling with rotation.
-
•
ConE (Bai et al., 2021), a KGE model embedding entities into hyperbolic cones and relations as transformations between cones.
-
•
RefH/RotH/AttH (Chami et al., 2020), which are hyperbolic KGE models that combine hyperbolic spaces using hyperbolic attention, where RefH and RotH are variants of AttH using only reflections and rotations respectively.
-
•
HGCN (Chami et al., 2019), a hyperbolic GCN model utilizing Riemannian geometry and the hyperboloid model.
-
•
HyperKA (Sun et al., 2020), which extends GCNs from the Euclidean to hyperbolic space using the Poincaré ball model.
4.2.2. DGS Variants
We describe variant models of DGS below.
-
•
DGS-RO-FC, which is DGS with the Riemannian operator (RO) used for vector transformation instead of our proposed closed spherical space vector transformation procedure in Section 3.1.1, and with fixed center (FC) of spherical ball at 0, which is the same center as the Poincaré disk. For single geometric spaces of and for non-bridge nodes and concepts, the Riemannian operator is performed as a retraction operation to the tangent Euclidean spaces. However, we extend the Riemannian operator when performing retraction for the intersection ring of the bridge space, which is formed by intersecting the spherical ball’s surface and Poincaré disk. This is described in the Supplements.
-
•
DGS-SO-FC, which is DGS with the proposed closed spherical space operator (SO), and with FC of spherical ball at 0, which is the same center as the Poincaré disk.
-
•
DGS (ours), which is DGS with the proposed closed SO, and with learnable center (LC) of spherical ball, which for simplicity of constructing the intersection space for bridge nodes, is set to the last dimension to only allow for vertical shift. Note that we do not need to also make the center of the Poincaré disk learnable as this shift is already introduced with making one of the centers learnable. For learning the spherical center, , in the model, we follow the same training procedure for Section 3.2 but for the non-bridge entities and bridge entities, we perform the operations by temporarily shifting to the center 0 (e.g., shift), then shift back to the new center (e.g., shift) after the updates are performed.
4.2.3. DGS Ablation Models
We study different ablations models of DGS, which are formed by utilizing different combinations of manifold spaces for each type of link of DGS () in the two-view KG including the spherical space, , hyperbolic space using the Poincaré model, , or Euclidean space, . These include: DGS (), MGS (), DGS (), and DGS (). Since is always at the intersection of the two spaces and , we do not need to specify the geometric space separately.
4.3. Evaluation
Datasets | YAGO26K-906 | DB111K-174 | ||||||||||||
Type | Graphs | KG Completion | KG Completion | KG Completion | KG Completion | |||||||||
Metrics | MRR | H@1 | H@10 | MRR | H@1 | H@10 | MRR | H@1 | H@10 | MRR | H@1 | H@10 | ||
TransE | 0.187 | 13.73 | 35.05 | 0.189 | 14.72 | 24.36 | 0.318 | 22.70 | 48.12 | 0.539 | 47.90 | 61.84 | ||
DistMult | 0.288 | 24.06 | 31.24 | 0.156 | 14.32 | 16.54 | 0.280 | 27.24 | 29.70 | 0.501 | 45.52 | 64.73 | ||
non-GNN based | ComplEx | 0.291 | 24.85 | 37.28 | 0.180 | 14.83 | 22.97 | 0.321 | 27.39 | 46.63 | 0.549 | 47.80 | 62.23 | |
KGE models | RotatE | 0.302 | 25.31 | 42.17 | 0.228 | 16.35 | 27.28 | 0.356 | 29.31 | 54.60 | 0.557 | 49.16 | 68.19 | |
JOIE | 0.316 | 24.62 | 51.85 | 0.289 | 18.66 | 39.13 | 0.479 | 35.21 | 72.38 | 0.602 | 52.48 | 79.71 | ||
HyperKG | 0.215 | 18.35 | 36.02 | 0.174 | 14.50 | 23.26 | 0.302 | 23.31 | 46.72 | 0.542 | 47.59 | 62.11 | ||
HAKE | 0.293 | 23.04 | 40.19 | 0.301 | 19.27 | 41.09 | 0.391 | 31.10 | 60.46 | 0.638 | 55.69 | 81.07 | ||
ConE | 0.299 | 23.56 | 41.23 | 0.313 | 20.05 | 41.80 | 0.422 | 33.69 | 68.12 | 0.639 | 55.89 | 81.45 | ||
RefH | 0.282 | 23.19 | 40.52 | 0.298 | 19.70 | 41.26 | 0.407 | 30.06 | 66.93 | 0.622 | 55.35 | 81.09 | ||
RotH | 0.295 | 23.50 | 41.03 | 0.308 | 19.97 | 41.78 | 0.418 | 30.18 | 67.05 | 0.639 | 55.82 | 81.44 | ||
GNN-based models | AttH | 0.298 | 23.43 | 41.20 | 0.310 | 19.99 | 41.53 | 0.419 | 30.10 | 66.58 | 0.629 | 55.37 | 81.39 | |
HGCN | 0.307 | 23.04 | 40.25 | 0.302 | 19.38 | 40.49 | 0.396 | 31.54 | 61.78 | 0.638 | 55.81 | 81.60 | ||
HyperKA | 0.320 | 26.71 | 52.09 | 0.305 | 18.83 | 40.28 | 0.486 | 35.79 | 72.33 | 0.613 | 53.36 | 80.59 | ||
M2GNN | 0.347 | 29.63 | 54.28 | 0.341 | 23.70 | 42.19 | 0.506 | 36.52 | 73.11 | 0.644 | 56.82 | 83.01 | ||
– | DGS () | 0.338 | 27.15 | 53.20 | 0.318 | 20.36 | 41.02 | 0.491 | 34.58 | 71.40 | 0.606 | 53.29 | 80.17 | |
Ablation variants: | – | DGS () | 0.314 | 25.11 | 52.02 | 0.358 | 24.61 | 43.28 | 0.502 | 35.79 | 73.61 | 0.663 | 57.59 | 84.16 |
DGS () | – | DGS () | 0.327 | 25.32 | 52.89 | 0.343 | 23.95 | 41.62 | 0.498 | 35.11 | 72.37 | 0.640 | 56.17 | 82.68 |
– | DGS () | 0.322 | 24.91 | 52.36 | 0.297 | 19.43 | 40.61 | 0.484 | 33.29 | 73.54 | 0.619 | 53.72 | 80.51 | |
– | DGS-RO-FC | 0.352 | 29.79 | 55.21 | 0.364 | 25.04 | 43.27 | 0.518 | 37.65 | 73.97 | 0.681 | 59.23 | 84.16 | |
DGS variants | – | DGS-SO-FC | 0.364 | 30.15 | 55.93 | 0.369 | 25.81 | 44.18 | 0.536 | 38.29 | 74.28 | 0.687 | 59.26 | 84.82 |
– | DGS (ours) | 0.366 | 30.15 | 56.06 | 0.372 | 25.88 | 44.38 | 0.536 | 38.31 | 74.85 | 0.690 | 59.88 | 84.82 |
In this section, we detail our evaluation on the tasks of KG triple completion and entity typing. The goal of triple completion is to construct the missing relation facts in a KG structure. Specifically, we test constructing a missing target node, from each of the ontology or instance views: or queries and , such that each model evaluated is trained on the entire two-view KG. The goal of entity typing is to predict the concepts that correspond to the entities, or queries .
Using plausibility scores to rank each test candidate set, for each task, we report results for the evaluation metrics of mean reciprocal rank () and Hits@, e.g., , , . Table 2 reports our data splits for each task. For triple completion, this is chosen to be embedding distance of the source node to the missing target node modeled under the relation, and for entity typing, this is chosen to be the embedding distance from the entity’s representation in the concept space to the concept. (Hao et al., 2019) provides more details on the evaluation procedure. For evaluation consistency, for both tasks, model training hyperparameters are chosen for dimensionality {50, 100, 200, 300} for all triples, learning rate {5e-4, 1e-3, 1e-2, 1e-1} and margins {0.5, 1}. Further, different batch sizes and epochs are used according to the type and size of the graphs.
KG Triple Completion Results.
Results are reported in Table 3. DGS outperforms all of the baseline models on both datasets. DGS achieves an average performance gain over all baselines by 32.36% on , 27.59% on , and 29.17% on for the instance-view completion across both YAGO26K-906 and DB111K-174. DGS achieves an average performance gain over all baselines by 23.43% on , 28.41% on , and 18.11% on for the ontology-view completion across both YAGO26K-906 and DB111K-174.
It can be observed that in both the instance and ontology views on both datasets, the hyperbolic-based KGE models outperform their Euclidean and complex space counterparts. Further, hyperbolic KGE models perform better on ontology view than instance view likely due to there being prevalence of hierarchy in the ontology. It is also seen that using multiple geometric spaces is more effective than using a single geometric space. For GNN-based models, M2GNN, which uses a product space, combining Euclidean, spherical, and hyperbolic spaces, outperforms the models using only one of the spaces. Compared to M2GNN, the most competitive baseline model in most cases, DGS shows significant improvement of 3.25% on , 5.15% on , and 2.73% on on average for both YAGO26K-906 and DB111K-174. This is likely because hierarchy in the KG is better modeled through the hyperbolic space than with influence from the spherical or Euclidean space, and cyclic links are better modeled through the space than with influence from the hyperbolic or Euclidean space. For bridge entities involved in both hierarchical and cyclic links, we see it is beneficial to model them in an intersection space to better model both of these properties.
Datasets | YAGO26K-906 | DB111K-174 | ||||
Metrics | MRR | Acc. | Hit@3 | MRR | Acc. | Hit@3 |
TransE | 0.144 | 7.32 | 35.26 | 0.503 | 43.67 | 60.78 |
DistMult | 0.411 | 36.07 | 55.32 | 0.551 | 49.83 | 68.01 |
ComplEx | 0.519 | 43.08 | 59.28 | 0.583 | 55.07 | 70.17 |
RotatE | 0.673 | 58.24 | 73.95 | 0.721 | 61.48 | 75.67 |
JOIE | 0.899 | 85.72 | 96.02 | 0.859 | 75.58 | 96.02 |
HyperKG | 0.627 | 55.39 | 65.64 | 0.736 | 61.20 | 74.29 |
HAKE | 0.905 | 87.42 | 96.37 | 0.866 | 76.04 | 96.22 |
ConE | 0.912 | 87.51 | 96.39 | 0.869 | 76.85 | 96.38 |
RefH | 0.907 | 87.49 | 96.37 | 0.867 | 76.26 | 96.31 |
RotH | 0.909 | 87.49 | 96.39 | 0.868 | 76.55 | 96.36 |
AttH | 0.910 | 87.50 | 96.38 | 0.868 | 76.50 | 96.33 |
HGCN | 0.905 | 87.44 | 96.37 | 0.867 | 76.11 | 96.27 |
HyperKA | 0.918 | 87.76 | 96.45 | 0.871 | 76.65 | 96.50 |
M2GNN | 0.922 | 88.16 | 97.01 | 0.880 | 77.58 | 96.94 |
DGS () | 0.919 | 87.94 | 96.81 | 0.875 | 77.03 | 96.78 |
DGS () | 0.924 | 88.12 | 97.01 | 0.887 | 77.53 | 96.95 |
DGS () | 0.922 | 88.07 | 96.91 | 0.883 | 77.37 | 96.58 |
DGS () | 0.917 | 87.68 | 96.41 | 0.866 | 76.33 | 96.48 |
DGS-RO-FC | 0.930 | 88.47 | 97.32 | 0.891 | 77.81 | 97.04 |
DGS-SO-FC | 0.938 | 89.02 | 98.11 | 0.895 | 77.92 | 97.40 |
DGS (ours) | 0.939 | 89.07 | 98.18 | 0.895 | 77.94 | 97.47 |
Entity Typing Results.
Results are reported in Table 4. Consistent with the KG completion results, it can be seen that DGS variants outperform the baseline models on both datasets for entity typing. DGS achieves an average performance gain over all baselines of 24.49% on , 26.39% on , and 18.78% on for YAGO26K-906. DGS achieves an average performance gain over all baselines of 14.86% on , 13.74% on , and 12.15% on for DB111K-174. Compared to the most competitive GNN-based model, M2GNN, DGS has nearly a 2% gain in for each YAGO26K-906 and DB111K-174.
Ablation Studies.
Tables 3 and 4 report our results for ablation studies in models belonging to the category Ablation variants. It can be seen that using the spherical space for links learns a better representation than its hyperbolic or Euclidean space counterparts. This indicates the prevalence of cyclic relations between entities. Utilizing the hyperbolic space for and links learns a better representation than its spherical or Euclidean space counterparts. This indicates the prevalence of hierarchical relations between concepts and entities-concepts. Interestingly, a fully hyperbolic-space model in general achieves better performance than its fully spherical counterpart, indicating that YAGO26K-906 and DB111K-174 may contain relatively more hierarchical links than cyclic links. Further, learning a better representation in one view benefits the other view more if there are more bridge links. For example, since DB111K-174 contains significantly more links than YAGO26K-906 shown in Table 1, it can be seen that DGS () has better performance against baselines on the instance view of DB111K-174 compared to the instance view of YAGO26K-906.
Visualizations.
We project the learned embeddings into 3-D space, and plot seven nodes with their relations in Figure 2. Concepts that are higher in the ontological hierarchy such as animal and person are closer to the origin compared to lower-level concepts such as leader, and entities tend to be farther away from the origin.

5. Conclusions
We are among the first to explore utilization of different embedding spaces for different views of a knowledge graph. Entities in instance view often have cyclical connections and are hereby modeled via spherical embeddings; whereas concepts and bridge entities in ontology view tend to form a hierarchy and are hereby modeled in hyperbolic space. We propose the notion of bridge space to seamlessly model intersection of the two spaces in which bridge entities reside. In addition, we propose a set of closed spherical space operations to eliminate the need of projecting embeddings to the tangent Euclidean space. The resulting model, DGS, is a unified framework that significantly outperforms all baselines, showing the effectiveness of capturing different structures in the knowledge graph using embedding spaces of different curvatures and properties.
Future directions include supporting geometric spaces of learnable curvature (to better capture the knowledge graph structure) and allowing a learnable offset between the origins of the two geometric spaces (to better accommodate uneven entity distribution and/or unbalanced ontology structures) We also aim to extend our dual-space model to a multi-space geometric embedding model.
6. Acknowledgements
This work was partially supported by NSF III-1705169, 1829071, 1937599, 2106859, 2119643; NIH R35-HL135772; NIBIB R01-EB027650; Amazon Research Awards; Cisco Research Grant USA000EP280889; NEC Gifts; Picsart Gifts; and Snapchat Gifts. The authors would also like to thank anonymous reviewers for their constructive feedback.
References
- (1)
- Bai et al. (2021) Yushi Bai, Rex Ying, Hongyu Ren, and Jure Leskovec. 2021. Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic Cones. NeurIPS (2021).
- Balaževic´ et al. (2019) Ivana Balaževic´, Carl Allen, and Timothy Hospedales. 2019. Multi-relational Poincaré Graph Embeddings. NeurIPS (2019).
- Beltrami (1897) Eugenio Beltrami. 1897. Teoria fondamentale degli spazii di curvatura costante. Annali di Matematica Pura ed Applicata.
- Berant and Liang (2014) Jonathan Berant and Percy Liang. 2014. Semantic Parsing via Paraphrasing. ACL (2014).
- Bonnabel (2013) Silvère Bonnabel. 2013. Stochastic gradient descent on Riemannian manifolds. IEEE (2013).
- Bordes et al. (2013) Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. NeurIPS (2013).
- Cannon et al. (1997) James W. Cannon, William J. Floyd, Richard Kenyon, and Walter R. Parry. 1997. Hyperbolic Geometry. MSRI.
- Chami et al. (2020) Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, and Christopher Re´. 2020. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. ACL (2020).
- Chami et al. (2019) Ines Chami, Rex Ying, Christopher Ré, and Jure Leskovec. 2019. Hyperbolic Graph Convolutional Neural Networks. NeurIPS (2019).
- Dettmers et al. (2018) Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D knowledge graph embeddings. AAAI.
- Gu et al. (2018) Albert Gu, Frederic Sala, Beliz Gunel, and Christopher Ré. 2018. Learning mixed-curvature representations in product spaces. In ICLR.
- Hao et al. (2019) Junheng Hao, Muhao Chen, Wenchao Yu, Yizhou Sun, and Wei Wang. 2019. Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts. KDD (2019).
- Huang et al. (2019a) Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019a. Knowledge Graph Embedding Based Question Answering. WSDM (2019).
- Huang et al. (2019b) Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019b. Knowledge Graph Embedding Based Question Answering. WSDM (2019).
- Iyer et al. (2022) Roshni G. Iyer, Thuy Vu, Alessandro Moschitti, and Yizhou Sun. 2022. Question-Answer Sentence Graph for Joint Modeling Answer Selection. arXiv preprint arXiv:2203.03549 (2022).
- Iyer et al. (2021) Roshni G. Iyer, Wei Wang, and Yizhou Sun. 2021. Bi-Level Attention Graph Neural Networks. ICDM (2021).
- Kolyvakis et al. (2019) Prodromos Kolyvakis, Alexandros Kalousis, and Dimitris Kiritsis. 2019. HyperKG: Hyperbolic Knowledge Graph Embeddings for Knowledge Base Completion. arXiv preprint arXiv:1908.04895 (2019).
- Lehmann et al. (2015) Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendesf, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Soren Auer, and Christian Bizer. 2015. DBpedia: A large-scale, multilingual knowledge base extracted from Wikipedia. In Semantic Web Journal. 167–195.
- Meng et al. (2019) Yu Meng, Jiaxin Huang, Guangyuan Wang, Chao Zhang, Honglei Zhuang, Lance Kaplan, and Jiawei Han. 2019. Spherical Text Embedding. NeurIPS (2019).
- Nickel and Kiela (2017) Maximilian Nickel and Douwe Kiela. 2017. Poincaré Embeddings for Learning Hierarchical Representations. NeurIPS (2017).
- Petersen et al. (2006) Peter Petersen, S. Axler, and K.A. Ribet. 2006. Riemannian Geometry. Springer.
- Suchanek et al. (2007) Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A Core of Semantic Knowledge. WWW.
- Sun et al. (2022) Li Sun, Zhongbao Zhang, Junda Ye, Hao Peng, Jiawei Zhang, Sen Su, and Philip S. Yu. 2022. A Self-supervised Mixed-curvature Graph Neural Network. AAAI (2022).
- Sun et al. (2020) Zequn Sun, Muhao Chen, Wei Hu, Chengming Wang, Jian Dai, and Wei Zhang. 2020. Knowledge Association with Hyperbolic Knowledge Graph Embeddings. ACL (2020).
- Sun et al. (2019) Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. ICLR (2019).
- Toutanova and Chen (2015) Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality.
- Trouillon et al. (2016) Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex Embeddings for Simple Link Prediction. ICML (2016).
- Wang et al. (2019) Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2019. RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. WSDM (2019).
- Wang et al. (2021) Shen Wang, Xiaokai Wei, Cícero Nogueira dos Santos, Zhiguo Wang, Ramesh Nallapati, Andrew Arnold, Bing Xiang, Philip S. Yu, and Isabel F Cruz. 2021. Mixed-Curvature Multi-Relational Graph Neural Network for Knowledge Graph Completion. WWW (2021).
- Yang et al. (2015) Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. ICLR (2015).
- Zhang et al. (2021) Shuai Zhang, Yi Tay, Wenqi Jiang, Da-cheng Juan, and Ce Zhang. 2021. Switch spaces: Learning product spaces with sparse gating. arXiv preprint arXiv:2102.08688 (2021).
- Zhang et al. (2020) Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, and Jie Wang. 2020. Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction. AAAI (2020).
Appendix A General Polar-Cartesian Conversion Framework
In this section, we detail our designed general procedures for efficiently converting arbitrary -dimensional embeddings between the polar and Cartesian coordinate representations, which are utilized by sub-models in DGS. These procedures include: (1) the Cartesian to polar coordinate transformation (MCP) and (2) the polar to Cartesian coordinate transformation (MPC).

A.1. MCP Transformation Procedure.
Using Figure 3, we derive transformations for the 3-dimensional spherical surface, and extend to a -dimensional spherical surface:

A.2. MPC Transformation Procedure.
Using Figure 3, we derive transformations for the -dimensional spherical surface, as follows with:
Theorem.
The MCP procedure and MPC procedure are closed in the spherical space.
Proof.
The above equations for the MCP and MPC procedures directly use properties of trigonometry for right triangles for each embedding dimension , ensuring that the transformation is closed. ∎
Appendix B Batch Vector Transformation Procedure
We illustrate the batch version of the rotation operation presented in Section 3.1.1. Consider for entity , there are relations connected to . Instead of applying to each one of the relations, we can perform the following to compute the result efficiently:
Theorem.
The batch vector transformation procedure is closed.
Proof.
Each relation has applied to the entity, and has been shown to be closed because every dimension of the transformed head is within the angular constraints e.g., by , and norm space constraints because preserves the norm space of the spherical space. ∎
Appendix C Riemannian retraction operation for the bridge space of DGS
Figure 4 describes the Riemannian retraction operation for the bridge space to the tangent Euclidean space. The bridge space is formed by intersecting the surface of the spherical ball with the Poincaré disk in the hyperbolic space. As shown in the figure, we separately map the Poincaré disk and the spherical ball to different tangent Euclidean spaces, and enable for joint communication between the two spaces to represent the bridge entities. On the left, we perform isomorphic mapping from the Poincaré disk, , to the hyperboloid model, as in (Chami et al., 2019) using:
(40) |
(41) |
, then using logarithmic mapping map to the tangent Euclidean space 1. On the right, we use logarithmic mapping to map the spherical ball to the tangent Euclidean space. Then we allow for transformation between the two Euclidean spaces through transformation mappings and such that
(42) |
(43) |