Expected Distances on Manifolds of Partially Oriented Flags
Abstract
Flag manifolds are generalizations of projective spaces and other Grassmannians: they parametrize flags, which are nested sequences of subspaces in a given vector space. These are important objects in algebraic and differential geometry, but are also increasingly being used in data science, where many types of data are properly understood as subspaces rather than vectors. In this paper we discuss partially oriented flag manifolds, which parametrize flags in which some of the subspaces may be endowed with an orientation. We compute the expected distance between random points on some low-dimensional examples, which we view as a statistical baseline against which to compare the distances between particular partially oriented flags coming from geometry or data.
1 Introduction
Let be an -dimensional vector space. A flag in is a nested sequence of subspaces . If is the dimension of , then associated to the flag is an increasing sequence . We call the sequence the signature of the flag. For this paper, we will make the assumption that is a real vector space and the flag manifold will be the real manifold whose points parametrize all flags of signature . Flag manifolds are natural generalizations of Grassmann manifolds and have found applications in image analysis [9] and face, pose, and action recognition [3, 8]; as Ye, Wong, and Lim point out [13], they are also implicit in many tasks in numerical and statistical analysis, ranging from mesh refinement to multiresolution analysis to canonical correlation analysis. The key feature of flag manifolds is they locally look the same at each point, i.e. flag manifolds are homogeneous spaces. Homogeneous spaces naturally admit nice coordinates which allow for tangible calculations.
In this paper we generalize the notion of flag manifold, defining partially oriented flag manifolds. These manifolds are slight generalizations of the partially oriented flag manifolds introduced by Lam [7] and studied by Sankaran and Zvengrowski [11, 12], among others. Flag manifolds are natural models for data which is properly represented as (nested) subspaces, rather than as individual vectors, and partially oriented flag manifolds should prove useful as models for subspace data when some or all of the nested subspaces are equipped with an orientation. These spaces are rather simple to define as homogeneous spaces, but do not seem to have appeared previously in the data science literature.
Given two data, represented as points in some Riemannian manifold or more general metric space, it is natural to use the distance between the points as a measure of similarity. However, the raw distance is meaningless without further context to know whether the distance is, say, much smaller than expected. A reasonable statistical baseline for such a comparison is the expected distance between two random points in the space. With an eye towards applications of partially oriented flag manifolds to data problems, we determine the expected geodesic distance between random points in all manifolds of partially oriented flags in ; compare to Absil, Edelman, and Koev’s work using a slightly different notion of distance in Grassmannians [1].
More precisely, recall that is the manifold of all -flags in ; that is, each point represents a line inside a plane inside . We will shortly introduce some more involved notation for this space, but for the moment let be the manifold of such flags in which the line has a preferred orientation, but the plane and do not. Then the most interesting expectations we compute are:
Theorem 1.
The expected distance between two random points in is
The expected distance between random -flags is
We define the partially oriented flag manifolds in Section 2 and discuss a number of examples and relationships to more familiar manifolds, including Grassmannians, Stiefel manifolds, and classical flag manifolds. In Section 3 we compute the expected distance between two random points in both analytically and numerically using Monte Carlo integration. While this computation is not new, it serves as a starting point and template for determining the expected distances between random points in all the manifolds of (partially oriented) flags on , which is the focus of Section 4.
2 Flags and Partially Oriented Flags
Let denote the set of all positive integers and let . An ordered partition of the integer is the tuple with such that .111Contrast this with the usual definition of a partition, in which the order does not matter, and hence one can assume . In this case, we say the size of is . Notice that for an ordered partition of size , the symmetric group acts by permuting elements; i.e. if we define . Now suppose that is the collection of indices of the partition . Then a partition of the set is a collection of disjoint nonempty subsets of whose union is all of . We call the partition the trivial partition, and the partition the complete partition. All other partitions of will be called proper partitions.
Let denote the group of real orthogonal matrices. For any fixed ordered partition of , define the group . An easy exercise in group theory shows that if , then . Since elements of are of the form where , we can interpret as a block-diagonal subgroup of .
Now suppose that is a subset of . We define a special orthogonal block of to be the set
For brevity of notation, we will denote this set as . If is a partition of , define a special orthogonal partition of to be the group . If the partition of is trivial, complete, or proper, we call the corresponding special orthogonal partition the same. is a block-diagonal subgroup of which is finite if and only if . Since the are subgroups of , the quotients will be homogeneous spaces which are the central object of this paper.
Definition 1.
Suppose that is any ordered partition of the integer . Then if is a partition of , we may define the orbit space . If the partition is trivial, then is called a flag manifold. If is a complete partition, we say that is an oriented flag manifold. Finally, if the partition is proper, then is called a partially oriented flag manifold.
Example 2.1.
If and is the trivial partition, then
the Grassmannian of -dimensional linear subspaces of . If instead we take the complete partition , then
the oriented Grassmannian of oriented -dimensional subspaces of , which double covers .
Example 2.2.
If , recall that the flag manifold consists of nested subspaces
where .
If is an ordered partition of the integer , define for each . If is the trivial partition of , then
really is a flag manifold. Notice that if , then yields a different space parametrizing entirely different geometric objects which is nonetheless diffeomorphic to . This helps explain our use of ordered partitions of rather than just partitions: we are thinking of the spaces as data models, with points corresponding to actual data, so we want to avoid the complication of applying a diffeomorphism before we can think of our data as points in a (partially oriented) flag manifold.
Example 2.3.
Take and suppose that is the complete partition of . Then
the Stiefel manifold of orthonormal -frames in .
Example 2.4.
In this example, we’ll consider all special orthogonal partitions when and . If is the complete partition of , then is the trivial group, so . At the other extreme, if is the trivial partition, then and is the copy of the Klein 4-group
If , , and are proper partitions, then the corresponding groups are
Clearly each is a subgroup of of index 2, so in the quotients we have natural double covers captured by the tower in Figure 1.
The double coverings in the previous example are special cases of a more general phenomenon. Given an ordered partition of and any partition of , suppose that is a refinement of . Then the number of sets contained in and differ by an integer . It follows that , and hence we have the proposition:
Proposition 2.
Suppose that is an ordered partition of and is a set partition of . Given any refinement of with , is a -cover of .
Example 2.5.
In light of the proposition and Example 2.1, we re-verify that is a double cover of .
Since is defined by quotients of orthogonal groups, we can find the total volume of . If is of length and , we have , so, for the purposes of computing volume, we may assume that is in decreasing order. To simplify notation, let denote the unit -sphere and set for ; then
so we see:
Proposition 3.
With notation as above,
In this expression, denotes the conjugate of the partition given by taking the Young diagram corresponding to , reflecting across the main diagonal, and then taking the partition corresponding to this new diagram; see Figure 2 for an example.222It is in converting the partition into a Young diagram that it is important we assume is in decreasing order. Also, we use the notation and .
4,3,2,2,1 \ydiagram5,4,2,1
3 and Lifts to the 3-Sphere
In this section and the next, we generalize [10] and compute the expected (Riemannian) distance between two random points in a partially oriented flag manifold. We will discuss this computation for all flags for the case . The strategy for each of these computations is similar. The idea is to lift to the unit 3-sphere and carry out computations upstairs. Lifting to gives us a natural coordinate system for which is then used to describe the Haar measure on . The push-forward measure is then used to find an invariant measure on each of the flags, and amounts to only changing the region of integration in each case. Additionally, we’ll discuss how we can do these calculations via Monte Carlo integration.
3.1 Lifting for Analytic Computation
We begin with a known calculation [10] of the expected distance between random points in . The calculations for partial flags will follow from a refinement of this approach. First we’ll briefly describe the Haar measure on as parametrized by . It is not hard to see that is double covered by the unit 3-sphere. Indeed, has a natural group structure as the unit quaternions, which act on the quaternions by conjugation. This action clearly fixes the real line since the reals commute with quaternions under multiplication. Hence the purely imaginary quaternions are also invariant under the action of unit quaternions. The restriction of the action to the purely imaginary quaternions is an isometry. To see this, suppose that is a unit quaternion and that and are both purely imaginary quaternions. Then
By identifying the purely imaginary quaternions with , this calculation shows that acts by isometries on . In fact, this action is by rotations: if are unit vectors, which we interpret as purely imaginary quaternions, and we define , then a straightforward calculation shows that
(1) |
which is the Rodrigues formula for rotation of around the axis by an angle . Therefore, acts on by rotation; moreover, since and both induce the same rotations, it readily follows that is double covered by .
The double covering of provides us a natural way to parametrize : each element of corresponds to an antipodal pair of points in , so each rotation corresponds to a (almost always unique) point in the hemisphere of where the first coordinate is positive. We can use hyperspherical coordinates on to parametrize :
where and . The volume form on is
but we need to make a slight adjustment to write down the volume form on . If and is a purely imaginary unit quaternion, then the point has positive first coordinate and lies at a distance from in . However, looking at (1), the corresponding element of is at Frobenius distance from the identity. Therefore, the map scales distances by 2 and hence 3-dimensional volumes by , so we conclude that, with respect to hyperspherical coordinates,
With this in mind, an easy calculation shows that
as expected since the first column lies on the unit 2-sphere, with an area of , the second column lies on a perpendicular unit circle of circumference , and the third column is determined by the first two.
Now, we can compute the expected distance between two random points in . Since the Frobenius distance is equivariant with respect to the left action of on itself, we can rotate one of the two points to the identity and, equivalently, compute the expected distance from a random point in to the identity. Equivalently, since an element of is a rotation by an angle around an axis , and is exactly the Frobenius distance to the identity, we are computing the expected rotation angle of a random element of .333This can also be computed by integrating the rotation angle against the known density of the circular real ensemble [6].
In order to compute the expected rotation, we note that if , the angle of rotation is . In other words, up in we are computing the expectation of twice the distance from a random point in the hemisphere of points with positive real part to the identity element :
This calculation agrees with that found in [10].
3.2 Monte Carlo Methods
This computation can also be done numerically using Monte Carlo integration. First, we will describe how to randomly generate matrices in . With this, we will be able to easily compute expected distances on . From here, we will be able to use the same algorithms with slight modifications for similar computations on flag and partially oriented flag manifolds.
Following Chikuse [2], we can generate random orthogonal matrices in Algorithm 1 by applying Gram–Schmidt to a random Gaussian matrix.
The next algorithm depends on the geodesic distance on . Fix , and let be the spectral decomposition of . Then because is an isometry,
This means the geodesic distance between and depends only on the eigenvalues of . Thus when approximating the integral via a Monte Carlo algorithm, we only need to generate one random special orthogonal matrix at each step, which reduces computation time significantly.
In fact, the distance is described in [4] and is given by
where are the eigenvalues of . The factor of ensures that we aren’t overusing , since the eigenvalues come in complex conjugate pairs. Algorithm 2 describes the Monte Carlo experiment. Using Algorithms 1 and 2 with and gives the approximation , with absolute error
return
4 Expected Distances Between Partially Oriented Flags
Following the example computations on , we now compute the expected distance between two random points on each (partially oriented) flag manifold obtained from . In all but one case we’ll be able to find an analytic expression for the expectation, and in all cases we’ll get a numerical result from a Monte Carlo experiment.
Notice that for of length , , and a partition of , we have . This is true in general, but in the special case of we also observe that if partitions and of have the same order, then and are diffeomorphic. Combined with the previous fact, we see that so long as the partitions of and of are the same size. This reduces the number of computations to be done.
In each case, we can use a trick already mentioned above to simplify our calculations. Since each is a homogeneous space, and since we will always choose the Riemannian submersion metric on the quotient which is invariant under the left action, we can always move one point to the orbit of the identity, and hence the expected distance between two random points in is the same as the expected distance between a single random point and the identity orbit (or any other preferred point).
4.1 and
In this section we’ll consider the simplest (oriented) flag manifolds derived from .444For completeness, we can consider the trivial flag manifold which clearly has expected distance 0. Suppose that . Let and denote the complete and trivial partitions of , respectively. Then we have the identifications and .
Using standard spherical coordinates, the area form on is , where is the polar angle and is the azimuthal angle. Indeed, we verify from the volume formula in 3 that , the area of the unit sphere. Since is homogeneous, it suffices to consider the distance between a point and the north pole, which is simply the polar angle . Hence we have
This is what we expect, since a point is just as likely to be in the northern hemisphere as in the southern. The calculation can also be carried out using Monte Carlo integration: generate a 3-vector with independent, normally distributed entries. Normalize to get which is uniformly distributed on the sphere. As above, it suffices to compute the angle between and the north pole. Taking the average of samples gives the numerical estimate; indeed, an experiment with yields the estimate , with an absolute error of .
The story for is similar. The map given by is a Riemannian submersion, so the area form on is , and computing the distance between points in is equivalent to computing the minimum distance between two pairs of antipodal points in , so our computation reduces to determining the expected angle between a random point in the northern hemisphere and the north pole:
To carry this out with Monte Carlo methods, take as before, but compute the average of where is the point at the north pole. An experiment with produces the estimate .
4.2 Flags with
In this section we consider flags with the ordered partition . First we’ll describe the problem of finding expected distances via Monte Carlo integration. Following this, we’ll take a look at each individual case. Suppose that is any partition of . In this case contains only a finite number of elements in as we have already seen in 2.4. Computing the expected distance on is hardly any different from what we’ve done before: we simply take the minimum distance from each point in the orbit to the identity as in Algorithm 3.
return
4.2.1 Oriented Flags
When is the complete partition, the oriented flag manifold and we saw in Section 3.1 that the expected distance between two random points in this space is
4.2.2 Partially Oriented Flags
For the case of the partially oriented flags we may choose to be or as in 2.4. In any case, will contain two matrices. Without loss of generality, suppose that . We can readily apply Algorithm 3 with to see that the expected distance is .
To get an analytic expression, the strategy is to lift the computation to as in Section 3. Since the composite map is (up to the scale factor 2) a Riemannian submersion, the distance between a pair of points in is equal to twice the minimum distance between the sets of preimages in .
As usual, the expected distance between two random points in is the same as the expected distance between a single random point and any preferred point, which we will take to be the identity coset . In turn, the identity coset consists of the identity and the rotation by angle around the axis , so its preimages in are ; in general, if is rotation by around an axis , then the preimage of in is , where . In turn, since multiplication by and multiplication by are isometries,
where the minimum distance will be achieved by the element of which is closer to than to , , or .
Therefore, the expected distance between a random coset and the identity coset (and hence, between two random points in ), is simply the expectation of twice the distance from to a random point in the subset of which is closer to than to , , or . In terms of Cartesian coordinates , this is precisely the set ; in hyperspherical coordinates, is equivalent to .
Hence, the volume of the partial flag is given by
where we use symmetry to restrict to and avoid any issues with .
Mathematica will happily compute the above integral to be , which agrees with the volume calculation from 3, but the limits of integration on the innermost integral make this unpleasant to compute by hand.
We can make things easier by changing coordinate systems. Thinking of as the unit sphere in , we will interpret points on as pairs with . Then the condition is equivalent to – which will be a lens-shaped region in ; the analogous region on is a lune – so it is desirable to use as one of our coordinates. In fact, means that
for and . Notice that each value of determines a torus in except the extremal values and , where the torus collapses to the unit circle in the - or -plane. See Figure 3 for a visualization.

We can write these coordinates – which we call join coordinates because they give a concrete realization of as the topological join of two copies of – in terms of Cartesian coordinates as
and the volume form on is easily computed to be . In these coordinates, the volume of is easy to compute by hand, as it reduces to a simple -substitution:
We want to compute the expectation of twice the spherical distance from to a random point in the region . The distance is simply , and so the expected distance between two random points in is
by integrating out and using the fact that the integrand is even in .
This is slightly tedious but essentially straightforward to compute using the substitutions and , producing the first half of Theorem 1, which we restate in slightly more general form:
Theorem 4.
For , the expected distance between two random points in is
Note that
which once again shows that our Monte Carlo experiment does a good job of approximating the analytic result.
4.2.3 Flags
Finally, we consider the case when is the trivial partition, so contains four matrices as in 2.4 and , the manifold of (complete) flags on . We may readily apply Algorithm 3 with to find that the expected distance between two random points in is .
Finding an explicit integral for this calculation only requires a few additional changes to the bounds from the previous section. The orbit of the identity lifts to , which are the vertices of the regular 16-cell dual to the standard hypercube. Other orbits are simply rotated copies of the orbit of the identity, so they also get lifted to regular 16-cells. Hence, computing the expected distance between points in is equivalent to computing the expected distance from to all the other unit quaternions which are closer to than any of the other integer points on .
This set is nothing but the radial projection of the cube
to , namely those points with , , and . This spherical cube is partitioned into spherical tetrahedra, each congruent to the spherical tetrahedron . To integrate over this tetrahedron in hyperspherical coordinates we get the limits of integration , , and . Since the simplex is of the full spherical cube, we see that
using Mathematica’s numerical integration algorithm reassuringly yields a number numerically indistinguishable from , which we know from 3 is the true volume.
Therefore, the expected distance between two points on is given by the integral
In join coordinates, the corresponding integral turns out to be
It is not clear how to evaluate either of these integrals exactly, though the former can be reduced to the one-dimensional integral given in the second half of Theorem 1, which we restate:
Theorem 5.
The expected distance between two random -flags is
Of course, this can be numerically evaluated to an arbitrary degree of precision; to 20 digits, we can compute that the expected distance between two random points in is . Using the PSLQ algorithm [5], this does not appear to be in the vector space over generated by and , unlike all the other expected values we have computed.
5 Conclusion and Open Questions
The manifolds of partially oriented flags described in this paper have the potential to be a natural home for data which is interpretable in terms of nested subspaces, of which some may be oriented. We encourage mathematicians, engineers, and other practitioners who are beginning to see the usefulness of flag manifolds in data analysis to keep these spaces in mind.
While partially oriented flag manifolds have been studied by algebraic topologists [7, 11, 12], we have not been able to find much evidence of their being studied by geometers. Given the fundamental role played by flag varieties in algebraic geometry, it would be interesting to find a more algebraic definition or interpretation of these spaces. We also look forward to further geometric computation on these spaces, whether it be computing expected distances on a larger class of partially oriented flag manifolds or determining other geometric quantities of interest.
For data living in a metric space, the expected distance between two random points in that space is a simple statistical baseline for distance-based similarity measures. While this quantity can often be estimated effectively using Monte Carlo techniques, it would be interesting to see analytic expressions for expected distance on spaces which are nice geometrically or useful as data models.
Acknowledgments
This project grew out of conversations in the Pattern Analysis Lab at Colorado State University, and we are very grateful to all of the participants in the lab for ongoing inspiration. This work was partially supported by the National Science Foundation (ATD #1712788, CCF–BSF:CIF #1830676, CP) and the Simons Foundation (#354225, CS).
References
- [1] Pierre-Antoine Absil, Alan Edelman, and Plamen Koev. On the largest principal angle between random subspaces. Linear Algebra and its Applications, 414(1):288–294, 2006.
- [2] Yasuko Chikuse. Statistics on Special Manifolds, volume 174 of Lecture Notes in Statistics. Springer–Verlag, New York, NY, 2003.
- [3] Bruce Draper, Michael Kirby, Justin Marks, Tim Marrinan, and Chris Peterson. A flag representation for finite collections of subspaces of mixed dimensions. Linear Algebra and its Applications, 451:15–32, 2014.
- [4] Alan Edelman, Tomás A Arias, and Steven T Smith. The geometry of algorithms with orthogonality constraints. SIAM Journal on Matrix Analysis and Applications, 20(2):303–353, 1998.
- [5] Helaman R P Ferguson, David H Bailey, and Steve Arno. Analysis of PSLQ, an integer relation finding algorithm. Mathematics of Computation, 68:351–369, 1999.
- [6] Vyacheslav Leonidovich Girko. Distribution of eigenvalues and eigenvectors of orthogonal random matrices. Ukrainian Mathematical Journal, 37(5):457–463, 1985. Translated from Ukrainskii Matematicheskii Zhurnal 37(5):568–575, 1985.
- [7] Kee Yuen Lam. A formula for the tangent bundle of flag manifolds and related manifolds. Transactions of the American Mathematical Society, 213:305–314, 1975.
- [8] Tim Marrinan, J Ross Beveridge, Bruce Draper, Michael Kirby, and Chris Peterson. Flag manifolds for the characterization of geometric structure in large data sets. In Assyr Abdulle, Simone Deparis, Daniel Kressner, Fabio Nobile, and Marco Picasso, editors, Numerical Mathematics and Advanced Applications – ENUMATH 2013, pages 457–465. Springer International Publishing, Cham, 2015.
- [9] Yasunori Nishimori, Shotaro Akaho, and Mark D Plumbley. Riemannian optimization method on generalized flag manifolds for complex and subspace ICA. AIP Conference Proceedings, 872(1):89–96, 2006.
- [10] Eugene Salamin. Application of quaternions to computation with rotations. Working paper, Stanford AI Lab, 1979.
- [11] Parameswaran Sankaran and Peter Zvengrowski. Stable parallelizability of partially oriented flag manifolds. Pacific Journal of Mathematics, 128(2):349–359, 1987.
- [12] Parameswaran Sankaran and Peter Zvengrowski. Stable parallelizability of partially oriented flag manifolds. II. Canadian Journal of Mathematics. Journal Canadien de Mathématiques, 49(6):1323–1339, 1997.
- [13] Ke Ye, Ken Sze-Wai Wong, and Lek-Heng Lim. Optimization on flag manifolds. Preprint, arXiv:1907.00949 [math.OC], 2019.