Breaking Multivariate Records
Abstract.
For a sequence of i.i.d. -dimensional random vectors with independent continuously distributed coordinates, say that the th observation in the sequence sets a record if it is not dominated in every coordinate by an earlier observation; for , say that the th observation is a current record at time if it has not been dominated in every coordinate by any of the first observations; and say that the th observation breaks records if it sets a record and there are observations that are current records at time but not at time .
For general dimension , we identify, with proof, the asymptotic conditional distribution of the number of records broken by an observation given that the observation sets a record.
Fix , and let be a random variable with this distribution. We show that the (right) tail of satisfies
and
When , the description of in terms of a Poisson process agrees with the main result from [3] that has the same distribution as , where . Note that the lower bound on implies that the distribution of is not (shifted) Geometric for any .
We show that as ; in particular, in probability as .
Key words and phrases:
Multivariate records, Pareto records, record breaking, current records, maxima, Poisson point processes, asymptotics2010 Mathematics Subject Classification:
Primary: 60D05; Secondary: 60F051. Introduction and main result
Let be i.i.d. (independent and identically distributed) copies of a random vector with independent coordinates, each uniformly distributed over the unit interval. In this paper, for general dimension , we prove (Theorem 1.3) that the conditional distribution of the number of records broken by the th observation given that sets a record has a weak limit [call it , the distribution or law of a random variable ] as , and we identify this limit. (See Definition 1.1 for the relevant definitions regarding records.) The case is trivial: with probability . The case is treated in depth in [3], where it is shown that with .
Notation. We let according as is true or false, and for a random variable and an event we write as shorthand for . We write or for natural logarithm. For -dimensional vectors and , write to mean that for . The notation means . The notations and each mean that either or . Whenever we speak of incomparable vectors, we will mean vectors belonging to that are (pairwise-)incomparable with respect to the partial order . We write for the sum of coordinates of and for -norm; thus if , where is the vector .
We begin with some relevant definitions, taken from [5; 4; 3]. We give definitions for record-large values; we consider such records exclusively except in the case of Theorem 1.3′, which deals with record-small values (for which the analogous definitions are obvious).
Let be i.i.d. (independent and identically distributed) copies of a random vector with independent coordinates, each drawn from a common specified distribution. We mainly consider the case (as in [5; 4]) where the common distribution is Exponential, but in Theorem 1.3′ we consider the uniform distribution over the unit interval. As far as counting records or broken records, any continuous distribution gives the same results.
Definition 1.1.
(a) We say that is a Pareto record (or simply record, or that sets a record at time ) if for all .
(b) If , we say that is a current record (or remaining record, or maximum) at time if for all .
(c) If , we say that breaks (or kills) records if sets a record and there exist precisely values with such that is a current record at time but is not a current record at time .
(d) For (or , with the obvious conventions) let denote the number of records with , and let denote the number of remaining records at time .
Remark 1.2.
Definition 1.1 is illustrated in Figures 1–2, adapted from [3, Figure 1] and [4, Figure 2], respectively. In dimension , remaining records can be ordered northwest to southeast, as seen in Figure 1.
In dimensions , the record-setting region is a (typically) more complicated set, as illustrated in Figure 2.
A maximum for a given set is defined in similar fashion to Definition 1.1(b): We say that is a maximum of if for all . In this language, if we have that is a remaining record at time if is a maximum of .
Here is the main result of this paper, illustrated in Figure 3.
Theorem 1.3.
Let be i.i.d. -variate observations, each with independent Exponential coordinates. Let if does not set a record, and otherwise let denote the number of remaining records killed by . Then , conditionally given , converges in distribution as to the law of , where
Here is distributed standard Gumbel and
in a nonhomogeneous Poisson point process in with intensity function
It is easy to see from the form of the intensity function that, for each , the distribution of is unchanged if is replaced by any point with .
By using the decreasing bijection from to , which is also a bijection from to , Theorem 1.3 can be recast equivalently as follows:
Theorem 1.3′. Suppose that independent -variate observations, each uniformly distributed in the unit hypercube , arrive at times , and consider record-small values. Let if the observation is not a new record, and otherwise let denote the number of remaining records killed by the observation. Then , conditionally given , converges in distribution as to the law of , where
Here is distributed Exponential and
in a (homogeneous) unit-rate Poisson point process in the set
Section 2 gives a non-rigorous but informative proof of Theorem 1.3, and the much longer Section 3 gives a rigorous proof. Tail probabilities for are studied (asymptotically) for each fixed in Section 4; moments are considered, too. Asymptotics of as are studied in Section 5; in particular, we find that converges in probability to . Finally, in Section 6 we show that the main Theorem 1.3 of this paper reduces to the main Theorem 1.1 of [4] when .
2. Informal proof of main theorem
In this section we give an informal (heuristic) proof of Theorem 1.3; a formal proof is given in the next section. The informal proof suggests that it should be possible to prove extensions of Theorem 1.3 (under the same hypotheses as the theorem and using the same notation) such as the following, but we haven’t pursued such extensions in this paper.
Possible extension. Let . Over the event , let , denote the values of the records killed by , listed (for definiteness) in increasing order of first coordinate, and let for . Then converges in distribution to
here, over the event , the points , are the maxima counted by , listed in increasing order of first coordinate, and for .
Before beginning our heuristic proof of Theorem 1.3, we gather and utilize important information from [5] in the following remark.
Remark 2.1.
(b) Recall from [5, proof of Theorem 1.4] that, conditionally given , the joint density of
with respect to the product of Lebesgue measure on and uniform distribution on the probability simplex is
(2.2) |
and, as , converges pointwise to
the density (with respect to the same product measure) of the standard Gumbel probability measure and uniform measure on . Thus by Scheffé’s theorem (e.g., [1, Theorem 16.12]), there is total variation convergence of to in obvious notation.
(c) We also claim that
(2.3) |
as . To see this, first observe that as we have [using (2.2) and (2.1)] that
For fixed , as the integrand converges to
so it suffices to invoke the dominated convergence theorem. For we can dominate the integrand by
and this last expression integrates: Indeed, as , it is asymptotically equivalent to , and as , it is asymptotically equivalent to .
Here now is a heuristic proof of Theorem 1.3. Recall that we use the shorthand notation for natural logarithm. From Remark 2.1(a), the conditional density of given converges pointwise to the standard Gumbel density as ; further, the conditional distribution of given and is uniform over all positive -tuples summing to . Next, given and with (call this condition ), the random variables are i.i.d., each with (conditional) density proportional to with respect to Lebesgue measure on
a proper set difference; the proportionality constant is then the reciprocal of . For and fixed real numbers, this implies that the conditional density in question evaluated at a point at distance (say, -distance) at most from is asymptotically equivalent to .
It follows that (still conditionally given ) the set of points with that are within distance of is approximately distributed (when is large) as a nonhomogeneous Poisson point process with intensity function as described in Theorem 1.3, restricted to the ball of radius centered at . The maxima of this restricted Poisson process ought to be the same as the maxima of the unrestricted process when is sufficiently large. Letting , Theorem 1.3 results.
3. Proof of Theorem 1.3
In this section we give a complete formal proof of the main Theorem 1.3. Let . In Subsection 3.1 we prove the existence of a weak limit for , and in Subsection 3.2 we identify as the law of as described in Theorem 1.3.
3.1. Existence of weak limit
In this subsection we prove the following proposition.
Proposition 3.1.
There exists a probability measure on such that converges weakly to .
Indeed, Proposition 3.1 is established by showing that the sequence is tight (Lemma 3.2) and has a vague limit (Lemma 3.3).
Lemma 3.2.
(i) We have as .
(ii) The sequence is tight.
Proof.
By Markov’s inequality, boundedness of first moments is a sufficient condition for tightness. Thus (ii) follows from (i).
We now prove (i). When we have deterministically if , so (i) is obvious.
We therefore assume henceforth that . Recalling Definition 1.1(d) and introducing the dimension into the notation, denote the number of records that have been broken through time by . Then
(3.1) |
For the numerator of (3.1) we observe
where the second equality follows by standard consideration of concomitants: The two random variables and have the same distribution, for any and . Combining the last display with (3.1), one has
(3.2) |
Thus, by (2.1), as we have
as claimed. ∎
Lemma 3.3.
The sequence of probability measures converges vaguely.
Proof.
It is sufficient to show that there exists such that for each we have
(3.3) |
This is proved by means of the next three lemmas and the arguments interspersed among them. ∎
Lemma 3.4.
Let . Let if does not set a record, and otherwise let denote the number of remaining records killed by that are at -distance from . Then
(3.4) | ||||
(3.5) |
where Gamma is distributed Gamma and the asymptotics in (3.5) are as .
Proof.
Having suitably controlled , we turn our attention to , defined as follows for given . Let if does not set a record, and otherwise let denote the number of remaining records killed by that are at -distance from ; thus
(3.7) |
It follows using Remark 2.1(b) that if we can prove, for each , that
(3.8) |
has a limit, call it , as , then, also for each , we have
(3.9) |
To prove that (3.8) has a limit, it suffices by the dominated convergence to prove the following lemma, in which case we can then take
(3.10) |
Lemma 3.5.
Let , , and with . Then
has a limit as , which doesn’t depend on .
Proof.
We use the method of moments [and so, by the way, as ; that is, the conditional distributions in question have a weak limit]. Let (which implies ), and let denote the set of remaining record-values at time that are killed by at time and satisfy . By writing as a sum of indicators, for integer we calculate
(3.11) | ||||
(3.12) | ||||
(3.13) |
where, in (3.12), the second of the three sums is over satisfying and the third sum is over satisfying ; and, in (3.13), is a Stirling number of the second kind and is the number of ways to partition an -set into nonempty sets.
Now, writing , the conditional probability in (3.13) equals
where the integral is over incomparable vectors such that, for , we have and . For the denominator here we calculate
Changing variables from to , the numerator (i.e., integral) can be written as
(3.14) | ||||
(3.15) |
Both integrals are over vectors satisfying the restrictions that
are incomparable, and, | (3.16) | |||
for the integral in (3.14) we have the additional restrictions that , and, correspondingly, in the second integral we use the shorthand
Observe that the integrands in (3.15) can be dominated by
(3.17) |
which is integrable [over the specified range (3.16) of ] because, dropping the restriction of incomparability in the next integral to appear, the integral of (3.17) can be bounded by
(3.18) |
So we may apply the dominated convergence theorem to the integral appearing in (3.15). The assumption of strict positivity is crucial, since it implies that as . If we can show, for fixed and and and that
(3.19) |
then (3.11) has the following limit as :
(3.20) |
where the integral is over satisfying (3.16). Further, using (3.1) this limit is bounded by
Given the rate of growth of this bound as a function of , the method of moments applies: The conditional distribution of given , , and converges weakly to the unique probability measure on the nonnegative integers whose th moment is given by (3.20) for .
It remains to prove (3.19). Indeed, defining to be the coordinate-wise minimum and applying inclusion–exclusion at the second equality,
(3.21) |
as desired. ∎
Since the expression studied in Lemma 3.5 is clearly nondecreasing in , the same is true for its limit —and therefore, by (3.10), for the limit appearing in (3.9).
Lemma 3.6.
Proof.
This follows simply from (3.7), Lemma 3.4, and (3.9). Indeed, the conditional probability in question is, for each , at least as large as the one in (3.9) and so, by (3.9), has a at least . Letting , we find
Conversely, the conditional probability in question is, by finite subadditivity, at most the sum of the one in (3.9) and the one treated in Lemma 3.4. Thus, for each , we have
by (3.9) and Lemma 3.4. Letting , we find from (3.22) and (3.5) that
This completes the proof. ∎
3.2. Identification of the weak limit as
In this subsection we prove the following proposition. Recall that the existence of a weak limit for the probability measures is established in Proposition 3.1 and identified to some extent in Lemma 3.6.
Proposition 3.7.
The weak limit of the measures is the distribution described in Theorem 1.3.
Our approach to proving this proposition is to define and in relation to in the same way that we defined and in relation to , to prove an analogue (namely, Lemma 3.8) of Lemma 3.4, and to use again the method of moments to establish (see Lemma 3.9) that the limit in Lemma 3.5 equals (in notation that should be obvious and which we shall at any rate make explicit in the statement of Lemma 3.9).
Lemma 3.8.
Let . For , let
and at -distance from |
in the nonhomogeneous Poisson point process described in Theorem 1.3, with intensity function
Define to be the mixture , where is distributed standard Gumbel. Then
where is distributed Gamma and the asymptotics are as .
Proof.
Let
Denote the Poisson process by . Apply the so-called Mecke equation by setting
and taking in [6, Theorem 4.1] to find
Multiply by and integrate over to get
as claimed. ∎
Lemma 3.9.
Proof.
We need only show that, for each , the random variable has moment equal to (3.20), where is defined at (3.21).
For this, we proceed in much the same way as for the proof of Lemma 3.8. In this case, let
Applying the multivariate Mecke equation [6, Theorem 4.4], we find
where the unlabeled integrals are over incomparable vectors belonging to .
4. Tail probabilities for
In Subsection 4.1 we show that all the moments of are finite and give a simple upper bound on each, and in Subsection 4.2 we study right-tail (logarithmic) asymptotics for .
4.1. Bounds on the moments of
Here is the main result of this subsection.
Proposition 4.1.
Proof.
Recall from (3.23) that
(4.1) | ||||
where the unlabeled integral is over vectors satisfying the restrictions (3.16).
Next, multiply both sides of this equality by and integrate over to find
Changing variables by scaling,
where now the unlabeled integral is over satisfying the restrictions
are incomparable, and, | |||
Observe that for each we have
and hence, taking the geometric mean of the lower bounds,
(4.2) |
Thus
(4.3) |
where denotes the number of remaining records at time ; recall Definition 1.1(b).
We therefore have the bound
with as in the statement of the proposition. The inequality
(4.4) |
used here is due to Brightwell [2, Theorem 1] [who also gives a lower bound of matching form on ], answering a question raised by Winkler [7].
Continuing by using the simple upper bound , for any we have
(4.5) |
Since as , the proposition follows by the monotone convergence theorem. ∎
Remark 4.2.
4.2. Tail asymptotics for
The next two theorems are the main results of this subsection. They give closely (but not perfectly) matching upper and lower bounds on lead-order logarithmic asymptotics for the tail of . We do not presently know for any how to close the gap.
Theorem 4.4.
Fix . Then
The exponent of here can be written as .
Theorem 4.5.
Fix . Then
Proof of Theorem 4.4.
This follows simply from Proposition 4.1, applying Markov’s inequality with an integer-rounding of the optimal choice (in relation to the bound of the proposition)
to wit:
(4.7) |
∎
Remark 4.6.
When , the truth, according to Corollary 6.1, is
Proof of Theorem 4.5.
We will establish the stronger assertion that
(4.8) |
for by establishing it for any . The idea of the proof is that when is large, will be large because and will be large.
Choose and fix . Define . Then
(4.9) |
For the first factor here, we use (with stated asymptotics as )
We will show that the second factor equals . It then follows that
as claimed.
It remains to show that
We will do this by applying Chebyshev’s inequality to the integrand factor , so to prepare we will obtain a lower bound on and an upper bound on .
After some straightforward simplification, it follows from the case (with ) of the calculation of in the proof of Lemma 3.9 and a change of variables (in what follows) from to that, for any and uniformly for , we have
Thus, uniformly for , we have
(4.10) |
Observe that the asymptotic coefficient of in (4.10) exceeds .
We next turn our attention to the second moment of . From the case (with ) of the calculation of in the proof of Lemma 3.9,
where the unlabeled integral is over vectors satisfying the restrictions (3.16) with .
Uniformly for , the contribution to this second moment equals
(4.11) |
[We remark in passing that the asymptotic bounds (4.10) and (4.11) match.]
The contribution, call it , is
Here we recall that and are incomparable. If, for example, and for and for , then, with , the integrand equals
where , , and are vectors of length , , and , respectively. By this reasoning, equals
where the integral is over vectors of lengths as previously specified, subject to the two restrictions for . The integral here reduces to the following three-dimensional integral:
It follows that is bounded above by times
By the dominated convergence theorem, as we have
where
In particular, uniformly for , we have
(4.12) |
By Chebyshev’s inequality, uniformly for we have
as was to be shown. ∎
Remark 4.7.
(a) When , the lower bound (4.8) with gives
The logarithmic asymptotics are of the correct order (namely, linear), but the coefficient is approximately , which is roughly twice as big as the correct coefficient (recall Remark 4.6) .
(b) Theorem 4.5 can be used to give a lower bound on the moments of by starting with the lower bound
and choosing judiciously. The result for fixed , in short, is that
(4.13) |
The lead-order terms of Proposition 4.1 and (4.13) for the logarithmic asymptotics are of the same order, but the coefficients [ and , respectively] don’t quite match.
Remark 4.8.
It may be of some interest to study the distribution of for fixed , but it is then simpler for measuring the distance of a killed record from the new record to switch from -distance to distance. We call the analogue of for the latter distance .
The goal of this remark is to show that, for fixed and , as we have (in stark contrast to Theorem 4.5); more precisely, we show (a) that
(4.14) |
with [so that the increasing function of belongs to as well] and (b) that
(4.15) |
(a) Here is a sketch of the proof of (4.14). In order to have , the Poisson process in the proof of Lemma 3.8 must have points for some , and at least of them must be maxima among these points. Integrating over , we find
Over the event there must be some -tuple of incomparable observations; thus, by finite subadditivity,
Recall from (4.4) that
Thus for any we have
A nearly optimal choice of here is to round to an integer, and this yields (4.14).
(b) To prove (4.15), we start (in similar fashion as the proof of Theorem 4.5) with a study of . Using the Poisson-process notation as we did in the proof of Lemma 3.8, observe that
where, with and , we set
The random variables and are each Poisson distributed with respective means
Since the number of remaining records in dimension has the same distribution as the total number of records set through time in dimension , it follows that increases stochastically in . Thus
If with , then Chebyshev’s inequality gives (uniformly for such ) the result
as . If , then
Further, according to [2],
as . To summarize our treatment thus far, uniformly for we have
(4.16) |
5. Asymptotics of as
In this section we prove that converges in probability (equivalently, in distribution) to as the dimension tends to infinity by establishing upper (Subsection 5.1) and lower (Subsection 5.2) bounds, each decaying exponentially to , on .
5.1. Upper bound
Here is the main result of this subsection, showing that the decay of to is at least exponentially rapid.
Theorem 5.1.
As we have
Proof.
Passing to the limit as from (4.1) with , recall that for each we have
Thus for any we have
(5.1) |
where the integral is defined, and can be bounded for any , as follows:
Returning to (5.1), for we find
As , this last bound has the asymptotics
Thus, to obtain an approximately optimal bound we choose
This choice gives the following asymptotics as :
The optimal choice of for large minimizes , leading to and
as claimed, since . ∎
5.2. Lower bound
Here is the main result of this subsection, showing that the decay of to is at most exponentially rapid.
Theorem 5.2.
For every we have
Proof.
Remark 5.3.
In the same spirit as their Conjecture 2.3 and the discussion following it, the authors of [4] might have put forth the closely related conjecture that has a limit and that as , with the additional suggestion that perhaps for every . In light of Theorems 1.3 and 5.1–5.2 we see that the related conjecture is true, but the additional suggestion is not.
6. Dimension
Here is the main result of this section.
Corollary 6.1.
Adopt the setting and notation of Theorem 1.3 with . Then
(i) For each , the law of is the mixture , where is the conditional distribution of given with standard Gumbel.
(ii) has the same distribution as , where (with support ) has the Geometric distribution.
Proof.
We first show how (ii) follows from (i). If (i) holds, then the law of is the mixture , where is the mixture with standard Gumbel. Observe that the density of is
(6.1) |
so the density of is
that is, has the Exponential distribution. But then is the law of , as follows either computationally: for we have
or by the probabilistic argument that has the same distribution as the number of arrivals in one Poisson counting process with unit rate prior to the (Exponentially distributed) epoch of first arrival in an independent copy of the process, which (using symmetry of the two processes and the memoryless property of the Exponential distribution) has the same distribution as .
We now proceed to prove (i). It is easy to see that, with probability , the Poisson point process (call it ) described in Theorem 1.3 will have an infinite number of maxima, but with no accumulation points in either of the coordinates, and a finite number of maxima dominated by . [It is worth noting that the argument to follow is unchanged if is changed to any other point with .] Thus, over the event we can list the locations of the maxima , in order from northwest to southeast (i.e., in increasing order of first coordinate and decreasing order of second coordinate), as , where are the maxima dominated by . Given
(6.2) | ||||
(6.3) |
let denote the following disjoint union of rectangular regions:
Then (by the same sort of reasoning as in [3, Proposition 3.1]), for and satisfying (6.2)–(6.3), and introducing abbreviations
we have
(6.4) |
To calculate , we need to integrate this last expression over all satisfying (6.2)–(6.3).
Since and appear only in the first of the two factors in (6.4), we integrate them out first to obtain
(6.5) | ||||
where the integral is over all satisfying
Next we integrate out . For this we use the calculation
So when we integrate out we get
where the integral is over all satisfying
Continuing in this fashion, after integrating out we get
where the integral is over all satisfying
Acknowledgements.
We thank Ao Sun for providing a simplified proof of Lemma 3.2(i), and Daniel Q. Naiman and Ao Sun for valuable assistance in producing the three figures. We thank Svante Janson for helpful discussions about the details of this paper. We are also grateful for discussions with Persi Diaconis, Hsien-Kuei Hwang, Daniel Q. Naiman, Robin Pemantle, Ao Sun, and Nicholas Wormald. Last but not least, we thank two anonymous reviewers for many helpful suggestions.
References
- [1] Patrick Billingsley. Probability and measure. Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, 2012. Anniversary edition [of MR1324786], With a foreword by Steve Lalley and a brief biography of Billingsley by Steve Koppes.
- [2] Graham Brightwell. Random -dimensional orders: width and number of linear extensions. Order, 9(4):333–342, 1992.
- [3] James Allen Fill. Breaking bivariate records. Combin. Probab. Comput., 30(1):105–123, 2021.
- [4] James Allen Fill and Daniel Q. Naiman. Generating Pareto records, 2019. arXiv:1901.05621.
- [5] James Allen Fill and Daniel Q. Naiman. The Pareto record frontier. Electron. J. Probab., 25:1–24, 2020.
- [6] Günter Last and Mathew Penrose. Lectures on the Poisson process, volume 7 of Institute of Mathematical Statistics Textbooks. Cambridge University Press, Cambridge, 2018.
- [7] Peter Winkler. Random orders. Order, 1(4):317–331, 1985.