Faster Privacy Accounting via Evolving Discretization
Abstract
We introduce a new algorithm for numerical composition of privacy random variables, useful for computing the accurate differential privacy parameters for composition of mechanisms. Our algorithm achieves a running time and memory usage of for the task of self-composing a mechanism, from a broad class of mechanisms, times; this class, e.g., includes the sub-sampled Gaussian mechanism, that appears in the analysis of differentially private stochastic gradient descent. By comparison, recent work by Gopi et al. (2021) has obtained a running time of for the same task. Our approach extends to the case of composing different mechanisms in the same class, improving upon their running time and memory usage from to .
1 Introduction
Differential privacy (DP) (Dwork et al., 2006b, a) has become the preferred notion of privacy in both academia and the industry. Fueled by the increased awareness and demand for privacy, several systems that use DP mechanisms to guard users’ privacy have been deployed in the industry (Erlingsson et al., 2014; Shankland, 2014; Greenberg, 2016; Apple Differential Privacy Team, 2017; Ding et al., 2017; Kenthapadi & Tran, 2018; Rogers et al., 2021), and the US Census (Abowd, 2018). Besides the large volume of data, many of these systems offer real-time private data analytic and inference capabilities, which entail strict computational efficiency requirements on the underlying DP operations.
Definition 1.1 (DP).
Let and . A randomized algorithm is -DP if, for all differing on a single index and all outputs , we have .
The DP guarantees of a mechanism are captured by the privacy parameters and ; the smaller these parameters, the more private the mechanism. Often a mechanism is simultaneously DP for multiple privacy parameters; this is captured by studying the privacy loss of a mechanism —the curve for which the mechanism is -DP.
A fundamental mathematical property satisfied by DP is composition, which prescribes the privacy guarantees of results from executing multiple DP mechanisms. In basic composition (Dwork et al., 2006a), a mechanism that returns the results of executing an -DP mechanism and an -DP mechanism is -DP. Unfortunately, this bound becomes weak when composing a large number of mechanisms. The advanced composition (Dwork et al., 2010) offers stronger bounds: roughly speaking, composing mechanisms that are each -DP yields a mechanism whose privacy parameters are of the order of and —clearly more desirable than the basic composition. In fact, obtaining tight composition bounds has been an active research topic. Kairouz et al. (2015) showed how to obtain the exact privacy parameters of self-composition (mechanism composed with itself), while Murtagh & Vadhan (2016) showed that the corresponding problem for the more general case is #P-complete.
Privacy Loss Distribution (PLD).
Tighter bounds for privacy parameters that go beyond advanced composition are possible if the privacy loss of the mechanism is taken into account. Meiser & Mohammadi (2018); Sommer et al. (2019) initiated the study of numerical methods for accurately estimating the privacy parameters, using the privacy loss distribution (PLD) of a mechanism. The PLD is the probability mass function of the so-called privacy loss random variable (PRV) for discrete mechanisms, and its density function for continuous mechanisms (see Section 2 for formal definitions). PLDs have two nice properties: (i) tight privacy parameters can be computed from the PLD of a mechanism and (ii) the PLD of a composed mechanism is the convolution of the individual PLDs. Property (ii) makes PLD amenable to FFT-based methods.
Koskela et al. (2020) exploited this property to speed up the computation of the PLD of the composition. An important step to retain efficiency using PLDs is approximating the distribution so that it has finite support; this is especially needed in the case when the PLD is continuous or has a large support. PLD implementations have been at the heart of many DP systems and open-source libraries including (Lukas Prediger, 2020; Google, 2020; Microsoft, 2021). To enable scale and support latency considerations, the PLD composition has to be as efficient as possible. This is the primary focus of our paper.
Our starting point is the work of Koskela et al. (2020, 2021); Koskela & Honkela (2021), who derived explicit error bounds for the approximation obtained by the FFT-based algorithm. The running time of this algorithm for -fold self-composition of a mechanism that is -DP is111For any positive , we denote by any quantity of the form . . When , this running time is , where is the additive error in . Gopi et al. (2021) improved this bound to obtain an algorithm with running time of
where is the additive error in . When composing different mechanisms, their running time is
Our Contributions.
We design and study new algorithms for -fold numerical composition of PLDs. Specifically, for reasonable choice of mechanisms, we
-
obtain (Section 3) a two-stage algorithm for self-composing PLDs, with running time
-
extend (Section 5) the two-stage algorithm to a recursive multi-stage algorithm with a running time of
Both the two-stage and multi-stage algorithms extend to the case of composing different mechanisms. In each case, the running time increases by a multiplicative factor. Note that this factor is inevitable since the input “size” is —indeed, the algorithm needs to read the input PLDs. We defer the details of this extension to Appendix B.
Algorithm Overview.
The main technique underlying our algorithms is the evolution of the discretization and truncation intervals of the approximations of the PRV with the number of compositions. To describe our approach, we first present a high-level description of the algorithm of Gopi et al. (2021). Their algorithm discretizes an -length interval into bucket intervals each with mesh size , thus leading to a total of buckets and a running time of for the FFT convolution algorithm. Both these aspects of their algorithm are in some sense necessary: a truncation interval of length would lose significant information about the -fold composition PRV, whereas a discretization interval of length would lose significant information about the original PRV; so relaxing either would lead to large approximation error.
The key insight in our work is that it is possible to avoid having both these aspects simultaneously. In particular, in our two-stage algorithm, the first stage performs a -fold composition using an interval of length discretized into bucket intervals with mesh size , followed by another -fold composition using an interval of length discretized into bucket intervals with mesh size . Thus, in each stage the number of discretization buckets is leading to a final running time of for performing two FFT convolutions.
The recursive multi-stage algorithm extends this idea to stages, each performed with an increasing discretization interval and truncation interval, ensuring that the number of discretization buckets at each step is .
Experimental Evaluation.
We implement our two-stage algorithm and compare it to baselines from the literature. We consider the sub-sampled Gaussian mechanism, which is a fundamental primitive in private machine learning and constitutes a core primitive of training algorithms that use differentially private stochastic gradient descent (DP-SGD) (see, e.g., (Abadi et al., 2016)). For compositions and a standard deviation of and with subsampling probability of , we obtain a speedup of compared to the state-of-the-art. We also consider compositions of the widely-used Laplace mechanism. For compositions, and a scale parameter of for the Laplace distribution, we achieve a speedup of . The parameters were chosen such that the composed mechanism satisfies -DP. See Section 4 for more details.
Related Work.
In addition to Moments Accountant (Abadi et al., 2016) and Rényi DP (Mironov, 2017) (which were originally used to bound the privacy loss in DP deep learning), several other tools can also be used to upper-bound the privacy parameters of composed mechanisms, including concentrated DP (Dwork & Rothblum, 2016; Bun & Steinke, 2016), its truncated variant (Bun et al., 2018), and Gaussian DP (Dong et al., 2019; Bu et al., 2020). However, none of these methods is tight; furthermore, none of them allows a high-accuracy estimation of the privacy parameters. In fact, some of them require that the moments of the PRV are bounded; the PLD composition approach does not have such restrictions and hence is applicable to mechanisms such as DP-SGD-JL (Bu et al., 2021).
In a recent work, Doroshenko et al. (2022) proposed a different discretization procedure based on whether we want the discretized PLD to be “optimistic” or “pessimistic”. They do not analyze the error bounds explicitly but it can be seen that their running time is , which is slower than both our algorithm and that of Gopi et al. (2021).
Another recent work (Zhu et al., 2022) developed a rigorous notion of “worst-case” PLD for mechanisms, under the name dominating PLDs. Our algorithms can be used for compositions of dominating PLDs; indeed, our experimental results for Laplace and (subsampled) Gaussian mechanisms are already doing this implicitly. Furthermore, Zhu et al. (2022) propose a different method for computing tight DP composition bounds. However, their algorithm requires an analytical expression for the characteristic function of the PLDs. This may not exist, e.g., we are unaware of such an analytical expression for subsampled Gaussian mechanisms.
2 Preliminaries
Let denote the set of all positive integers, the set of all non-negative real numbers, and let . For any two random variables and , we denote by their total variation distance.
2.1 Privacy Loss Random Variables
We will use the following privacy definitions and tools that appeared in previous works on PLDs (Sommer et al., 2019; Koskela et al., 2020; Gopi et al., 2021).
For any mechanism , and any , we denote by the smallest value such that is -DP. The resulting function is said to be the privacy curve of the mechanism . A closely related notion is the privacy curve between two random variables , and which is defined, for any as
where is the support of and . The privacy loss random variables associated with a privacy curve are random variables such that is the same curve as , and that satisfy certain additional properties (which make them unique). While PRVs have been defined earlier in Dwork & Rothblum (2016); Bun & Steinke (2016), we use the definition of Gopi et al. (2021):
Definition 2.1 (PRV).
Given a privacy curve , we say that random variables are privacy loss random variables (PRVs) for , if (i) and are supported on , (ii) , (iii) for every , and (iv) , where and denote the PDFs of and , respectively.
Theorem 2.2 (Gopi et al. (2021)).
Let be a privacy curve that is identical to for some random variables and . Then, the PRVs for are given by
and
Moreover, we define for every and define as the smallest such that .
Note that is well defined even for that is not a privacy loss random variable.
If is a privacy curve identical to and is a privacy curve identical to , then the composition is defined as the privacy curve , where is independent of , and is independent of . A crucial property is that composition of privacy curves corresponds to addition of PRVs:
Theorem 2.3 (Dwork & Rothblum (2016)).
Let and be privacy curves with PRVs and respectively. Then, the PRVs for the privacy curve are given by
We interchangeably use the same letter to denote both a random variable and its corresponding probability distribution. For any two distributions , we use to denote its convolution (same as the random variable ). We use to denote the -fold convolution of with itself.
Finally, we use the following tail bounds for PRVs.
Lemma 2.4 (Gopi et al. (2021)).
For all PRV , and , it holds that
2.2 Coupling Approximation
To describe and analyze our algorithm, we use the coupling approximation tool used by Gopi et al. (2021). They showed that, in order to provide an approximation guarantee on the privacy loss curve, it suffices to approximate a PRV according to the following coupling notion of approximation:
Definition 2.5 (Coupling Approximation).
For two random variables , we write if there exists a coupling between them such that .
We use the following properties of coupling approximation shown by Gopi et al. (2021). We provide the proofs in Appendix A for completeness.
Lemma 2.6 (Properties of Coupling Approximation).
Let be random variables.
-
1.
If , then for all ,
-
2.
If and , then (“triangle inequality”).
-
3.
If , then ; furthermore, for all , it holds that .
-
4.
If (for any ) and , then for all and ,
2.3 Discretization Procedure
We adapt the discretization procedure from (Gopi et al., 2021). The only difference is that we assume the support of the input distribution is already in the specified range as opposed to being truncated as part of the algorithm. A complete description of the procedure is given in Algorithm 1.
The procedure can be shown to satisfy the following key property.
Proposition 2.7.
For any random variable supported in , the output of with mesh size and truncation parameter satisfies and , for some with .
3 Two-Stage Composition Algorithm
Our two-stage algorithm for the case of -fold self-composition is given in Algorithm 2. We assume where , , and , which for instance can be achieved by taking , , and .
The algorithm implements the circular convolution using Fast Fourier Transform (FFT). For any and , we define where such that . Given the circular addition is defined as
Similarly, for random variables , we define to be their convolution modulo and to be the -fold convolution of modulo .
Observe that with mesh size and truncation parameter runs in time , assuming an -time access to . The FFT convolution step runs in time ; thus the overall running time of is
The approximation guarantees provided by our two-stage algorithm are captured in the following theorem. For convenience, we assume that is a perfect square (we set ). The complete proof is in Section 3.1.
Theorem 3.1.
For any PRV , the approximation returned by satisfies
when invoked with (assumed to be an integer) and parameters given below (using )
In terms of a concrete running time bound, Theorem 3.1 implies:
Corollary 3.2.
For any DP algorithm , the privacy curve of the -fold (adaptive) composition of can be approximated in time where is the additive error in , is the additive error in , and is an upper bound on
In many practical regimes of interest, is a constant. For ease of exposition in the following, we assume that is a small constant, e.g. and suppress the dependence on . Suppose the original mechanism underlying satisfies -DP. Then by advanced composition (Dwork et al., 2010), we have that satisfies -DP. If , then we have that . Instantiating this with , and gives us that is at most a constant.
Note that, to set the value of and , we do not need the exact value of (or or ). We only need an upper bound on , which can often be obtained by using the RDP accountant or some other method.
For the case when is not a perfect square, using a similar analysis, it is easy to see that the approximation error would be no worse than the error in self-compositions. The running time increases by a constant because of the additional step of -fold convolution to get and the final convolution step to get ; however this does not affect the asymptotic time complexity of the algorithm. Moreover, as seen in Section 4, even with this additional cost, is still faster than the algorithm in Gopi et al. (2021).
3.1 Analysis
In this section we establish Theorem 3.1. The proof proceeds are follows. We establish coupling approximations between consecutive random variables in the following sequence:
Coupling .
Coupling and .
We have from Proposition 2.7 that and that . Thus, by applying Lemma 2.6(4), we have that (for any ; to be chosen later)
(2) | ||||
(3) |
Similarly, we have that
(4) |
Coupling and .
Towards combining the bounds.
Combining Equations 1, 3, 5, 4 and 6 using Lemma 2.6(2), we have that
(7) | ||||
(8) |
We set and . The key step remaining is to bound , , and in terms of ’s, ’s, and ’s.
To do so, we use Lemma 2.4 for ’s to be chosen later.
Bounding .
We have and hence
Bounding .
For , we have
where the second inequality uses Equation 2 and the third inequality uses the fact that the tails of are only heavier than the tails of since is a truncation of .
Bounding .
First, we combine Equations 3, 4 and 5 to get (recall in Equation 8)
Using this, we get
where we use in the third step that tails of are only heavier than tails of since is a truncation of .
Putting it all together.
Plugging in these bounds for , , and in Equation 7, we get that
(9) | ||||
Thus, we can set parameters , , and as follows such that each of the four terms in Equation 9 is at most to satisfy the above:
Setting (assumed to be integers) and , and , completes the proof of Theorem 3.1.
Runtime analysis.
As argued earlier, the total running time of is given as . Substituting in the bounds for , , , and , we get a final running time of
where is as in Corollary 3.2.
3.2 Heterogeneous Compositions
Algorithm 2 can be easily generalized to handle heterogeneous composition of different mechanisms, with a running time blow up of over the homogeneous case. We defer the details to Appendix B.
4 Experimental Evaluation of






We empirically evaluate on the tasks of self-composing two kinds of mechanism acting on dataset of real values as
-
Laplace Mechanism : returns plus a noise drawn from given by the .
-
Poisson Subsampled Gaussian Mechanism with probability : Subsamples a random subset of indices by including each index independently with probability . Returns plus a noise drawn from the Gaussian distribution .
Both these mechanisms are highly used in practice. For each mechanism, we compare against the implementation by Gopi et al. (2021)222github.com/microsoft/prv_accountant (referred to as GLW) on three fronts: (i) the running time of the algorithm, (ii) the number of discretized buckets used, and (iii) the final estimates on which includes comparing lower bound , estimates and upper bounds . We use and in all the experiments.
We run each algorithm for a varying number of self-compositions of a basic mechanism. The noise parameter of basic mechanism is chosen such that the final value after -fold composition is equal to for each value of .333these values were computed using the Google DP accountant github.com/google/differential-privacy/tree/main/python. The exact choice of noise parameters used are shown in Figure 3.
The comparison for the Laplace mechanism is shown in Figure 1 and for the subsampled Gaussian mechanism is shown in Figure 2. In terms of accuracy we find that for the same choice of and , the estimates returned by are nearly identical to the estimates returned by GLW for the subsampled Gaussian mechanism. On the other hand, the estimates for Laplace mechanism returned by both algorithms are similar and consistent with each other, but strictly speaking, incomparable with each other.
5 Multi-Stage Recursive Composition
We extend the approach in to give a multi-stage algorithm (Algorithm 3), presented only when is a power of for ease of notation. Similar to the running time analysis of , the running time of is given as
assuming an -time access to .
Theorem 5.1.
For all PRV and , the approximation returned by satisfies
using a choice of parameters satisfying for all that
Proof Outline.
We establish coupling approximations between consecutive random variables in the sequence:
using a similar approach as in the proof of Theorem 3.1.
Running time analysis.
The overall running time is at most , which can be upper bounded by
where .
In many practical regimes of interest, is at most . For ease of exposition in the following, we assume that is a small constant, e.g. and suppress the dependence on . Suppose the original mechanism underlying satisfies -DP. Then by advanced composition (Dwork et al., 2010), we have that satisfies -DP. If , then we have that . Instantiating this with gives us that is at most .
6 Conclusions and Discussion
In this work, we presented an algorithm with a running time and memory usage of for the task of self-composing a broad class of DP mechanisms times. We also extended our algorithm to the case of composing different mechanisms in the same class, resulting in a running time and memory usage ; both of these improve the state-of-the-art by roughly a factor of . We also demonstrated the practical benefits of our framework compared to the state-of-the-art by evaluating on the sub-sampled Gaussian mechanism and the Laplace mechanism, both of which are widely used in the literature and in practice.
For future work, it would be interesting to tighten the log factors in our bounds. A related future direction is to make the algorithm more practical, since the current recursive analysis is quite loose. Note that could also be performed with an arity larger than ; e.g., with an arity of , one would perform compositions as a three-stage composition. For any constant arity, our algorithm gives an asymptotic runtime of as , however, for practical considerations, one may also consider adapting the arity with to tighten the log factors. We avoided doing so for simplicity, since our focus in obtaining an running time was primarily theoretical.
Acknowledgments
We would like to thank the anonymous reviewers for their thoughtful comments that have improved the quality of the paper.
References
- Abadi et al. (2016) Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., and Zhang, L. Deep learning with differential privacy. In CCS, pp. 308–318, 2016.
- Abowd (2018) Abowd, J. M. The US Census Bureau adopts differential privacy. In KDD, pp. 2867–2867, 2018.
- Apple Differential Privacy Team (2017) Apple Differential Privacy Team. Learning with privacy at scale. Apple Machine Learning Journal, 2017.
- Bu et al. (2020) Bu, Z., Dong, J., Long, Q., and Su, W. J. Deep learning with Gaussian differential privacy. Harvard Data Science Review, 2020(23), 2020.
- Bu et al. (2021) Bu, Z., Gopi, S., Kulkarni, J., Lee, Y. T., Shen, J. H., and Tantipongpipat, U. Fast and memory efficient differentially private-SGD via JL projections. In NeurIPS, 2021.
- Bun & Steinke (2016) Bun, M. and Steinke, T. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In TCC, pp. 635–658, 2016.
- Bun et al. (2018) Bun, M., Dwork, C., Rothblum, G. N., and Steinke, T. Composable and versatile privacy via truncated CDP. In STOC, pp. 74–86, 2018.
- Ding et al. (2017) Ding, B., Kulkarni, J., and Yekhanin, S. Collecting telemetry data privately. In NeurIPS, pp. 3571–3580, 2017.
- Dong et al. (2019) Dong, J., Roth, A., and Su, W. J. Gaussian differential privacy. arXiv:1905.02383, 2019.
- Doroshenko et al. (2022) Doroshenko, V., Ghazi, B., Kamath, P., Kumar, R., and Manurangsi, P. Connect the dots: Tighter discrete approximations of privacy loss distributions. In PETS (to appear), 2022.
- Dwork & Rothblum (2016) Dwork, C. and Rothblum, G. N. Concentrated differential privacy. arXiv:1603.01887, 2016.
- Dwork et al. (2006a) Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., and Naor, M. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pp. 486–503, 2006a.
- Dwork et al. (2006b) Dwork, C., McSherry, F., Nissim, K., and Smith, A. D. Calibrating noise to sensitivity in private data analysis. In TCC, pp. 265–284, 2006b.
- Dwork et al. (2010) Dwork, C., Rothblum, G. N., and Vadhan, S. Boosting and differential privacy. In FOCS, pp. 51–60, 2010.
- Erlingsson et al. (2014) Erlingsson, Ú., Pihur, V., and Korolova, A. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In CCS, pp. 1054–1067, 2014.
- Google (2020) Google. DP Accounting Library. https://github.com/google/differential-privacy/tree/main/python/dp_accounting, 2020.
- Gopi et al. (2021) Gopi, S., Lee, Y. T., and Wutschitz, L. Numerical composition of differential privacy. In NeurIPS, 2021.
- Greenberg (2016) Greenberg, A. Apple’s “differential privacy” is about collecting your data – but not your data. Wired, June, 13, 2016.
- Kairouz et al. (2015) Kairouz, P., Oh, S., and Viswanath, P. The composition theorem for differential privacy. In ICML, pp. 1376–1385, 2015.
- Kenthapadi & Tran (2018) Kenthapadi, K. and Tran, T. T. L. Pripearl: A framework for privacy-preserving analytics and reporting at LinkedIn. In CIKM, pp. 2183–2191, 2018.
- Koskela & Honkela (2021) Koskela, A. and Honkela, A. Computing differential privacy guarantees for heterogeneous compositions using FFT. arXiv:2102.12412, 2021.
- Koskela et al. (2020) Koskela, A., Jälkö, J., and Honkela, A. Computing tight differential privacy guarantees using FFT. In AISTATS, pp. 2560–2569, 2020.
- Koskela et al. (2021) Koskela, A., Jälkö, J., Prediger, L., and Honkela, A. Tight differential privacy for discrete-valued mechanisms and for the subsampled Gaussian mechanism using FFT. In AISTATS, pp. 3358–3366, 2021.
- Lukas Prediger (2020) Lukas Prediger, A. K. Code for computing tight guarantees for differential privacy. https://github.com/DPBayes/PLD-Accountant, 2020.
- Meiser & Mohammadi (2018) Meiser, S. and Mohammadi, E. Tight on budget? Tight bounds for -fold approximate differential privacy. In CCS, pp. 247–264, 2018.
- Microsoft (2021) Microsoft. A fast algorithm to optimally compose privacy guarantees of differentially private (DP) mechanisms to arbitrary accuracy. https://github.com/microsoft/prv_accountant, 2021.
- Mironov (2017) Mironov, I. Rényi differential privacy. In CSF, pp. 263–275, 2017.
- Murtagh & Vadhan (2016) Murtagh, J. and Vadhan, S. The complexity of computing the optimal composition of differential privacy. In TCC, pp. 157–175, 2016.
- Rogers et al. (2021) Rogers, R., Subramaniam, S., Peng, S., Durfee, D., Lee, S., Kancha, S. K., Sahay, S., and Ahammad, P. LinkedIn’s audience engagements API: A privacy preserving data analytics system at scale. Journal of Privacy and Confidentiality, 11(3), 2021.
- Shankland (2014) Shankland, S. How Google tricks itself to protect Chrome user privacy. CNET, October, 2014.
- Sommer et al. (2019) Sommer, D. M., Meiser, S., and Mohammadi, E. Privacy loss classes: The central limit theorem in differential privacy. PoPETS, 2019(2):245–269, 2019.
- Zhu et al. (2022) Zhu, Y., Dong, J., and Wang, Y. Optimal accounting of differential privacy via characteristic function. In AISTATS, pp. 4782–4817, 2022.
Appendix A Proofs of Coupling Approximation Properties
For sake of completeness, we include proofs of the lemmas we use from Gopi et al. (2021).
Proof of Lemma 2.6(2).
There exists couplings and such that and . From these two couplings, we can construct a coupling between : sample , sample from (given by coupling ) and finally sample from (given by coupling ). Therefore for this coupling, we have:
Appendix B Two-Stage Heterogeneous Composition
We can handle composition of different PRVs with a slight modification to as given in Algorithm 4. The approximation analysis remains similar as before. The main difference is that and are to be chosen as
where we denote . In the case of (assumed to be an integer) and , this leads to a final running time of
where can be bounded as
Appendix C Analysis of
We establish coupling approximations between consecutive random variables in the sequence:
Coupling and .
Coupling and .
We have from Proposition 2.7 that and that for some satisfying . Thus, applying Lemma 2.6(4), we have (for to be chosen later) that
(11) |
Coupling and .
Putting things together.
Thus, combining Equations 11 and 12 for , using Lemma 2.6(3), we have that
(13) | ||||
(14) | ||||
More generally, the same analysis shows that for any ,
(15) | ||||
To simplify our analysis going forward, we fix the choice of ’s that we will use, namely, (for that will be chosen later). This implies that
where the last step uses that .
In order to get our final bound, we need to bound ’s in terms of , the mesh sizes ’s, and truncation parameters ’s. For ease of notation, we let . We have for that
(16) |
where in the penultimate step we use that the tails of are no larger than tails of since is a truncation of , and in the last step we use Lemma 2.4 with (eventually we set ).
We show using an inductive argument that for ,
(17) | ||||
(18) |
The base case holds since ; note . From (15), we have
This completes the inductive step (18). Finally, (16) immediately implies the inductive step (17).
Putting this together in (14), and setting , we get
Finally, combining (13) with (10) using Lemma 2.6(2), we get
Thus, we get the desired approximation result for the following choice of parameters
Thus, the overall running time is at most
Extensions of .
Similar to Appendix B, it is possible to extend to handle the case where the number of compositions is not a power of , with a similar asymptotic runtime. It can then be extended to handle heterogeneous compositions of different mechanisms.