Data-driven fixed-point tuning for truncated realized variations

B. C.B. Cooper Boniecelabel=e1][email protected] [ J. E.José E. Figueroa-Lópezlabel=e2][email protected] [ Y.Yuchen Hanlabel=e3][email protected] [ Department of Mathematics, Drexel Universitypresep=, ]e1 Department of Statistics and Data Science, Washington University in St. Louispresep=, ]e2,e3

Abstract

Many methods for estimating integrated volatility and related functionals of semimartingales in the presence of jumps require specification of tuning parameters for their use in practice. In much of the available theory, tuning parameters are assumed to be deterministic and their values are specified only up to asymptotic constraints. However, in empirical work and in simulation studies, they are typically chosen to be random and data-dependent, with explicit choices often relying entirely on heuristics. In this paper, we consider novel data-driven tuning procedures for the truncated realized variations of a semimartingale with jumps based on a type of random fixed-point iteration. Being effectively automated, our approach alleviates the need for delicate decision-making regarding tuning parameters in practice and can be implemented using information regarding sampling frequency alone. We demonstrate our methods can lead to asymptotically efficient estimation of integrated volatility and exhibit superior finite-sample performance compared to popular alternatives in the literature.

1 Introduction

High-frequency data,

integrated volatility estimation,

semimartingales,

keywords:

\startlocaldefs\endlocaldefs

The continuous part of the quadratic variation of an Itô semimartingale, commonly known as the integrated volatility, plays an outsize role in financial econometrics, and its estimation in various settings based on discrete observations has been a major focus in the literature at various points in the past 20+ years. The semimartingale $X$ commonly represents the log-price of a financial asset, and its integrated volatility serves as a measure of the overall uncertainty inherent in the continuous part of $X$ over a given time period.

Among the variety of available methods for integrated volatility estimation, the truncated realized variation (TRV), introduced in [mancini:2001], was one of the first and remains among the most popular approaches to-date that is jump-robust, in the sense that it can still provide reliable estimates of integrated volatility when jumps occur in the process $X$ . Other well known jump-robust methods for estimating integrated volatility include bipower variations and their extensions (barndorff-nielsen:2004; barndorff-nielsen:shephard:winkel:2006; corsi:pirino:reno:2010) or those based on empirical characteristic functions (todorov:tauchen:2012; jacod:todorov:2014; jacod:todorov:2018), among others, giving the practitioner a wide array of choices at their disposal for estimation of integrated volatility in modeling contexts where jumps may be present.

To choose an estimator among this array of options, currently one must first decide between two distinct classes: either asymptotically efficient approaches, like TRV, which require selection of tuning parameters, or alternatively “tuning-free” estimators but at the unfortunate expense of asymptotic efficiency. From the perspective of minimizing variance, asymptotically efficient approaches are preferable, but their use in practice necessitates the critical additional step of specifying the tuning parameter values themselves. This consequential step can significantly impact estimation performance, but current asymptotic theory does not offer direct guidelines for choosing parameters explicitly, which can be an extremely delicate matter in practice. For instance, even in idealized asymptotic settings, appropriate choices often depend on a priori unknown properties of $X$ and can determine whether or not a given estimator retains even the basic requirement of consistency. In the absence of theoretically supported approaches for specifying explicit values of these parameters, the practical use of tuning-parameter-based methods remains entirely reliant on heuristics. The purpose of the present work is to address this gap.

In the case of TRV, the tuning parameter of importance is called the threshold, denoted hereafter as $\varepsilon>0$ , indicating a level above which increments are discarded from the estimation procedure. Concretely, given a discretely observed semimartingale $X=\{X_{t}\}_{t\geq 0}$ at times $0=t_{0}<t_{1}<\ldots<t_{n}=T$ , the TRV is defined as {align*}TRV_n(ε)=∑_i=1^n(Δ_i^nX)^21_{—Δ_i^nX—≤ε}, where $\Delta_{i}^{n}X:=X_{t_{i}}-X_{t_{i-1}}$ is the $i^{th}$ increment of $X$ , often assumed to be observed on a regular sampling grid, so that $t_{i}-t_{i-1}=:h_{n}$ for all $i$ . Statistical properties of TRV have been extensively studied when $\varepsilon=\varepsilon(h_{n})$ is a deterministic function of the time step $h_{n}$ such that $\varepsilon(h_{n})\to 0$ at specified rates as $h_{n}\to 0$ . In mancini:2009, when either the jump component of the process $X$ is of finite activity or is a pure-jump Lévy process with infinite jump activity, TRV was shown to be consistent whenever

\displaystyle\lim_{h_{n}\to 0}\varepsilon(h_{n})=0,\quad\mbox{and}\quad\lim_{h_{n}\to 0}\frac{\varepsilon(h_{n})}{\sqrt{h_{n}\log\frac{1}{h_{n}}}}=\infty.

(1)

A consistency statement for TRV encompassing a broader class of semimartingales was given in jacod:2008, but for the more restrictive case of power thresholds, namely, thresholds of the form