This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Computer-Assisted Verification of Four Interval Arithmetic Operators

Daisuke Ishii [email protected] Tomohito Yabu Department of Information Science, University of Fukui. 3-9-1 Bunkyo, Fukui-shi, Fukui, 910-8507, Japan
Abstract

Interval arithmetic libraries provide the four elementary arithmetic operators for operand intervals bounded by floating-point numbers. Actual implementations need to make a large case analysis that considers, e.g., magnitude relations between all pairs of argument bounds, positional relations between the arguments and zero, and handling of the special values, ±\pm\infty and NaN\mathrm{NaN}. Their correctness is not obvious as they are implemented by human hands, which comes to be critical for the reliability. This work provides a mechanically-verified interval arithmetic library. For this purpose, we utilize the Why3 platform equipped with a specification language for annotated programs and back-end theorem provers. We conduct several proof tasks for each of three properties of the target code: validity, soundness, and tightness; zero division exception handling is also verified for the division code. To accomplish the proof, we propose several techniques for specification/verification. First, we specify additional lemmas that support deductions made by back-end SMT solvers, which enable to discharge proof obligations in floating-point arithmetic containing nonlinear terms. Second, we examine the annotation of tightness, which requires to assume that a computation may result in NaN; we propose specific extremum operators for this purpose. In the experiments, applying the techniques in conjunction with the Alt-Ergo SMT solver and the Coq proof assistant proved the entire code.

keywords:
Interval Arithmetic , Floating-Point Numbers , Program Verification , SMT Solvers , Proof Assistants
journal: Journal of Computational and Applied Mathematics

1 Introduction

Interval arithmetic [1, 2] is a reliable method for numerical computation that deals with reals. It handles (machine-representable) intervals instead of numerical values, typically floating-point (FP) numbers, and evaluates arithmetic expressions via computing their interval enclosures with careful control of rounding modes and interval widening. The four elementary arithmetic operators on intervals can be implemented in an efficient way that only considers the bounds of the given argument intervals. However, actual implementations need to make a large case analysis that considers, e.g., magnitude relations between all pairs of argument bounds, positional relations between the arguments and zero, and handling of the special values, ±\pm\infty and NaN\mathrm{NaN}. For instance, the implementation of the multiplication in the kv library [3] comprises 13 branches. As existing code bases are implemented by human hands, their correctness is not obvious, which comes to be critical for the reliability.

In this work, we use Why3 to prove the correctness of the code. Why3 [4, 5] has been developed as a program verification platform, which integrates automated theorem provers (e.g. SMT solvers) and interactive proof assistants (e.g. Coq). Why3 provides a specification language WhyML to describe a target program and to annotate the program with meta-level properties. Given a specification, Why3 generates verification conditions (VCs) that ensure the correctness when they are all validated. Users can then apply the back-end provers in succession to discharge all of the VCs. Targets of Why3 include numerical programs and several applications have been reported (e.g. [6]), as support for the numerical domain has been developed both in the platform and its back-end provers.

This study aims for the verification of the four interval arithmetic operations, i.e., addition, subtraction, multiplication and division. For this purpose, we translate an implementation into WhyML and properly annotate the preconditions and postconditions. For the generated VCs, we experiment with discharging them via utilizing back-end provers, Alt-Ergo [7] and Coq. Although the control structure of the implementation is simple, which only consists of branching statements, proof obligation becomes complicated due to combinations of FP number expressions and the axioms that realize the system of FP numbers/intervals in Why3. Several VCs thus remain unproved after a basic usage of Why3. Therefore, we examine the target annotated code and propose techniques to complete the verification. Finally, we show that the entire annotated code is verified and present experimental measures such as timings.

1.1 Contribution

This contribution is a mechanical proof of an implementation of the four operators, which verifies mainly three properties: validity (“the results conform to the definition of intervals”), soundness (“the results enclose every possible real values”) and tightness (“the resulting enclosures are optimal/tightest”). Our verification result indicates that the operators, which compute with the bounds of operand intervals and return the hull of computed results, satisfy the three properties.

To complete the proof, we propose several techniques to specify/verify the target interval operations and their properties. First, through an examination of each unproved VC, we find that specifying an additional lemma will help the Alt-Ergo SMT solver to discharge the VC. For instance, for a VC on FP numbers, a lemma on reals involving the same expression as the VC may help to discharge it. We identify 19 lemmas needed for the complete verification. Second, we propose a specification of the tightness property of an interval computation such that the code annotated with the property is verified by SMT solvers. In this specification, it is essential to reformulate the interval hull operation with a consideration of NaN.

We note that the target of above techniques is not limited to interval arithmetic operations but applicable to various proof tasks in the numerical domain, e.g., verification of other interval functions and application programs.

1.2 Outline

Section 2 introduces basic notions and concepts of interval arithmetic. Section 3 explains the main tools for verification: Why3 and Alt-Ergo. The following sections describe the main verification process for an interval arithmetic code. First, Section 4 gives an overview of the process. Second, Section 5 explains the specification of the target code. Next, Section 6 and Section 7 describe the processes of annotating the pre- and postconditions to the code and discharging the generated proof obligations. Section 8 reports the complete results of the verification experiment.

2 Interval Arithmetic

Interval arithmetic [1, 2, 8] is a method for reliable numerical computations, which replaces the FP numbers that are usually used in numerical computation with intervals that enclose values in \mathbb{R}; the result should contain all the solutions calculated for every real in the given input interval.

Before introducing intervals, we consider sets of machine-representable numbers that conform to IEEE-754 standard [9], i.e., floating-point (FP) numbers. In the following, 𝔽\mathbb{F} denotes a set of finite FP numbers including the signed zeros (0-0 and +0+0; we also denote +0+0 by 0). Additionally, special FP data, i.e., -\infty, ++\infty, NaN\mathrm{NaN}, are considered according to the context. In the following, we assume IEEE-754’s comparison operators {<,>,,,=}\{<,>,\leq,\geq,=\} for FP numbers in 𝔽{±,NaN}\mathbb{F}\cup\{\pm\infty,\mathrm{NaN}\}; for example, +0=0+0=-0 holds and NaNx\mathrm{NaN}\leq x does not hold for x𝔽{±,NaN}x\in\mathbb{F}\cup\{\pm\infty,\mathrm{NaN}\}. Given a real x~\tilde{x}\in\mathbb{R}, (x~)\bigtriangledown(\tilde{x}) and (x~)\bigtriangleup(\tilde{x}) denote the downward and upward rounded values in 𝔽{,+}\mathbb{F}\cup\{-\infty,+\infty\} and are defined by (x~):=max{x𝔽{}:xx~}\bigtriangledown(\tilde{x}):=\max\{x\in\mathbb{F}\cup\{-\infty\}:x\leq\tilde{x}\} and (x~):=min{x𝔽{+}:xx~}\bigtriangleup(\tilde{x}):=\min\{x\in\mathbb{F}\cup\{+\infty\}:x\geq\tilde{x}\}. Additionally, we assume the four operators for FP numbers {+,,×,÷}\odot\in\{+,-,\times,\div\}. For x,y𝔽{±}x,y\in\mathbb{F}\cup\{\pm\infty\}, (xy)\bigtriangledown(x\odot y) and (xy)\bigtriangleup(x\odot y) denote the downward and upward rounded values of the operation \odot. Special cases are handled according to IEEE-754 and the semantics in [10]: Infinity arithmetic, e.g., 0+=0-+\infty=-\infty; invalid operations, e.g., ++=NaN+\infty-+\infty=\mathrm{NaN}; and overflows, e.g., positive overflow of (xy)\bigtriangledown(x\odot y) evaluates to max𝔽;\max\mathbb{F}; we also assume that underflows result in correctly rounded values.

An interval 𝒙=[x¯,x¯]{\bm{x}}=[\underline{x},\overline{x}] is defined as a closed connected set {x~:x¯x~x¯}\{\tilde{x}\in\mathbb{R}:\underline{x}\!\leq\!\tilde{x}\!\leq\!\overline{x}\}, where x¯x¯\underline{x}\leq\overline{x}. In an actual implementation, the lower and upper bounds x¯\underline{x} and x¯\overline{x} are restricted to FP numbers in 𝔽\mathbb{F}. We also consider the entire interval [,+]:=[-\infty,+\infty]:=\mathbb{R} and half-bounded intervals [x¯,+]:={x~:x¯x~}[\underline{x},+\infty]:=\{\tilde{x}\in\mathbb{R}:\underline{x}\!\leq\!\tilde{x}\} and [,x¯]:={x~:x~x¯}[-\infty,\overline{x}]:=\{\tilde{x}\in\mathbb{R}:\tilde{x}\!\leq\!\overline{x}\}, where x¯,x¯𝔽\underline{x},\overline{x}\in\mathbb{F}. 𝕀𝔽\mathbb{IF} denotes the set of intervals, which contains the entire interval and the closed/half-bounded intervals bounded by FP numbers in 𝔽{±}\mathbb{F}\cup\{\pm\infty\}. Intervals are considered as overapproximations of (non-empty) sets SS\subseteq\mathbb{R} of interest; when the extrema of a set are not representable in 𝕀𝔽\mathbb{IF}, we consider an interval whose lower/upper bounds are rounded downwards/upwards. Given a set SS\subseteq\mathbb{R}, An optimal (tightest) overapproximation is obtained by the hull operation hull(S):=[(infS),(supS)]\mathrm{hull}(S):=\allowbreak[\bigtriangledown(\mathrm{inf}\ S),\bigtriangleup(\mathrm{sup}\ S)].

Recently, interval arithmetic has been standardized [8]. The standard is layered into four levels as in IEEE-754 and it is parameterized with the notions such as flavors and decorations to be adapted to various models and implementations. The interval arithmetic in this paper deals with a variant of the set-based flavor. It is different in that we consider neither empty intervals nor division by an interval containing zero. The set-based flavor represents the empty set as empty intervals, e.g. [x¯,x¯][\underline{x},\overline{x}] where x¯>x¯\underline{x}>\overline{x} and [,][-\infty,-\infty], but our 𝕀𝔽\mathbb{IF} does not contain them. In the set-based flavor, 𝒙÷[0,0]{\bm{x}}\div[0,0] is computed as an empty interval, and other divisions by intervals containing zero can be done with the two-output division scheme, whereas our division operator throws an exception. Otherwise, our interval arithmetic is “classical” [8], which is based on the “simple model” [11]. In this paper, we intend to make the interval arithmetic code simple without missing the essential verification process. We consider the process will be applied to other extensions in a future work.

Now, we introduce a definition of the four arithmetic operations on intervals.

Definition 1 (Four interval arithmetic operators)

For 𝐱,𝐲𝕀𝔽{\bm{x}},{\bm{y}}\in\mathbb{IF} and {+,,×,÷}\odot\in\{+,-,\times,\div\},

𝒙𝒚:=hull({xy:x𝒙,y𝒚})(We assume 0𝒚 when =÷).{\bm{x}}\odot{\bm{y}}~{}:=~{}\mathrm{hull}(\{x\odot y\,:\,x\in{\bm{x}},y\in{\bm{y}}\})\qquad\text{(We assume $0\not\in{\bm{y}}$ when $\odot=\div$)}.

The following theorem gives the basic principle for the tight implementation of the interval operators, namely to take the hull of the values computed with the bounds of operand intervals.

Theorem 1

For 𝐱,𝐲𝕀𝔽{\bm{x}},{\bm{y}}\in\mathbb{IF} and {+,,×,÷}\odot\in\{+,-,\times,\div\} (we assume 0𝐲0\not\in{\bm{y}} when =÷\odot=\div),

𝒙𝒚[min{(x¯y¯),(x¯y¯),(x¯y¯),(x¯y¯)},max{(x¯y¯),(x¯y¯),(x¯y¯),(x¯y¯)}],{\bm{x}}\odot{\bm{y}}~{}\supseteq~{}\bigl{[}\mathsfit{min}\bigl{\{}\bigtriangledown(\underline{x}\odot\underline{y}),\bigtriangledown(\underline{x}\odot\overline{y}),\bigtriangledown(\overline{x}\odot\underline{y}),\bigtriangledown(\overline{x}\odot\overline{y})\bigr{\}},~{}\mathsfit{max}\bigl{\{}\bigtriangleup(\underline{x}\odot\underline{y}),\bigtriangleup(\underline{x}\odot\overline{y}),\bigtriangleup(\overline{x}\odot\underline{y}),\bigtriangleup(\overline{x}\odot\overline{y})\bigr{\}}\bigr{]}, (1)

where for S𝔽{±,NaN}S\subseteq\mathbb{F}\cup\{\pm\infty,\mathrm{NaN}\},

minS\displaystyle\mathsfit{min}\ S :={min(S{NaN})if S{NaN},0if S={NaN},\displaystyle~{}:=~{}\begin{cases}\min(S-\{\mathrm{NaN}\})&\text{if $S\neq\{\mathrm{NaN}\}$},\\ 0&\text{if $S=\{\mathrm{NaN}\}$},\end{cases} maxS\displaystyle\mathsfit{max}\ S :={max(S{NaN})if S{NaN},0if S={NaN}.\displaystyle~{}:=~{}\begin{cases}\max(S-\{\mathrm{NaN}\})&\text{if $S\neq\{\mathrm{NaN}\}$},\\ 0&\text{if $S=\{\mathrm{NaN}\}$}.\end{cases} (2)

See A for the proof. Because the above theorem serves as a basis for the tightness specification in the following sections, we only consider the one-way inclusion \supseteq in (1). In the theorem, we use modified extremum operators min\mathsfit{min} and max\mathsfit{max} with an unusual handling of NaN\mathrm{NaN}, which occurs in some FP computations, e.g., +++\infty-+\infty and 0×+0\times+\infty. They evaluate to zero when all the arguments are NaN\mathrm{NaN} (with the second cases), whereas the operators of IEEE-754 evaluate to NaN\mathrm{NaN}. The min\min and max\max operators in the right-hand sides of (2) select the extrema based on the ordinary comparison between FP numbers, e.g., min{0,+}=0\min\{0,+\infty\}=0. For example, a subtraction and a multiplication are performed as:

[0,+][0,+]\displaystyle[0,+\infty]-[0,+\infty] [min{0,,+,NaN},max{0,,+,NaN}]\displaystyle~{}\supseteq~{}\bigl{[}\mathsfit{min}\bigl{\{}0,-\infty,+\infty,\mathrm{NaN}\bigr{\}},~{}\mathsfit{max}\bigl{\{}0,-\infty,+\infty,\mathrm{NaN}\bigr{\}}\bigr{]}
=[min{0,,+},max{0,,+}]=[,+],\displaystyle~{}=~{}\bigl{[}\min\{0,-\infty,+\infty\},~{}\max\{0,-\infty,+\infty\}\bigr{]}~{}=~{}[-\infty,+\infty],
[0,0]×[,+]\displaystyle[0,0]\times[-\infty,+\infty] [min{NaN},max{NaN}]=[0,0].\displaystyle~{}\supseteq~{}\bigl{[}\mathsfit{min}\bigl{\{}\mathrm{NaN}\},~{}\mathsfit{max}\{\mathrm{NaN}\}\bigr{]}~{}=~{}[0,0].

In our context, the second cases of (2) only happen in the multiplications of [0,0][0,0] and [,+][-\infty,+\infty].

Several libraries that implement the interval four operators have been developed, e.g., PROFIL/BIAS,111http://www.ti3.tuhh.de/keil/profil/index_e.html filib++,222http://www2.math.uni-wuppertal.de/wrswt/software/filib.html Boost Interval Arithmetic Library,333https://www.boost.org/doc/libs/1_72_0/libs/numeric/interval/doc/interval.htm gaol,444http://frederic.goualard.net/#research-software libieeep1788,555https://github.com/nadezhin/libieeep1788 INTLAB,666http://www.ti3.tu-harburg.de/rump/intlab/ GNU Octave interval package,777https://octave.sourceforge.io/interval/index.html and kv [3]. In the actual code base, the multiplication and division operators are implemented as nested branching statements based on a careful case analysis. The number of branches increases further due to other factors such as optimizations to reduce the number of rounding mode controls [12], data-level parallelization [13] and multi-precision FP computation [3].

3 Why3 Platform and Alt-Ergo SMT Solver

The main tools used in this work are Why3 and Alt-Ergo.

3.1 Overview of the Tools

Why3 (version 1.1.0) [4, 5] is a platform for deductive program verification, which provides: (i) a specification language WhyML to describe programs and specifications; (ii) a verification condition (VC) generator that applies weakest precondition calculus to the annotated programs; and (iii) a proof intermediate interface that facilitates discharging the VCs using the back-end automated/interactive theorem provers. Why3’s back-end theorem provers include SMT solvers, e.g., Alt-Ergo [7], CVC4 and Z3,888Alt-Ergo: https://alt-ergo.ocamlpro.com/; CVC4: https://cvc4.github.io/; Z3: https://github.com/Z3Prover/z3 and theorem proof assistants, e.g., Coq.999https://coq.inria.fr/ The possible results of applying an automated prover to discharge a VC are valid, invalid, timeout (within a configured time limit), or unknown. Why3 provides a GUI to allow users to browse a list of generated VCs. Users assign a back-end prover to each VC to validate it. When provers cannot validate a VC, users are able to modify the VC (e.g., split one VC into several VCs) and investigate a prover’s input/output though the GUI.

WhyML is an OCaml-like language to describe functional programs and to annotate programs with first-order predicate logic statements. The language has a type system, which provides the type real for real numbers, types for FP numbers, user-defined record types, etc. WhyML also has a module system that encapsulates specification as a module; basic modules are supplied in the standard library. There are two implementations of 64-bit FP numbers: the type t provided by the ieee_float.Float64 module and the type double of the floating_point.DoubleFull module. Running examples in this paper uses the former, which has been introduced in a recent version of Why3; it is compatible with the FloatingPoint theory of SMT-LIB and processed by dedicated solvers when the VCs are discharged with SMT solvers. The latter realizes a theory of FP numbers based on the theory of reals and the generated VCs are encoded with the predicates on reals. The two modules provide arithmetic operators (with rounding mode control) and comparison operators that conform to IEEE-754. FP numbers are related to real numbers using the rounding functions and translation functions provided by either module.

Alt-Ergo (version 2.2.0) [7] is an SMT solver, which shares various features with Why3. For example, Alt-Ergo has a trigger mechanism that allows users to guide the solver towards the proof, using a user-defined lemma. In WhyML, auxiliary lemmas can be specified with triggers and they are utilized by Alt-Ergo. In our experiments, with the help of several user-defined lemmas, Alt-Ergo discharged the most VCs.

3.2 Program Verification Process Using Why3

A user of Why3 verifies a program through performing the following tasks.

Specification of the target code, properties and lemmas. The user implements the target code in the WhyML language and annotates the properties to the code as preconditions and postconditions. Here, requirements for the code should be better represented as simple and generic properties; however, such properties tend to require more efforts to verify. In our work, we investigate an intuitive yet provable representation of the tightness property (cf. Theorem 1). Also, the user is able to annotate properties about intermediate states by inserting assert formulas in the middle of the code, and specify lemmas related to the properties besides the code. These auxiliary annotations will help automated deductions performed by provers like Alt-Ergo.

Generation of VCs and proving with back-end provers. Once an annotated WhyML program is fed to the Why3 tool, a set of VCs are generated; discharging them all entails the correctness of the annotated program. The user is able to modify the VCs by applying logical transformations; for example, s/he can split a VC into several VCs which are easier to prove. Then, s/he applies a prover to each VC to check its validity. In this work, we first apply SMT solvers, i.e., Alt-Ergo and Z3, and then invoke the Coq tool for manual proving.

Analysis on unprovable VCs and modification of the specification. When VCs are not provable, the user returns to the specification task to modify either the target code or the annotations. Since the automated proving processes of VCs are incomplete in general, they may fail to prove VCs even for a correct specification. In such cases, the user should make an effort in finding a specification that results in provable VCs. Here, an interactive proof task with Coq may help to understand the proof context and to identify an additional lemma that is useful to discharge the proof goal. In this work, we actually make such analyses; as a result, we insert assertions within the code and specify lemmas to help Alt-Ergo’s proof process.

4 Towards a Verified Interval Arithmetic Library

The goal of this study is to verify the correctness of an implementation of the four interval arithmetic operators. The implementation code is first cloned from the kv library [3] and then modified to conform to Theorem 1 (see Section 5.3 for details). To achieve this goal, we apply the verification process described in Section 3.2 to the operator code. This section describes the properties we aim at verifying.

When interval operators are implemented in a programming language (e.g. WhyML), without loss of generality, we can assume that intervals are represented as pairs of FP numbers such as 𝒙=(x¯,x¯){\bm{x}}=(\underline{x},\overline{x}) where x¯\underline{x} and x¯\overline{x} can be arbitrary FP numbers in 𝔽{±,NaN}\mathbb{F}\cup\{\pm\infty,\mathrm{NaN}\}; we denote the set of such pairs by 𝕀𝔽\mathbb{IF}^{*} in this section. In the following, we denote that a pair 𝒙𝕀𝔽{\bm{x}}\in\mathbb{IF}^{*} is a valid interval by 𝒙𝕀𝔽{\bm{x}}\in\mathbb{IF}. Accordingly, interval operations {+,,×,÷}\odot\in\{+,-,\times,\div\} will be implemented as procedures 𝒇:𝕀𝔽×𝕀𝔽𝕀𝔽{\bm{f}}_{\odot}:\mathbb{IF}^{*}\times\mathbb{IF}^{*}\to\mathbb{IF}^{*}.

The properties are specified as one precondition and four postconditions, i.e., first-order conditions in \mathbb{R}, 𝔽\mathbb{F}, 𝕀𝔽\mathbb{IF} and 𝕀𝔽\mathbb{IF}^{*}. Here, we present a high-level specification of the pre- and postconditions.

Definition 2 (Pre- and postconditions)

Let 𝐱,𝐲,𝐫𝕀𝔽{\bm{x}},{\bm{y}},{\bm{r}}\in\mathbb{IF}^{*} be pairs of FP numbers, {+,,×,÷}\odot\in\{+,-,\times,\div\} be an operation, and 𝐟{\bm{f}}_{\odot} be an implementation of the interval operation, which is interpreted as a function 𝕀𝔽×𝕀𝔽𝕀𝔽\mathbb{IF}^{*}\times\mathbb{IF}^{*}\to\mathbb{IF}^{*}. Then, the precondition PVP_{V} and the postconditions QVQ_{V}, QSQ_{S}, QTQ_{T} and QZQ_{Z} of computation 𝐫:=𝐟(𝐱,𝐲){\bm{r}}:={\bm{f}}_{\odot}({\bm{x}},{\bm{y}}) are specified as follows:

(Validity precondition) PV\displaystyle P_{V} :𝒙𝕀𝔽𝒚𝕀𝔽,\displaystyle~{}~{}:\equiv~{}~{}{\bm{x}}\in\mathbb{IF}\,\land\,{\bm{y}}\in\mathbb{IF},
(Validity postcondition) QV\displaystyle Q_{V} :𝒓𝕀𝔽,\displaystyle~{}~{}:\equiv~{}~{}{\bm{r}}\in\mathbb{IF},
(Soundness postcondition) QS\displaystyle Q_{S} :x~𝒙,y~𝒚,x~y~𝒓,\displaystyle~{}~{}:\equiv~{}~{}\forall\tilde{x}\!\in\!{\bm{x}},\ \forall\tilde{y}\!\in\!{\bm{y}},~{}\tilde{x}\odot\tilde{y}\in{\bm{r}},
(Tightness postcondition) QT\displaystyle Q_{T} :𝒓=[min{(x¯y¯),(x¯y¯),(x¯y¯),(x¯y¯)},\displaystyle~{}~{}:\equiv~{}~{}{\bm{r}}~{}=~{}\bigl{[}\mathsfit{min}\bigl{\{}\bigtriangledown(\underline{x}\odot\underline{y}),\bigtriangledown(\underline{x}\odot\overline{y}),\bigtriangledown(\overline{x}\odot\underline{y}),\bigtriangledown(\overline{x}\odot\overline{y})\bigr{\}},
max{(x¯y¯),(x¯y¯),(x¯y¯),(x¯y¯)}],\displaystyle\hskip 50.00008pt\mathsfit{max}\bigl{\{}\bigtriangleup(\underline{x}\odot\underline{y}),\bigtriangleup(\underline{x}\odot\overline{y}),\bigtriangleup(\overline{x}\odot\underline{y}),\bigtriangleup(\overline{x}\odot\overline{y})\bigr{\}}\bigr{]},
(Exceptional postcondition) QZ\displaystyle Q_{Z} :0𝒚.\displaystyle~{}~{}:\equiv~{}~{}0\in{\bm{y}}.

Note that, in the right-hand sides, each appearance of the operator \odot acts as that for reals or FP numbers. In the definition, first, the precondition PVP_{V} states that the operand pairs of FP numbers are valid intervals, i.e., their lower (resp. upper) bounds are in 𝔽{}\mathbb{F}\cup\{-\infty\} (resp. 𝔽{+}\mathbb{F}\cup\{+\infty\}) and the magnitude relation holds between the lower and upper bounds. Second, similar to PVP_{V}, the postcondition QVQ_{V} states the validity of the computation result. Third, the postcondition QSQ_{S} states the soundness: “The result contains every possible real operation results.” Fourth, the postcondition QTQ_{T} states the tightness: “The result is an optimal/tightest overapproximation of the four computations with the argument FP numbers.” Finally, QZQ_{Z} is the postcondition for the case where the computation of 𝒇÷{\bm{f}}_{\div} throws the zero division exception.

If the target code is executed, it will perform an operation and either terminate normally or throw an exception that signals zero division when =÷\odot=\div. For the case where the code terminates normally, we first verify that a result 𝒓{\bm{r}} is an interval that overapproximates the true results in {x~y~:x~𝒙,y~𝒚}\{\tilde{x}\odot\tilde{y}~{}:~{}\tilde{x}\in{\bm{x}},\tilde{y}\in{\bm{y}}\} (QVQ_{V} and QSQ_{S}). Secondly, we verify that 𝒓{\bm{r}} is an optimal overapproximation as described in Theorem 1 (QTQ_{T}). Here, we take into account the overflow cases as they are specified in the Why3’s module. Finally, for the exceptional case, we verify that the code correctly detects the zero division (QZQ_{Z}). Our result contains a proof of a variant of Theorem 1: Soundness of the right-hand side of (1).

The representation of the tightness in QTQ_{T} is rather restrictive as it assumes the monotonicity of the operations. More generic representation (e.g. r~𝒓,x~𝒙,y~𝒚,r~(hull(x~)hull(y~))\forall\tilde{r}\!\in\!{\bm{r}},\exists\tilde{x}\!\in\!{\bm{x}},\exists\tilde{y}\!\in\!{\bm{y}},\tilde{r}\in(\mathrm{hull}(\tilde{x})\odot\mathrm{hull}(\tilde{y}))) is preferable but it will certainly make the verification process less automated and more difficult, which involves many interactive proof tasks. In this work, we aim at making the proof process automated and simple with a restrictive annotation.

5 Target Code: Interval Multiplication Procedure

In the following sections, we consider interval multiplication as a running example. This section explains how we specify intervals (Section 5.1) and an interval procedure (Section 5.2) with the WhyML language. Also, Section 5.3 explains a difference between our code and the original code. Additional notes on specifying other operators are described in B.

5.1 Specification of Intervals

1module Interval
2 use real.Real
3 use ieee_float.Float64
4 use Float64Ex
5
6 type interval = { inf: Float64.t; sup: Float64.t }
7
8 predicate real_in (a: real) (x: interval) =
9 ((tisFinite x.inf /\ treal x.inf <= a) \/ is_minus_infinity x.inf) /\
10 ((tisFinite x.sup /\ treal x.sup >= a) \/ is_plus_infinity x.sup)
11
12 val constant zeroI : interval
13 ensures{ is_zero result.inf /\ is_zero result.sup }
14
15 lemma zeroR_in_zeroI: real_in 0. zeroI
16
17 predicate valid (x: interval) =
18 (tisFinite x.inf \/ is_minus_infinity x.inf) /\
19 (tisFinite x.sup \/ is_plus_infinity x.sup) /\ x.inf .<= x.sup
20
21 lemma valid_not_nan: forall x. valid x -> is_not_nan x.inf /\ is_not_nan x.sup
22 lemma valid_zeroI: valid zeroI
23end
Figure 1: Specification of the interval module.
Table 1: Some vocabularies for FP numbers (x,y𝔽{±,NaN}x,y\in\mathbb{F}\cup\{\pm\infty,\mathrm{NaN}\}).
Expression Interpretation Expression Interpretation
is_zerox\texttt{is\char 95\relax zero}\ x x=0 or x=+0\Leftrightarrow~{}\text{$x=-0$ or $x=+0$} t’isFinitex\texttt{t'isFinite}\ x x𝔽\Leftrightarrow~{}x\in\mathbb{F}
zeroF =+0=~{}{+0} is_plus_infinityx\texttt{is\char 95\relax plus\char 95\relax infinity}\ x x=+\Leftrightarrow~{}x=+\infty
x.<=yx\ \texttt{.<=}\ y xy\Leftrightarrow~{}x\leq y (cf. IEEE-754) is_infinitex\texttt{is\char 95\relax infinite}\ x x= or x=+\Leftrightarrow~{}\text{$x=-\infty$ or $x=+\infty$}
mul_downxy\texttt{mul\char 95\relax down}\ x\ y =(x×y)=~{}\bigtriangledown(x\times y) is_not_nanx\texttt{is\char 95\relax not\char 95\relax nan}\ x x is not NaN\Leftrightarrow~{}\text{$x$ is not NaN}

We consider interval operations as procedures that handle data of the interval type, which is specified in a dedicated Why3 module (Figure 1) named Interval; related constants, predicates and lemmas are also specified in this module. In Figure 1, first, the modules for real numbers and FP numbers are imported at Lines 2–4; these modules provide type real, type Float64.t (double-precision FP number), and related types, constants, operators and predicates (some of them are summarized in Table 1). Most of them are specified by a set of WhyML axioms, e.g., the operator .<= is specified as follows.

forall x y. x .<= y -> ( (is_finite x /\ is_finite y) \/
(is_minus_infinity x /\ is_not_nan y) \/ (is_not_nan x /\ is_plus_infinity y) )

The Float64Ex module imported at Line 4 is our own module that provides auxiliary vocabularies for FP numbers (some are shown in Table 1). Second, the type interval is defined at Line 6 as a record combining the lower (inf) and upper (sup) bounds of type Float64.t. Third, the predicate real_in is defined at Lines 8–10 to state a𝒙a\in{\bm{x}} where aa\in\mathbb{R} and 𝒙𝕀𝔽{\bm{x}}\in\mathbb{IF}. Fourth, the module specifies a constant interval zeroI, i.e., a point-wise interval containing 0. It is specified using the is_zero predicate instead of specifying a concrete interval to represent either of intervals whose bounds are 0-0 or +0+0. At Line 15, there is a lemma characterizing zeroI, which may be used in an automated proof process. In the rest of the specification (Lines 17–22), the predicate valid x to state the validity of the value x of type interval is specified. For instance, it is used in the precondition PVP_{V} to state that 𝒙𝕀𝔽{\bm{x}}\in\mathbb{IF}. Then, two lemmas using valid are specified; valid_not_nan states that the bounds of valid intervals are not NaN\mathrm{NaN}; valid_zeroI states the validity of the constant specified above. Lemmas in a WhyML module are proved by Why3’s back-end provers. The three lemmas specified so far are proved using Alt-Ergo, where each proof takes less than 0.2s.

5.2 Specification of Multiplication Procedure

1module IntervalMul
2 use real.Real
3 use ieee_float.Float64
4 use Float64Ex
5 use Interval
6
7 let multiply (x y: interval) : interval
8 (* pre- and postconditions are specified here *)
9 = if x.inf .>= zeroF then
10 if x.sup .= zeroF then
11 zeroI
12 else
13 if y.inf .>= zeroF then
14 if y.sup .= zeroF then
15 zeroI
16 else
17 { inf = mul_down x.inf y.inf; sup = mul_up x.sup y.sup }
18 else if y.sup .<= zeroF then
19 { inf = mul_down x.sup y.inf; sup = mul_up x.inf y.sup }
20 else
21 { inf = mul_down x.sup y.inf; sup = mul_up x.sup y.sup }
22 (* other 8 branches are omitted *)
23end
Figure 2: Code snippet for multiplication of intervals: Case x¯0\underline{x}\geq 0.

Interval multiplication code written as a WhyML function multiply is shown in Figure 2; only five branches are shown due to space limitations, but the full specification consists of 13 branches. At Lines 2–5, necessary modules are imported. Multiplication is defined by the WhyML function multiply at Lines 7–22. The procedure is specified as a nested if-then statement and the expressions inside are specified with predefined vocabularies for FP numbers and intervals.

5.3 Modification of the Target Code

In this work, one major modification was made against the codebase of the kv library (version 0.4.38). In the modification, Lines 11 and 15 of the code in Figure 2 were simplified; Line 11 of the original code was implemented as follows.

if y.inf .= minus_inf || y.sup .= plus_inf then entireI else zeroI

Line 11 corresponds to the case where 𝒙=[0,0]{\bm{x}}=[0,0] and the modified code results in [0,0][0,0], whereas the original code checks whether a 𝒚{\bm{y}}’s bound is ±\pm\infty and returns either [,+][-\infty,+\infty] or [0,0][0,0]. Line 15 is likewise for switched 𝒙{\bm{x}} and 𝒚{\bm{y}}. Thus, the code comprises 17 branches. Our verification result suggests that the original code of kv unnecessarily enlarges some results from [0,0][0,0] to [,+][-\infty,+\infty], which is not correct regarding our tightness property. The original code is correct in terms of the cset model [11] but it is not clear whether this design choice was made or not (cf. the code does not support divisions by an interval containing zero). Based on our interval arithmetic flavor (Section 2), our code is simplified and results in tighter intervals yet it is verified to be correct.

6 Validity and Soundness of Multiplication

This section describes the verification process for the validity QVQ_{V} and soundness QSQ_{S} of the interval multiplication, in which we annotate the target code with the properties (Section 6.1) and discharge the resulting VCs using the Why3’s back-end provers (Section 6.2).

6.1 Specification of the Pre- and Postconditions

We annotate the multiply function in Figure 2 with the properties in Section 4. The following pre- and postconditions, which encode PVP_{V}, QVQ_{V} and QSQ_{S} of Definition 2, respectively, are inserted at Line 8 of the code. 101010The variable result is a built-in variable that represents the resulting value 𝒓{\bm{r}} (Definition 2) of the function under verification.

requires { valid x /\ valid y } (* P_V *)
ensures { valid result } (* Q_V *)
ensures { forall u v. real_in u x /\ real_in v y ->
real_in (u * v) result } (* Q_S *)

6.2 Proof

Next, we generate a proof obligation (i.e., a set of VCs) for the function multiply using the Why3 tool and try to discharge (i.e., prove the validity of) the VCs using one of the back-end SMT solvers. This process can be easily performed through the Why3 GUI. Here, two VCs for the postconditions are generated. However, for these VCs, neither proof process of Alt-Ergo and Z3 does terminate after 10min. For that case, we can split the VC into 26 partial VCs (again through Why3’s GUI), each of which corresponds to one of the 13 branches and one of the two postconditions; discharging them all entails the validity of the original VC. We apply Alt-Ergo to prove the split VCs; it discharges 13 of them and leaves 13 VCs unproved.

To prove the remaining VCs, we try to prove them interactively using the Coq proof assistant. An interactive proof is pessimistic due to complication of the VCs involving intervals and FP numbers, which requires a number of rewritings to accomplish the proof.111111In a Coq proof context, FP numbers are formalized by real numbers with additional data; vocabularies for intervals and FP numbers are defined and characterized based on reals and integers. For instance, rewriting a context by applying a lemma on real numbers requires to unfold the definitions of the vocabularies used in the context, which additionally requires a number of rewritings. The proof of such VCs are performed by SMT solvers more efficiently by automating the rewritings when their search strategies are sufficient. However, this interaction is useful to identify an auxiliary lemma that is helpful for SMT solvers. Indeed, for the VCs under consideration, we find that the following lemma is useful for the proof with Alt-Ergo.

lemma Rmult_le_compat: forall r1 r2 r3 r4 : real.
0. <= r1 <= r2 /\ 0. <= r3 <= r4 -> r1 * r3 <= r2 * r4

By adding this lemma within the same module, two of the unproved VCs121212Each unproved VC corresponds to the third branch shown in Figure 2 and either QVQ_{V} or QSQ_{S}, respectively. are discharged using Alt-Ergo. The above lemma assumes the branching condition in Figure 2 (in reals) and describes a magnitude relation between the products of the pairs of reals. The same lemma is considered in the proof in [14]. The lemma will be regarded as a relation between the branching condition and the resulting state, when the variables are converted to the Float64.t type. The lemma is difficult to prove with SMT solvers because of the nonlinear terms but it is proved with Coq.131313It is simply proved by the tactic “auto with real. In this way, a proof task involving nonlinear term is done manually. Likewise, all of the unproved VCs are discharged with four additional lemmas and finally the soundness of multiply is proved. Overall, Alt-Ergo takes 243s to discharge the 26 VCs. Z3 is able to discharge four of them.

7 Tightness of Multiplication

As a second round of the verification, we verify the tightness QTQ_{T} of the multiply function. The postcondition states that results are the hull of the candidate bounds, i.e., rounded values of x¯×y¯,x¯×y¯,x¯×y¯\underline{x}\times\underline{y},\underline{x}\times\overline{y},\overline{x}\times\underline{y} and x¯×y¯\overline{x}\times\overline{y}. However, annotation of QTQ_{T} is not straightforward due to the complicated comparison of FP numbers that can be the special value NaN. This section describes comparison functions for our purpose (Section 7.1), how we annotate QTQ_{T} into the code in Figure 2 (Section 7.2) and a proof process (Section 7.3).

7.1 Comparison of Four FP Numbers

To specify the postcondition QTQ_{T}, we introduce the functions min4 and max4, which correspond to (the first cases of) the min\mathsfit{min} and max\mathsfit{max} operations in Theorem 1; these functions take the four candidate bounds and evaluate to the minimum/maximum of them while ignoring NaN\mathrm{NaN}; they evaluate to NaN\mathrm{NaN} when all the arguments are NaN\mathrm{NaN}. Specification of min4 is shown in Figure 3, which is actually described in the Float64Ex module; max4 is specified in the same way. First, the min2 function that compares two FP numbers is specified (Lines 1–2). min2 ignores NaN\mathrm{NaN} given as an argument so that the expression min2 x y evaluates to x when y evaluates to NaN\mathrm{NaN}. Second, the min4 function is simply specified with min2 (Line 4). To support deductions when min4 is used, we also specify six lemmas including min4_feq (Lines 6–8), min4_fle (Lines 10–12) and min4_feq_w (Lines 14–16) in Figure 3. The first two lemmas state the relations between the value of min4 and the arguments that are not NaN\mathrm{NaN}; they are respectively proved by Alt-Ergo in 7s and 0.5s. The last four lemmas state possible resulting values of min4 expressions, which are equivalent to one of the arguments w, x, y and z; these four lemmas are respectively proved by Alt-Ergo in 0.4s. The descriptions “[min4 w x y z]” specify the trigger of a lemma, which enforces to assume the lemma when a min4 predicate becomes true.

1 function min2 (x y: Float64.t) : Float64.t
2 = if x .< y then x else if is_not_nan y then y else x
3
4 function min4 (w x y z: Float64.t) : Float64.t = min2 w (min2 x (min2 y z))
5
6 lemma min4_feq: forall w x y z [min4 w x y z].
7 is_not_nan w \/ is_not_nan x \/ is_not_nan y \/ is_not_nan z ->
8 min4 x y z w .= w \/ min4 x y z w .= x \/ min4 x y z w .= y \/ min4 x y z w .= z
9
10 lemma min4_fle: forall w x y z [min4 w x y z].
11 (is_not_nan w -> min4 w x y z .<= w) /\ (is_not_nan x -> min4 w x y z .<= x) /\
12 (is_not_nan y -> min4 w x y z .<= y) /\ (is_not_nan z -> min4 w x y z .<= z)
13
14 lemma min4_feq_w: forall w x y z [min4 w x y z].
15 is_not_nan w /\ (w .<= x \/ is_nan x) /\ (w .<= y \/ is_nan y) /\ (w .<= z \/ is_nan z) ->
16 min4 w x y z .= w
17
18 (* lemmas min4_feq_(x|y|z) are omitted *)

Figure 3: Specification of the min2 and min4 functions.

7.2 Specification of Tightness

Before annotating QTQ_{T}, we specify the conditions for the second cases of Equation (2) in Theorem 1, which hold when the four candidate bounds become NaN; these cases only happen in multiplication. The condition for min\mathsfit{min} is specified as a predicate in the IntervalMul module as follows (the condition for max\mathsfit{max} is specified likewise as mul_nan_case_sup).

predicate mul_nan_case_inf (x y: interval) =
is_nan (mul_down x.inf y.inf) /\ is_nan (mul_down x.inf y.sup) /\
is_nan (mul_down x.sup y.inf) /\ is_nan (mul_down x.sup y.sup)

Then, the pre- and postconditions PVP_{V} and QTQ_{T} are specified as follows.

requires { valid x /\ valid y } (* P_V *)
ensures { not mul_nan_case_inf x y ->
result.inf .= min4
(mul_down x.inf y.inf) (mul_down x.inf y.sup)
(mul_down x.sup y.inf) (mul_down x.sup y.sup)} (* Q_T lb *)
ensures { not mul_nan_case_sup x y ->
result.sup .= max4
(mul_up x.inf y.inf) (mul_up x.inf y.sup)
(mul_up x.sup y.inf) (mul_up x.sup y.sup) } (* Q_T ub *)
ensures { mul_nan_case_inf x y /\ mul_nan_case_sup x y ->
is_zero result.inf /\ is_zero result.sup } (* Q_T NaN case *)

Here, QTQ_{T} is specified in three parts. The first and second parts describe the first cases of Equation (2) using the min4 and max4 functions, each of which specifies the lower or upper bound portion of QTQ_{T}. The third part describes the second cases of Equation (2) in conjunction because they only happen at the same time.

7.3 Proof

The first and second postconditions are transformed into 26 VCs (with three split operations) and Alt-Ergo discharges 22 of them; 4 VCs are left unproved after 10min. The third postcondition is transformed into a VC and discharged by Alt-Ergo in 0.5s.

After investigation of the remaining VCs, we have found that the following lemma helps Alt-Ergo to prove the VCs generated from the first postcondition.

lemma mul_down_positive_finite: forall x y [mul_down x y].
is_positive (mul_down x y) /\ tisFinite (mul_down x y) ->
round RTN (treal zeroF) <= round RTN (treal x * treal y)

The lemma details the relation between the lower bound of the result and zero. The premise extracts the unproved two branches, each of which corresponds to a remaining VC; for example, one of the VCs corresponds to the third branch in Figure 2, where x.inf0\texttt{x.inf}\geq 0 and y.inf0\texttt{y.inf}\geq 0 hold and the lower bound of the result is computed as mul_down x.inf y.inf. We presume that applying round explicitly on both sides of the consequent inequality facilitates unifying it with expressions specified in the ieee_float module and thus further deductions are invoked. Another lemma mul_up_negative_finite is specified likewise for the second postcondition. In our experiment with Alt-Ergo, the two lemmas are proved in 0.2s and the 26 VCs generated from the target code are proved in 308s.

8 Implementation and Experiments

This section summarizes how the four operators have been specified (Section 8.1) and reports the verification result (Section 8.2). Section 8.3 describes how we extracted an executable code from the specification. The file containing the specifications and the extracted code is available at https://dsksh.github.io/vint-jcam2020.zip.

8.1 Specification

Each operator was implemented as a WhyML function (see B for some details). The number of branches in each function is shown in the second column of Table 2. Each function was annotated with the conditions PVP_{V}, QVQ_{V}, QSQ_{S} and QTQ_{T} (the code for division was also annotated with QZQ_{Z}). Since the “NaN case” described in Section 7.2 only happened in multiplication, QTQ_{T} was translated as two postconditions (“Q_T lb” and “Q_T ub” without preconditioning) for the other three operators. In the experiment, we identified that additional lemmas were necessary for the verification (the numbers of such lemmas are shown in the third column of Table 2). First, two variations of the lemma in Section 7.3 were needed to verify QTQ_{T} for each operator code. Second, the subtraction code was annotated with an assert formula to detail a property of the resulting value. Third, the multiplication and division codes were also annotated with four and seven lemmas, respectively, which were the variations of the lemma in Section 6.2; here, the seven lemmas for division include the four lemmas for multiplication. There were 19 lemmas specified in total. Other than the operator code, two WhyML modules, Float64Ex and Interval were prepared, which contained 30 and three lemmas about FP numbers and intervals, respectively.

8.2 Statistical Data

We performed an experiment to verify the implementation of four operators and all of them were successfully verified. The experiments (including the running examples) were operated using a 2.2GHz Intel Xeon E5-2650v4 processor with 128GB of RAM. The following tools were used: Why3 1.1.0, Alt-Ergo 2.2.0, Coq 8.8.0 and Z3 4.7.1. The memory used was limited to 4GB and the time to 10min for each VC proof.

Why3’s split strategy allows splitting VCs in several degrees. Therefore, the verification was performed with three settings: Without using the split strategy (“No split”); with splitting VCs as many times as possible (“Split full”); with an optimal number of splittings that requires minimum CPU time (“Split optimal”). Note that splitting a VC does not always result in VCs that are discharged by SMT solvers more efficiently than the original.

Table 2: Experimental results.
No split Split full Split optimal
Op. # Br’s # Lm’s # VCs Time # VCs Time # VCs Time
++ 1 2 3 36.4s 8 25.0s 6 24.7s
- 1 2 3 63.0s 10 47.2s 8 47.0s
×\times 13 2+4 3 140s 100 508s 3 140s
÷\div 7 6+7 7 TO (1) 33 TO (4) 15 578s

Herein, we present the experimental results in Table 2. The first three columns represent the target operator (“Op.”), the number of branches (“# Br’s”) and the number of additional lemmas (“# Lm’s”). The data “mm+nn” indicates that mm lemmas were discharged with SMT solvers and nn lemmas were proved with Coq (only the VCs generated from the former are taken into account in the following columns). The following columns are separated for each of the settings described above; for each section, the number of generated VCs (“# VCs”) and the total execution time by Alt-Ergo (“Time”) are shown. “TO (nn)” (for “Time Out”) means that the proofs for nn VCs did not finish within 10min. Note that the verification of the division code succeeded only with the “Split optimal” setting.

Additionally, 33 lemmas in the Float64Ex and Interval modules were discharged by Alt-Ergo in 32.6s in total. We also tried to discharge the “Split full” VCs with Z3, which was less efficient in our experiment; for each operator, 1/8, 1/10, 48/100 and 2/33 VCs were proved, respectively, and the rest resulted in timeout.

8.3 Extraction of Executable Code

Why3 provides an extract function that generates an executable code from a WhyML code. Using this function, we extracted an OCaml code that implements the interval type and the four operators. The extraction of numerical programs with Why3 required additional implementations. First, we prepared an OCaml module for FP arithmetic operations with rounding mode control. We implemented the rounding operators in C, which were then interfaced with OCaml functions of the dedicated Round module. Second, we extended Why3’s extract function to support the Float64.t type and related constants/functions. For this purpose, we prepared a driver file that defines various translation rules.

9 Related Work

Ayad and Marché [14] reported a case study on verifying an interval multiplication code. They used Frama-C, which is a verification platform like Why3 equipped with SMT solvers, and verified the soundness QSQ_{S} of a variant of our code. In this paper, we go further by considering the tightness QTQ_{T}.

For generic FP computation, a number of verified programs have been reported. A notable implementation is CRlibm [15], which aims at replacing the mathematical library of C. Elementary functions of CRlibm were formally verified yet preserve the efficiency of the computation. The correctness was proved by analyzing the bounds of numerical errors by hand or using the Gappa tool [16]. CRlibm differs from our verified code in two aspects: First, the properties and the target programs are different; in CRlibm, it is verified that computation results are rounded values in the prescribed direction for the elementary functions such as logarithm functions. Second, CRlibm is implemented in C and executable but our code is not; instead, we extract an executable OCaml code (Section 8.3).

Verification results for other numerical programs/algorithms have been reported [17, 6, 18], which used tools including Frama-C, SMT solvers, Gappa and Coq (with Flocq [19]). They targeted computations of certain expressions in reals, which differed from our subject: operations on FP intervals. In the verification, they aimed at certifying rounding error bounds of numerical computations, as in the verification of CRlibm. On the other hand, our target code outputs intervals that should enclose possible errors; then, we verify the correctness of the code. The verification tool Frama-C is able to target C++ programs (with the C++ plugin). Verification of our target program with Frama-C will be a future work as different tools might require different efforts to accomplish a verification.

Gappa [16] is a theorem prover for (specific forms of) VCs about real numbers which may involve rounding operators. It has been shown practical for proving rounding error bounds as mentioned above [15, 17, 6]. Gappa can be applied to our VCs but how to encode them in a way solvable with Gappa is not straightforward; for example, it is unclear how to handle user-defined types (e.g. interval) and the special FP values ±\pm\infty and NaN\mathrm{NaN} since Gappa does not support them natively.

10 Conclusion and Future Work

A code for four interval operations, verified with Why3, has been presented. We implemented the target code with WhyML and annotated with properties validity, soundness and tightness; then, we proved the correctness by discharging all of the generated VCs by Alt-Ergo and Coq. The tightness was specified in an indirect way with the min4/max4 functions; we verified that each bound of a result coincides either of the extremum computed with the bounds of the inputs, or zero. We believe interval arithmetic code is an appealing application for program verification as its correctness matters and as the provers of this domain are developed actively.

In the verification, several strategies were utilized, e.g., specification of auxiliary lemmas that help Alt-Ergo, and delegation of reasoning on nonlinear terms to the interactive proof with Coq. Some lemmas are not specific to interval arithmetic; so, they can be reused by automated provers in other applications. Many of the VCs were proved in a combined automated and interactive manner. This approach is powerful as VCs tend to become complicated, while often involving nonlinear terms. In principle, it can handle arbitrary VCs on background theories of the tools.

As a future work, we plan to verify other interval computations, such as other arithmetic implementations, elementary functions, and application programs. We aim to schematize useful strategies for the verification of this domain. Another line of work is to design and integrate a set of axioms for interval analysis into a prover like Alt-Ergo [20] for reasoning about statements on interval arithmetic.

Acknowledgements

The authors thank Masahide Kashiwagi who developed the kv library and the developers of the verification tools we utilized. We also thank Guillaume Melquiond for suggesting us the use of lemmas on reals in the early stage. We are grateful for the many valuable comments of the peer reviewers. This work was partially funded by JSPS (KAKENHI 18K11240 and 18H03223).

References

Appendix A Proof of Theorem 1

We check the tightness of the right-hand side (rhs) for the following two cases: (i) Except when all of the computations with the interval bounds become NaN\mathrm{NaN}, the tightness is obvious since the bounds of the rhs are the rounded values of the operation results, i.e., the hull of some theoretical values. (ii) For the case when all boundary computations result in NaN\mathrm{NaN}, which is evaluated to 0 with the second case of (2), we confirm that it only happens in the multiplication and it results in a correct interval:

  • ++)

    The extremum bounds are always computed as [(x¯+y¯),(x¯+y¯)][\bigtriangledown(\underline{x}+\underline{y}),\bigtriangleup(\overline{x}+\overline{y})]. Here, additions ±+\pm\infty+\mp\infty that result in NaN\mathrm{NaN} never happen; so, the second cases of (2)\eqref{eq:minmax4} do not apply.

  • -)

    The rhs is always computed as [(x¯y¯),(x¯y¯)][\bigtriangledown(\underline{x}-\overline{y}),\bigtriangleup(\overline{x}-\underline{y})]. Neither subtractions ±±\pm\infty-\pm\infty that result in NaN\mathrm{NaN} nor the second cases of (2)\eqref{eq:minmax4} happen.

  • ×\times)

    The computation results of the rhs are classified as follows:

    y¯<0,y¯0\underline{y}<0,\overline{y}\leq 0 y¯<0,y¯>0\underline{y}<0,\overline{y}>0 y¯0,y¯>0\underline{y}\geq 0,\overline{y}>0 y¯=y¯=0\underline{y}=\overline{y}=0
    x¯<0,x¯0\underline{x}<0,\overline{x}\leq 0 [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\overline{x}\times\overline{y}),\bigtriangleup(\underline{x}\times\underline{y})] [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\underline{x}\times\overline{y}),\bigtriangleup(\underline{x}\times\underline{y})] [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\underline{x}\times\overline{y}),\bigtriangleup(\overline{x}\times\underline{y})] [0,0][0,0] (**)
    x¯<0,x¯>0\underline{x}<0,\overline{x}>0 [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\overline{x}\times\underline{y}),\bigtriangleup(\underline{x}\times\underline{y})] (*) [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\underline{x}\times\overline{y}),\bigtriangleup(\overline{x}\times\overline{y})] [0,0][0,0] (**)
    x¯0,x¯>0\underline{x}\geq 0,\overline{x}>0 [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\overline{x}\times\underline{y}),\bigtriangleup(\underline{x}\times\overline{y})] [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\overline{x}\times\underline{y}),\bigtriangleup(\overline{x}\times\overline{y})] [(x¯×y¯),(x¯×y¯)][\bigtriangledown(\underline{x}\times\underline{y}),\bigtriangleup(\overline{x}\times\overline{y})] [0,0][0,0] (**)
    x¯=x¯=0\underline{x}=\overline{x}=0 [0,0][0,0] (**) [0,0][0,0] (**) [0,0][0,0] (**) [0,0][0,0]

    The cell (*) is computed as [min{(x¯×y¯),(x¯×y¯)},max{(x¯×y¯),(x¯×y¯)}][\min\{\bigtriangledown(\underline{x}\times\overline{y}),\bigtriangledown(\overline{x}\times\underline{y})\},\max\{\bigtriangleup(\underline{x}\times\underline{y}),\bigtriangleup(\overline{x}\times\overline{y})\}]. Note that the multiplications of 0 and ±\pm\infty result in NaN\mathrm{NaN}. These combinations never happen in the cells not marked with (**). For the cells marked with (**), we compute [0,0][0,0] as r,0×r=0\forall r\in\mathbb{R},0\times r=0. When multiplying [0,0][0,0] and half-bounded intervals, [0,0][0,0] is computed with the first cases of (2). When multiplying [0,0][0,0] and [,+][-\infty,+\infty], [0,0][0,0] is computed with the second cases.

  • ÷\div)

    The computation results of the rhs are classified as follows:

    y¯<0\overline{y}<0 y¯>0\underline{y}>0
    x¯<0\overline{x}<0 [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\overline{x}\div\underline{y}),\bigtriangleup(\underline{x}\div\overline{y})] [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\underline{x}\div\underline{y}),\bigtriangleup(\overline{x}\div\overline{y})]
    x¯0,x¯0\underline{x}\leq 0,\overline{x}\geq 0 [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\overline{x}\div\overline{y}),\bigtriangleup(\underline{x}\div\overline{y})] [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\underline{x}\div\underline{y}),\bigtriangleup(\overline{x}\div\underline{y})]
    x¯>0\underline{x}>0 [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\overline{x}\div\overline{y}),\bigtriangleup(\underline{x}\div\underline{y})] [(x¯÷y¯),(x¯÷y¯)][\bigtriangledown(\underline{x}\div\overline{y}),\bigtriangleup(\overline{x}\div\underline{y})]

    The divisions ±÷±\pm\infty\div\pm\infty that result in NaN\mathrm{NaN} never happen in all the cells; so, the second cases do not apply.

Almost identical analysis to the above is described in [21], in which the cells (**) of the multiplication table are analyzed differently.

Appendix B Specification of Subtraction and Division

The verification processes for other operations are similar to that for the multiplication. Some notable annotations made to the subtraction and division operators are described below.

Specification of the subtraction operator is shown in Figure 5. In the specification of QTQ_{T}, the precondition mul_nan_case is omitted since it is not necessary; if the precondition was violated, the result would be [NaN,NaN][\mathrm{NaN},\mathrm{NaN}], which is an invalid interval; it is verified with the postcondition QVQ_{V} that it will not happen. At Lines 12–13, two properties on the lower bound of the result and the argument intervals are annotated; the properties are introduced as premises in the VCs for the postconditions; they accelerated the proof process of Alt-Ergo in our experiment. As a side note, similar assertions were not annotated to the addition operator, since it was verified automatically and efficiently without them.

Specification of the division operator is shown in Figure 5. Line 9 raises the Division_by_zero exception and it is verified with the postcondition QZQ_{Z} at Line 3 that it is raised properly.

1 let subtraction (x y: interval) : interval
2 requires { valid x /\ valid y }
3 ensures { valid result }
4 ensures { forall u v. real_in u x /\ real_in v y -> real_in (u - v) result }
5 ensures { result.inf .= min4
6 (sub_down x.inf y.inf) (sub_down x.inf y.sup)
7 (sub_down x.sup y.inf) (sub_down x.sup y.sup) }
8 ensures { result.sup .= max4
9 (sub_up x.inf y.inf) (sub_up x.inf y.sup)
10 (sub_up x.sup y.inf) (sub_up x.sup y.sup) }
11 = let r = { inf = sub_down x.inf y.sup; sup = sub_up x.sup y.inf } in
12 assert { r.inf .<= sub_down x.inf y.inf \/ is_nan (sub_down x.inf y.inf) };
13 assert { r.inf .<= sub_down x.sup y.sup \/ is_nan (sub_down x.sup y.sup) };
14 r
Figure 4: Code for subtraction of intervals.
1 let division (x y: interval) : interval
2 (* pre- and postconditions are omitted *)
3 raises { Division_by_zero -> real_in 0. y }
4 = if y.inf .> zeroF then
5 (* three branches are omitted *)
6 else if y.sup .< zeroF then
7 (* three branches are omitted *)
8 else
9 raise Division_by_zero
Figure 5: Code snippet for division of intervals: Exceptional case.