This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\givenname

Samuel Allen \surnameAlexander \givennameBryan \surnameDawson \subjectprimarymsc202026E35, 26A24 \subjectsecondarymsc202054D80, 30D20 \volumenumber \issuenumber \publicationyear \papernumber \startpage \endpage \MR \Zbl \published \publishedonline \proposed \seconded \corresponding \editor \version

Hyperreal differentiation with an idempotent ultrafilter

Samuel Allen Alexander The U.S. Securities and Exchange Commission [email protected]    Bryan Dawson Union University [email protected]
Abstract

In the hyperreals constructed using a free ultrafilter on \mathbb{R}, where [f][f] is the hyperreal represented by f:f:\mathbb{R}\to\mathbb{R}, it is tempting to define a derivative operator by [f]=[f][f]^{\prime}=[f^{\prime}], but unfortunately this is not generally well-defined. We show that if the ultrafilter in question is idempotent and contains (0,ϵ)(0,\epsilon) for arbitrarily small real ϵ\epsilon then the desired derivative operator is well-defined for all ff such that [f][f^{\prime}] exists. We also introduce a hyperreal variation of the derivative from finite calculus, and show that it has surprising relationships to the standard derivative. We give an alternate proof, and strengthened version of, Hindman’s theorem.

keywords:
hyperreals, idempotent ultrafilters, derivatives, finite calculus, Hindman’s theorem
doi:
{asciiabstract}

In the hyperreals constructed using a free ultrafilter on R, where [f] is the hyperreal represented by f:R-¿R, it is tempting to define a derivative operator by [f]’=[f’], but unfortunately this is not generally well-defined. We show that if the ultrafilter in question is idempotent and contains (0,epsilon) for arbitrarily small real epsilon then the desired derivative operator is well-defined for all f such that [f’] exists. We also introduce a hyperreal variation of the derivative from finite calculus, and show that it has surprising relationships to the standard derivative. We give a new proof of Hindman’s theorem, and we prove a stronger theorem.

1 Introduction

There is a long tradition [13, 2, 16, 4, 14, 11, 5, 12, 6] of attempting to differentiate numbers in various ways. Much attention was focused on derivatives of numbers when Jeffries’ paper on the subject appeared in the Notices of the AMS late last year [10]; almost simultaneously (and apparently independently), Tossavainen et al’s survey on the subject appeared in the College Math Journal [15].

Why should the reader care about differentiating numbers? In general, any time a new theory is introduced, it is natural to seek numerical structures satisfying that theory: thus, when the theory of groups is introduced, it is natural to introduce examples like (,+)(\mathbb{Q},+) and (+,)(\mathbb{R}^{+},\cdot). Theories about elementary calculus functions, in languages including the unary function symbol \bullet^{\prime}, were originally modeled by structures whose universes consisted of elementary calculus functions, not numbers. We can at least try to find numerical models for these theories. The act of interpreting \bullet^{\prime} in a structure whose universe is a number system is the act of “differentiating numbers”. We are hopeful that generalizing models of elementary calculus function theories could eventually be fruitful just like generalizing models of permutation sets led to the abstract theory of groups.

When it comes to numerically modeling theories, different theories might require different number systems: both (,+,)(\mathbb{Z},+,\cdot) and (,+,)(\mathbb{Q},+,\cdot) are rings, but only the latter is a field. By gaining knowledge about which number systems are needed for which theories, we gain insight into those theories. We hope that a greater knowledge about which number systems are needed to model various subtheories of elementary calculus functions, will eventually give us insight into those subtheories.

If the only axiom we care about is the Leibniz rule (and the nontriviality axiom xs.t.x0\exists x\,\mbox{s.t.}\,x^{\prime}\not=0), we can interpret \bullet^{\prime} on \mathbb{N} so as to satisfy that. That is the approach of [2, 16]. But their (,)(\mathbb{N},\bullet^{\prime}) does not even satisfy the linearity axiom. Our interpretation (in Section 3) of \bullet^{\prime} on a subset of the hyperreals will satisfy far more axioms of the theory of elementary calculus functions. And in Section 3.2, we will introduce a stricter subset of the hyperreals where not only \bullet^{\prime} can be elegantly interpreted, but \circ as well (in other words, there is an elegant way to define the “composition” of two numbers there), in such a way as to numerically satisfy even the chain rule.

The key idea behind our numerical interpretation of \bullet^{\prime} is to commute the derivative operation with the operation of taking a function’s equivalence class in the hyperreals, in other words, define [f]=[f][f]^{\prime}=[f^{\prime}] (we will spell out the details below). Unfortunately, this is not well-defined in general. However, the idea can be salvaged in several different ways, by making use of certain idempotent ultrafilters. In Section 4 we will use the same approach to well-define a hyperreal variation of the derivative from finite calculus, and as an application of that, we will give a new proof of Hindman’s theorem and also strengthen said theorem.

2 Preliminaries

Throughout the paper, we write β\beta\mathbb{R} for the set of ultrafilters on \mathbb{R}.

Definition 1.

(Hyperreals) For each free pβp\in\beta\mathbb{R}, let p{}^{*}\mathbb{R}_{p} be the hyperreals constructed using pp. For every f:f:\mathbb{R}\to\mathbb{R}, let [f]p[f]_{p} be the hyperreal represented by ff. If pp is clear from context, we will write {}^{*}\mathbb{R} and [f][f] for p{}^{*}\mathbb{R}_{p} and [f]p[f]_{p}, respectively.

Convention 2.

If pβp\in\beta\mathbb{R} is free and ff is a function with codomain \mathbb{R} and with domain dom(f)p\mathrm{dom}(f)\in p, we will write [f]p[f]_{p} for [f^]p[\hat{f}]_{p} where f^:\hat{f}:\mathbb{R}\to\mathbb{R} is the extension of ff defined by f^(x)=0\hat{f}(x)=0 for all x\dom(f)x\in\mathbb{R}\backslash\mathrm{dom}(f). If dom(f)p\mathrm{dom}(f)\not\in p, we say that [f]p[f]_{p} does not exist. If pp is clear from context, we will write [f][f] for [f]p[f]_{p}.

Definition 3.

For each f:f:\mathbb{R}\to\mathbb{R}, let f:{}^{*}f:{}^{*}\mathbb{R}\to{}^{*}\mathbb{R} be the nonstandard extension of ff. Let Ω=[xx]\Omega=[x\mapsto x] be the hyperreal represented by the identity function.

For every f:f:\mathbb{R}\to\mathbb{R}, there are two ways of viewing ff in nonstandard analysis. It can be viewed as the number [f][f] or as the function f:{}^{*}f:{}^{*}\mathbb{R}\to{}^{*}\mathbb{R}. The two are related via Ω\Omega. Namely: [f]=f(Ω)[f]={}^{*}f(\Omega).

Unfortunately, the following proposition shows that the idea of defining [f]=[f][f]^{\prime}=[f^{\prime}] does not work in general.

Proposition 4.

(Ill-definedness)

  1. 1.

    There exists a free pβp\in\beta\mathbb{R} and everywhere-differentiable f,g:f,g:\mathbb{R}\to\mathbb{R} such that [f]p=[g]p[f]_{p}=[g]_{p} but [f]p[g]p[f^{\prime}]_{p}\not=[g^{\prime}]_{p}.

  2. 2.

    For every free pβp\in\beta\mathbb{R}, for all f:f:\mathbb{R}\to\mathbb{R} such that [f]p[f^{\prime}]_{p} exists, there exists g:g:\mathbb{R}\to\mathbb{R} such that [f]p=[g]p[f]_{p}=[g]_{p} but [g]p[g^{\prime}]_{p} does not exist.

Proof.

(1) Let pβp\in\beta\mathbb{R} such that p\mathbb{N}\in p. The claim is witnessed by f(x)=0f(x)=0 and g(x)=sinπxg(x)=\sin\pi x.

(2) Let DD\subseteq\mathbb{R} be dense and co-dense. Assume DpD\in p (if not, then DcpD^{c}\in p and a similar argument applies). The claim is witnessed by g(x)=f(x)+χDc(x)g(x)=f(x)+\chi_{D^{c}}(x). ∎

In light of Proposition 4, we cannot expect the definition [f]p=[f]p[f]^{\prime}_{p}=[f^{\prime}]_{p} to work for every free pβp\in\beta\mathbb{R} even if we restrict our attention to everywhere-differentiable ff; and if we do not so restrict our attention, then we can expect the definition [f]p=[f]p[f]^{\prime}_{p}=[f^{\prime}]_{p} to fail for every pp. We will show that if we restrict attention to those ff such that [f][f^{\prime}] exists, then the definition [f]p=[f]p[f]^{\prime}_{p}=[f^{\prime}]_{p} does work provided pp is idempotent and contains (0,ϵ)(0,\epsilon) for every ϵ>0\epsilon>0.

3 Differentiating hyperreals [f][f] such that [f][f^{\prime}] exists

Definition 5.

(Idempotent ultrafilters on \mathbb{R})

  1. 1.

    For each SS\subseteq\mathbb{R} and any yy\in\mathbb{R}, SyS-y is defined to be {xy:xS}\{x-y\,:\,x\in S\}.

  2. 2.

    An ultrafilter pβp\in\beta\mathbb{R} is idempotent if p={S:{y:Syp}p}p=\{S\subseteq\mathbb{R}\,:\,\{y\in\mathbb{R}\,:\,S-y\in p\}\in p\}.

Definition 6.

By 0+0^{+} we mean the set of ultrafilters pβp\in\beta\mathbb{R} such that pp satisfies the following requirement. For every real ϵ>0\epsilon>0, the open interval (0,ϵ)p(0,\epsilon)\in p.

Lemma 7.

0+0^{+} contains an idempotent ultrafilter.

Proof.

By Lemma 13.29(a) and Theorem 13.31 of [9]. ∎

Clearly an ultrafilter in 0+0^{+} is free. The following lemma illustrates the power of ultrafilters in 0+0^{+}.

Lemma 8.

Let p0+p\in 0^{+}. If f:f:\mathbb{R}\to\mathbb{R} is continuous at 0 then st([f]p)=f(0)\mathrm{st}([f]_{p})=f(0).

Proof.

Let ϵ>0\epsilon>0 be real. By continuity of ff at 0, δ>0\exists\delta>0 such that |f(0)f(x)|<ϵ|f(0)-f(x)|<\epsilon whenever x(0,δ)x\in(0,\delta). Since p0+p\in 0^{+}, (0,δ)p(0,\delta)\in p. Thus, ff is within ϵ\epsilon of f(0)f(0) ultrafilter often, so [f]p[f]_{p} is within ϵ\epsilon of f(0)f(0). ∎

For the rest of this section, we fix an idempotent p0+p\in 0^{+}. The following theorem shows that this suffices to make the definition [f]=[f][f]^{\prime}=[f^{\prime}] well-defined if we restrict it to functions such that [f][f^{\prime}] exists. Note that since p0+p\in 0^{+}, the existence of [f][f^{\prime}] is equivalent to the statement that for all real ϵ>0\epsilon>0, there exists real δ(0,ϵ)\delta\in(0,\epsilon) such that f(δ)f^{\prime}(\delta) exists.

Theorem 9.

For all f,g:f,g:\mathbb{R}\to\mathbb{R} such that [f][f^{\prime}] and [g][g^{\prime}] exist, if [f]=[g][f]=[g] then [f]=[g][f^{\prime}]=[g^{\prime}].

Proof.

Since [f]=[g][f]=[g], there is some S0pS_{0}\in p such that f=gf=g on S0S_{0}. Let S=S0dom(f)dom(g)S=S_{0}\cap\mathrm{dom}(f^{\prime})\cap\mathrm{dom}(g^{\prime}). Existence of [f][f^{\prime}] and [g][g^{\prime}] means dom(f)p\mathrm{dom}(f^{\prime})\in p and dom(g)p\mathrm{dom}(g^{\prime})\in p, thus SpS\in p. Since pp is idempotent, {x:Sxp}p\{x\in\mathbb{R}\,:\,S-x\in p\}\in p. Thus S{x:Sxp}pS\cap\{x\in\mathbb{R}\,:\,S-x\in p\}\in p. To show [f]=[g][f^{\prime}]=[g^{\prime}], we will show that f=gf^{\prime}=g^{\prime} on S{x:Sxp}S\cap\{x\in\mathbb{R}\,:\,S-x\in p\}. Let xS{x:Sxp}x\in S\cap\{x\in\mathbb{R}\,:\,S-x\in p\}. In particular, SxpS-x\in p. We must show f(x)=g(x)f^{\prime}(x)=g^{\prime}(x).

Claim: For all hSxh\in S-x, (f(x+h)f(x))/h=(g(x+h)g(x))/h(f(x+h)-f(x))/h=(g(x+h)-g(x))/h. Indeed, let hSxh\in S-x. This means h=yxh=y-x for some ySy\in S. Compute:

(f(x+h)f(x))/h\displaystyle(f(x+h)-f(x))/h =(f(x+yx)f(x))/h\displaystyle=(f(x+y-x)-f(x))/h (h=yxh=y-x)
=(f(y)f(x))/h\displaystyle=(f(y)-f(x))/h (Algebra)
=(g(y)g(x))/h\displaystyle=(g(y)-g(x))/h (f=gf=g on SS)
=(g(x+yx)g(x))/h\displaystyle=(g(x+y-x)-g(x))/h (Algebra)
=(g(x+h)g(x))/h,\displaystyle=(g(x+h)-g(x))/h, (h=yxh=y-x)

proving the claim.

Since p0+p\in 0^{+} and SxpS-x\in p, it follows that for all real ϵ>0\epsilon>0, (Sx)(0,ϵ)p(S-x)\cap(0,\epsilon)\in p, thus is nonempty. So SxS-x contains hh arbitrarily near 0. Since xdom(f)dom(g)x\in\mathrm{dom}(f^{\prime})\cap\mathrm{dom}(g^{\prime}), limh0(f(x+h)f(x))/h\lim_{h\to 0}(f(x+h)-f(x))/h and limh0(g(x+h)g(x))/h\lim_{h\to 0}(g(x+h)-g(x))/h exist. Since both limits exist and since there are hh arbitrarily near 0 such that (f(x+h)f(x))/h=(g(x+h)g(x))/h(f(x+h)-f(x))/h=(g(x+h)-g(x))/h, the limits must be equal, that is, f(x)=g(x)f^{\prime}(x)=g^{\prime}(x). ∎

Corollary 10.

Let 𝒟={[f]:f: and [f] exists}.\mathcal{D}=\{[f]\,:\,\mbox{$f:\mathbb{R}\to\mathbb{R}$ and $[f^{\prime}]$ exists}\}. The derivative operation :𝒟\bullet^{\prime}:\mathcal{D}\to{}^{*}\mathbb{R} defined by [f]=[f][f]^{\prime}=[f^{\prime}] (for all ff such that [f][f^{\prime}] exists) is well-defined.

We do not yet know whether 𝒟\mathcal{D} (from Corollary 10) is a proper subset of {}^{*}\mathbb{R}. In other words: can every hyperreal be written in the form [f][f] where [f][f^{\prime}] exists? Does this depend on pp?

If 𝒟\mathcal{D} is as in Corollary 10 then it follows that the structure (𝒟,1,Ω,+,,)(\mathcal{D},1,\Omega,+,\cdot,\bullet^{\prime}) satisfies every positive formula in the theory of elementary calculus functions in the language (1,id,+,,)(1,\mathrm{id},+,\cdot,\bullet^{\prime}), where, by positive formula, we mean a formula that can be built up without using ¬\neg or \not=. In particular this includes the Leibniz rule axiom, linearity, and the power rule schema. The structure also satisfies the nontriviality axiom xs.t.x0\exists x\,\mbox{s.t.}\,x^{\prime}\not=0. All this remains true if constant symbols for other individual functions (such as sin\sin and cos\cos), besides just the identity function, are added to the language, interpreted in 𝒟\mathcal{D} by the hyperreals represented thereby (such as [sin][\sin] and [cos][\cos]), provided those hyperreals’ derivatives exist.

In terms of nonstandard extensions, Theorem 9 says that there is a well-defined map which sends every f(Ω){}^{*}f(\Omega) to f(Ω){}^{*}f^{\prime}(\Omega). For example, this map sends eΩ+Ω3+cos2Ωe^{\Omega}+\Omega^{3}+\cos 2\Omega to eΩ+3Ω22sin2Ωe^{\Omega}+3\Omega^{2}-2\sin 2\Omega.

One might intuitively wonder whether [f]=0[f]^{\prime}=0 implies ff is constant (at least ultrafilter often). The following proposition provides a counterexample.

Proposition 11.

There exists f:f:\mathbb{R}\to\mathbb{R} such that [f]=0[f]^{\prime}=0 but for every rr\in\mathbb{R}, {x:f(x)=r}p\{x\in\mathbb{R}\,:\,f(x)=r\}\not\in p.

Proof.

By Theorem 1.14 of [1], there exist disjoint Cantor sets on \mathbb{R}. An ultrafilter cannot contain two disjoint sets, so there is some Cantor set CC on \mathbb{R} with CpC\not\in p. Let ff be the devil’s staircase based on CC. Then f(x)=0f^{\prime}(x)=0 for all xCx\not\in C so [f]=0[f]^{\prime}=0, but ff is increasing, and is not flat on (0,ϵ)(0,\epsilon) for any ϵ>0\epsilon>0, implying (since p0+p\in 0^{+}) that {x:f(x)=r}p\{x\in\mathbb{R}\,:\,f(x)=r\}\not\in p for all rr. ∎

We have proven Corollary 10 under the assumption that pp is idempotent and in 0+0^{+}. Similar reasoning would hold if pp were idempotent and in 00^{-} (i.e., if pp were required to contain (ϵ,0)(-\epsilon,0) for every positive real ϵ\epsilon). We currently do not know whether Corollary 10 holds for any other type of ultrafilter.

3.1 Differential equations and the secant method

Since \bullet^{\prime} takes (a subset of) {}^{*}\mathbb{R} to {}^{*}\mathbb{R}, one can attempt to solve (or approximately solve) differential equations by using the secant method from numerical analysis, which is traditionally only used to solve non-differential equations. This is interesting because as far as we know, the secant method has not previously been applicable to differential equations. We illustrate this with an example in which the method finds a correct solution in one step.

Example 12.

Solve the differential equation y2x=0y^{\prime}-2x=0 using the secant method, with initial guesses y0=x2+x3y_{0}=x^{2}+x^{3} and y1=x2x3y_{1}=x^{2}-x^{3}.

Solution. Define α:\alpha:{\subseteq}{}^{*}\mathbb{R}\to{}^{*}\mathbb{R} by α([f])=[f]2Ω\alpha([f])=[f]^{\prime}-2\Omega whenever [f][f]^{\prime} is defined. We desire a solution [f][f] of the equation α([f])=0\alpha([f])=0. Since Ω=[xx]\Omega=[x\mapsto x], it follows that [f]2Ω=[xf(x)2x][f]^{\prime}-2\Omega=[x\mapsto f^{\prime}(x)-2x], so any such solution [f][f] will yield a solution y=f(x)y=f(x) to the differential equation y2x=0y^{\prime}-2x=0 (at least pp-a.e.). Compute:

[f0]\displaystyle[f_{0}] =Ω2+Ω3\displaystyle=\Omega^{2}+\Omega^{3} (initial guess y0=x2+x3y_{0}=x^{2}+x^{3})
[f1]\displaystyle[f_{1}] =Ω2Ω3\displaystyle=\Omega^{2}-\Omega^{3} (initial guess y1=x2x3y_{1}=x^{2}-x^{3})
[f2]\displaystyle[f_{2}] =[f1]α([f1])[f1][f0]α([f1])α([f0]);\displaystyle=[f_{1}]-\alpha([f_{1}])\frac{[f_{1}]-[f_{0}]}{\alpha([f_{1}])-\alpha([f_{0}])}; (Secant method)
α([f1])\displaystyle\alpha([f_{1}]) =(Ω2Ω3)2Ω\displaystyle=(\Omega^{2}-\Omega^{3})^{\prime}-2\Omega
=(2Ω3Ω2)2Ω;\displaystyle=(2\Omega-3\Omega^{2})-2\Omega;
α([f0])\displaystyle\alpha([f_{0}]) =(Ω2+Ω3)2Ω\displaystyle=(\Omega^{2}+\Omega^{3})^{\prime}-2\Omega
=(2Ω+3Ω2)2Ω;\displaystyle=(2\Omega+3\Omega^{2})-2\Omega;

it follows that [f2]=Ω2[f_{2}]=\Omega^{2}. This yields a solution y=x2y=x^{2} to the original differential equation. ∎

We have not yet found any examples where this approach is more practical than other approximate methods in differential equations. We hope that either such examples can be found later, or, if not, that the lack of such examples might provide insight into limitations of the secant method itself. The point of this subsection is not so much to focus on the secant method, but rather to illustrate the kind of things we hope might be possible by numerically interpreting the theory of elementary calculus functions.

3.2 The well-definability of composition on a subset of the hyperreals

The work we have presented above is relevant to numerically modeling subtheories of the theory of elementary calculus functions in a language containing a unary function symbol \bullet^{\prime} for differentiation. But one key axiom is missing from that theory, namely the chain rule, since the chain rule also involves a binary function symbol \circ for composition. In this section, we introduce a subset of the hyperreals suitable for numerically modeling subtheories of the theory of elementary calculus functions in a language containing \bullet^{\prime} and \circ. The following definition is motivated by the theory of complex analysis.

Definition 13.

(Entire numbers)

  1. 1.

    A function f:f:\mathbb{R}\to\mathbb{R} is entire if ff is infinitely differentiable at 0 and x\forall x\in\mathbb{R}, f(x)=k=0f(k)(0)xk/k!f(x)=\sum_{k=0}^{\infty}f^{(k)}(0)x^{k}/k!.

  2. 2.

    A hyperreal number is entire if it can be written as [f][f] for some entire f:f:\mathbb{R}\to\mathbb{R}.

Proposition 14.

For all entire f,g:f,g:\mathbb{R}\to\mathbb{R}, if [f]=[g][f]=[g] then f=gf=g.

Proof.

By well-known results from analysis, ff and gg are infinitely differentiable everywhere, so [f(k)][f^{(k)}] and [g(k)][g^{(k)}] exist for all kk. By repeated applications of Theorem 9, each [f(k)]=[g(k)][f^{(k)}]=[g^{(k)}]. By Lemma 8, it follows that each f(k)(0)=g(k)(0)f^{(k)}(0)=g^{(k)}(0). Since ff and gg are entire, this implies f=gf=g. ∎

In the same way that we can think of the differential equation 2x+2yy=02x+2yy^{\prime}=0 as describing the family of all circles centered at the origin, we can think of the equation y=k=0y(k)(0)xk/k!y=\sum_{k=0}^{\infty}y^{(k)}(0)x^{k}/k! as describing the family of all entire functions. The latter is much worse behaved than the former: no two circles centered at the origin ever intersect each other, but for all (x0,y0)2(x_{0},y_{0})\in\mathbb{R}^{2} there exist distinct entire functions whose graphs intersect at (x0,y0)(x_{0},y_{0}). In this sense, we can say that the real plane contains no “critical points” of 2x+2yy=02x+2yy^{\prime}=0, but that every point of the real plane is a “critical point” of y=k=0y(k)(0)xk/k!y=\sum_{k=0}^{\infty}y^{(k)}(0)x^{k}/k!. We can interpret Proposition 14 as saying that in the hyperreal plane, every point on the vertical line x=Ωx=\Omega is a “non-critical point” of the family of all entire functions, for the proposition says that no two distinct entire function graphs intersect anywhere on this vertical line.

Corollary 15.

The operation \circ defined on the entire numbers by [f][g]=[fg][f]\circ[g]=[f\circ g] (whenever ff and gg are entire) is well-defined.

Clearly with \circ defined as in Corollary 15 and with \bullet^{\prime} defined as in Corollary 10, the entire numbers satisfy the chain rule axiom, xy(xy)=(xy)y\forall x\forall y\,(x\circ y)^{\prime}=(x^{\prime}\circ y)y^{\prime}.

3.3 The approximately space-filling nature of differentiation

Clearly \bullet^{\prime} is linear over \mathbb{R}, in the sense that for all [f],[g][f],[g]\in{}^{*}\mathbb{R} and λ,μ\lambda,\mu\in\mathbb{R}, (λ[f]+μ[g])=λ[f]+μ[g](\lambda[f]+\mu[g])^{\prime}=\lambda[f]^{\prime}+\mu[g]^{\prime} if the derivatives in question exist. So how badly behaved could the graph y=xy=x^{\prime} be? We will show that it is approximately space-filling, in the following sense.

Definition 16.

A subset C()2C\subseteq({}^{*}\mathbb{R})^{2} is approximately space-filling if for all (x0,y0)2(x_{0},y_{0})\in\mathbb{R}^{2}, there exists some (x,y)C(x,y)\in C such that st(x)=x0\mathrm{st}(x)=x_{0} and st(y)=y0\mathrm{st}(y)=y_{0}.

It is not hard to find functions {}^{*}\mathbb{R}\to{}^{*}\mathbb{R} which are linear over \mathbb{R} and whose graphs are approximately space-filling. For example, if we consider {}^{*}\mathbb{R} as a vector space over \mathbb{R} and let \mathcal{B} be a basis for it with an infinitesimal basis element vv, then the projection π(+λv+)=λ\pi(\cdots+\lambda v+\cdots)=\lambda is one such function. Nevertheless, we find it interesting (even if the proof is quite simple) that our derivative operator also has these properties.

Proposition 17.

The hyperreal graph y=xy=x^{\prime}, i.e., the set of all ([f],[g])()2([f],[g])\in({}^{*}\mathbb{R})^{2} such that [g]=[f][g]=[f]^{\prime}, is approximately space-filling.

Proof.

For any (x0,y0)2(x_{0},y_{0})\in\mathbb{R}^{2}, let f:f:\mathbb{R}\to\mathbb{R} be continuously differentiable everywhere with f(0)=x0f(0)=x_{0} and f(0)=y0f^{\prime}(0)=y_{0}. By Lemma 8, st([f])=f(0)=x0\mathrm{st}([f])=f(0)=x_{0} and st([f])=st([f])=f(0)=y0\mathrm{st}([f]^{\prime})=\mathrm{st}([f^{\prime}])=f^{\prime}(0)=y_{0}. ∎

Having established that the hyperreal graph y=xy=x^{\prime} is approximately space-filling, we will proceed to state two additional results about approximately space-filling sets in general, which therefore apply in particular to said graph.

Proposition 18.

If C()2C\subseteq({}^{*}\mathbb{R})^{2} is approximately space-filling, then for any X2X\subseteq\mathbb{R}^{2}, there exists some SS\subseteq{}^{*}\mathbb{R} such that

X={(st(x),st(y)):(x,y)C and xS}.X=\{(\mathrm{st}(x),\mathrm{st}(y))\,:\,(x,y)\in C\mbox{ and }x\in S\}.

In particular, for any X2X\subseteq\mathbb{R}^{2}, there exists some SS\subseteq{}^{*}\mathbb{R} such that

X={(st(x),st(x)):xS}.X=\{(\mathrm{st}(x),\mathrm{st}(x^{\prime}))\,:\,x\in S\}.
Proof.

Straightforward. ∎

Proposition 19.

Suppose C()2C\subseteq({}^{*}\mathbb{R})^{2} is approximately space-filling where CC is the graph (over {}^{*}\mathbb{R}) of y=F(x)y=F(x) for some F:F:{\subseteq}{}^{*}\mathbb{R}\to{}^{*}\mathbb{R}. Let X2X\subseteq\mathbb{R}^{2} be the graph of the equation y=f(x)y=f(x) for some everywhere-differentiable f:f:\mathbb{R}\to\mathbb{R}. If SS is as in Proposition 18, then F|SF|_{S} has the same slope as y=f(x)y=f(x) in the following sense: for every xSx\in S, for every real ϵ>0\epsilon>0, there exists some real δ>0\delta>0 such that for all hh\in{}^{*}\mathbb{R}, if 0<|h|<δ0<|h|<\delta and x+hSx+h\in S then |f(st(x))(F(x+h)F(x))/h|<ϵ|f^{\prime}(\mathrm{st}(x))-(F(x+h)-F(x))/h|<\epsilon. In particular, this is true when F=F=\bullet^{\prime}.

Proof.

Straightforward. ∎

4 A variant of finite calculus using an idempotent ultrafilter on \mathbb{N}

In so-called finite calculus, one considers the “derivative” f(x+1)f(x)f(x+1)-f(x) of ff, see Section 2.6 of [8]. In this section, we will investigate a variant of this derivative, namely Δf(x)=f(x+Ω)f(x)\Delta f(x)={}^{*}f(x+\Omega)-f(x) where Ω=[nn]\Omega=[n\mapsto n] is the canonical hyperreal in the hyperreals constructed using an idempotent ultrafilter on ={1,2,}\mathbb{N}=\{1,2,\ldots\} (note that we omit 0 from \mathbb{N}). We will show that this finite derivative Δ\Delta has unexpected connections to the standard derivative, and that the equivalence class (in an iterated ultrapower construction) of (Δf)|(\Delta f)|_{\mathbb{N}} is well-defined as a function of [f|][f|_{\mathbb{N}}] (we elaborate what we mean by “in an iterated ultrapower construction” in Remark 28 below).

The following Definitions 20 and 22, and Lemma 21 are \mathbb{N}-focused analogies of the \mathbb{R}-focused Definitions 5 and 1, and Lemma 7 above, respectively. We prefer this slight redundancy (instead of defining everything in higher generality) for the sake of concreteness.

Definition 20.

(Idempotent ultrafilters on \mathbb{N})

  1. 1.

    If SS\subseteq\mathbb{N} and nn\in\mathbb{N}, let Sn={xn:xS and xn}S-n=\{x-n\,:\,\mbox{$x\in S$ and $x-n\in\mathbb{N}$}\}.

  2. 2.

    An ultrafilter qq on \mathbb{N} is idempotent if q={S:{n:Snq}q}q=\{S\subseteq\mathbb{N}\,:\,\{n\in\mathbb{N}\,:\,S-n\in q\}\in q\}.

Lemma 21.

Idempotent ultrafilters on \mathbb{N} exist and are free.

Proof.

See [9]. ∎

Definition 22.

If qq is a free ultrafilter on \mathbb{N}, we write q{}^{*}\mathbb{R}_{q} for the hyperreal numbers constructed using qq in the usual way, and for each f:f:\mathbb{N}\to\mathbb{R} we write [f]q[f]_{q} for the hyperreal represented by ff. If qq is clear from context, we will write {}^{*}\mathbb{R} for q{}^{*}\mathbb{R}_{q}, [f][f] for [f]q[f]_{q}, and f{}^{*}f for the nonstandard extension f:{}^{*}f:{}^{*}\mathbb{R}\to{}^{*}\mathbb{R} of f:f:\mathbb{R}\to\mathbb{R}.

For the remainder of the paper, we fix an idempotent ultrafilter qq on \mathbb{N}. Lemma 24 below will replace 0+0^{+}.

Definition 23.

For each γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}, we define rmγ:(γ/2,γ/2)\mathrm{rm}_{\gamma}:\mathbb{N}\to(-\gamma/2,\gamma/2) as follows (we pronounce rm\mathrm{rm} as “remainder”). For each nn\in\mathbb{N}, we define rmγ(n)=nkγ\mathrm{rm}_{\gamma}(n)=n-k\gamma where kγk\gamma is the closest integer multiple of γ\gamma to nn (there is a unique such closest integer multiple of γ\gamma because γ\gamma is irrational and nn\in\mathbb{N}).

Note that the following lemma depends on qq being idempotent.

Lemma 24.

Let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}. For every real ϵ>0\epsilon>0, {n:|rmγ(n)|<ϵ}q\{n\in\mathbb{N}\,:\,|\mathrm{rm}_{\gamma}(n)|<\epsilon\}\in q.

Proof.

Follows from Theorem 7.2 of [3]. ∎

For the rest of the section, let Ω=[nn]\Omega=[n\in\mathbb{N}\mapsto n] (the canonical element of \{}^{*}\mathbb{R}\backslash\mathbb{R}).

Definition 25.

(Finite derivative) For each f:f:\mathbb{R}\to\mathbb{R}, we define the finite derivative Δf:\Delta f:\mathbb{R}\to{}^{*}\mathbb{R} by Δf(x)=f(x+Ω)f(x)\Delta f(x)={}^{*}f(x+\Omega)-f(x).

The following theorem shows that, when restricted to γ\gamma-periodic functions for a fixed irrational real number γ>0\gamma>0, Δ\Delta is a constant multiple of \bullet^{\prime} (up to an infinitesimal error), at least where ff^{\prime} exists. And even when ff^{\prime} does not exist, Δ\Delta (times the same constant multiple) sometimes still provides information about the slope in question.

Theorem 26.

Let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}, and let f:f:\mathbb{R}\to\mathbb{R} be γ\gamma-periodic. Let xx\in\mathbb{R}.

  1. 1.

    If f(x)f^{\prime}(x) exists, then st(Δf(x)/[rmγ])=f(x)\mathrm{st}(\Delta f(x)/[\mathrm{rm}_{\gamma}])=f^{\prime}(x).

  2. 2.

    If f(x)f^{\prime}(x) fails to exist because limh0(f(x+h)f(x))/h\lim_{h\to 0}(f(x+h)-f(x))/h diverges to \infty (resp. -\infty) then Δf(x)/[rmγ]\Delta f(x)/[\mathrm{rm}_{\gamma}] is infinite (resp. negative infinite).

  3. 3.

    Let g:g:\mathbb{R}\to\mathbb{R} be γ\gamma-periodic. If both f(x)f^{\prime}(x) and g(x)g^{\prime}(x) fail to exist because limh0(f(x+h)f(x))/h\lim_{h\to 0}(f(x+h)-f(x))/h and limh0(g(x+h)g(x))/h\lim_{h\to 0}(g(x+h)-g(x))/h both diverge to \infty but limh0(f(x+h)f(x))/h\lim_{h\to 0}(f(x+h)-f(x))/h diverges to \infty faster (i.e., there exists δ>0\delta>0 such that (f(x+h)f(x))/h>(g(x+h)g(x))/h(f(x+h)-f(x))/h>(g(x+h)-g(x))/h whenever 0<|h|<δ0<|h|<\delta) then Δf(x)/[rmγ]>Δg(x)/[rmγ]\Delta f(x)/[\mathrm{rm}_{\gamma}]>\Delta g(x)/[\mathrm{rm}_{\gamma}]. Similarly for -\infty.

Proof.

(1) Fix xx\in\mathbb{R} such that f(x)f^{\prime}(x) exists. Define g:g:\mathbb{N}\to\mathbb{R} by g(n)=(f(x+n)f(x))/rmγ(n)g(n)=(f(x+n)-f(x))/\mathrm{rm}_{\gamma}(n), then [g]=Δf(x)/[rmγ][g]=\Delta f(x)/[\mathrm{rm}_{\gamma}]. To show st([g])=f(x)\mathrm{st}([g])=f^{\prime}(x), it suffices to show that for every real ϵ>0\epsilon>0, {n:|f(x)g(n)|<ϵ}q\{n\in\mathbb{N}\,:\,|f^{\prime}(x)-g(n)|<\epsilon\}\in q. Fix ϵ\epsilon. By definition of ff^{\prime}, there is some δ>0\delta>0 such that |f(x)(f(x+h)f(x))/h|<ϵ\left|f^{\prime}(x)-(f(x+h)-f(x))/h\right|<\epsilon whenever 0<|h|<δ0<|h|<\delta. Let S={n:|rmγ(n)|<δ}S=\{n\in\mathbb{N}\,:\,|\mathrm{rm}_{\gamma}(n)|<\delta\}. By Lemma 24, SqS\in q. We claim that |f(x)g(n)|<ϵ|f^{\prime}(x)-g(n)|<\epsilon for all nSn\in S. Let nSn\in S. Then

|f(x)g(n)|\displaystyle|f^{\prime}(x)-g(n)| =|f(x)(f(x+n)f(x))/rmγ(n)|\displaystyle=|f^{\prime}(x)-(f(x+n)-f(x))/\mathrm{rm}_{\gamma}(n)| (Def. of gg)
=|f(x)(f(x+rmγ(n))f(x))/rmγ(n)|\displaystyle=|f^{\prime}(x)-(f(x+\mathrm{rm}_{\gamma}(n))-f(x))/\mathrm{rm}_{\gamma}(n)| (ff is γ\gamma-periodic)
<ϵ.\displaystyle<\epsilon. (0<|rmγ(n)|<δ0<|\mathrm{rm}_{\gamma}(n)|<\delta)

The proofs of (2) and (3) are similar to (and easier than) the proof of (1). ∎

In particular, Δ(fg)(x)/[rmγ](Δf(g(x))/[rmγ])(Δg(x)/[rmγ])\Delta(f\circ g)(x)/[\mathrm{rm}_{\gamma}]\approx(\Delta f(g(x))/[\mathrm{rm}_{\gamma}])(\Delta g(x)/[\mathrm{rm}_{\gamma}]) (with infinitesimal error) provided f(g(x))f^{\prime}(g(x)) and g(x)g^{\prime}(x) exist and gg is γ\gamma-periodic; thus Δ/[rmγ]\Delta/[\mathrm{rm}_{\gamma}] satisfies a chain rule. Contrast this with Graham, Knuth and Patashnik’s claim that “there’s no corresponding chain rule of finite calculus” [8].

Without the idempotency requirement, it would be possible to find free qβq\in\beta\mathbb{N} so as to falsify Theorem 26. For example, one could choose qq such that {n:rmγ(n)(γ3,2γ3)}q\{n\in\mathbb{N}\,:\,\mathrm{rm}_{\gamma}(n)\in(\frac{\gamma}{3},\frac{2\gamma}{3})\}\in q. Then for any γ\gamma-periodic f:f:\mathbb{R}\to\mathbb{R} such that f(0)=f(0)=0f(0)=f^{\prime}(0)=0 and f(x)=1f(x)=1 for all x(γ3,2γ3)x\in(\frac{\gamma}{3},\frac{2\gamma}{3}), it would follow that Δf(0)=1\Delta f(0)=1, implying 32γ<Δf(0)/[rmγ]<3γ\frac{3}{2\gamma}<\Delta f(0)/[\mathrm{rm}_{\gamma}]<\frac{3}{\gamma}, making the conclusion of Theorem 26 (part 1) impossible at x=0x=0.

As in Section 3, we would like to transform the derivative operation Δ:()\Delta:\mathbb{R}^{\mathbb{R}}\to({}^{*}\mathbb{R})^{\mathbb{R}} into a well-defined derivative on hyperreals. But since Δf(x)\Delta f(x) is itself hyperreal, we will need to be careful.

Definition 27.

For every f:f:\mathbb{N}\to{}^{*}\mathbb{R}, we write f\llbracket f\rrbracket for the equivalence class of ff modulo the equivalence relation defined by declaring that h,g:h,g:\mathbb{N}\to{}^{*}\mathbb{R} are equivalent iff {n:h(n)=g(n)}q\{n\in\mathbb{N}\,:\,h(n)=g(n)\}\in q. We identify each rr\in\mathbb{R} with nr\llbracket n\mapsto r\rrbracket.

Remark 28.

Suppose f:f:\mathbb{N}\to{}^{*}\mathbb{R}. Using Proposition 6.5.2 of Chang and Keisler [7], one can in fact think of f\llbracket f\rrbracket as being a hyperreal number, but in a different copy of the hyperreals. To be more precise, one can think of f\llbracket f\rrbracket as a hyperreal in q×q{}^{*}\mathbb{R}_{q\times q}, where ×\times is the filter product defined shortly before Proposition 6.2.1 of [7].

The following theorem will allow us to well-define Δ[f]=Δf\Delta[f]=\llbracket\Delta f\rrbracket.

Theorem 29.

Let f,g:f,g:\mathbb{R}\to\mathbb{R}. If [f|]=[g|][f|_{\mathbb{N}}]=[g|_{\mathbb{N}}] then Δf|=Δg|\llbracket\Delta f|_{\mathbb{N}}\rrbracket=\llbracket\Delta g|_{\mathbb{N}}\rrbracket.

Proof.

Assume [f|]=[g|][f|_{\mathbb{N}}]=[g|_{\mathbb{N}}]. We must show {n:Δf(n)=Δg(n)}q\{n\in\mathbb{N}\,:\,\Delta f(n)=\Delta g(n)\}\in q. Since [f|]=[g|][f|_{\mathbb{N}}]=[g|_{\mathbb{N}}], there is some SqS\in q such that f(n)=g(n)f(n)=g(n) for all nSn\in S. Since qq is idempotent, {n:Snq}q\{n\in\mathbb{N}\,:\,S-n\in q\}\in q. Thus S{n:Snq}qS\cap\{n\in\mathbb{N}\,:\,S-n\in q\}\in q. We claim Δf(n)=Δg(n)\Delta f(n)=\Delta g(n) for all nS{n:Snq}n\in S\cap\{n\in\mathbb{N}\,:\,S-n\in q\}. Fix any such nn. By construction, SnqS-n\in q. By a similar argument as in the proof of Theorem 9, for all hSnh\in S-n, f(n+h)f(n)=g(n+h)g(n)f(n+h)-f(n)=g(n+h)-g(n). Thus [hf(n+h)f(n)]=[hg(n+h)g(n)][h\mapsto f(n+h)-f(n)]=[h\mapsto g(n+h)-g(n)], in other words f(n+Ω)f(n)=g(n+Ω)g(n){}^{*}f(n+\Omega)-f(n)={}^{*}g(n+\Omega)-g(n), in other words Δf(n)=Δg(n)\Delta f(n)=\Delta g(n). ∎

Corollary 30.

The finite derivative Δ:q×q\Delta:{}^{*}\mathbb{R}\to{}^{*}\mathbb{R}_{q\times q} defined by Δ[f]=Δf\Delta[f]=\llbracket\Delta f\rrbracket is well-defined.

One might intuitively expect that Δ[f]=0\Delta[f]=0 should imply that ff is constant (at least ultrafilter often); we present a counterexample disproving this intuition and replacing it with a characterization related to periodicity.

Definition 31.

Let f:f:\mathbb{N}\to\mathbb{R}. We say ff is qq-a.e. constant if r\exists r\in\mathbb{R} such that {n:f(n)=r}q\{n\in\mathbb{N}\,:\,f(n)=r\}\in q. We say ff is qq-a.e. Ω\Omega-periodic if {n:f(n+Ω)=f(n)}q\{n\in\mathbb{N}\,:\,{}^{*}f(n+\Omega)=f(n)\}\in q.

Theorem 32.
  1. 1.

    For all [f][f]\in{}^{*}\mathbb{R}, Δ[f]=0\Delta[f]=0 iff ff is qq-a.e. Ω\Omega-periodic.

  2. 2.

    For all [f][f]\in{}^{*}\mathbb{R}, if ff is qq-a.e. constant then Δ[f]=0\Delta[f]=0.

  3. 3.

    There exists f:f:\mathbb{N}\to\mathbb{R} such that Δ[f]=0\Delta[f]=0 but ff is not qq-a.e. constant.

Proof.

(1) and (2) are straightforward. For (3), let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q} and define g:(γ/2,γ/2)g:(-\gamma/2,\gamma/2)\to\mathbb{R} as follows. For x0x\not=0, let g(x)=1/|x|g(x)=\lfloor 1/|x|\rfloor where \lfloor\bullet\rfloor is the greatest integer function. Let g(0)=0g(0)=0. We claim f=grmγf=g\circ\mathrm{rm}_{\gamma} witnesses (3). By Lemma 24, {n:|rmγ(n)|<ϵ}q\{n\in\mathbb{N}\,:\,|\mathrm{rm}_{\gamma}(n)|<\epsilon\}\in q for all real ϵ>0\epsilon>0. Since limx0g(x)=\lim_{x\to 0}g(x)=\infty, it follows that [f][f] is infinite, thus ff is not qq-a.e. constant.

By Lemma 24, S={n:|rmγ(n)|<γ/4}qS=\{n\in\mathbb{N}\,:\,|\mathrm{rm}_{\gamma}(n)|<\gamma/4\}\in q. We claim Δf(n)=0\Delta f(n)=0 for all nSn\in S, whence Δ[f]=Δf=0\Delta[f]=\llbracket\Delta f\rrbracket=0. Let nSn\in S. Since γ\gamma is irrational, it follows that rmγ(n)\mathrm{rm}_{\gamma}(n) is not one of the jump discontinuity points of gg, thus δ>0\exists\delta>0 such that g(x)=g(rmγ(n))g(x)=g(\mathrm{rm}_{\gamma}(n)) whenever |xrmγ(n)|<δ|x-\mathrm{rm}_{\gamma}(n)|<\delta. By Lemma 24, T={m:|rmγ(m)|<min(δ,γ/4)}qT=\{m\in\mathbb{N}\,:\,|\mathrm{rm}_{\gamma}(m)|<\min(\delta,\gamma/4)\}\in q. We claim f(n+m)f(n)=0f(n+m)-f(n)=0 for all mTm\in T, which establishes Δf(n)=0\Delta f(n)=0. Let mTm\in T. Since |rmγ(n)|<γ/4|\mathrm{rm}_{\gamma}(n)|<\gamma/4 and |rmγ(m)|<γ/4|\mathrm{rm}_{\gamma}(m)|<\gamma/4, it follows that rmγ(n+m)=rmγ(n)+rmγ(m)\mathrm{rm}_{\gamma}(n+m)=\mathrm{rm}_{\gamma}(n)+\mathrm{rm}_{\gamma}(m). Thus |rmγ(n+m)rmγ(n)|=|rmγ(m)|<δ|\mathrm{rm}_{\gamma}(n+m)-\mathrm{rm}_{\gamma}(n)|=|\mathrm{rm}_{\gamma}(m)|<\delta, so f(n+m)=g(rmγ(n+m))=g(rmγ(n))=f(n)f(n+m)=g(\mathrm{rm}_{\gamma}(n+m))=g(\mathrm{rm}_{\gamma}(n))=f(n) by choice of δ\delta. ∎

4.1 An alternate proof and strengthening of Hindman’s theorem

The well-definedness in Corollary 30 can be used to prove Hindman’s theorem (Theorem 34 below). Formally, our proof of Hindman’s theorem is basically identical to the usual proof, but informally it appears different because all references to idempotent ultrafilters are hidden underneath the innocent-looking fact that the derivative of a constant function is zero. But the idempotency of the underlying ultrafilter was used to prove that the derivative in question is well-defined.

Lemma 33.

Let cc\in\mathbb{R}. If f1,,fk:f_{1},\ldots,f_{k}:\mathbb{N}\to\mathbb{R} are such that each [fi]=c[f_{i}]=c, then there exist arbitrarily large nn\in\mathbb{N} such that the following requirement holds. For each i{1,,k}i\in\{1,\ldots,k\}, fi(n+Ω)=fi(n)=c{}^{*}f_{i}(n+\Omega)=f_{i}(n)=c.

Proof.

By Corollary 30 each Δ[fi]=Δc=0\Delta[f_{i}]=\Delta c=0, thus nfi(n+Ω)fi(n)=0\llbracket n\mapsto{}^{*}f_{i}(n+\Omega)-f_{i}(n)\rrbracket=0, i.e. Si={n:fi(n+Ω)=fi(n)}qS_{i}=\{n\in\mathbb{N}\,:\,{}^{*}f_{i}(n+\Omega)=f_{i}(n)\}\in q. Since each [fi]=c[f_{i}]=c, each Ti={n:fi(n)=c}qT_{i}=\{n\in\mathbb{N}\,:\,f_{i}(n)=c\}\in q. Thus i(SiTi)q\bigcap_{i}(S_{i}\cap T_{i})\in q. The elements thereof witness the lemma. ∎

Theorem 34.

(Hindman’s Theorem) If f:f:\mathbb{N}\to\mathbb{R} has finite range, then there exists some cc\in\mathbb{R} and some infinite SS\subseteq\mathbb{N} such that for all finite nonempty XSX\subseteq S, f(X)=cf(\sum X)=c.

Proof.

By the maximality of ultrafilters, [f]=c[f]=c for some cc\in\mathbb{R}. For every finite XX\subseteq\mathbb{N}, define fX:f_{X}:\mathbb{N}\to\mathbb{R} by fX(n)=f((X)+n)f_{X}(n)=f((\sum X)+n) (note that f=ff_{\emptyset}=f).

Inductively, suppose we have defined n1<<nkn_{1}<\cdots<n_{k} (an empty list if k=0k=0) such that:

  1. 1.

    For each nonempty X{n1,,nk}X\subseteq\{n_{1},\ldots,n_{k}\}, f(X)=cf(\sum X)=c.

  2. 2.

    For each X{n1,,nk}X\subseteq\{n_{1},\ldots,n_{k}\}, [fX]=c[f_{X}]=c.

By Lemma 33, pick nk+1>max{n1,,nk}n_{k+1}>\max\{n_{1},\ldots,n_{k}\} such that

(*) for each X{n1,,nk}X\subseteq\{n_{1},\ldots,n_{k}\}, fX(nk+1+Ω)=fX(nk+1)=c{}^{*}f_{X}(n_{k+1}+\Omega)=f_{X}(n_{k+1})=c.

For each X{n1,,nk+1}X\subseteq\{n_{1},\ldots,n_{k+1}\} with nk+1Xn_{k+1}\in X, we have (by *) f(X)=fX\{nk+1}(nk+1)=cf(\sum X)=f_{X\backslash\{n_{k+1}\}}(n_{k+1})=c because X\{nk+1}{n1,,nk}X\backslash\{n_{k+1}\}\subseteq\{n_{1},\ldots,n_{k}\}. And for each X{n1,,nk}X\subseteq\{n_{1},\ldots,n_{k}\}, since (by *) fX(nk+1+Ω)=c{}^{*}f_{X}(n_{k+1}+\Omega)=c, it follows that fX{nk+1}(Ω)=c{}^{*}f_{X\cup\{n_{k+1}\}}(\Omega)=c, i.e., [fX{nk+1}]=c[f_{X\cup\{n_{k+1}\}}]=c. So n1<<nk+1{n_{1}<\cdots<n_{k+1}} also satisfy 1–2. By induction, we obtain n1<n2<n_{1}<n_{2}<\cdots with the above properties, which clearly proves the theorem. ∎

In the above proof, we proved more than was required. This leads to the following strengthening of Hindman’s theorem.

Theorem 35.

(Ω\Omega as universal Hindman number) If f:f:\mathbb{N}\to\mathbb{R} has finite range, then there exists some cc\in\mathbb{R} and some infinite SS\subseteq\mathbb{N} such that for all finite nonempty XS{Ω}X\subseteq S\cup\{\Omega\}, f(X)=c{}^{*}f(\sum X)=c.

Proof.

In the proof of Theorem 34, we proved the existence of cc\in\mathbb{R} and n1,n2,n_{1},n_{2},\ldots\in\mathbb{N} such that (1) for each finite nonempty X{n1,n2,}X\subseteq\{n_{1},n_{2},\ldots\}, f(X)=cf(\sum X)=c, and (2) for each finite X{n1,n2,}X\subseteq\{n_{1},n_{2},\ldots\}, [fX]=c[f_{X}]=c, where fX:f_{X}:\mathbb{N}\to\mathbb{R} is defined by fX(n)=f((X)+n)f_{X}(n)=f((\sum X)+n). For XX\subseteq\mathbb{N}, the statement [fX]=c[f_{X}]=c is equivalent to fX(Ω)=c{}^{*}f_{X}(\Omega)=c, which is equivalent to f((X{Ω}))=c{}^{*}f(\sum(X\cup\{\Omega\}))=c. ∎

In a heuristical sense, our proof of Theorem 34 seems to suggest that the idempotency requirement might be indispensable for Corollary 30: if the corollary could be proven using weaker assumptions about qq, then we would have a non-idempotent ultrafilter proof of Hindman’s theorem, which seems like it would be surprising. But of course, this is not a rigorous proof, and we do not actually know whether there exists any non-idempotent ultrafilter for which Corollary 30 holds.

4.2 Differentiating by [rmγ][\mathrm{rm}_{\gamma}]

In Theorem 26 we established a deep connection between our finite derivative and the usual derivative from elementary calculus. But the theorem was limited to γ\gamma-periodic functions for some irrational γ\gamma. At first glance, this seems very limiting. But since Δf\llbracket\Delta f\rrbracket only depends on f|f|_{\mathbb{N}}, the following lemma shows that the γ\gamma-periodic hypothesis in Theorem 26 does not limit the \bullet^{\prime}-like nature of the derivative from Corollary 30 at all.

Lemma 36.

Let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}. For every f:f:\mathbb{R}\to\mathbb{R}, there exists a γ\gamma-periodic function f^:\hat{f}:\mathbb{R}\to\mathbb{R} such that f|=f^|f|_{\mathbb{N}}=\hat{f}|_{\mathbb{N}}.

Proof.

Define f^:\hat{f}:\mathbb{R}\to\mathbb{R} by

f^(x)={f(n)if x=mγ+n for some m,n,0in any other case.\hat{f}(x)=\begin{cases}f(n)&\mbox{if $x=m\gamma+n$ for some $m\in\mathbb{Z},n\in\mathbb{N}$,}\\ 0&\mbox{in any other case.}\end{cases}

Since γ\gamma is irrational, there do not exist distinct ways to write x=mγ+nx=m\gamma+n, thus f^(x)\hat{f}(x) is well-defined. Clearly f^\hat{f} has the desired properties. ∎

Thus if f:f:\mathbb{R}\to\mathbb{R} then Δ[f|]=Δ[f^|]=nΔf^(n)\Delta[f|_{\mathbb{N}}]=\Delta[\hat{f}|_{\mathbb{N}}]=\llbracket n\mapsto\Delta\hat{f}(n)\rrbracket encodes (by Theorem 26) information not about the derivative of ff but rather about the derivative of f^\hat{f}. Since [f|]=[f^|][f|_{\mathbb{N}}]=[\hat{f}|_{\mathbb{N}}] this means that Δ[f|]\Delta[f|_{\mathbb{N}}] does indeed encode information about the hyperreal [f|][f|_{\mathbb{N}}], just not necessarily about ff. For example, if f(x)=xf(x)=x for all xx\in\mathbb{R}, then f^\hat{f} is exotic and Δ[f|]/[rmγ]\Delta[f|_{\mathbb{N}}]/[\mathrm{rm}_{\gamma}] is far from the derivative x=1x^{\prime}=1 we might expect. Nonetheless, we can use Theorem 26 to obtain some other derivatives for which the familiar rules of elementary calculus apply.

Definition 37.

For every γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}, for every f:f:\mathbb{N}\to\mathbb{R}, define Dγf:D_{\gamma}f:{\subseteq}\mathbb{N}\to\mathbb{R} by Dγf(n)=st(Δf(n)/[rmγ])D_{\gamma}f(n)=\mathrm{st}(\Delta f(n)/[\mathrm{rm}_{\gamma}]) for all nn such that st(Δf(n)/[rmγ])\mathrm{st}(\Delta f(n)/[\mathrm{rm}_{\gamma}]) is defined.

Definition 38.

For every γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}, define Dγ:D_{\gamma}:{\subseteq}{}^{*}\mathbb{R}\to{}^{*}\mathbb{R} as follows. For every [f][f]\in{}^{*}\mathbb{R} (so f:f:\mathbb{N}\to\mathbb{R}), define Dγ[f]=[nDγf(n)]D_{\gamma}[f]=[n\in\mathbb{N}\mapsto D_{\gamma}f(n)] provided [nDγf(n)][n\in\mathbb{N}\mapsto D_{\gamma}f(n)] is defined.

Proposition 39.

Let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}. The operator Dγ:D_{\gamma}:{\subseteq}{}^{*}\mathbb{R}\to{}^{*}\mathbb{R} of Definition 38 is well-defined.

Proof.

Suppose f,g:f,g:\mathbb{N}\to\mathbb{R} are such that [f]=[g][f]=[g]. By Theorem 29, Δf=Δg\llbracket\Delta f\rrbracket=\llbracket\Delta g\rrbracket. Thus, Sq\exists S\in q such that Δf(n)=Δg(n)\Delta f(n)=\Delta g(n) for all nSn\in S. Thus for all nSn\in S, Δf(n)/[rmγ]=Δg(n)/[rmγ]\Delta f(n)/[\mathrm{rm}_{\gamma}]=\Delta g(n)/[\mathrm{rm}_{\gamma}], and st(Δf(n)/[rmγ])\mathrm{st}(\Delta f(n)/[\mathrm{rm}_{\gamma}]) is defined iff st(Δg(n)/[rmγ])\mathrm{st}(\Delta g(n)/[\mathrm{rm}_{\gamma}]) is defined. Thus Dγ[f]=Dγ[g]D_{\gamma}[f]=D_{\gamma}[g] (and either both sides are defined, or both sides are undefined). ∎

Theorem 40.

Let γ+\\gamma\in\mathbb{R}^{+}\backslash\mathbb{Q}. If f:f:\mathbb{R}\to\mathbb{R} is differentiable on (γ/2,γ/2)(-\gamma/2,\gamma/2), then Dγ[frmγ]=[frmγ]D_{\gamma}[f\circ\mathrm{rm}_{\gamma}]=[f^{\prime}\circ\mathrm{rm}_{\gamma}].

Proof.

Let S=\{(k+12)γ:k}S=\mathbb{R}\backslash\{(k+\frac{1}{2})\gamma\,:\,k\in\mathbb{Z}\}. Define rm¯γ\overline{\mathrm{rm}}_{\gamma} the same way rmγ\mathrm{rm}_{\gamma} was defined (Definition 23) except define it for all xSx\in S instead of only xx\in\mathbb{N}. Clearly rm¯γ|=rmγ|\overline{\mathrm{rm}}_{\gamma}|_{\mathbb{N}}=\mathrm{rm}_{\gamma}|_{\mathbb{N}} and rm¯γ\overline{\mathrm{rm}}_{\gamma} is differentiable with rm¯γ(x)=1\overline{\mathrm{rm}}_{\gamma}^{\prime}(x)=1 on SS. Since ff is differentiable on (γ/2,γ/2)=rm¯γ(S)(-\gamma/2,\gamma/2)=\overline{\mathrm{rm}}_{\gamma}(S), we can apply the chain rule and see f(rm¯γ(x))=f(rm¯γ(x))rm¯γ(x)=f(rm¯γ(x))f(\overline{\mathrm{rm}}_{\gamma}(x))^{\prime}=f^{\prime}(\overline{\mathrm{rm}}_{\gamma}(x))\overline{\mathrm{rm}}_{\gamma}^{\prime}(x)=f^{\prime}(\overline{\mathrm{rm}}_{\gamma}(x)) on SS (*). For every nn\in\mathbb{N},

Dγ(frmγ)(n)\displaystyle D_{\gamma}(f\circ\mathrm{rm}_{\gamma})(n) =st(Δ(frmγ)(n)/[rmγ])\displaystyle=\mathrm{st}(\Delta(f\circ\mathrm{rm}_{\gamma})(n)/[\mathrm{rm}_{\gamma}]) (Definition 37)
=st(Δ(frm¯γ)(n)/[rmγ])\displaystyle=\mathrm{st}(\Delta(f\circ\overline{\mathrm{rm}}_{\gamma})(n)/[\mathrm{rm}_{\gamma}]) (rmγ|=rm¯γ|\mathrm{rm}_{\gamma}|_{\mathbb{N}}=\overline{\mathrm{rm}}_{\gamma}|_{\mathbb{N}})
=(frm¯γ)(n)\displaystyle=(f\circ\overline{\mathrm{rm}}_{\gamma})^{\prime}(n) (Theorem 26)
=f(rm¯γ(n))\displaystyle=f^{\prime}(\overline{\mathrm{rm}}_{\gamma}(n)) (By *)
=f(rmγ(n)),\displaystyle=f^{\prime}(\mathrm{rm}_{\gamma}(n)), (rmγ|=rm¯γ|\mathrm{rm}_{\gamma}|_{\mathbb{N}}=\overline{\mathrm{rm}}_{\gamma}|_{\mathbb{N}})

so [nDγ(frmγ)(n)]=[nf(rmγ(n))][n\mapsto D_{\gamma}(f\circ\mathrm{rm}_{\gamma})(n)]=[n\mapsto f^{\prime}(\mathrm{rm}_{\gamma}(n))], i.e., Dγ[frmγ]=[frmγ]D_{\gamma}[f\circ\mathrm{rm}_{\gamma}]=[f^{\prime}\circ\mathrm{rm}_{\gamma}]. ∎

For example,

Dγ(e[rmγ]+[rmγ]3+cos2[rmγ])=e[rmγ]+3[rmγ]22sin2[rmγ].D_{\gamma}(e^{[\mathrm{rm}_{\gamma}]}+[\mathrm{rm}_{\gamma}]^{3}+\cos 2[\mathrm{rm}_{\gamma}])=e^{[\mathrm{rm}_{\gamma}]}+3[\mathrm{rm}_{\gamma}]^{2}-2\sin 2[\mathrm{rm}_{\gamma}].

Thus, DγD_{\gamma} follows the familiar derivative rules from elementary calculus as long as we consider functions not of the continuous variable xx\in\mathbb{R}, but rather of the discrete variable rmγ(n)rmγ()\mathrm{rm}_{\gamma}(n)\in\mathrm{rm}_{\gamma}(\mathbb{N}). In a sense, one can “differentiate by [rmγ][\mathrm{rm}_{\gamma}]”; it might even be tempting to write DγD_{\gamma} as d/d[rmγ]d/d[\mathrm{rm}_{\gamma}]. This is spiritually similar to how the prime numbers play the role of dimensions, and how one differentiates by prime numbers, in Jeffries’ paper [10] in the Notices.

Acknowledgments

We gratefully acknowledge Arthur Paul Pedersen and the reviewers and the editor for comments and feedback.

References