Samuel Allen \surnameAlexander \givennameBryan \surnameDawson \subjectprimarymsc202026E35, 26A24 \subjectsecondarymsc202054D80, 30D20 \volumenumber \issuenumber \publicationyear \papernumber \startpage \endpage \MR \Zbl \published \publishedonline \proposed \seconded \corresponding \editor \version
Hyperreal differentiation with an idempotent ultrafilter
Abstract
In the hyperreals constructed using a free ultrafilter on , where is the hyperreal represented by , it is tempting to define a derivative operator by , but unfortunately this is not generally well-defined. We show that if the ultrafilter in question is idempotent and contains for arbitrarily small real then the desired derivative operator is well-defined for all such that exists. We also introduce a hyperreal variation of the derivative from finite calculus, and show that it has surprising relationships to the standard derivative. We give an alternate proof, and strengthened version of, Hindman’s theorem.
keywords:
hyperreals, idempotent ultrafilters, derivatives, finite calculus, Hindman’s theoremdoi:
In the hyperreals constructed using a free ultrafilter on R, where [f] is the hyperreal represented by f:R-¿R, it is tempting to define a derivative operator by [f]’=[f’], but unfortunately this is not generally well-defined. We show that if the ultrafilter in question is idempotent and contains (0,epsilon) for arbitrarily small real epsilon then the desired derivative operator is well-defined for all f such that [f’] exists. We also introduce a hyperreal variation of the derivative from finite calculus, and show that it has surprising relationships to the standard derivative. We give a new proof of Hindman’s theorem, and we prove a stronger theorem.
1 Introduction
There is a long tradition [13, 2, 16, 4, 14, 11, 5, 12, 6] of attempting to differentiate numbers in various ways. Much attention was focused on derivatives of numbers when Jeffries’ paper on the subject appeared in the Notices of the AMS late last year [10]; almost simultaneously (and apparently independently), Tossavainen et al’s survey on the subject appeared in the College Math Journal [15].
Why should the reader care about differentiating numbers? In general, any time a new theory is introduced, it is natural to seek numerical structures satisfying that theory: thus, when the theory of groups is introduced, it is natural to introduce examples like and . Theories about elementary calculus functions, in languages including the unary function symbol , were originally modeled by structures whose universes consisted of elementary calculus functions, not numbers. We can at least try to find numerical models for these theories. The act of interpreting in a structure whose universe is a number system is the act of “differentiating numbers”. We are hopeful that generalizing models of elementary calculus function theories could eventually be fruitful just like generalizing models of permutation sets led to the abstract theory of groups.
When it comes to numerically modeling theories, different theories might require different number systems: both and are rings, but only the latter is a field. By gaining knowledge about which number systems are needed for which theories, we gain insight into those theories. We hope that a greater knowledge about which number systems are needed to model various subtheories of elementary calculus functions, will eventually give us insight into those subtheories.
If the only axiom we care about is the Leibniz rule (and the nontriviality axiom ), we can interpret on so as to satisfy that. That is the approach of [2, 16]. But their does not even satisfy the linearity axiom. Our interpretation (in Section 3) of on a subset of the hyperreals will satisfy far more axioms of the theory of elementary calculus functions. And in Section 3.2, we will introduce a stricter subset of the hyperreals where not only can be elegantly interpreted, but as well (in other words, there is an elegant way to define the “composition” of two numbers there), in such a way as to numerically satisfy even the chain rule.
The key idea behind our numerical interpretation of is to commute the derivative operation with the operation of taking a function’s equivalence class in the hyperreals, in other words, define (we will spell out the details below). Unfortunately, this is not well-defined in general. However, the idea can be salvaged in several different ways, by making use of certain idempotent ultrafilters. In Section 4 we will use the same approach to well-define a hyperreal variation of the derivative from finite calculus, and as an application of that, we will give a new proof of Hindman’s theorem and also strengthen said theorem.
2 Preliminaries
Throughout the paper, we write for the set of ultrafilters on .
Definition 1.
(Hyperreals) For each free , let be the hyperreals constructed using . For every , let be the hyperreal represented by . If is clear from context, we will write and for and , respectively.
Convention 2.
If is free and is a function with codomain and with domain , we will write for where is the extension of defined by for all . If , we say that does not exist. If is clear from context, we will write for .
Definition 3.
For each , let be the nonstandard extension of . Let be the hyperreal represented by the identity function.
For every , there are two ways of viewing in nonstandard analysis. It can be viewed as the number or as the function . The two are related via . Namely: .
Unfortunately, the following proposition shows that the idea of defining does not work in general.
Proposition 4.
(Ill-definedness)
-
1.
There exists a free and everywhere-differentiable such that but .
-
2.
For every free , for all such that exists, there exists such that but does not exist.
Proof.
(1) Let such that . The claim is witnessed by and .
(2) Let be dense and co-dense. Assume (if not, then and a similar argument applies). The claim is witnessed by . ∎
In light of Proposition 4, we cannot expect the definition to work for every free even if we restrict our attention to everywhere-differentiable ; and if we do not so restrict our attention, then we can expect the definition to fail for every . We will show that if we restrict attention to those such that exists, then the definition does work provided is idempotent and contains for every .
3 Differentiating hyperreals such that exists
Definition 5.
(Idempotent ultrafilters on )
-
1.
For each and any , is defined to be .
-
2.
An ultrafilter is idempotent if .
Definition 6.
By we mean the set of ultrafilters such that satisfies the following requirement. For every real , the open interval .
Lemma 7.
contains an idempotent ultrafilter.
Proof.
By Lemma 13.29(a) and Theorem 13.31 of [9]. ∎
Clearly an ultrafilter in is free. The following lemma illustrates the power of ultrafilters in .
Lemma 8.
Let . If is continuous at then .
Proof.
Let be real. By continuity of at , such that whenever . Since , . Thus, is within of ultrafilter often, so is within of . ∎
For the rest of this section, we fix an idempotent . The following theorem shows that this suffices to make the definition well-defined if we restrict it to functions such that exists. Note that since , the existence of is equivalent to the statement that for all real , there exists real such that exists.
Theorem 9.
For all such that and exist, if then .
Proof.
Since , there is some such that on . Let . Existence of and means and , thus . Since is idempotent, . Thus . To show , we will show that on . Let . In particular, . We must show .
Claim: For all , . Indeed, let . This means for some . Compute:
() | ||||
(Algebra) | ||||
( on ) | ||||
(Algebra) | ||||
() |
proving the claim.
Since and , it follows that for all real , , thus is nonempty. So contains arbitrarily near . Since , and exist. Since both limits exist and since there are arbitrarily near such that , the limits must be equal, that is, . ∎
Corollary 10.
Let The derivative operation defined by (for all such that exists) is well-defined.
We do not yet know whether (from Corollary 10) is a proper subset of . In other words: can every hyperreal be written in the form where exists? Does this depend on ?
If is as in Corollary 10 then it follows that the structure satisfies every positive formula in the theory of elementary calculus functions in the language , where, by positive formula, we mean a formula that can be built up without using or . In particular this includes the Leibniz rule axiom, linearity, and the power rule schema. The structure also satisfies the nontriviality axiom . All this remains true if constant symbols for other individual functions (such as and ), besides just the identity function, are added to the language, interpreted in by the hyperreals represented thereby (such as and ), provided those hyperreals’ derivatives exist.
In terms of nonstandard extensions, Theorem 9 says that there is a well-defined map which sends every to . For example, this map sends to .
One might intuitively wonder whether implies is constant (at least ultrafilter often). The following proposition provides a counterexample.
Proposition 11.
There exists such that but for every , .
Proof.
By Theorem 1.14 of [1], there exist disjoint Cantor sets on . An ultrafilter cannot contain two disjoint sets, so there is some Cantor set on with . Let be the devil’s staircase based on . Then for all so , but is increasing, and is not flat on for any , implying (since ) that for all . ∎
We have proven Corollary 10 under the assumption that is idempotent and in . Similar reasoning would hold if were idempotent and in (i.e., if were required to contain for every positive real ). We currently do not know whether Corollary 10 holds for any other type of ultrafilter.
3.1 Differential equations and the secant method
Since takes (a subset of) to , one can attempt to solve (or approximately solve) differential equations by using the secant method from numerical analysis, which is traditionally only used to solve non-differential equations. This is interesting because as far as we know, the secant method has not previously been applicable to differential equations. We illustrate this with an example in which the method finds a correct solution in one step.
Example 12.
Solve the differential equation using the secant method, with initial guesses and .
Solution. Define by whenever is defined. We desire a solution of the equation . Since , it follows that , so any such solution will yield a solution to the differential equation (at least -a.e.). Compute:
(initial guess ) | ||||
(initial guess ) | ||||
(Secant method) | ||||
it follows that . This yields a solution to the original differential equation. ∎
We have not yet found any examples where this approach is more practical than other approximate methods in differential equations. We hope that either such examples can be found later, or, if not, that the lack of such examples might provide insight into limitations of the secant method itself. The point of this subsection is not so much to focus on the secant method, but rather to illustrate the kind of things we hope might be possible by numerically interpreting the theory of elementary calculus functions.
3.2 The well-definability of composition on a subset of the hyperreals
The work we have presented above is relevant to numerically modeling subtheories of the theory of elementary calculus functions in a language containing a unary function symbol for differentiation. But one key axiom is missing from that theory, namely the chain rule, since the chain rule also involves a binary function symbol for composition. In this section, we introduce a subset of the hyperreals suitable for numerically modeling subtheories of the theory of elementary calculus functions in a language containing and . The following definition is motivated by the theory of complex analysis.
Definition 13.
(Entire numbers)
-
1.
A function is entire if is infinitely differentiable at and , .
-
2.
A hyperreal number is entire if it can be written as for some entire .
Proposition 14.
For all entire , if then .
Proof.
In the same way that we can think of the differential equation as describing the family of all circles centered at the origin, we can think of the equation as describing the family of all entire functions. The latter is much worse behaved than the former: no two circles centered at the origin ever intersect each other, but for all there exist distinct entire functions whose graphs intersect at . In this sense, we can say that the real plane contains no “critical points” of , but that every point of the real plane is a “critical point” of . We can interpret Proposition 14 as saying that in the hyperreal plane, every point on the vertical line is a “non-critical point” of the family of all entire functions, for the proposition says that no two distinct entire function graphs intersect anywhere on this vertical line.
Corollary 15.
The operation defined on the entire numbers by (whenever and are entire) is well-defined.
3.3 The approximately space-filling nature of differentiation
Clearly is linear over , in the sense that for all and , if the derivatives in question exist. So how badly behaved could the graph be? We will show that it is approximately space-filling, in the following sense.
Definition 16.
A subset is approximately space-filling if for all , there exists some such that and .
It is not hard to find functions which are linear over and whose graphs are approximately space-filling. For example, if we consider as a vector space over and let be a basis for it with an infinitesimal basis element , then the projection is one such function. Nevertheless, we find it interesting (even if the proof is quite simple) that our derivative operator also has these properties.
Proposition 17.
The hyperreal graph , i.e., the set of all such that , is approximately space-filling.
Proof.
For any , let be continuously differentiable everywhere with and . By Lemma 8, and . ∎
Having established that the hyperreal graph is approximately space-filling, we will proceed to state two additional results about approximately space-filling sets in general, which therefore apply in particular to said graph.
Proposition 18.
If is approximately space-filling, then for any , there exists some such that
In particular, for any , there exists some such that
Proof.
Straightforward. ∎
Proposition 19.
Suppose is approximately space-filling where is the graph (over ) of for some . Let be the graph of the equation for some everywhere-differentiable . If is as in Proposition 18, then has the same slope as in the following sense: for every , for every real , there exists some real such that for all , if and then . In particular, this is true when .
Proof.
Straightforward. ∎
4 A variant of finite calculus using an idempotent ultrafilter on
In so-called finite calculus, one considers the “derivative” of , see Section 2.6 of [8]. In this section, we will investigate a variant of this derivative, namely where is the canonical hyperreal in the hyperreals constructed using an idempotent ultrafilter on (note that we omit from ). We will show that this finite derivative has unexpected connections to the standard derivative, and that the equivalence class (in an iterated ultrapower construction) of is well-defined as a function of (we elaborate what we mean by “in an iterated ultrapower construction” in Remark 28 below).
The following Definitions 20 and 22, and Lemma 21 are -focused analogies of the -focused Definitions 5 and 1, and Lemma 7 above, respectively. We prefer this slight redundancy (instead of defining everything in higher generality) for the sake of concreteness.
Definition 20.
(Idempotent ultrafilters on )
-
1.
If and , let .
-
2.
An ultrafilter on is idempotent if .
Lemma 21.
Idempotent ultrafilters on exist and are free.
Proof.
See [9]. ∎
Definition 22.
If is a free ultrafilter on , we write for the hyperreal numbers constructed using in the usual way, and for each we write for the hyperreal represented by . If is clear from context, we will write for , for , and for the nonstandard extension of .
For the remainder of the paper, we fix an idempotent ultrafilter on . Lemma 24 below will replace .
Definition 23.
For each , we define as follows (we pronounce as “remainder”). For each , we define where is the closest integer multiple of to (there is a unique such closest integer multiple of because is irrational and ).
Note that the following lemma depends on being idempotent.
Lemma 24.
Let . For every real , .
Proof.
Follows from Theorem 7.2 of [3]. ∎
For the rest of the section, let (the canonical element of ).
Definition 25.
(Finite derivative) For each , we define the finite derivative by .
The following theorem shows that, when restricted to -periodic functions for a fixed irrational real number , is a constant multiple of (up to an infinitesimal error), at least where exists. And even when does not exist, (times the same constant multiple) sometimes still provides information about the slope in question.
Theorem 26.
Let , and let be -periodic. Let .
-
1.
If exists, then .
-
2.
If fails to exist because diverges to (resp. ) then is infinite (resp. negative infinite).
-
3.
Let be -periodic. If both and fail to exist because and both diverge to but diverges to faster (i.e., there exists such that whenever ) then . Similarly for .
Proof.
(1) Fix such that exists. Define by , then . To show , it suffices to show that for every real , . Fix . By definition of , there is some such that whenever . Let . By Lemma 24, . We claim that for all . Let . Then
(Def. of ) | ||||
( is -periodic) | ||||
() |
The proofs of (2) and (3) are similar to (and easier than) the proof of (1). ∎
In particular, (with infinitesimal error) provided and exist and is -periodic; thus satisfies a chain rule. Contrast this with Graham, Knuth and Patashnik’s claim that “there’s no corresponding chain rule of finite calculus” [8].
Without the idempotency requirement, it would be possible to find free so as to falsify Theorem 26. For example, one could choose such that . Then for any -periodic such that and for all , it would follow that , implying , making the conclusion of Theorem 26 (part 1) impossible at .
As in Section 3, we would like to transform the derivative operation into a well-defined derivative on hyperreals. But since is itself hyperreal, we will need to be careful.
Definition 27.
For every , we write for the equivalence class of modulo the equivalence relation defined by declaring that are equivalent iff . We identify each with .
Remark 28.
The following theorem will allow us to well-define .
Theorem 29.
Let . If then .
Proof.
Assume . We must show . Since , there is some such that for all . Since is idempotent, . Thus . We claim for all . Fix any such . By construction, . By a similar argument as in the proof of Theorem 9, for all , . Thus , in other words , in other words . ∎
Corollary 30.
The finite derivative defined by is well-defined.
One might intuitively expect that should imply that is constant (at least ultrafilter often); we present a counterexample disproving this intuition and replacing it with a characterization related to periodicity.
Definition 31.
Let . We say is -a.e. constant if such that . We say is -a.e. -periodic if .
Theorem 32.
-
1.
For all , iff is -a.e. -periodic.
-
2.
For all , if is -a.e. constant then .
-
3.
There exists such that but is not -a.e. constant.
Proof.
(1) and (2) are straightforward. For (3), let and define as follows. For , let where is the greatest integer function. Let . We claim witnesses (3). By Lemma 24, for all real . Since , it follows that is infinite, thus is not -a.e. constant.
4.1 An alternate proof and strengthening of Hindman’s theorem
The well-definedness in Corollary 30 can be used to prove Hindman’s theorem (Theorem 34 below). Formally, our proof of Hindman’s theorem is basically identical to the usual proof, but informally it appears different because all references to idempotent ultrafilters are hidden underneath the innocent-looking fact that the derivative of a constant function is zero. But the idempotency of the underlying ultrafilter was used to prove that the derivative in question is well-defined.
Lemma 33.
Let . If are such that each , then there exist arbitrarily large such that the following requirement holds. For each , .
Proof.
By Corollary 30 each , thus , i.e. . Since each , each . Thus . The elements thereof witness the lemma. ∎
Theorem 34.
(Hindman’s Theorem) If has finite range, then there exists some and some infinite such that for all finite nonempty , .
Proof.
By the maximality of ultrafilters, for some . For every finite , define by (note that ).
Inductively, suppose we have defined (an empty list if ) such that:
-
1.
For each nonempty , .
-
2.
For each , .
By Lemma 33, pick such that
() for each , . |
For each with , we have (by ) because . And for each , since (by ) , it follows that , i.e., . So also satisfy 1–2. By induction, we obtain with the above properties, which clearly proves the theorem. ∎
In the above proof, we proved more than was required. This leads to the following strengthening of Hindman’s theorem.
Theorem 35.
( as universal Hindman number) If has finite range, then there exists some and some infinite such that for all finite nonempty , .
Proof.
In the proof of Theorem 34, we proved the existence of and such that (1) for each finite nonempty , , and (2) for each finite , , where is defined by . For , the statement is equivalent to , which is equivalent to . ∎
In a heuristical sense, our proof of Theorem 34 seems to suggest that the idempotency requirement might be indispensable for Corollary 30: if the corollary could be proven using weaker assumptions about , then we would have a non-idempotent ultrafilter proof of Hindman’s theorem, which seems like it would be surprising. But of course, this is not a rigorous proof, and we do not actually know whether there exists any non-idempotent ultrafilter for which Corollary 30 holds.
4.2 Differentiating by
In Theorem 26 we established a deep connection between our finite derivative and the usual derivative from elementary calculus. But the theorem was limited to -periodic functions for some irrational . At first glance, this seems very limiting. But since only depends on , the following lemma shows that the -periodic hypothesis in Theorem 26 does not limit the -like nature of the derivative from Corollary 30 at all.
Lemma 36.
Let . For every , there exists a -periodic function such that .
Proof.
Define by
Since is irrational, there do not exist distinct ways to write , thus is well-defined. Clearly has the desired properties. ∎
Thus if then encodes (by Theorem 26) information not about the derivative of but rather about the derivative of . Since this means that does indeed encode information about the hyperreal , just not necessarily about . For example, if for all , then is exotic and is far from the derivative we might expect. Nonetheless, we can use Theorem 26 to obtain some other derivatives for which the familiar rules of elementary calculus apply.
Definition 37.
For every , for every , define by for all such that is defined.
Definition 38.
For every , define as follows. For every (so ), define provided is defined.
Proposition 39.
Let . The operator of Definition 38 is well-defined.
Proof.
Suppose are such that . By Theorem 29, . Thus, such that for all . Thus for all , , and is defined iff is defined. Thus (and either both sides are defined, or both sides are undefined). ∎
Theorem 40.
Let . If is differentiable on , then .
Proof.
For example,
Thus, follows the familiar derivative rules from elementary calculus as long as we consider functions not of the continuous variable , but rather of the discrete variable . In a sense, one can “differentiate by ”; it might even be tempting to write as . This is spiritually similar to how the prime numbers play the role of dimensions, and how one differentiates by prime numbers, in Jeffries’ paper [10] in the Notices.
Acknowledgments
We gratefully acknowledge Arthur Paul Pedersen and the reviewers and the editor for comments and feedback.
References
- [1] P Bankston, R J McGovern, Topological partitions, General Topology and its Applications 10 (1979) 215–229
- [2] E Barbeau, Remarks on an arithmetic derivative, Canadian Mathematical Bulletin 4 (1961) 117–122
- [3] V Bergelson, Ultrafilters, IP sets, dynamics, and combinatorial number theory, from: “Ultrafilters across Mathematics”, American Mathematical Society (2010) 23–47
- [4] A Buium, Arithmetic differential equations, American Mathematical Society (2005)
- [5] A Buium, Differential calculus with integers, from: “Arithmetic and Geometry, London Mathematical Society Lecture Note Series”, Cambridge University Press (2015) 139–187
- [6] A Buium, Foundations of arithmetic differential geometry, American Mathematical Society (2023)
- [7] C Chang, H J Keisler, Model Theory, 3rd edition, New York, Elsevier (1990)
- [8] R L Graham, D E Knuth, O Patashnik, Concrete Mathematics: A Foundation for Computer Science, 2nd edition, Addison-Wesley (1994)
- [9] N Hindman, D Strauss, Algebra in the Stone-Čech compactification: theory and applications, Walter de Gruyter (2011)
- [10] J Jeffries, Differentiating by prime numbers, Notices of the AMS 70 (2023)
- [11] J Kovic, The arithmetic derivative and antiderivative, Journal of Integer Sequences 15 (2012)
- [12] H Pasten, Arithmetic derivatives through geometry of numbers, Canadian Mathematical Bulletin 65 (2022) 906–923
- [13] J M Shelly, Una cuestión de la teoría de los números, Asociación española, Granada (1911) 1–12
- [14] M Stay, Generalized number derivatives, arXiv preprint math/0508364 (2005)
- [15] T Tossavainen, P Haukkanen, J K Merikoski, M Mattila, We Can Differentiate Numbers, Too, The College Mathematics Journal 55 (2024) 100–108
- [16] V Ufnarovski, B Ahlander, How to differentiate a number, Journal of Integer Sequences 6 (2003) 03–3