Information content in formal languages
Abstract.
Motivated by creating physical theories, formal languages with variables are considered and a kind of distance between elements of the languages is defined by the formula , where is a length function and means the united theory of and . Actually we mainly consider abstract abelian idempotent monoids provided with length functions . The set of length functions can be projected to another set of length functions such that the distance is actually a pseudometric and satisfies . We also propose a “signed measure” on the set of Boolean expressions of elements in , and a Banach-Mazur-like distance between abelian, idempotent monoids with length functions, or formal languages.
Key words and phrases:
formal language, variables, metric, distance, equivalence relation, information, information content, abelian, commutative, idempotent, monoid, semigroup, length, length function, measure, Boolean algebra, computing, physics, quantum gravity, physical theories1991 Mathematics Subject Classification:
94A17, 06E75, 06F05, 68Q45, 94-041. Introduction
Variables are the center and core of mathematics, physics, theoretical technics and theoretical economy theory. In this paper we consider formal languages with variables (see definition 2.1), provided with a length function for sentences (for example a simple character count), and define something which appears like a pseudo-metric between the sentences within one languages (see definition 2.12).
The languages are assumed to be equipped with an equivalence relation (def. 2.6), and the more two sentences have in common, the closer their distance becomes.
Actually, this project was motivated by physics in the quest for a solution to quantum gravity, see for instance Weinberg [21] and Hawking and Israel [11]. It is observed, that in general physical theories are surprisingly short when formulated as equations, and that progress in advancing the correct equations in physics is often realized by small modifications in formulas. For example, the formulas between Hamiltonian mechanics and quantum mechanics are very close in distance, also in our approach here, for all given Hamiltonians, even if quantum mechanics and mechanics appear very different when considered without formulas (cf. example 11.1).
We guess that a solution that unites gravity and quantum field theory, when formulated as mathematical equations, is actually close as formulas, for example in the sense proposed in this paper, to the existing ununited theories of gravity and quantum field theory. This might lead to the direction towards a candidate for a united theory, and on the other hand, might be used as a tool to evaluate a given candidate theory by computing the distance to the existing theory so far.
In short terms, the simpler a theory is, the better. The closer it is to existing effective theories, the better, as the more likely it is the correct generalization.
In this sense, even quantum mechanics might very theoretically have had been predicted without any inspection of black body radiation or double slit experiment, because it is perhaps the closest theory to mechanics which avoids the infinities. Simply by looking at formulas.
In this sense also, a solution to quantum gravity is perhaps less a physical problem but more a pure mathematical problem. It may be formulated like so: indicate the theory with the shortest distance - in the sense of this paper, say - to the existing theory, such that no infinities occur (cf. problem 11.3). Even more so, as ultrahigh-energy experiments appear currently to be impossible by several reasons, for example the no-return from black holes obstacle.
After this motivating and ambitious remarks, which may also appear somewhat self-evident, let us come to the hard facts of this paper:
Our definitions may be used much more widely, for example in computing, and it is widely applicable, as the approach is very flexible and does not bother with details but with the core principle. World applications are viewed as formulas. Actually, the distance between two formulas is defined as
which means in words: unite both theories and (called ), optimize their length (called ) and compute it (the more overlap they have the shorter it will become) and then subtract the length of the shorter theory of and .
It looks natural and simple, but is also intriguing because of the minus sign. Even if it appears natural that the triangle inequality - also called -inequality in this paper - holds for (cf. lemma 3.6), it is difficult and enormous involving when trying to verify this. A key of this paper is to repair the definition of by making the -inequality valid, see definition 7.1. See also theorem 6.10 for other conditions which imply the -inequality.
As another example, it is difficult to get any easy criteria for the uniform continuity of a homomorphism between two languages, for example computer languages, but see lemma 12.1.
To not get lost in details or become too specific, in this paper we mainly forget languages at all but instead work with an abelian, idempotent monoid provided with a length function (see definition 2.15).
This setting can be somewhat compared with a measured space with the operation of taking the union of sets and a measure on it. And indeed, we will make parallels, and show how we can compute the “signed measure” (i.e. length) of a formal Boolean expression in the monoid , see definition 6.15. The above formula for is easily seen to be indeed a metric for measured spaces and , see lemma 3.3.
In a main result, see theorem 7.21, we show how one can, starting with a length function , construct a length function such that the above formula for with replaced by is indeed a pseudometric, and the measure of the Boolean expression of intersection of two sentences and , is monotonically increasing in and , as one may naturally expect it. We have however no clue in how far degenerates, it could even be completely zero (but don’t believe that).
A key role in this paper also plays the -inequality
which, even if it holds heuristically, see lemma 4.2, we also force by definition finally, see definition 7.4.
For homomorphisms between languages we use the same concepts as for languages itself to define a metric, enriched by the submultiplicativity (cf. lemma 8.14).
As a distance between languages, or more generally, abelian, idempotent monoids, one may use the proposed Banach-Mazur-like distance of definition 8.27.
In the final sections we give various examples which should illustrate and motivate the theory of this paper, presented in an extremely superficially and sloppy way.
Actually, defining pseudometrics on languages, and on a family of languages goes back for a long time with a wide literature, and we only cite samples, far from being in any way complete, Han and Ko [10], Djordjević, Ikodinović and Stojanović [6], Fulop and Kephart [8], Ko, Han and Salomaa [15], Imaoka [13], Cseresnyes and Seiwert [2], Ibarra, McQuillan and Ravikumar [12], Bertoni, Goldwurm and Sabadini [1], Stratford [20], Jalobeanu [14], Parker, K.B. Yancey and M.P. Yancey [16], Cui, Dang, Fischer and Ibarra [3], Pighizzini [17], and in the context of physical theories, Giangiacomo [9]. We will not compare these approaches with the one in this paper, but we think there is a different emphasis on the latter.
Let us remark, that we less stress the exact computability of the proposed metrics, but the idealized relations they should satisfy. Also, our definitions are designed such that we find them particularly useful for physical theories.
2. Metric
Under a language we mean a set called alphabet, and a set of finite text-strings with letters in called sentences or formulas or theories. Here, means the empty sentence (i.e. empty set). In this section we are going to give successively several definitions, each of which becomes part of the considered setting after the definition.
Definition 2.1 (Language with variables).
A language with variables means a language , where and contains at least a sign and a distinguished infinite subset of so-called variables , divided further into two disjoint infinite subsets () called main variables set and auxiliary variable set , such that for all elements and variable permutations (to be explained below),
Here, means the concatenated string of the string with the letter with the string in the order as indicated. The letter serves purely as a symbol for concanation of strings.
Definition 2.2 (Variables permutation).
Let be a permutation (i.e. bijective function) on the variable set such that (), and let denote the canonically bijective map induced by . That is, set for , where if .
Write for the set of all such functions , called variable permutation or transformation.
Definition 2.3 (Length function).
Under a length or cost function on a language with variables we mean a positive function satisfying
for all and .
In other words, the empty word has no length, the length of two concanated strings is the sum of of the length of the single strings, and the length or cost of a string does not change if we just choose other variable names.
Lemma 2.4.
The map
is an associative semigroup operation on .
Two sentences are completely unrelated if they have no variables in common because they do not speak about the same variables:
Definition 2.5 (Independence).
Two elements are called independent, notated by , if and only if and have no variables in common.
(In other words, a variable showing up in the string must not show up in the string .)
In languages like mathematical of physical theories, one can typically express formulas in formally different ways but which are equivalent in content (for example, formally different equations may have the same solution). That is why we need an equivalence relation on the set of sentences, and the following axioms are the minimum natural ones we postulate:
Definition 2.6 (Equivalence relation).
Let be given an equivalence relation on such that the following relations hold for all and :
Neutral element:
(1) |
Commutativity:
(2) |
preserves equivalent elements in case of independence:
(3) |
A superfluous copy can be removed:
(4) |
Variable transformation does not change:
(5) |
If contains no main variables then require:
(6) |
Let us revisit some aspects of the last definition. Relation (2) means it is unimportant in which order we list our sentences separated by , like it is unimportant in which order we list the single formulas in a longer theory, or the collection of subroutines in a longer computer program. The connection between the single sentences is regulated solely by the variables they have in common. That is why relation (3) says that the semigroup operation of concatenation does not preserve the equivalence relation in general, but only if the concatenated sentences are unrelated. Thus we also needed to incorporate and in commutativity (2). Relation (5) tells us that it is unimportant which variables we choose (but only as a whole sentence). Relation (4) says if a theory is restated and fused with the original theory with other variables then essentially nothing new is said and the so new formed theory collapses to the original theory. The relation (6) is actually theoretically unimportant in the sense that it does not shine to the later abstract approach stated in definition 2.15 below through and may thus be omitted, but it is essential in the examples that we shall consider and are the only difference and the point what the distinction between main and auxiliary variables is all about. (See also example 10.1.)
We use class brackets as in for classes in quotients, where is a representative. The limits of sequences and the compare of all real valued functions are always understood to be taken pointwise (i.e. iff for all ).
The whole theory of this paper is mainly build only on the quotient set of sentences:
Definition 2.7 (Quotient).
Set to be the set-theoretical quotient.
On the quotient of sentences we naturally form an operation of concatenation by concatenating different sentences by choosing copies of them with disjoint variables chosen:
Definition 2.8 (Operation).
Define by
for all , where (dependent on ) is chosen such that .
We will then mainly work with an abstract abelian monoid that actually comes out so far:
Lemma 2.9.
is an abelian, idempotent monoid (i.e. for all ).
Proof.
On the quotient set of sentences we essentially define a length function by choosing the shortest possible length among the equivalent representatives (i.e. ), but to ensure that the length of is always bigger or equal than the length of for in the quotient , we correct this definition by the formula of definition 7.9. Thus the following definition of is just a somewhat obscured reformulation of defining at first as in definition 9.1 and then correcting it by definition 7.9:
Definition 2.10 (Length function on quotient).
Set ()
We shall conveniently denote . We get very natural properties for the length function which we later will postulate axiomatically for abelian monoids:
Lemma 2.11.
is positive and for all we have:
(7) |
Subadditivity:
(8) |
Monotone increasingness:
(9) |
Proof.
Let . Given , by definition 2.10 choose such that where and ().
Subadditivity: Choose any such that . Then
where the last inequality follows from monotone increasingness, which we prove next:
where, given , is chosen such that and the before last inequality holds. ∎
We often abbreviate the notion “monotonically increasing” for (9) simply by “monotone” or “increasing”. Let and denote the minimum and maximum, respectively, of two real numbers .
The length function induces now two very natural notions of distance functions between sentences:
Definition 2.12 (“Metric”).
We define two functions by
We sometimes call these functions also more precisely and . They have the characteristic of distance between elements of :
As an illustrative example let us recall that in his famous work [4], Dirac took the then known quantum numbers valued Klein-Gordon function (operators on Hilbert space) defined on spacetime and the Klein-Gordon differential equation ( constant) and took the formal square-root of the Klein-Gorden differential operator and formed the Dirac differential operator with constant coefficients in matrix algebra , that is, , and proposed the physical equation and along with it predicted correctly the existence of particles and anti-particles [5]. This is a prototype example where an existing physical theory is slightly modified to obtain a new theory. That is, the Klein-Gordon equation is close to the Dirac equation, also in our “metric”. Let us compute this:
Example 2.13.
The prototype example is physical theories. are the collections of all sentences, which are equations, formulas, function definitions and so on. Equivalence relation is when two formulas are equivalent in a mathematical sense. The means “and” in the sense that this and the other formula/equation holds. The equivalence relation has to be enriched by the axioms of definition 2.6. Main variables are those that cannot be deleted, like the electromagnetic field in Maxwell’s equations, because it is what the theory is about. The length function could be a simple character count. See also the beginning of section 10 including example 10.1 for further explanations.
For example, consider the free Klein-Gordon equation and the Dirac equation , where , and where in both cases is a main variable. Then
whereas and .
Here, and are auxiliary variables. If we had a longer formula of (i.e. the Klein-Gordon operator) then the above distance would not change, but and would increase.
Note that we did not explicitly transform the variables in as required by the -operation of definition 2.8 because they are bound variables which we can label as we like without change, and so have back transformed.
Example 2.14.
Another standard example is computer languages. For example ++. Sentences are meaningful programs or code snippets. Two programs are equivalent, if they do the same, for example, independent of time performance. Typically, the entry point of a program is a main variable. Set to be the length of a program . Set to be a concatenation of programs.
Variables are the typical variables in higher computer languages. The variable transformation in the -operation of definition 2.8 ensures that the variables of two different programs do not overlap.
The more two programs have in common, the lower becomes.
For example, if and have one common routine, then it is enough to keep one copy of them in .
Throughout, if nothing else is said, we switch to the following general setting for and . Only if we discuss something involving itself, we implicitly assume without saying that we use the above specific definitions of and .
Definition 2.15 (More abstract definition).
Let be an abelian, idempotent monoid.
We call a length function if the assertions of lemma 2.11 hold for it.
is also called the information content of .
Positivity of is also automatic by idempotence and subadditivity, as . Instead of the above specific metric candidates and from above, we shall sometimes use the following generalization:
Definition 2.16 (Metric candidate).
We call a positive, symmetric, nilpotent function if
for all .
is called a pseudometric on if it is positive, symmetric, nilpotent and satisfies the -inequality (see definition 3.1).
3. -inequality
The following section is a first discussion when the proposed distance functions and of defintion 2.12 could indeed satisfy the well known triangle inequality (here also called -inequality). This is actually a main theme of this paper and first satisfying answers are given in theorems 6.10 and 7.21.
Let be an abelian, idempotent monoid.
Definition 3.1.
A function is said to satisfy the -inequality if
for all
Lemma 3.2.
If a symmetric function satisfies the -inequality then also the second -inequality (i.e. second triangle inequality)
To heuristically see that of definition 2.12 in the context of formal languages could satisfy the -inequality we do the following considerations. Assume a sentence is written in its shortest form as for , that is, such that . Similarly we do so for a given sentence . Many could be auxiliary function definitions, some of which may be identical to some . As a first approximation we may estimate
where in the last line we have dropped identical auxiliary function definitions which appear already in the s, and so instead of calling we may call and that is why we had to introduce another transformation and dropped the superfluous by (6). In a certain sense, we formed the set theoretical union of and .
If we sloppily replace the above by and view as a measure, then the -inequality holds:
Lemma 3.3.
Let be a measured space. Set to be the collection of finite, measurable sets. Then
where , define metrics on .
Proof.
Generally we have
(10) |
We only prove the -case. (i) Assume . Then we have equivalences
(11) |
But this is true because in general we have
(ii) Assume . Doing the same as above leads to (11) but with replaced with and this is then also true by the above argument.
(iii) Assuming leads to (11) but with replaced with and this is then also true. ∎
Actually is like the maximum’s norm or -norm, and like the -norm in or space of real-valued sequences:
Lemma 3.4.
Continuing the setting of the last lemma, we have and
This is obvious by (10). Actually, as just formulated is well-known to be a metric, might be less used. We continue by imposing definitions 2.15 and 2.12.
Lemma 3.5.
For all we have
(12) |
Proof.
This criterion for the triangle inequality will be extensively used in section 6:
Lemma 3.6.
satisfies the -inequality if and only if
for all .
Moreover, this equivalence is true for any given function .
Proof.
Divide the -inequality spelled-out by to see the equivalence:
∎
Interestingly, if satisfies the triangle inequality then must be a length function:
Lemma 3.7.
Given any function , if satisfies the -inequality, then is monotonically increasing.
Moreover it is positive and subadditive if , and even a length function (def. 2.15) if .
Proof.
Set in lemma 3.6 to get monotonically increasingness, to get subaddtivity, and positivity by . ∎
4. -inequality
An important ingredient of our theory is the -inequality defined and first discussed in this section. It is fundamental in the proof of lemma 7.14, and thus for the whole section 7, theorem 7.14 and corollary 7.24.
Let be an abelian, idempotent monoid.
Definition 4.1.
A function is said to satisfy the -inequality if
for all
This may be heuristically compared with normed vector spaces by setting , and having
Going back to the remark before lemma 3.3 and the setting of this lemma, we heuristically argue that the -equality may hold:
Lemma 4.2.
Let be as in lemma 3.3. Then
Proof.
Write . We have to show that
(13) |
By symmetry, just exchange and , it is enough to consider the case .
Case : In this case, . Hence the last inequality verifies (13). The third case is analogous. ∎
Lemma 4.3.
If a nilpotent function satisfies the -inequality then ()
(14) |
We say that a function satisfies the weak -inequality if it satisfies the inequality (14), and only the very weak one, if this holds only for (see definition 5.1).
We finally remark that given a (possibly infinite) sequence of pseudometrics in our setting, we may sum them up to get possibly a more faithful pseudometric:
Lemma 4.4.
The set of length functions on is a positive cone, i.e. if are length functions, then also for constants in . Similarly, the functions satisfying the -inequality (or the -inequality, or those of the form ) are a positive cone, and one has .
5. Order relation
In this section we consider a natural order relation on an abelian idempotent monoid induced by the monoid operation. This is not only important in various occasions in the theory like in the monotone increasingness of and in various aspects in the next section, but also a fundamental relation for information itself. It says when a theory is information-theoretically contained in another theory, and because we work with equivalence relations on theories, this works also for formally very different looking representatives of theories. This effect will even become more intense when we shall divide out equivalences induced by a pseudometric in definition 5.7.
In this section, if nothing else is said, we continue with the setting and notions of definitions 2.15 and 2.12.
Definition 5.1.
Define an order relation on by
The following three lemmas are straightforward to check, and the reader should recall the monotone increasingness of , see (9), which is often used, as it is in lemmas 5.5 and 5.6.
Lemma 5.2.
(i) This is an order relation on .
(ii) For all we have
(iii) is a directed set as
The length function preserves the order and and have particular simple formulas for ordered pairs:
Lemma 5.3.
For all we have
Here are some further relations ( is defined below) for and under ordered pairs:
Lemma 5.4.
For all we have
(15) |
In other words, if we know the values of or for all , then we know them for all , and for both and .
The following lemma reverses lemma 5.3. It also states in words that a theory is contained in another theory (i.e. ) if and only if the information content of the united theory is the information content of the bigger theory (i.e. ).
Lemma 5.5.
For all , the following assertions are equivalent:
If or is faithful (i.e. for all one has ) then these relations are also equivalent to
Proof.
(i) if then and so and thus and so
These implications can be reversed.
The same equivalence with replaced by is analogously proved.
(ii) If is faithful this is equivalent to saying that ∎
Lemma 5.6.
For all we have the equivalences
In particular, is faithful iff is faithful.
Proof.
If then, because , we have and so . The first equivalence is by (12). ∎
Typically, later in this paper we will construct pseudometrics satisfying the -inequality and in the next definition and proposition we state how the given structures on the abelian idempotent monoid carry over naturally to the quotient when turning a pseudometric to a metric.
Definition 5.7 (From pseudometric to metric).
Assume that a pseudometric on satisfies the -inequality.
Define an equivalence relation on by
Set to be the set-theoretical quotient. Define operations, relations and functions on by (for all )
Then the so defined language with equivalence relation satisfies the concepts of section 2 again:
Proposition 5.8.
If a pseudo-metric on satisfies the -inequality then:
(i) The operation passes to , that is, the above is an abelian, idempotent monoid with order relation as in definition 5.1, and is provided with a metric satisfying the -inequality.
(ii) In case as of definition 2.7, then we may canonically identify for an equivalence relation on satisfying the same rules of definition 2.6. (So is again of the form .)
(iii) In case (or ), the above defined is a length function on and
Proof.
(i) The operation is well-defined on as if and then
(ii) Let the canonical map and the associated equivalence relation if and only if for all . As all rules of definition hold for , they hold also for , excepting (3) needs explanation. But by lemma 4.3 we have
for
(iii) By definition
so is monotonically increasing, and its subadditivity follows from the -inequality of . ∎
6. Measure of Boolean expressions
We have seen in the considerations before lemma 3.3, that somehow, is the measure of the union of and . In this section we want to extend the notion of a measure also to other Boolean expressions, like the intersection of sentences and .
There is not necessarily a sentence that represents the intersection or the set-theoretical difference of two sentences , see example 10.11. But we can still compute the “measure” of these non-existing, imagined elements.
Let us write formally for the measure of a Boolean expression , for example . We will give precise definitions in this section.
In this section, we assume the setting and notations of definitions 2.15 and 2.12, but suppose that is possibly non-increasing, that is, we cease requiring (9).
Definition 6.1 (Measure of intersection).
For all we set
An illustrative example for would be the amount two computer programs and have in common, for example by identical or mathematically similar routines. The following function slightly extracts some core of the last definition:
Definition 6.2 (-function).
We also set, for all ,
We say the measure of intersection is increasing if
for all , and is increasing if
The following couple of lemmas will cummulate to theorem 6.10 below.
Lemma 6.3.
We have the following equivalences:
The measure of intersection is increasing.
is increasing in for all fixed .
is increasing.
Proof.
If is increasing, then is increasing in for fixed , and so also in for fixed by symmetry of . Combining both we get that the measure of intersection is increasing. ∎
Lemma 6.4.
If is increasing and monotone, then satisfies the -inequality.
Proof.
is increasing means that for all we have
(16) |
Monotony of implies
Thus we get the criterion of lemma 3.6, that is,
(17) |
∎
Lemma 6.5.
If satisfies the -inequality then is increasing.
Lemma 6.6.
If is increasing then satisfies the -inequality.
Proof.
By lemma 6.3, is increasing in simultaneously. Then is increasing implies is decreasing, and applying this two times yields
A simple rearrangement of the summands in this inequality yields exactly
∎
Lemma 6.7.
If or satisfies the very weak -inequality and is monotone, then is increasing.
Proof.
The very weak -inequality of implies
which by the monotonicity of means
But this means is increasing. The same is true for by (15). ∎
Lemma 6.8.
If is monotone, then satisfies the very weak -inequality if and only if satisfies it.
Proof.
Because of (15). ∎
Lemma 6.9.
If satisfies the -inequality then satisfies the -inequality.
Proof.
Let . In all cases like etc. we see that
Thus, by lemma 3.6 we get
But this is the -inequality for . ∎
It is astonishing how strongly the validity of the -inequality and the -inequality of and , and the monotone increasingness of the measure of intersection and the -function are connected:
Theorem 6.10.
Assume the setting and notations of definitions 2.15 and 2.12. In particular, suppose that is monotonically increasing. Then the following are equivalent:
(a) satisfies the very weak -inequality
(b) satisfies the very weak -inequality
(c) is increasing
(d) the measure of intersection is increasing
(e) satisfies the -inequality
(f) satisfies the -inequality
Each of these assertions imply that satisfies the -inequality.
Reversely, the -inequality for implies the above assertions.
Proof.
Having said in definition 6.1 what the measure of an imagined intersection of two sentences is, we continue to define it for the last Boolean operations of complement and union.
Definition 6.11 (Measure of complement).
For all put
As , is increasing if and only if the measure of complement is decreasing in . This is reminiscent of lemma 3.4:
Lemma 6.12.
If is monotone then
Definition 6.13 (Measure of union).
For set
The final goal of this section is to extend the measure of imagined Boolean expressions of sentences to all possible Boolean expressions.
Definition 6.14.
Let be the free Boolean algebra (with zero element and one-element ) generated by the set .
That is, consists of expressions like (complement of ) etc. (also notated as ) for and .
Two elements in are regarded as same if and only if they are the same by formal rules in Boolean algebra, for example, .
For and define the expression .
The above definitions for will now be generalized as follows:
Definition 6.15.
Define a function
as follows. Put
For all with (and ) set
Here it is understood that mutually . Then define
(18) |
(19) |
because all expressions without cancel. Finally set
Lemma 6.16.
is well defined.
Proof.
Because we can choose common refinements, every expression in can be brought to the form with for , with and obviously defined, and so the above is already fully defined.
This expression in is uniquely defined if the set of involved variables is determined. Thus we only need to check that the definition of is independent of the set of involved variables:
Observe here the occurrence of :
because all occurrences with cancel each other. This result is consistent with .
By induction we can add as many variables as we want by the formula and expansion, without changing . ∎
Lemma 6.17.
is a “signed measure” on the Boolean algebra B generated by in the sense that
for all . In particular we have
Proof.
This is clear, as when we form common refinements for and in as indicated in the last proof, sums only over the products which appear both in , and over those which appear in or .
The last two identities follow from this and . ∎
We should however be reminded that we do not work with a real positive measure on measurable sets. We only observe:
Lemma 6.18.
For all we have:
(i) ,
(ii) If is monotonically increasing then .
7. New metric
Up to now we could not clarify if or satisfies the triangle or -inequality in our natural prototype languages. We have theorem 6.10, and it tells us the -inequality of implies all the desired natural properties. In this section we show how we may modify a given length function such that we get the -inequality for its associated distance function and thus all the desired properties. The approach is as follows. At first we modify the distance function for a given as in definitions 7.1 and 7.4 such that the so obtained satisfies the - and -inequality. Then we extract from it a new length function by setting (see definition 7.11). But this new may not be monotone increasing as required in (9). That is why we make it monotone increasing in definition 7.9 to obtain a new length . But the validity of the desired - and -inequality of and associated to might fail, also cf. lemma 7.13. Hence we restart the above procedure with again, and then again and again infinitely many often, and in the limit the obtained length function satisfies all desired properties.
A key lemma for this procedure is lemma 7.14, whose proof shows that we need both the - and -inequality simultaneously. Hence, it is also intrinsic to do the procedure with the -distance, because the lemma might not hold for the -distance. Doing the procedure for the -distance instead leads at least to the estimates of lemma 7.27 in the limit.
Actually, in this section we start with , and if nothing else is said, always denotes a symmetric, positive, nilpotent function for an abelian, idempotent monoid .
Example 10.10 suggests, that of definition 2.12 does not satisfy the -inequality in general. A way out of that fact is to repair as follows:
Definition 7.1 (Forced -inequality).
Define a function by
Lemma 7.2.
satisfies the -inequality, and if and only if satisfies the -inequality.
Proof.
This is obvious as the right hand side of the -inequality is one possible value over which the infimum is taken in . ∎
The following lemma may be useful for the pseudometric on defined by to get a lower bound for , cf. lemma 3.5.
Lemma 7.3.
If a function satisfies the -inequality then we have the following lower bound estimate:
Next we repair to satisfy the -inequality:
Definition 7.4 (Forced -inquality).
Define a function by
Lemma 7.5.
satisfies the -inequality, and if and only if satisfies the -inequality.
Proof.
This is obvious as the right hand side of the -inequality is one possible value over which the infimum is taken in . ∎
Fortunately, the transition from to does not destroy the validitiy of the -inequality:
Lemma 7.6.
If satisfies the -inequality, then satisfies the -inequality.
Proof.
By application of the -inequality for we get
where the first inequality is because the infimum goes over a smaller selection. ∎
Lemma 7.7.
satisfies the - and -inequality, and if and only if satisfies the - and - inequality.
Proof.
Hence we get a projection from metric candidates to those ones which satisfy the - and -inequality:
Corollary 7.8 (Projection onto pseudometrics satisfying -inequality).
There is a projection from the set of positive, symmetric, nilpotent functions on onto the set of pseudometrics on satisfying the -inequality.
The following bar operator turns possibly non-increasing length functions to increasing ones:
Definition 7.9 (Bar operator).
Given a positive function set as
Lemma 7.10.
If is a possibly non-increasing length function then is an increasing length-function, i.e. (for all )
If is already monotonically increasing, then .
From the various distance functions we naturally extract the length functions out:
Definition 7.11.
Define positive functions (depending on ) by
(20) |
Lemma 7.12.
If satisfies the -inequality, then of (20) is a non-increasing length function, and a length function.
The next lemma shows that switching from to may really change their associated length functions and , respectively:
Lemma 7.13.
If for as defined before lemma 7.3 then , and if is subadditive then . But we have in certain cases.
Proof.
By lemma 7.3 we have
This is an important key lemma which shows how our above procedure decreases the metric candidates:
Lemma 7.14.
Proof.
(i) Without loss of generality we have by the second -inequality and the -inequality that
(ii) Similarly we get
Given any we may choose a such that this is
∎
Corollary 7.15.
Proof.
Apply lemma 7.14 to . ∎
Let us now restart the above procedure with a new instead of and then again and again. More precisely put
Definition 7.16.
Set a start point , and then for all define successively functions and by
(21) |
We are going to show that the successively applied procedure defined in the last definition which takes a length function in and gives a length function out converges to a desired fixed point.
Lemma 7.17.
All are length functions on , and we have descending sequences
Proof.
We note that all are pseudometrics which satisfy the -inequality by lemma 7.7, but it is unclear whether they are of the form . (Otherwise we would be done.)
Since we have monotonically falling real-valued sequences that are bounded below by we may define
Definition 7.18.
Define the pointwise function limits
(22) |
Lemma 7.19.
The function
is a pseudometric on satisfying the -inequality.
Proof.
The second equality is clear by the formula for of def. 2.12, the first one is then by definition. The third equality is because of the converging sequence of lemma 7.17. Because satisfies the - and -inequality by lemma 7.7, their point-wise limit does also, and so we get the last equalities and thus the final assertion. ∎
Lemma 7.20.
is a length function on and
Proof.
As by lemma 7.19, we get the second identity.
The last equality holds because, as is positive,
that is, satisfies already the monotone increasingness. ∎
We now summarize that the successively applied described procedure which takes a length function and forms a new one converges to a desired fixed point:
Theorem 7.21 (Ideal length function).
Given a positive symmetric idempotent function on an abelian, idempotent monoid , define as described in definitions 7.16 and 7.18.
Then is a length function on , and and (def. 2.12) are pseudometrics on satisfying the -inequality.
If starting at the beginning of this section happens to be of the form for a length function on , then astonishingly satisfies already the - and -inequality if only :
Proposition 7.22 (Ideal length function attained).
If is of the form for some length function then satisfies the - and -inequality
Proof.
If satisfies the -inequality then certainly and similarly the -inequality implies . So by lemma 7.6 and thus .
If then
and so . Also then, , and thus and the above procedure restarts again with the same start , and so again etc and we get a constant sequence and finally .
If then by corollary 7.20 . ∎
We just remark that one will naturally turn the desired pseudometrics to metrics effortlessly:
Corollary 7.23.
By definition 5.7, we may then divide out elements in to obtain a metric space , which is again an abelian idempotent monoid, with metrics and satisfying the -inequality.
Proof.
The above procedure gives a desired projection onto the set of length functions from arbitrary length functions to desired ones:
Corollary 7.24 (Projection onto ideal length functions).
There is a projection from the set of length functions on onto the set of those ones such that and are pseudometrics which satisfy the -inequality.
Proof.
If the length function on is a simple character count as in our natural protoexamples then the fixed point for the length functions will be attained after finitely many steps:
Lemma 7.25.
If the -function is in a grid, i.e. for some constant , and , then the limits of definition 7.18 will all be attained at some ( depending on the points of however).
We will now make a bigger cut and switch from to . Let us now do the above procedure by starting with a length function on , but instead always forming we now choose . In other words, set
Lemma 7.27.
Assume definition 7.26, but to avoid confusion with what we have defined before, rewrite .
Then is a decreasing sequence of length functions on , with limits and estimates
(23) |
(24) |
Proof.
As converges pointwise to , converges also pointwise to . (23) is by lemmas 3.5 and corollary 7.15 applied to , as
By definitions 7.1 and 7.4, and because converges pointwise to , for all there are and such that for all we have that
where the last inequality is because satisfies the - and -inequality and so we can estimate back what we have expanded in the first two lines above.
8. Homomorphisms
In this section we shift our attention away from the abelian idempotent monoids to their homomorphism sets between them. Thereby we consider only homomorphisms which are “uniformly continuous” similarly as the bounded linear operator between Banach spaces are, see definition 8.8. We are provided by a length on each homomorphism set (analogy: norm on operators between Banach spaces) and apply the procedure of section 7 to obtain new length functions with better theoretical properties. Thereby we enrich this concept with the desired property of product inequalities, that is, we want achieve the validity of inequalities like and for homomorphisms and . Similarly as in the last section we end up with a fixed point theorem yielding desired length functions, and apply them to get a Banach-Mazur-like distance between languages.
At first we define homomorphism between languages. In this section let and be languages with equivalence relation as in definitions 2.1 and 2.6.
Definition 8.1 (Homomorphism of languages).
A function is called a homomorphism between these languages if for all we have
In this section we almost exclusively return to the abstract setting of abelian idempotent monoids. In what follows let and denote any abelian idempotent monoids. Finally, a natural notion for homomorphisms between such monoids is certainly a semigroup or monoid homomorphism:
Definition 8.2.
A function is called a homomorphism if for all
A homomorphism between languages becomes a homomorphism between their quotients:
Lemma 8.3.
If is a homomorphism then it descends to a homomorphism defined by .
The monoid operation between abelian idempotent monoids naturally carries over to the set of homomorphisms between them:
Definition 8.4 (Operation on homomorphism set).
For homomorphisms define the homomorphism
That is why we get the usual order on the homomorphism set:
Definition 8.5.
We equip with the usual order as of definition 5.1, that is, for all homomorphisms set
Lemma 8.6.
For any , defined by is a (non-unital) homomorphism .
We write the composition of two homomorphisms also as a product . The following lemma states the distributive law between the (“additive”) monoid operations and the (“multiplicative”) composition operations on the homomorphism sets.
Lemma 8.7 (Distribution laws).
For all homomorphisms and and we have
Write for the operator . A continuous linear operator between Banach spaces is actually uniformly continuous, that is, is bounded. We consider analogously only uniformly continuous, or in other words, bounded homomorphisms between abelian idempotent monoids equipped with metrics in order to get a natural length function on each homomorphism set in analogy to the norm of linear operators between Banach spaces:
Definition 8.8 (Natural length function on homomorphism set).
Let and be equipped with positive, symmetric, nilpotent functions and , respectively, satisfying the -inequality. Write for the set of all “uniformly continuous” homomorphisms , i.e. for which defined next is finite.
We define functions , where is a length function, by
(25) |
In (25) we used the bar operator of definition 7.9. Actually we should index in the data etc., but to avoid cluttering we use the sloppy light notation. If we choose for the trivial null function then we will get all homomorphisms in . For simplicity, we shall call elements of just homomorphisms and drop “uniformly continuous”.
Lemma 8.9.
For all (meaningful composable) homomorphisms we have
(26) |
Proof.
The first inequality is elementary as for the norm of linear operators, and the second one follows from the -inequality of . ∎
We refer to the inequality (26) (for any given ) as the product-inequality. It is analogue to the product inequality for bounded linear operators between Banach spaces. We tidy up the situation so far and get the usual abstract setting of abelian idempotent monoids also for the homomorphism sets:
Lemma 8.10 (Abstraction).
The homomorphism set is an abelian idempotent monoid and of definition 8.8 a length function on it.
Proof.
For the further discussion we need now consider a whole family of abelian idempotent monoids:
Definition 8.11 (Category of languages).
Let be a small category consisting of abelian, idempotent monoids equipped with positive, symmetric, nilpotent functions satisfying the -inequality as objects, and all possible “uniformly continuous” homomorphisms between these objects as in definition 8.2 with ordinary composition as morphisms.
It is now important to stress that we will in the sequel not use the distance functions on the objects at all. We only mentioned it to have a natural length function on the homomorphism sets by definition 8.8. We require now all homomorphism sets to be equipped with length functions and metric candidates:
Definition 8.12 (Category of languages with general length functions on homomorphism sets).
Our next aim is to modify the length functions on the homomorphism sets in such a way such that we (just) get the natural inequality for all composable homomorphisms in analogy to the inequality for linear operators between Banach spaces. Simultaneously we alter as to achieve that satisfies the - and -inequality. To this end we apply the fix point concept of section 7 to the homomorphism sets, enriched by the now also desired product inequality.
For the rest of this section we will define various new s and s simultaneously for all morphism sets in , without indicating in notation. We again note that we may choose all functions to be the trivial null functions and any and s in definition 8.12 independent of the suggested s of definition 8.8.
We repair now all given symmetric nilpotent positive functions on the homomorphism sets in such a way that they satisfy the product inequalities with respect to the length functions:
Definition 8.13 (New submultiplicative distance function).
For all objects and in define as follows. For all morphisms in with same domain and range set
where are morphisms between any objects of for which their products as indicated are valid (composable).
The so repaired functions now fulfill the desired product inequalities:
Lemma 8.14 (Product inequality).
For composable morphisms in we have the submultiplicativity relations (often called product-inequality)
(27) |
(28) |
Proof.
because we choose the infimum over a smaller set in the second line, that is, we at first chose a resolution and then said . ∎
The validity of relations (27) and (28) for all morphism sets of is summarized as product inequality. It is important that descends:
Lemma 8.15.
is a positive, symmetric nilpotent function with
Lemma 8.16.
satisfies the product-inequality, and if and only if satisfies the product-inequality.
Lemma 8.17.
If satisfies the product inequality then also the one of (26) in case that .
We proceed now analogously and thus faster as in section 7, and use various definitions from there, but now applied to the homomorphism sets rather than to the object sets of . In what follows we use definitions 7.1, 7.4, 7.9 and 7.11 applied to the homomorphism sets. In analogy to lemma 7.7 we can now repair in such a way that we get all the desired properties we want:
Lemma 8.18.
For being any positive, symmetric, nilpotent functions satisfying the product-inequality on each morphism set of we have:
satisfies the - and product-inequality.
satisfies the - and product-inequality.
satisfies the -, - and product-inequality.
Proof.
Lemma 8.19.
satisfies the product-inequality if and only if satisfies it.
Proof.
By lemma 5.4 we may express by , and vice versa, and from this it is easily deduced. ∎
We introduce now the following functions:
Definition 8.20.
Define positive functions depending on (recall that we do this for each morphism set of separately) by
To get finally convergence, it is fundamental that the new metric candidates descend:
Proof.
Apply corollary 7.15 to and . ∎
We do now an analogous procedure as in definition 7.16, but now for the homomorphism sets instead of the object sets:
Definition 8.22.
For all object sets and of set the start points and (def. 8.12), and then successively define and () by
where is defined as in definition 8.13 with respect to and (instead of and ).
The successively applied procedure to a length function as described in the last definition converges again, analogously as in the theorem 7.21, to a wished fixed point:
Theorem 8.23 (Ideal length functions on homomorphism sets).
Assume the definitions 8.11, 8.12, 8.22 and form the limits and with respect to all morphism sets of .
Then for all morphism sets of , and are length functions on fulfilling the product-inequality (26), and are pseudometrics on fulfilling the - and product-inequality, and moreover and more precisely
for all composable morphisms and of .
Proof.
This is the analogy to proposition 7.22. It shows, for example, when the procedure of definition 8.22 stops:
Proposition 8.24 (Ideal length functions on homomorphism sets attained).
If for all morphism sets of , then all satisfy the -, - and product-inequality
where each item is understood to hold for all morphism sets of .
Proof.
We finally may turn the obtained pseudometrics to metrics and notice:
Corollary 8.25 (Quotient category).
By def. 5.7, for each objects in , we may then divide out elements in to obtain a metric space , which is again an abelian idempotent monoid, with metric satisfying the - and product-inequality with respect to .
These quotients are compatible with composition, i.e. if with (class brackets) and with then
In this way we obtain a category out of by replacing the morphism sets by .
Analogous distribution formulas as in lemma 8.7 hold in .
The next lemma indicates how we could show that a homomorphism is uniformly continuous with respect to various derived metric candidates.
Lemma 8.26.
Let be a homomorphism. If there is an such that
for all , then this inequality holds for all . Then also
Analogous assertions hold for instead of .
Proof.
We may apply the last theorem to get a Banach-Mazur-like distance between languages:
Definition 8.27 (Banach-Mazur-like distance between languages).
Let be equipped with a length function . Then a Banach-Mazur-like distance between and (e.g. languages) may be defined as follows:
Lemma 8.28.
This distance is a pseudometric on any category of isomorphic abelian, idempotent monoids such that the hom-sets are provided with a length function satisfying the product-inequality (26).
The length functions of theorem 8.23 may be good candidates for the Banach-Mazur-like distance as they satisfy the product inequalities (26) (indeed, apply the inequalities of theorem 8.23 and note that ). Alternatively one may start with definitions 8.11 and 8.12 with , and then using of lemma 8.17.
9. Non-increasing length functions
In this section we work with length functions which may not be monotonically increasing, that is, without requiring inequality (9). The length function of definition 2.10 may be replaced by the following possibly non-increasing bigger one:
Definition 9.1 (Length function on quotient).
Given a language with variables and a length function on it, set
Now assume that we are given an abelian, idempotent monoid with a non-increasing length function on it (i.e. without postulating (9)).
In this section we do the same limit as in definitions 7.16 and 7.18, but instead of implementing the bar operator of lemma 7.10 we choose other functions and as follows, which do not require the monotone increasingness of and are still positive:
Definition 9.2.
We define for all and ()
We set for short. We use the same notations and as in definition 2.12, as notice, there is no difference between definitions 9.2 and 2.12 for monotonically increasing length functions . We still have the lower bound for , but not so for .
Let us be given a positive, symmetric, nilpotent function on . We are going to use definitions 7.1, 7.4 and 7.9. Define as in definition 7.11.
Lemma 9.3.
Proof.
Yielded by a similar proof as for lemma 7.14. ∎
We do now a procedure as in definition 7.16, but with the bar operator now being ignored:
Definition 9.4.
Set , and then for put
Corollary 9.5.
Proof.
By lemma 9.3, applied to , we get
Proposition 9.6.
If is of the form for a non-increasing length function then satisfies the - and -inequality
Proof.
This is analogously proved as proposition 7.22. ∎
However, it cannot be claimed that is increasing.
Also, contrary to theorem 7.21, of definition 6.2 with respect to might not be increasing and might not satisfy the - and -inequality, as all the proofs given rely on the monotonically increasingness of .
We could now form , define as above and restart the above procedure, or the one of section 7. There seem to be endless many variants. A principal problem is, however this can be said, that might not imply , so one might not be able to order the limits by .
10. General examples
In this and the remaining sections, we assume that is the mathematical language of technics and physics, but also of typical mathematics. We may use quantors, set-definitions, but over all sentences need not to be completely defined in a rigorous mathematical way. This may also not be useful. In finding a new theory, it may be better to let the precise meaning open. For example, a quantum field theory may be singular and yield infinities, but after renormalization the situation turns, see the various textbooks about quantum field theory, for example [22].
We distinguish variables in sentences which are unbound, like in , and bound ones like in . Here, is a function which is being defined, and in the function is being called. We sometimes underline variables (like so: ) to indicate that these are main variables.
For the length function on we choose a simple character count (i.e. for ), without counting non-notated brackets, even if they are theoretically there (as in ), and set , . For example, . The sign is always understood to bind most weakly (for example, means ). The sign means approximately equal between two real numbers, and its more precise meaning depends (for example, for integers). We note that for most of the discussion this choice of is essentially irrelevant, and one may as well choose any other length function , or the length function of theorem 7.21, but to get specific numbers in certain examples we choose the character count for simplicity.
From now on, everything is to be understood to be very sloppy, inexact, vague and superficial!
Example 10.1 (Main and auxiliary variables).
Example 10.2 (Length function is not directly subadditive on the level of ).
Given , in general we have
Indeed, if is longer and contains only auxiliary variables and is short with main variables and uses the variables of (typically by function calls, that is, outsources the bulk of formulas to as auxilliary functions) then, by (6) one has , and thus we got a contradiction
The next two examples show that even if the class elements , their lengths are approximately equal.
Example 10.3 (Copies are not superfluous in general).
Even if for , then - contrary to (4) which says that because is a superfluous copy of - we still may have
(29) |
Indeed, set and , then but it cannot equivalently be deleted in
because is a different theory than because is further determined, whereas is not.
Even if (29) is an inequality in general, it becomes approximately an identity when put into , because is just a copy of :
Example 10.4 (Copies are cheap).
If and then we have
In other words, if we take a theory , and add to it a copy (with other variable names) of the subtheory , then the length (= cost) of the outcoming theory does not much change. This is clear for computer programs because in we just need to call all the routines of and need not to rewrite them twice in . Indeed, going back to the concrete example 10.3 we may write
That is, instead of we wrote , and instead of . So we have one long definition of , approximately as long as itself, and two short calls and . So we may end up with the claim:
In this short example it may not pay off, because the length of is so short, but if is long with few unbound variables, certainly.
By definition 2.8, the -operation requires that if and have no variables in common. The next lemma shows when this identity approximately holds in the quotient , see definition 5.7, even if there are common variables.
Lemma 10.5 (-operation as an idealization of -operation).
Here, means approximately lower or equal. For all we have
The next two important lemmas about auxiliary and main variables are exactly proved in the general framework of section 2.
Lemma 10.6 (The more main variables the longer the sentence).
If and a copy of it, but where some auxiliary variables are declared to main variables, then
Proof.
In our natural definition of there is no difference between main and auxiliary variables besides the rule (6), which can only help to reduce the -length of a sentence, whence the inequality. ∎
The next lemma, which is an important assertion that holds actually in the general framework of section 2 as well, says that in an optimal sentence we can change all auxiliary variables to main variables without changing the element in the quotient :
Lemma 10.7 (Optimal sentences and main variables).
If is optimal in the sense that , and is a copy of the sentence but where all unbound auxiliary variables are replaced by main variables (in a bijective way) then
Proof.
Choose a variable transformation such that the copy of is disjoint to and , i.e. and .
Let be a verbatim copy of but where all unbound auxiliary variables are declared to main variables. Then by lemma 10.6, . By law (4), is just another variable transformed copy of and thus . Hence, by assumption we get
Thus by definition 2.12. ∎
The following example illustrates the last lemma:
Example 10.8 (Changing auxiliary to main variables).
(i) Consider the sentence defined by
Let be a verbatim copy of but where each is replaced by , so a main variable. As it appears, the representations of and are already shortest possible and thus
(30) |
because is a superfluous copy of by (4), and where we have dropped the definition of in by (6) and taken from instead. Thus
(ii) But if is a short polynomial, say , then can be simplified by saying (so is not optimal, i.e. ), but can be not, as is a main variable and so cannot be deleted, and so
This example demonstrates that the assumption of lemma 10.7 is necessary.
Example 10.9 (Approximate automaticity of (4)).
Relation (4), that is, for , is an idealization that holds approximately automatically in our prototype language, that is, . Indeed, as an illustration considering at first the sentence of example 10.11.(i), we may write (without using (4))
because we can shorten the long expression by equivalence in mathematics by just saying that we have a main variable which is just a function like and so we have
More generally, one drops the copy but rescues the main variables from by adding a list where are the main variables appearing in and those corresponding in .
Even more, one could encode the main variables into one main variable as a function on and use instead. It would be then enough to say .
The fact that in optimal sentences (i.e. those for which ) the auxiliary variables can be turned to main variables without changing the optimality (see lemma 10.7) can be exploited to give an instance where the triangle inequality is invalid:
Example 10.10 (Violation of -inequality).
Assume that has two approximately optimal representations in the sense that in and . Of course,
Since and are approximately optimal short in length, we may replace all unbound auxiliary variables of and , respectively, by main variables to obtain new sentences and , respectively, and still
by lemma 10.7. But because and may be very different in , by all the new main variables appearing now in and , and contain much necessary information which cannot be deleted (so (6) does not apply) and it may be difficult to simplify , so we may get
But this contradicts the -inequality, as assuming it, we got
It would be helpful if one had a difference element in the sense that if for then there is a ‘complement’ such that ( united with is the whole ) and (the intersection of and is zero), that is,
by definition 6.1, or in other words, . But it does not exist in general:
Example 10.11 (Difference Element does not exist).
Consider two formulas as follows:
We assume that and , as it appears, are already in their shortest form, so, observing also that are the same polynomials, and to this applying (4) and then (3), we get
The last line shows that a difference element in does not exist, as it should have only length , which appears impossible in our language .
11. Examples from Physics
In this section we collect various examples from physics, and indicate that the distance between an existing and a new improved theory is often very low, as well as the lengths of the theories itself. A very good instance is already the transition form the Klein-Gordon equation to the Dirac equation discussed in example 2.13 and the paragraph before it. Another example would be Maxwell’s equations extended from Ampere’s and Faraday’s equations.
Example 11.1 (Quantum mechanics).
Classical mechanics can be described as follows. In these equations, means derivative of after time, and a free independent variable
If the Lagrangian (kinetic energy potential energy) then (energy). The Schrödinger equations [18, 19] read
(31) |
(32) |
We abbreviate quantum mechanics by , Schrödinger equations (31)-(32) by , Hamilton function by , Hamilton equations by and classical mechanics by . The distance between and is
for a suitable by def. 2.8. Since and are copies of each other, we may take the Hamilton function instead of its copy in and so choose a variable transformation accordingly and then dropping the superfluous term which only consists of auxiliary variables and which is not needed anymore by (6). So we get
(33) |
a low number, for all Hamilton functions .
In (33) we have assumed that there is not much optimization possible between the involved parts, in particular in relation to a longer Hamilton function .
Example 11.2 (General relativity).
In generalizing the field equation of Newtonian mechanics to relativity, Einstein noted that because of the equation , a perpetuum mobile could be formed by sending a light beam upwards from a big mass like the earth and so lifting mass upwards in a gravitational field without using energy.
To avoid violation of the theorem of conservation of energy, he suggested that spacetime is deformed under the influence of mass, so that the upwards-traveling lightbeam could shift to red and so lose energy. To this end, mass is substituted by energy, which is turned to a tensor to obtain the energy momentum tensor , which interacts with the metric of the 4-dim spacetime by the formula [7]
Once again, the distance between general relativity GR and Newtonian field equations NF appears very low, i.e.
maybe the shortest possible of a solution extending SR, NF and solving the perpetuum mobile problem. Simply already because the theory of GR is short, i.e. is low (cf. lemma 3.5).
Since Maxwell’s equations are Lorentz invariant, and is deduced from Maxwell’s equations, GR might already follow from the latter by optimization, because if nature were not Lorentz invariant, but the Maxwell’s equations hold, the whole alternative theory might be longer.
Problem 11.3.
Given any theory with singularities, indicate a theory without singularities (i.e. without mathematical pathological inconsistencies and divergences) and such that works approximately as in the range where works and
Even the low distance alone might increase the likelihood that generalizes properly.
12. Examples of Computing
For computer languages we use the definitions of example 2.14.
Lemma 12.1.
Assume that is an isomorphism between two computer languages (exceptionally with only finitely many main variables available).
Then there is a constant such that for all
Hence
Proof.
Indeed, write a compiler or interpreter such that
does the same on the computer as does by compiling or interpreting the code on the computer . We assume since is an isomorphism this is possible, and that is given on the formal language level as in definition 8.1 in this proof. Here means an optimal short program (i.e. ) as a pure text string, which we assign the variable above. Further, means some minor code containing all main variables of and calling to interpret or compile . Moreover, is just an abbreviation.
We also assume that
gives only the length of the text string and similarly so for . Also, as we have only finitely many main variables at all (say one entry point of a program is the main variable), we assume that there is a fixed constant such that
for all . We have
for any . Here, the second equality is because we do not need a copy of the compiler , the third one is because instead of interpreting and separately, we may interpret at once to get the same code.
Hence, setting we get
Analogously we get
so that we end up with
Thus
∎
We are going to define a Reduced Instruction Set Central Processing Unit (RISC CPU) in our ordinary mathematical language of physics and technics by describing its function of memory, accumulator and program pointer in spacetime:
Example 12.2.
A simple computer can be realized as formulas like so:
where if an assertion is true, and otherwise.
Here, stands for memory, for an accumulator, and for program pointer. Everything is time indexed . Then is the value of memory at place at time . Similarly, is the value of accumulator, program pointer, respectively, at time . We have the opcodes (some constant real values) which stand for store accumulator, load accumulator, add accumulator, multiply accumulator, and branch. One command has the format opcode and value in the next memory. For example
mean, load accu with the value at memory address 1000, or, branch to 2000 if accu .
If a sentence is longer, it may pay off to implement the above computer and implement the formula via a program. In other words, to add the above computer as auxiliary variables, i.e. and simplify by writing a program on which substitutes (part of) . In this way, long sentences are in concurrence with computing, which emerges out of nowhere simply because of the pressure of optimization.
The above formulas are similarly to differential equations in rough shape. One could analogously allow functions than real values, that is, and add a differential command for the accumulator, say, .
We may enrich any language without variables with variables to get a macro language:
Example 12.3 (Macro language).
Given a language , with or without variables, we may form a macro language by adding countably infinitely many auxiliary and main variables, function definitions (macros) which accept text strings as input and give a text string as output, function calls, concatenation of text (written .), and a sign. Sentences of are sentences of enriched with macros.
For example, both are the same:
If two sentences in are similar by much text overlap, the distance becomes low in the macro language by using macros.
Example 12.4 (Further examples).
(i) In the usual mathematical framework we could also allow Boolean algebra expressions like
and compare . More generally we may compare any two functions which allow definitions by symbolic expressions by considering .
(ii) Further we could work with text strings and concatenation in our general mathematical framework and in this way get a stronger macro language than in example 12.3 as we had natural numbers, functions or pointers to functions as arguments in functions calls etc.
(iii) We may consider set definitions and in this way compute the distance between sets, or algebraic structured sets like rings etc.
(iv) Mathematical proofs may be compared by writing down proofs , regarding them as text strings (or better as text strings in the framework of the mathematical logic of deduction), and compare these text strings within the macro language as in example 12.3 or in item (ii).
References
- [1] A. Bertoni, M. Goldwurm, and N. Sabadini. Computing the counting function of context-free languages. STACS 87, Theoretical aspects of computer science, Proc. 4th annu. Symp., Passau/FRG 1987, Lect. Notes Comput. Sci. 247, 169-179 (1987)., 1987.
- [2] Ehud Cseresnyes and Hannes Seiwert. Regular expression length via arithmetic formula complexity. J. Comput. Syst. Sci., 125:1–24, 2022.
- [3] Cewei Cui, Zhe Dang, Thomas R. Fischer, and Oscar H. Ibarra. Similarity in languages and programs. Theor. Comput. Sci., 498:58–75, 2013.
- [4] P. A. M. Dirac. The quantum theory of the electron. I. Proc. R. Soc. Lond., Ser. A, 117:610–624, 1928.
- [5] P. A. M. Dirac. Quantised singularities in the electromagnetic field. Proc. R. Soc. Lond., Ser. A, 133:60–72, 1931.
- [6] Radosav Djordjević, Nebojša Ikodinović, and Nenad Stojanović. A propositional metric logic with fixed finite ranges. Fundam. Inform., 174(2):185–199, 2020.
- [7] A. Einstein. Zur allgemeinen Relativitätstheorie. Berl. Ber., 1915:778–786, 1915.
- [8] Sean A. Fulop and David Kephart. Topology of language classes. In The 14th meeting on the mathematics of language. Proceedings of the meeting, MoL 14, Chicago, IL, USA, July 25–26, 2015, pages 26–38. Stroudsburg, PA: Association for Computational Linguistics, 2015.
- [9] Giangiacomo Gerla. Distances, diameters and verisimilitude of theories. Arch. Math. Logic, 31(6):407–414, 1992.
- [10] Yo-Sub Han and Sang-Ki Ko. Alignment distance of regular tree languages. In Implementation and application of automata. 22nd international conference, CIAA 2017, Marne-la-Vallée, France, June 27–30, 2017. Proceedings, pages 126–137. Cham: Springer, 2017.
- [11] Stephen W. Hawking and W. Israel. General relativity. An Einstein centenary survey. Cambridge etc.: Cambridge University Press. XVIII, 919 p. £ 37.50 (1979)., 1979.
- [12] Oscar H. Ibarra, Ian McQuillan, and Bala Ravikumar. On counting functions and slenderness of languages. Theor. Comput. Sci., 777:356–378, 2019.
- [13] Teruo Imaoka. Algebraic semigroups, formal languages and computation. Proceedings of a symposium held at the Research Institute for Mathematical Sciences, Kyoto University, Kyoto, Japan, February 19–21, 2001. RIMS Kokyuroku 1222, viii, 188 p. (2001)., 2001.
- [14] Ciresica Jalobeanu. Approximation in languages. In Current issues in mathematical linguistics. 1st International Conference, ICML ’93, Virgili University, Tarragona, Catalonia, Spain, on March 30- 31, 1993. Selected papers, pages 161–169. Amsterdam: North-Holland, 1994.
- [15] Sang-Ki Ko, Yo-Sub Han, and Kai Salomaa. Top-down tree edit-distance of regular tree languages. In Language and automata theory and applications. 8th international conference, LATA 2014, Madrid, Spain, March 10–14, 2014. Proceedings, pages 466–477. Berlin: Springer, 2014.
- [16] Austin J. Parker, Kelly B. Yancey, and Matthew P. Yancey. Regular language distance and entropy. In 42nd international symposium on mathematical foundations of computer science, MFCS 2017, August 21–25, 2017, Aalborg, Denmark, page 14. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik, 2017. Id/No 3.
- [17] Giovanni Pighizzini. How hard is computing the edit distance? Inf. Comput., 165(1):1–13, 2001.
- [18] E. Schrödinger. Quantisierung als Eigenwertproblem. I. Ann. der Phys. (4), 79:361–374, 1926.
- [19] E. Schrödinger. Quantisierung als Eigenwertproblem. II. Ann. der Phys. (4), 79:489–527, 1926.
- [20] Barney Stratford. The topology of language. J. Math. Psychol., 53(6):502–509, 2009.
- [21] Steven Weinberg. Ultraviolet divergences in quantum theories of gravitation. In Hawking, Stephen W. and Israel, W., Cambridge etc.: Cambridge University Press. XVIII, 919 p. £ 37.50 (1979)., pages 790–831. 1979.
- [22] Steven Weinberg. The quantum theory of fields. Vol. 1: Foundations. Cambridge: Cambridge University Press, corr. repr. of the 1995 orig. edition, 1996.