Classical perspectives on the Newton–Wigner position observable

Philip K. Schwartz Institute for Theoretical Physics, Leibniz University Hannover, Appelstraße 2, 30167 Hannover, Germany [email protected] Domenico Giulini Institute for Theoretical Physics, Leibniz University Hannover, Appelstraße 2, 30167 Hannover, Germany Center of Applied Space Technology and Microgravity, University of Bremen, Am Fallturm 1, 28359 Bremen, Germany [email protected]

Abstract

This paper deals with the Newton–Wigner position observable for Poincaré-invariant classical systems. We prove an existence and uniqueness theorem for elementary systems that parallels the well-known Newton–Wigner theorem in the quantum context. We also discuss and justify the geometric interpretation of the Newton–Wigner position as ‘centre of spin’, already proposed by Fleming in 1965 again in the quantum context.

1 Introduction

Even though we shall in this paper exclusively deal with classical (i.e. non-quantum) aspects of the Newton–Wigner position observable, we wish to start with a brief discussion of its historic origin, which is based in the early history of relativistic quantum field theory (RQFT). After that we will remark on its classical importance and give an outline of this paper.

The conceptual problem of how to properly ‘localise’ a physical system ‘in space’ has a very long history, the roots of which extend to pre-Newtonian times. Newtonian concepts of space, time, and point particles allowed for sufficiently useful localisation schemes, either in terms of the position of the particle itself if an elementary (i.e. indecomposable) systems is considered, or in terms of weighted convex sums of instantaneous particle positions for systems composed of many particles, like, e.g., the centre of mass. These concepts satisfy the expected covariance properties under spatial translations and rotations and readily translate to ordinary, Galilei-invariant Quantum Mechanics, where concepts like ‘position operators’ and the associated projection operators for positions within any measurable subset of space can be defined, again fulfilling the expected transformation rules under spatial motions.

However, serious difficulties with naive localisation concepts arose in attempts to combine Quantum Mechanics with Special Relativity. For example, as already observed in 1928 by Breit [1] and again in 1930–31 by Schrödinger [2, 3], a naive concept of ‘position’ for the Dirac equation leads to unexpected and apparently paradoxical results, like the infamous ‘Zitterbewegung’. It soon became clear that naive translations of concepts familiar from non-relativistic Quantum Mechanics did not result in satisfactory results. This had to do with the fact that spatially localised wave functions (e.g. those of compact spatial support) necessarily contained negative-energy modes in their Fourier decomposition and that negative-energy modes would necessarily be introduced if a ‘naive position operator’ (like multiplying the wave function with the position coordinate) were applied to a positive-energy state. Physically this could be seen as an inevitable result of pair production that sets in once the bounds on localisation come close to the Compton wavelength. Would that argument put an end to any further attempt to define localised states in a relativistic context?

This question was analysed and answered in the negative in 1949 by Newton and Wigner [4]. Their method was to write down axioms for what it meant that a system is ‘localised in space at a given time’ and then investigate existence as well as uniqueness for corresponding position operators. It turned out that existence and uniqueness are indeed given for elementary systems (fields being elements of irreducible representations of the Poincaré group), except for massless fields of higher helicity. A more rigorous derivation was later given by Wightman [5] who also pointed out the connection with the representation-theoretic notion of ‘imprimitivity systems’¹¹1A good text-book reference explaining the notion of imprimitivity systems is [6]..

It should be emphasised that the Newton–Wigner notion of localisation still suffers from the acausal spreading of localisation domains that is typical of fields satisfying special-relativistic wave equations, an observation made many times in the literature in one form or another; see, e.g., [7, 8, 9]. This means that if a system is Newton–Wigner localised at a point in space at a time $t$ , it is not strictly localised anymore in any bounded region of space at any time later than $t$ [4, 10]. In other words, the spatial bounds of localisation do not develop in time within the causal future of the original domain. Issues of that sort, and related ones concerning, in particular, the relation between Newton–Wigner localisation and the Reeh-Schlieder theorem in RQFT have been discussed many times in the literature even up to the more recent past, with sometimes conflicting statements as to their apparent paradoxical interpretations; see, e.g., [11] and [12, 13]. For us, these issues are not in the focus of our interest.

Clearly, due to its historical development, most discussions of Newton–Wigner localisation put their emphasis on its relevance for RQFT. This sometimes seems to mask the fact that the problem of localisation is likewise present for classical systems, in particular if they are ‘relativistic’ in the sense of Special Relativity. In fact, special-relativistic systems react with characteristic ambiguities if one tries to introduce the familiar notions of ‘centre of mass’ that one uses successfully in Newtonian physics. A first comprehensive discussion on relativistic notions of ‘position’ of various ‘centres’ was given by Pryce in his 1948 paper [14]. He starts with a list of no less than six different definitions, which Pryce labelled alphabetically from (a) through (f) and which for systems of point masses may briefly be characterised as follows: (a) and (c) correspond to taking the convex affine combination of spatial positions in each Lorentz frame with weights being equal to the rest masses and dynamical masses respectively, whereas (b) and (d) correspond to restricting this procedure to the zero-momentum frame and then transforming this position to other frames by Lorentz boosts. Possibility (e) is a combination of (c) and (d), determined by the condition that the spatial components of the ensuing position observable shall (Poisson) commute. This combination is, in fact, the Newton–Wigner position in its classical guise. Finally, possibility (f) is a variant of (b) in which the distinguished frame is not that of zero-momentum but that in which the ‘mass centre’ as defined by (a) is at rest.

In 1965, Fleming gave a more geometric discussion in [15] that highlighted the group-theoretic properties (regarding the group of spacetime automorphisms) underlying the constructions and thereby clarified many of the sometimes controversial issues regarding ‘covariance’. Fleming focussed on three position observables which he called ‘centre of inertia’, ‘centre of mass’, and the Newton–Wigner position observable, for which he, at the very end of his paper and almost in passing, suggested the name ‘centre of spin’. In our paper we shall give a more detailed geometric justification for that name.

Pryce, Fleming, and other contemporary commentators mainly had RQFT in mind as the main target for their considerations, presumably because the study of deeply relativistic classical systems was simply not considered relevant at that time. But that has clearly changed with the advent of modern relativistic astrophysics. For example, modern analytical studies of close compact binary-star systems also make use of various definitions of ‘centre of mass’ in an attempt to separate the ‘overall’ from the ‘internal’ motion as far as possible. Note that, as is well known, special-relativistic many-particle systems will generally show dynamical couplings of internal and external degrees of freedom which cannot be eliminated altogether by more clever choices of external coordinates. But, in that respect, it turns out that modern treatments of gravitationally interacting two-body systems within the theoretical framework of Hamiltonian General Relativity show a clear preference for the Newton–Wigner position [16, 17], emphasising once more its distinguished role, now in a purely classical context. In passing we mention the importance and long history connected with the ‘problem of motion’ in General Relativity, i.e. the problem of how to associate a timelike worldline with the field-theoretic evolution of an extended and structured body, a glimpse of which may be obtained by the recent collection [18]. A concise account of the various definitions of ‘centres’ that have been used in the context of General Relativity is given in [19], which also contains most of the original references in its bibliography. In our opinion, all this provides sufficient motivation for further attempts to work out the characteristic properties of Newton–Wigner localisation in the classical realm.

The plan of our investigation is as follows. After setting up our notation and conventions in section 2, where we also introduce some mathematical background, we prove a few results in section 3 which are intended to explain in what sense the Newton–Wigner position is indeed a ‘centre of spin’ and in what sense it is uniquely so (theorem 3.12). We continue in section 4 with the statement and proof of a classical analogue of the Newton–Wigner theorem, according to which the Newton–Wigner position is the unique observable satisfying a set of axioms. The result is presented in theorem 4.6 and in a slightly different formulation in theorem 4.7. They say that for a classical elementary Poincaré-invariant system with timelike four-momentum (as classified by Arens [20, 21]), there is a unique observable transforming ‘as a position should’ under translations, rotations, and time reversal, having Poisson-commuting components, and satisfying a regularity condition (being $C^{1}$ on all of phase space). This observable is the Newton–Wigner position.

2 Notation and conventions

This section is meant to list our notation and conventions in the general sense, by also providing some background material on the geometric and group-theoretic setting onto which the following two sections are based.

2.1 Minkowski spacetime and the Poincaré group

We use the ‘mostly plus’ $(-{++}+)$ signature convention for the spacetime metric and stick, as indicated, to four dimensions. This is not to say that our analysis cannot be generalised to other dimensions. In fact, as will become clear as we proceed, many of our statements have an obvious generalisation to other, in particular higher dimensions. On the other hand, as will also become clear, there are a few constructions which would definitely look different in other dimensions, like, e.g., the use of the Pauli–Lubański ‘vector’ in section 2.5, which becomes an $(n-3)$ -form in $n$ dimensions, or the classification of elementary systems.

The velocity of light will be denoted by $c$ , and not set equal to $1$ . Affine Minkowski spacetime will be denoted by $M$ , and the corresponding vector space of ‘difference vectors’ will be denoted by $V$ . The Minkowski metric will be denoted by $\eta\colon V\times V\to\mathbb{R}$ . The isomorphism of $V$ with its dual space $V^{*}$ induced by $\eta$ (‘index lowering’) will be denoted by a superscript ‘flat’ symbol $\flat$ , i.e. for a vector $v\in V$ the corresponding one-form is $v^{\flat}:=\eta(v,\cdot)\in V^{*}$ . The inverse isomorphism (‘index raising’) will be denoted by a superscript sharp symbol $\sharp$ . Note that under a Lorentz transformation $\Lambda$ , $v\in V$ transforms under the defining representation, $(\Lambda,v)\mapsto\Lambda v$ , whereas its image $v^{\flat}\in V^{*}$ under the $\eta$ -induced isomorphism transforms under the inverse transposed, $(\Lambda,v^{\flat})\mapsto(\Lambda^{-1})^{\top}v^{\flat}=v^{\flat}\circ\Lambda^{-1}$ .

We fix an orientation and a time orientation on $M$ . The (homogeneous) Lorentz group, i.e. the group of linear isometries of $(V,\eta)$ , will be denoted by $\mathcal{L}:=\mathsf{O}(V,\eta)$ . The Poincaré group, i.e. the group of affine isometries of $(M,\eta)$ , will be denoted by $\mathcal{P}$ . The proper orthochronous Lorentz and Poincaré groups (i.e. the connected components of the identity) will be denoted by $\mathcal{L}_{+}^{\uparrow}$ and $\mathcal{P}_{+}^{\uparrow}$ , respectively²²2Note that speaking of just orthochronous or proper Lorentz / Poincaré transformations does not make invariant sense without specifying a time direction..

We employ standard index notation for Minkowski spacetime, using lowercase Greek letters for spacetime indices. When working with respect to bases, we will, unless otherwise stated, assume them to be positively oriented and orthonormal, and we will use 0 for the timelike and lowercase Latin letters for spatial indices. We will adhere to standard practice in physics where lowering and raising of indices are done while keeping the same kernel symbol; i.e. for a vector $v\in V$ with components $v^{\mu}$ , the components of the corresponding one-form $v^{\flat}\in V^{*}$ will be denoted simply by $v_{\mu}$ . For the sake of notational clarity, we will sometimes denote the Minkowski inner product of two vectors $u,v\in V$ simply by

u\cdot v:=\eta(u,v)=u_{\mu}v^{\mu}.

(2.1)

We fix, once and for all, a reference point / origin $o\in M$ in (affine) Minkowski spacetime, allowing us to identify $M$ with its corresponding vector space $V$ (identifying the reference point $o\in M$ with the zero vector $0\in V$ , i.e. via $M\ni x\mapsto(x-o)\in V$ ), which we will do most of the time. Using the reference point $o\in M$ , the Poincaré group splits as a semidirect product

\mathcal{P}=\mathcal{L}\ltimes V

(2.2)

where the Lorentz group factor in this decomposition arises as the stabiliser of the reference point – i.e. a Poincaré transformation is considered a homogeneous Lorentz transformation if and only if it leaves $o$ invariant. Thus, a homogeneous Lorentz transformation $\Lambda\in\mathcal{L}$ acts on a point $x\in M\equiv V$ as $(\Lambda x)^{\mu}=\Lambda^{\mu}_{\hphantom{\mu}\nu}x^{\nu}$ , and a Poincaré transformation $(\Lambda,a)\in\mathcal{P}$ acts as $((\Lambda,a)\cdot x)^{\mu}=\Lambda^{\mu}_{\hphantom{\mu}\nu}x^{\nu}+a^{\mu}$ .

We will sometimes make use of the set of spacelike hyperplanes in (affine) Minkowski spacetime $M$ , which we will denote by

\mathsf{SpHP}:=\{\Sigma\subset M:\Sigma\;\text{spacelike hyperplane}\}.

(2.3)

Since the image of a spacelike hyperplane under a Poincaré transformation is again a spacelike hyperplane, there is a natural action of the Poincaré group on $\mathsf{SpHP}$ , which we will denote by $((\Lambda,a),\Sigma)\mapsto(\Lambda,a)\cdot\Sigma$ and spell out in more detail in equation (3.4) below.

2.2 The Poincaré algebra

When considering the Lie algebra $\mathfrak{p}$ of the Poincaré group (or symplectic representations thereof), we will denote the generators of translations by $P_{\mu}$ such that $a^{\mu}P_{\mu}$ is the ‘infinitesimal transformation’ corresponding to the translation by $a\in V$ , and the generators of homogeneous Lorentz transformations (with respect to the chosen origin $o$ ) by $J_{\mu\nu}$ , such that $-\frac{1}{2}\omega^{\mu\nu}J_{\mu\nu}$ is the ‘infinitesimal transformation’ corresponding to the Lorentz transformation $\exp(\omega)\in\mathcal{L}_{+}^{\uparrow}\subset\mathsf{GL}(V)$ for $\omega\in\mathfrak{l}=\mathrm{Lie}(\mathcal{L})\subset\mathrm{End}(V)$ .

Since we are using the $(-{++}+)$ signature convention, the minus sign in the expression $-\frac{1}{2}\omega^{\mu\nu}J_{\mu\nu}$ is necessary in order that $J_{ab}$ generate rotations in the $e_{a}$ – $e_{b}$ plane from $e_{a}$ towards $e_{b}$ , which is the convention we want to adopt. A detailed discussion of these issues regarding sign conventions for the generators of special orthogonal groups can be found in appendix A. Moreover, if $u\in V$ is a future-directed unit timelike vector, then $cP_{\mu}u^{\mu}$ (i.e. $cP_{0}$ in the Lorentz frame defined by $u=e_{0}$ ), which is minus the energy in the frame defined by $u$ , is the generator of active time translations in the direction of $u$ . Therefore, with our conventions, for the case of causal four-momentum $P\in V$ the energy (with respect to future-directed time directions) is positive if and only if $P$ is future-directed.

With our conventions, the commutation relations for the Poincaré generators are as follows:


$\displaystyle[P_{\mu},P_{\nu}]$	$\displaystyle=0$	(2.4a)
$\displaystyle[J_{\mu\nu},P_{\rho}]$	$\displaystyle=\eta_{\mu\rho}P_{\nu}-\eta_{\nu\rho}P_{\mu}$	(2.4b)
$\displaystyle[J_{\mu\nu},J_{\rho\sigma}]$	$\displaystyle=\eta_{\mu\rho}J_{\nu\sigma}+\text{(antisymm.)}$
	$\displaystyle=\Big{(}\eta_{\mu\rho}J_{\nu\sigma}-(\mu\leftrightarrow\nu)\Big{)}-\Big{(}\rho\leftrightarrow\sigma\Big{)}$	(2.4c)

As indicated, the abbreviation ‘antisymm.’, which we shall also use in the sequel of this paper, stands for the additional three terms that one obtains by first antisymmetrising (without a factor of $1/2$ ) in the first pair of indices on the left hand side, here $(\mu\nu)$ , and then the ensuing combination once more in the second set of indices, here $(\rho\sigma)$ , again without a factor $1/2$ .

For later reference we already point out here that this Lie algebra has several convenient features, one of which being that it is perfect. This means that it equals its own derived algebra or, in other words, that each of its element is expressible as a linear combination of Lie brackets. This is easy to see directly from (2.4). Indeed, contraction of (2.4b) and (2.4) with $\eta^{\mu\rho}$ gives $(\dim V-1)P_{\nu}$ in the first and $(\dim V-2)J_{\nu\sigma}$ in the second case, showing that each basis element $P_{\nu}$ and $J_{\nu\sigma}$ is a linear combination of Lie brackets if $\dim V>2$ . Being perfect implies that its first cohomology is trivial. Moreover, the second cohomology is also trivial. Being perfect and of trivial second cohomology will later allow us to conclude that symplectic actions are necessarily Poisson actions.

2.3 Symplectic geometry

We employ the following sign conventions for symplectic geometry (as used by Abraham and Marsden in [22], but different to those of Arnold in [23]). Let $(\Gamma,\omega)$ be a symplectic manifold. For a smooth function $f\in C^{\infty}(\Gamma)$ , we define the Hamiltonian vector field $X_{f}\in ST(\Gamma)$ ( $ST$ denoting sections in the tangent bundle) corresponding to $f$ by

\iota_{X_{f}}\omega:=\omega(X_{f},\cdot)=\mathrm{d}f,

(2.5)

where $\iota$ denotes the interior product between vector fields and differential forms. The Poisson bracket of two smooth functions $f,g\in C^{\infty}(\Gamma)$ is then defined as

\{f,g\}:=\omega(X_{f},X_{g})=\mathrm{d}f(X_{g})=\iota_{X_{g}}\mathrm{d}f.

(2.6)

These conventions give the usual coordinate forms of the Hamiltonian flow equations and the Poisson bracket if the symplectic form $\omega$ takes the coordinate form (sign-opposite to that in [23])

\omega=\mathrm{d}q^{a}\wedge\mathrm{d}p_{a}\,.

(2.7)

It is important to note that $C^{\infty}(\Gamma)$ as well as $ST(\Gamma)$ are (infinite dimensional) Lie algebras with respect to the Poisson bracket and the commutator respectively, and that, with respect to these Lie structures, the map $C^{\infty}(\Gamma)\to ST(\Gamma),f\mapsto X_{f}$ is a Lie anti-homomorphism, that is,

X_{\{f,g\}}=-\,[X_{f},X_{g}].

(2.8)

The proof is simple once one recalls from (2.5) that the Lie derivative of $\omega$ with respect to any Hamiltonian vector field vanishes: $L_{X_{f}}\omega=\mathrm{d}(\iota_{X_{f}}\omega)+\iota_{X_{f}}\mathrm{d}\omega=\mathrm{d}^{2}f=0$ . Therefore, $\mathrm{d}\{f,g\}=\mathrm{d}(\iota_{X_{g}}\mathrm{d}f)=L_{X_{g}}\mathrm{d}f=L_{X_{g}}(\iota_{X_{f}}\omega)=-\iota_{[X_{f},X_{g}]}\omega$ .

By saying that a one-parameter group $\phi_{s}\colon\Gamma\to\Gamma$ of symplectomorphisms is generated by a function $g\in C^{\infty}(\Gamma)$ , we mean that $\phi_{s}$ is the flow of the Hamiltonian vector field to $g$ , i.e. that

\frac{\mathrm{d}}{\mathrm{d}s}\phi_{s}(\gamma)=X_{g}(\phi_{s}(\gamma))

(2.9)

for $\gamma\in\Gamma$ , or equivalently

	$\displaystyle\frac{\mathrm{d}}{\mathrm{d}s}(f\circ\phi_{s})$	$\displaystyle=\Big{(}\mathrm{d}f(X_{g})\Big{)}\circ\phi_{s}$
		$\displaystyle=\{f,g\}\circ\phi_{s}$		(2.10)

for $f\in C^{\infty}(\Gamma)$ . Here both sides of (2.3) are to be understood as evaluated pointwise.

2.4 Poincaré-invariant Hamiltonian systems and their momentum maps

A classical Poincaré-invariant system will be described by a phase space $(\Gamma,\omega)$ – i.e. a symplectic manifold – with a symplectic action

\Phi\colon\mathcal{P}\times\Gamma\to\Gamma,\;((\Lambda,a),\gamma)\mapsto\Phi_{(\Lambda,a)}(\gamma)

(2.11)

of the Poincaré group (in fact, for most of our purposes an action of $\mathcal{P}_{+}^{\uparrow}$ is enough). We will take $\Phi$ to be a ‘left’ action, i.e. to satisfy³³3We refer to [24] for a detailed discussion of left versus right actions and the corresponding sign conventions that will also play an important role in the sequel of this paper.

\Phi_{(\Lambda_{1},a_{1})}\circ\Phi_{(\Lambda_{2},a_{2})}=\Phi_{(\Lambda_{1}\Lambda_{2},a_{1}+\Lambda_{1}a_{2})}\;.

(2.12)

We will denote such systems as $(\Gamma,\omega,\Phi)$ .

The left action $\Phi$ of $\mathcal{P}$ on $\Gamma$ induces vector fields $V_{\xi}$ on $\Gamma$ (the so-called ‘fundamental vector fields’), one for each $\xi$ in the Lie algebra $\mathfrak{p}$ of $\mathcal{P}$ . They are given by

V_{\xi}(\gamma):=\left.\frac{\mathrm{d}}{\mathrm{d}s}\Phi_{\exp(s\xi)}(\gamma)\right|_{s=0}\;,

(2.13)

so that the map $\mathfrak{p}\to ST(\Gamma),\xi\mapsto V_{\xi}$ , given by the differential of $\Phi$ with respect to its first argument and evaluated at the group identity, is clearly linear. In fact, it is straightforward to show that it is an anti-homomorphism from the Lie algebra $\mathfrak{p}$ into the Lie algebra $ST(M)$ ⁴⁴4Had we chosen $\Phi$ to be a right action, we would have obtained a proper Lie homomorphism; compare [24, appendix B]., i.e.

\bigl{[}V_{\xi_{1}},V_{\xi_{2}}\bigr{]}=-V_{[\xi_{1},\xi_{2}]}.

(2.14)

Moreover, a similar calculation shows [24, appendix B]

(\mathrm{D}\Phi_{(\Lambda,a)})\circ V_{\xi}=V_{\mathrm{Ad}_{(\Lambda,a)}(\xi)}\circ\Phi_{(\Lambda,a)}\;,

(2.15)

where $\mathrm{D}\Phi_{(\Lambda,a)}\colon T\Gamma\to T\Gamma$ denotes the differential of $\Phi_{(\Lambda,a)}\colon\Gamma\to\Gamma$ .

As $\mathcal{P}$ acts by symplectomorphisms, we clearly have

L_{V_{\xi}}\omega=0\quad\text{for all}\;\xi\in\mathfrak{p}.

(2.16)

As $\omega$ is closed, the latter equation implies that $\iota_{V_{\xi}}\omega$ is likewise closed. Hence, by Poincaré’s lemma, locally (i.e. in a neighbourhood of each point) there exists a local function $f_{\xi}$ such that $\mathrm{d}f_{\xi}=\iota_{V_{\xi}}\omega$ . This function is unique up to the addition of a $\xi$ -dependent constant. Again by Poincaré’s lemma we could argue that $f_{\xi}$ existed globally if $\Gamma$ were simply connected. But, fortunately, we do not need that extra assumption.

In fact, since we are dealing with a special group, the function $f_{\xi}$ always exists globally, irrespective of $\Gamma$ ’s topology, so that each $V_{\xi}$ is a globally defined Hamiltonian vector field (i.e. each one-parameter group $\Phi_{\exp(s\xi)}\colon\Gamma\to\Gamma$ of symplectomorphisms is generated, in the sense of (2.3), by the corresponding function $f_{\xi}$ ). Moreover, the constants up to which the collection of $f_{\xi}$ is defined can be chosen in such a way that the map $\xi\mapsto f_{\xi}$ from the Lie algebra $\mathfrak{p}$ to the Lie algebra $C^{\infty}(\Gamma)$ (the Lie product of the latter being the Poisson bracket) is a Lie homomorphism, i.e.

\left\{f_{\xi_{1}},f_{\xi_{2}}\right\}=f_{[\xi_{1},\xi_{2}]}.

(2.17)

This clearly fixes the constants uniquely. Note that, according to (2.14) and (2.8), both maps, $\xi\mapsto V_{\xi}$ and $V_{\xi}\mapsto f_{\xi}$ , are Lie anti-homomorphisms. Hence their combination $\xi\mapsto f_{\xi}$ is a proper Lie homomorphism (no minus sign on the right-hand side of (2.17)).

A symplectic action of a group whose generating vector fields are globally Hamiltonian and satisfy (2.17) is called a Poisson action. The statement made here is that if $\dim V>2$ , any symplectic action of the Poincaré of group is always a Poisson action. This is a non-trivial statement depending crucially on properties of the groups’s Lie algebra. For example, it would fail to hold for the Galilei group (homogeneous as well as inhomogeneous) which, despite being just a contraction of the Poincaré group, behaves quite differently in that matter and, consequently, also as regards the problem of localisation [25, 5].

The underlying reason for why $f_{\xi}$ exists globally is that $\mathfrak{p}$ is perfect, as already shown above. Indeed, the proof is quite simple: Since $\xi=[\xi_{1},\xi_{2}]$ (or sums of such commutators) we have $V_{\xi}=-[V_{\xi_{1}},V_{\xi_{2}}]$ and hence $\mathrm{d}f_{\xi}=-\iota_{[V_{\xi_{1}},V_{\xi_{2}}]}\omega=-L_{V_{\xi_{1}}}(\iota_{V_{\xi_{2}}}\omega)=\mathrm{d}(\omega(V_{\xi_{1}},V_{\xi_{2}}))$ , so that $f_{\xi}=\omega(V_{\xi_{1}},V_{\xi_{2}})+\text{const.}$ which is globally defined. The other statement concerning the choice of constants that guarantee (2.17) is an immediate consequence of the triviality of the second cohomology of $\mathfrak{p}$ , the proof of which may, e.g., be looked up in [26, § 3.3].

Having established global existence and uniqueness of the generators $f_{\xi}$ satisfying $\omega(V_{\xi},\cdot)=\mathrm{d}f_{\xi}$ , we can now deduce the transformation property of $f_{\xi}$ under the action of $\mathcal{P}$ . Taking the pullback of the equation $\omega(V_{\xi},\cdot)=\mathrm{d}f_{\xi}$ with $\Phi_{(\Lambda,a)^{-1}}$ and using the invariance of $\omega$ as well as (2.15), we immediately deduce

\Phi_{(\Lambda,a)^{-1}}^{*}f_{\xi}:=f_{\xi}\circ\Phi_{(\Lambda,a)^{-1}}=f_{\mathrm{Ad}_{(\Lambda,a)}(\xi)}\;,

(2.18)

which may also be read as the invariance of the real-valued function $f\colon\mathfrak{p}\times\Gamma\to\mathbb{R}$ , $(\xi,\gamma)\mapsto f_{\xi}(\gamma)$ , under the combined left action of $\mathcal{P}$ on $\mathfrak{p}\times\Gamma$ given by $\mathrm{Ad}\times\Phi$ . Alternatively, since $\xi\mapsto f_{\xi}$ is linear, we may regard $f$ as $\mathfrak{p}^{*}$ -valued function on $\Gamma$ , where $\mathfrak{p}^{*}$ denotes the vector space dual to $\mathfrak{p}$ . This map is called the momentum map⁵⁵5See [22, chap. 4.2] for a general discussion on the notion of ‘momentum map’ and also [24] for an account of its use and properties restricted to the case of Poincaré-invariant systems. for the given system $(\Gamma,\omega,\Phi)$ , which according to (2.18) is then $\mathrm{Ad}^{*}$ -equivariant:

f\circ\Phi_{(\Lambda,a)}=\mathrm{Ad}^{*}_{(\Lambda,a)}\circ f\iff\mathrm{Ad}^{*}_{(\Lambda,a)}\circ f\circ\Phi_{(\Lambda,a)^{-1}}=f

(2.19)

The second expression is again meant to stress that the condition of equivariance is equivalent to the invariance of the function $f$ under the combined left actions in its domain and target spaces (invariance of the graph). Note that $\mathrm{Ad}^{*}$ denotes the co-adjoint representation of $\mathcal{P}$ on $\mathfrak{p}^{*}$ , given by $\mathrm{Ad}^{*}_{(\Lambda,a)}:=(\mathrm{Ad}_{(\Lambda,a)^{-1}})^{\top}$ with superscript $\top$ denoting the transposed map.

Points in $\Gamma$ faithfully represent the state of the physical system whereas observables correspond to functions on $\Gamma$ . In order to implement time evolution we shall employ a ‘classical Heisenberg picture’, in which the phase space point remains the same at all times, whereas the evolution will correspond to the changes of observables according to their association to different spacelike hyperplanes in spacetime. Although this is different from the (‘Schrödinger picture’) approach usually taken in classical mechanics (where the state of the system is given by a phase space point changing in ‘time’, which is an external parameter), this point of view is clearly better adapted to the Poincaré-relativistic framework, in which there simply is no absolute notion of time.

Choosing a set of ten basis vectors $(P_{\mu},J_{\mu\nu})$ for $\mathfrak{p}$ obeying (2.4) (compare appendix A), we can contract the $\mathfrak{p}^{*}$ -valued momentum map with each of these basis vectors in order to obtain the corresponding ten real-valued component functions of the momentum map. By some abuse of notation we shall call these component functions by the same letters $(P_{\mu},J_{\mu\nu})$ as the Lie algebra elements themselves. Equation (2.17) now says that the map that sends the Lie algebra elements $P_{\mu}$ and $J_{\mu\nu}$ in $\mathfrak{p}$ to the corresponding component functions of the momentum map is a Lie homomorphism from $\mathfrak{p}$ to the Lie algebra $C^{\infty}(\Gamma,\mathbb{R})$ (the latter with Poisson bracket as Lie multiplication):


$\displaystyle\{P_{\mu},P_{\nu}\}$	$\displaystyle=0$	(2.20a)
$\displaystyle\{J_{\mu\nu},P_{\rho}\}$	$\displaystyle=\eta_{\mu\rho}P_{\nu}-\eta_{\nu\rho}P_{\mu}$	(2.20b)
$\displaystyle\{J_{\mu\nu},J_{\rho\sigma}\}$	$\displaystyle=\eta_{\mu\rho}J_{\nu\sigma}+\text{(antisymm.)}$	(2.20c)

The $\mathrm{Ad}^{*}$ -equivariance of the momentum map can now be written down in component form if we first set $\xi=P_{\mu}$ and then $\xi=J_{\mu\nu}$ . Indeed, considering (2.18) and recalling our abuse of notation in denoting the real-valued phase space functions $f_{P_{\mu}}$ and $f_{J_{\mu\nu}}$ again with the letters $P_{\mu}$ and $J_{\mu\nu}$ , we can immediately read from equation (B.8) of appendix B, in which we need to replace $e_{a}$ with $P_{\mu}$ and $B_{ab}$ with $-J_{\mu\nu}$ according to (A.15) of appendix A, that


$\displaystyle P_{\mu}\circ\Phi_{(\Lambda,a)}$	$\displaystyle=(\Lambda^{-1})^{\nu}_{\phantom{\nu}\mu}\,P_{\nu}\;,$	(2.21a)
$\displaystyle J_{\mu\nu}\circ\Phi_{(\Lambda,a)}$	$\displaystyle=(\Lambda^{-1})^{\rho}_{\phantom{\rho}\mu}(\Lambda^{-1})^{\sigma}_{\phantom{\sigma}\nu}\,J_{\rho\sigma}+a_{\mu}(\Lambda^{-1})^{\rho}_{\phantom{\rho}\nu}\,P_{\rho}-a_{\nu}(\Lambda^{-1})^{\rho}_{\phantom{\rho}\mu}\,P_{\rho}\;.$	(2.21b)

Note that the left-hand sides of (2.21) are precisely what we need; that is, we need the composition with $\Phi_{(\Lambda,a)}$ rather than $\Phi_{(\Lambda,a)^{-1}}$ to evaluate the momenta $P_{\mu}$ and $J_{\mu\nu}$ on the actively Poincaré-displaced phase space points. Note also that if we had put the indices upstairs and had used, e.g., $P^{\mu}=\eta^{\mu\nu}P_{\nu}$ rather than $P_{\mu}$ then the right-hand side of (2.21a) would read $\Lambda^{\mu}_{\phantom{\mu}\nu}\,P^{\nu}$ , and correspondingly in (2.21b). Finally recall that the last term on the right-hand side of (2.21b) just reflects the familiar transformation of angular momentum (the momentum associated to spatial rotations) under spatial translations, which is typical for the co-adjoint representation, which here gets extended to the momentum associated to boost transformations⁶⁶6One easily checks that the signs are right: translating a system whose momentum points in $y$ -direction by a positive amount into the $x$ -direction should enhance the angular momentum in $z$ -direction. This is just what (2.21b) implies..

2.5 The Pauli–Lubański vector

Given a classical Poincaré-invariant system, the Pauli–Lubański vector $W$ is the $V$ -valued phase space function defined in components by

W_{\mu}=-\frac{1}{2}\varepsilon_{\mu\nu\rho\sigma}P^{\nu}J^{\rho\sigma}

(2.22)

where $\varepsilon$ denotes the volume form of Minkowski space (whose components in a positively oriented orthonormal basis are just given by the usual totally antisymmetric symbol, with $\varepsilon_{0123}=+1$ ). The sign convention in this definition can be understood as follows. We imagine a situation in which $P$ is timelike and future-directed (positive energy, see above), and consider the spatial components of $W$ with respect to an orthonormal basis $\{e_{0},\dots,e_{3}\}$ of $V$ with $(e_{0})^{\mu}=P^{\mu}/\sqrt{-P_{\nu}P^{\nu}}$ (‘momentum rest frame’). For those, we obtain

\frac{W_{a}}{\sqrt{-P_{\mu}P^{\mu}}}=-\frac{1}{2}\varepsilon_{a0\rho\sigma}J^{\rho\sigma}=\frac{1}{2}{{}^{(3)}\varepsilon}_{abc}J^{bc}

(2.23)

where the ${{}^{(3)}\varepsilon}_{abc}$ is the three-dimensional antisymmetric symbol / the components of the spatial volume form. Thus, since $J^{bc}=J_{bc}$ generates rotations from $e_{b}$ towards $e_{c}$ , we see that $W_{a}/\sqrt{-P_{\mu}P^{\mu}}$ generates rotations ‘along the $e_{a}$ axis’ in the usual, three-dimensional sense. Thus, $W/\sqrt{-P_{\mu}P^{\mu}}$ can be interpreted as the ‘spatial spin vector’ in the momentum rest frame, which is the usual interpretation of the Pauli–Lubański vector.

Rewriting the definition of $W$ as

W_{\mu}=-\frac{1}{2}\varepsilon_{\mu\nu\rho\sigma}P^{\nu}J^{\rho\sigma}=\frac{1}{2}\varepsilon_{\nu\rho\sigma\mu}P^{\nu}J^{\rho\sigma}=\frac{1}{3!}\varepsilon_{\nu\rho\sigma\mu}(P^{\flat}\wedge J)^{\nu\rho\sigma},

(2.24)

we see that in the language of exterior algebra

W=(*(P^{\flat}\wedge J))^{\sharp}

(2.25)

where $*$ is the Hodge star operator. Here we use the standard sign conventions for the Hodge operator, i.e. the definition $\alpha\wedge*\beta=\eta(\alpha,\beta)\,\varepsilon$ ; see for example [27] or [24, appendix A].

3 The Newton–Wigner position as a ‘centre of spin’

In this section we will explain our understanding and present our geometric clarification of Fleming’s statement in [15] that the Newton–Wigner position may be understood as a ‘centre of spin’. To this end, we introduce Fleming’s geometric framework for special-relativistic position observables, and then discuss the definition of position observables by spin supplementary conditions (SSCs). Finally, we introduce the notion of a position observable being a ‘centre of spin’, and prove that the Newton–Wigner position is the only continuous position observable defined by an SSC that represents a centre of spin in that sense.

3.1 Position observables on spacelike hyperplanes

We start by describing the general framework developed by Fleming in [15] and also [28] for the description of special-relativistic position observables, translated to our case of classical systems from Fleming’s quantum language. Consider a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ . By a position observable $\chi$ for this system we understand a ‘procedure’ which, given any spacelike hyperplane $\Sigma\in\mathsf{SpHP}$ in (affine) Minkowski spacetime, allows us to ‘localise’ the system on $\Sigma$ . More precisely, this means that for any $\Sigma\in\mathsf{SpHP}$ , we have an $M$ -valued phase space function

\chi(\Sigma)\colon\Gamma\to M

(3.1)

with image contained in $\Sigma$ , whose value $\chi(\Sigma)(\gamma)$ for $\gamma\in\Gamma$ is to be interpreted as the ‘ $\chi$ -position’ of our system in state $\gamma$ on the hyperplane $\Sigma$ .

Any spacelike hyperplane $\Sigma\in\mathsf{SpHP}$ is uniquely characterised by its (timelike) future-directed unit normal $u\in V$ and its distance $\tau\in\mathbb{R}$ to the origin $o\in M$ , measured along the straight line through $o$ in direction $u$ . In terms of these, it has the form

\Sigma=\{x\in M:u_{\mu}x^{\mu}=-\tau\},

(3.2)

where we identified $M$ with $V$ . From now on, whenever convenient, we will identify $\Sigma$ with the tuple $(u,\tau)$ . The condition that the image of $\chi(\Sigma)$ be contained in $\Sigma$ then takes the form

u_{\mu}\chi^{\mu}(u,\tau)(\gamma)=-\tau.

(3.3)

We can now also spell out explicitly the left action of $\mathcal{P}$ on $\mathsf{SpHP}$ that is induced from the left action of $\mathcal{P}$ on $M$ (as already mentioned below equation (2.3)):

(\Lambda,a)\cdot(u,\tau)=(\Lambda u,\tau-\Lambda u\cdot a)

(3.4)

One easily checks that this indeed defines a left action, i.e. $(\Lambda_{1},a_{1})\cdot[(\Lambda_{2},a_{2})\cdot(u,\tau)]=(\Lambda_{1}\Lambda_{2},a_{1}+\Lambda_{1}a_{2})\cdot(u,\tau)$ .

Fixing $u$ and varying $\tau$ in (3.2), we obtain the spacelike hyperplanes corresponding to different ‘instants of time’ $\tau$ in the Lorentz frame corresponding to $u$ . Thus, for a fixed state $\gamma\in\Gamma$ and fixed frame $u$ , the set

\{\chi(u,\tau)(\gamma):\tau\in\mathbb{R}\}\subset M

(3.5)

gives the ‘worldline’ of the $\chi$ -position of the system. Following Fleming [15], who says that this is a requirement ‘easily agreed upon’, we require that this worldline should be parallel to the four-momentum⁷⁷7This assumption is natural for closed systems as we consider here. For non-closed systems, i.e. systems without local energy–momentum conservation, the four-velocity is in general not parallel to the four-momentum; see, e.g., the discussion at the beginning of section 2.6 in [29]., i.e. $\frac{\partial\chi(u,\tau)}{\partial\tau}\propto P$ . Together with (3.3), this implies condition (3.8) in the definition below, which is meant to sum up all the preceding considerations.

Definition 3.1.

A position observable for a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ with causal four-momentum is a map

\chi\colon\mathsf{SpHP}\times\Gamma\to M,\;(\Sigma,\gamma)\mapsto\chi(\Sigma)(\gamma)

(3.6)

satisfying

\chi(\Sigma)(\gamma)\in\Sigma

(3.7)

for all $\Sigma\in\mathsf{SpHP}$ and all $\gamma\in\Gamma$ (or, equivalently, (3.3)), as well as

\frac{\partial\chi_{\mu}(u,\tau)}{\partial\tau}=\frac{1}{(-u\cdot P)}P_{\mu}\;.

(3.8)

For fixed $\Sigma\in\mathsf{SpHP}$ , we will often view $\chi(\Sigma)\colon\Gamma\to M$ as a phase space function in its own right.

Note that (3.8) and (3.3) imply that the four-momentum must be causal for such a position observable to exist.

In addition to the demands of the positions $\chi(\Sigma)$ being located on $\Sigma$ and of ‘worldlines’ in direction of the four-momentum, Fleming also introduces the following covariance requirement (which we, different to Fleming, do not include in the definition of a position observable):

Definition 3.2.

A position observable for a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ is said to be covariant if and only if

\chi\Big{(}(\Lambda,a)\cdot\Sigma\Big{)}\Big{(}\Phi_{(\Lambda,a)}(\gamma)\Big{)}=(\Lambda,a)\cdot\Big{(}\chi(\Sigma)(\gamma)\Big{)}

(3.9)

for all $\Sigma\in\mathsf{SpHP}$ , $\gamma\in\Gamma$ and $(\Lambda,a)\in\mathcal{P}$ . This can be read concisely as saying that the map (3.6) is invariant under the natural left action induced from those on the domain and target spaces (invariance of $\chi$ ’s graph):

\chi=(\Lambda,a)\circ\chi\circ\left((\Lambda,a)^{-1}\times\Phi_{(\Lambda,a)^{-1}}\right).

(3.10)

This is indeed a sensible notion of covariance: it demands that, for any Poincaré transformation $(\Lambda,a)$ , the $\chi$ -position of the transformed system $\Phi_{(\Lambda,a)}(\gamma)$ on the transformed hyperplane $(\Lambda,a)\cdot\Sigma$ be the transform of the ‘original position’ $\chi(\Sigma)(\gamma)$ . In terms of components, (3.9) assumes the form

\chi^{\mu}(\Lambda u,\tau-\Lambda u\cdot a)\circ\Phi_{(\Lambda,a)}=\Lambda^{\mu}_{\hphantom{\mu}\nu}\chi^{\nu}(u,\tau)+a^{\mu}\,,

(3.11)

taking into account (3.4).

3.2 Spin supplementary conditions

The most important and widely used procedure to define special-relativistic position observables is by so-called spin supplementary conditions. Suppose we are given a causal, future-directed vector $P\in V$ and an antisymmetric 2-tensor $J\in\bigwedge^{2}V^{*}$ , describing the four-momentum and the angular momentum (with respect to the origin $o\in M$ ) of some physical system. For any future-directed timelike vector $f\in V$ , we then consider the equation

0=S_{\mu\nu}f^{\nu}

(3.12)

with $S_{\mu\nu}:=J_{\mu\nu}-x_{\mu}P_{\nu}+x_{\nu}P_{\mu}$ , which we view as an equation for $x\in M$ . Since $S$ is the angular momentum tensor with respect to the reference point $x$ (instead of the origin $o$ as for $J$ ), or the spin tensor with respect to $x$ , (3.12) is called the spin supplementary condition (SSC) with respect to $f$ . As is well-known (and easily verified), the set of its solutions $x$ is a line in $M$ with tangent $P$ , namely

\{x\in M:0=S_{\mu\nu}f^{\nu}\}=\left\{x\in M:x_{\mu}=\frac{J_{\mu\rho}f^{\rho}}{f\cdot P}+\lambda P_{\mu}\;\text{with}\;\lambda\in\mathbb{R}\right\}.

(3.13)

This line can be given the interpretation of the ‘centre of energy’ worldline of our system with respect to the Lorentz frame defined by $f$ . See [19] and references therein for further discussion on the interpretation and impact of various SSCs as regards equations of motion in General Relativity.

The idea is now to explicitly combine the SSC-based approach with Fleming’s geometric ideas, thereby introducing the two independent parameters $f$ from (3.13) and $u$ from (3.2). We define a position observable in the sense of definition 3.1 in the following way: given a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ with causal four-momentum and a state $\gamma\in\Gamma$ , we consider the SSC worldline defined by (3.12) where we now take $P_{\mu}(\gamma)$ for the four-momentum and $J_{\mu\nu}(\gamma)$ for the angular momentum tensor. We then simply define $\chi(\Sigma)(\gamma)$ to be the intersection of this worldline with the hyperplane $\Sigma=(u,\tau)$ . This means that we take the $x(\lambda)$ from (3.13) and determine the parameter $\lambda$ from (3.3), i.e. from $x(\lambda)\cdot u+\tau=0$ . Inserting the $\lambda=\lambda(u,\tau)$ so determined leads to

Definition 3.3.

The SSC position observable with respect to $f$ is given by

\chi_{\mu}(u,\tau)=\frac{J_{\mu\rho}f^{\rho}}{f\cdot P}+\frac{\tau P_{\mu}}{(-u\cdot P)}-\frac{J_{\lambda\rho}u^{\lambda}f^{\rho}}{(-f\cdot P)}\,\frac{P_{\mu}}{(-u\cdot P)}\;.

(3.14)

Let us again stress the interpretation of this expression: it is the SSC position with respect to $f$ (i.e. a point on the ‘centre of energy’ worldline with respect to $f$ ) as localised on the hyperplane characterised by unit normal $u$ and distance $\tau$ to the origin, i.e. as seen in the Lorentz frame with respect to $u$ at ‘time’ $\tau$ .

Note that for this definition to make sense, $f$ does not have to be a fixed timelike future-directed vector: it can depend on the normal $u$ (and could even depend on $\tau$ ), and it can also depend on phase space⁸⁸8Various choices for $f$ were given distinguished names in the literature. The main ones, different from the Newton–Wigner condition to be discussed here, are as follows. If $f$ is meant to just characterise a fixed ‘laboratory frame’, which may be preferred for any reason, like rotational symmetries in that frame, the SSC is named after Corinaldesi & Papapetrou [30]. If $f$ is proportional to the total linear momentum of the system, the SSC is named after Tulczyjew [31] and Dixon [32]. If $f$ is chosen in a somewhat self-referential way to be the four-velocity of the worldline that is to be determined by the very SSC containing that $f$ , the condition is named after Frenkel [33], Mathisson [34, 35], and Pirani [36, 37].. Of course this means that according to this dependence of $f$ , we will possibly be considering different worldlines for different choices of $u$ .

Example 3.4.

(i)

Choosing $f=u$ , we are considering, for each $u$ , the SSC worldline with respect to $u$ , i.e. the centre of energy worldline⁹⁹9Note that it was called ‘centre of mass’ by Fleming [15]. with respect to $u$ . Using (3.14), the centre of energy position observable has the form

$\chi^{\mathrm{CE}}_{\mu}(u,\tau)=\frac{J_{\mu\rho}u^{\rho}}{u\cdot P}+\frac{\tau P_{\mu}}{(-u\cdot P)}\;.$ (3.15)

(ii)

In the case of timelike four-momentum, we can choose $f=P$ the four-momentum (the Tulczyjew–Dixon SSC), such that the corresponding SSC worldline is the centre of energy worldline in the momentum rest frame of the system. This worldline, which is obviously independent of $u$ , was called the centre of inertia worldline by Fleming [15]. The centre of inertia has the form

\chi^{\mathrm{CI}}_{\mu}(u,\tau)=-\frac{J_{\mu\rho}P^{\rho}}{m^{2}c^{2}}+\frac{\tau P_{\mu}}{(-u\cdot P)}-\frac{J_{\lambda\rho}u^{\lambda}P^{\rho}}{m^{2}c^{2}}\;\frac{P_{\mu}}{(-u\cdot P)}\;,

(3.16)

where $m=\sqrt{-P^{2}}/c$ is the mass of the system.

(iii)

Choosing $f=u+\frac{P}{mc}$ where $m=\sqrt{-P^{2}}/c$ is the mass of the system (again only possible in the case of timelike four-momentum), we obtain the Newton–Wigner position observable. Evaluating (3.14), it has the form

\chi_{\mu}^{\mathrm{NW}}(u,\tau)=-\frac{J_{\mu\rho}\left(u^{\rho}+\frac{P^{\rho}}{mc}\right)}{mc-u\cdot P}+\frac{\tau P_{\mu}}{(-u\cdot P)}-\frac{J_{\lambda\rho}u^{\lambda}P^{\rho}}{mc(mc-u\cdot P)}\,\frac{P_{\mu}}{(-u\cdot P)}\;.

(3.17)

Of course, the SSC position observable (3.14) will generally not be covariant in the sense of definition 3.2 unless $f$ is also assumed to transform appropriately. If $f$ depends on $\Sigma\in\mathsf{SpHP}$ and $\gamma\in\Gamma$ and takes values in $V$ it seems obvious that for the resulting position to be covariant $f$ itself must be a covariant function under the combined actions on its domain and target spaces. Indeed, we have

Proposition 3.5.

If the vector $f$ defining the SSC position observable $\chi$ is a function

f\colon\mathsf{SpHP}\times\Gamma\to V,\quad(\Sigma,\gamma)\mapsto f(\Sigma)(\gamma),

(3.18)

such that

f\Big{(}(\Lambda,a)\cdot\Sigma\Big{)}\Big{(}\Phi_{(\Lambda,a)}(\gamma)\Big{)}=\Lambda\cdot\Big{(}f(\Sigma)(\gamma)\Big{)}

(3.19)

for all $\Sigma\in\mathsf{SpHP}$ , $\gamma\in\Gamma$ , and $(\Lambda,a)\in\mathcal{P}$ , then $\chi$ is a covariant position observable. Again we note that, just like in the transition from (3.9) to (3.10), we may rewrite (3.19) equivalently as expressing the invariance of $f$ (i.e. its graph) under simultaneous actions on its domain and target spaces (using that translations act trivially on the target space $V$ ):

f=\Lambda\circ f\circ\left((\Lambda,a)^{-1}\times\Phi_{(\Lambda,a)^{-1}}\right)

(3.20)

Proof.

At first, suppose we are given a future-directed timelike four-momentum $P\in V$ and an angular momentum tensor $J\in\bigwedge^{2}V^{*}$ , as well as a future-directed timelike vector $f$ for the definition of an SSC. In addition, fix a Poincaré transformation $(\Lambda,a)\in\mathcal{P}$ . If we now consider (a) the SSC worldline for $P$ and $J$ with respect to $f$ , and (b) the SSC worldline for the transformed four-momentum $P^{\prime}=\Lambda P$ and angular momentum $J^{\prime}=((\Lambda^{-1})^{\top}\otimes(\Lambda^{-1})^{\top})J+a^{\flat}\wedge(\Lambda^{-1})^{\top}P^{\flat}$ (compare (2.21b)) with respect to the transformed vector $\Lambda f$ , it is easy to check that the second worldline is the Poincaré transform by $(\Lambda,a)$ of the first. That is, by Poincaré transforming the four-momentum and angular momentum of the system as well as the ‘direction vector’ for the SSC, we Poincaré transform the SSC worldline.

Now, the SSC position $\chi(\Sigma)(\gamma)$ is defined to be the intersection of the hyperplane $\Sigma$ with the SSC worldline of $\gamma$ with respect to $f(\Sigma)(\gamma)$ . Thus, the ‘new position’

\chi\Big{(}(\Lambda,a)\cdot\Sigma\Big{)}\Big{(}\Phi_{(\Lambda,a)}(\gamma)\Big{)}

(3.21)

is the intersection of the transformed hyperplane $(\Lambda,a)\cdot\Sigma$ with the SSC worldline of the transformed system $\Phi_{(\Lambda,a)}(\gamma)$ with respect to the transformed vector $\Lambda\cdot\Big{(}f(\Sigma)(\gamma)\Big{)}$ , where we used the covariance requirement (3.19). But according to our earlier considerations, this means that the ‘new position’ is the intersection of the transformed hyperplane with the transform of the original SSC worldline – i.e. the transform of the original position $\chi(\Sigma)(\gamma)$ . This means that the position observable is covariant. ∎

Since the vectors defining the centre of energy, the centre of inertia and the Newton–Wigner position satisfy (3.19), all of these are covariant position observables. We stress once more that for this to be true we need to take into account the action of the Poincaré group on $\mathsf{SpHP}$ . This remark is particularly relevant in the Newton–Wigner case, in which $f$ is the sum of two vectors, $u$ and $P/(mc)$ , the first being associated to an element of $\mathsf{SpHP}$ and the second to an element of $\Gamma$ . Covariance cannot be expected to hold for non-trivial actions on $\Gamma$ alone. In the next section we will offer an insight as to why this somewhat ‘hybrid’ combination for $f$ in terms of an ‘external’ vector $u$ and an ‘internal’ vector $P/(mc)$ appears. The latter is internal, or dynamical, in the sense that it is defined entirely by the physical state of the system, i.e. a point in $\Gamma$ , while the former is external, or kinematical, in the sense that it refers to the choice of $\Sigma\in\mathsf{SpHP}$ , which is entirely independent of the physical system and its state.

Finally, we will need the following well-known result for SSCs with respect to different vectors $f$ , which was first shown by Møller in 1949 in [38]; see also [24, theorem 17] for a recent and more geometric discussion:

Theorem 3.6 (Møller disc and radius).

Suppose we are given the future-directed timelike four-momentum vector $P\in V$ and the angular momentum tensor $J\in\bigwedge^{2}V^{*}$ of some physical system. Consider the bundle of all possible SSC worldlines (3.13) for this system, defined by considering all future-directed timelike vectors $f$ . The intersection of this bundle with any hyperplane $\Sigma\in\mathsf{SpHP}$ orthogonal to $P$ is a two-dimensional disc (the so-called Møller disc) in the plane orthogonal to the Pauli–Lubański vector $W=(*(P^{\flat}\wedge J))^{\sharp}$ , whose centre is the centre of inertia on $\Sigma$ and whose radius is the Møller radius

R_{M}=\frac{S}{mc}\;,

(3.22)

where $S=\sqrt{W^{2}}/(mc)$ is the spin of the system and $m=\sqrt{-P^{2}}/c$ its mass.

3.3 The centre of spin condition

For a system with timelike four-momentum, the Pauli–Lubański vector $W$ has the interpretation of being ( $mc$ times) the spin vector in the momentum rest frame. We now define the spin vector in an arbitrary Lorentz frame by boosting $W/(mc)$ to the new frame:

Definition 3.7.

Given the timelike four-momentum $P\in V$ and the Pauli–Lubański vector $W\in P^{\perp}$ of a physical system, its spin vector in the Lorentz frame given by the future-directed unit timelike vector $u$ is

s(u):=B(u)\cdot\frac{W}{mc}\;,

(3.23)

where $B(u)\in\mathcal{L}_{+}^{\uparrow}$ is the unique Lorentz boost with respect to $\frac{P}{mc}$ (i.e. containing $\frac{P}{mc}$ in its timelike 2-plane of action) that maps $\frac{P}{mc}$ to $u$ , with $m=\sqrt{-P^{2}}/c$ being the mass. In terms of components, this boost is given by¹⁰¹⁰10Generally, given two unit timelike future-pointing vectors $n_{1}$ and $n_{2}$ , then the boost that maps $n_{1}$ onto $n_{2}$ and fixes the spacelike plane orthogonal to $\mathrm{span}\{n_{1},n_{2}\}$ is given by the combination $\rho_{n_{1}+n_{2}}\circ\rho_{n_{1}}$ of two hyperplane-reflections, where $\rho_{n}:=\mathrm{id}_{V}-2\frac{n\otimes n^{\flat}}{n^{2}}$ is the reflection at the hyperplane orthogonal to $n$ . Setting $n_{1}=P/(mc)$ and $n_{2}=u$ gives (3.24).

B^{\mu}_{\hphantom{\mu}\nu}(u)=\delta^{\mu}_{\nu}+\frac{\left(\frac{P^{\mu}}{mc}+u^{\mu}\right)\left(\frac{P_{\nu}}{mc}+u_{\nu}\right)}{1-u\cdot\frac{P}{mc}}-2\frac{u^{\mu}P_{\nu}}{mc}\;.

(3.24)

Definition 3.8.

A centre of spin position observable for a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ with timelike four-momentum is a position observable $\chi$ satisfying

s_{\mu}(u)=-\frac{1}{2}\varepsilon_{\mu\nu\rho\sigma}u^{\nu}S^{\rho\sigma}(u),

(3.25)

where $S_{\rho\sigma}(u):=J_{\mu\nu}-\chi_{\mu}(u,\tau)P_{\nu}+\chi_{\nu}(u,\tau)P_{\mu}$ is the spin tensor¹¹¹¹11Since $\frac{\partial\chi(u,\tau)}{\partial\tau}$ is proportional to $P$ , the spin tensor is independent of $\tau$ . with respect to $\chi$ . Expressed in terms of the Hodge operator, this condition reads

s(u)=(*(u^{\flat}\wedge S(u)))^{\sharp}.

(3.26)

With respect to an orthonormal basis $\{u=e_{0},\dots,e_{3}\}$ adapted to $u$ , the centre of spin condition takes the form

s_{0}(u)=0,\quad s_{a}(u)=-\frac{1}{2}\varepsilon_{a0\rho\sigma}S^{\rho\sigma}(u)=\frac{1}{2}{{}^{(3)}\varepsilon}_{abc}S^{bc}(u),

(3.27)

through which it acquires an immediate interpretation: a position observable is a centre of spin if and only if, for any Lorentz frame $u$ , the spin vector defined by boosting the Pauli–Lubański vector to $u$ really generates spatial rotations around the point given by the position observable.

We will now rewrite the centre of spin condition. Since $S(u)=J-(\chi(u,\tau))^{\flat}\wedge P^{\flat}$ , we can rewrite the Pauli–Lubański vector as $W=\left[*\left(\frac{P^{\flat}}{mc}\wedge J\right)\right]^{\sharp}=\left[*\left(\frac{P^{\flat}}{mc}\wedge S(u)\right)\right]^{\sharp}$ . Thus, the centre of spin condition takes the form

(B(u)^{-1})^{\top}\left[*\left(\frac{P^{\flat}}{mc}\wedge S(u)\right)\right]=*(u^{\flat}\wedge S(u)).

(3.28)

Since $B(u)$ is a Lorentz transformation, i.e. an isometry of $(V,\eta)$ , and it maps $P/(mc)$ to $u$ , this is equivalent to

u^{\flat}\wedge\left((B(u)^{-1})^{\top}\otimes(B(u)^{-1})^{\top}\right)(S(u))=u^{\flat}\wedge S(u).

(3.29)

Using the explicit form (3.24) of $B(u)$ , we see that

\left((B(u)^{-1})^{\top}\otimes(B(u)^{-1})^{\top}\right)(S(u))=S(u)+\frac{\frac{P^{\flat}}{mc}\wedge\left(\iota_{u+\frac{P}{mc}}S(u)\right)}{1-u\cdot\frac{P}{mc}}+u^{\flat}\wedge(\ldots).

(3.30)

Thus, we have the following:

Lemma 3.9.

The centre of spin condition is equivalent to

u^{\flat}\wedge P^{\flat}\wedge\left(\iota_{u+\frac{P}{mc}}S(u)\right)=0.\qed

(3.31)

Since the Newton–Wigner position observable is defined by the SSC $\iota_{u+\frac{P}{mc}}S(u)=0$ , the preceding result immediately implies

Theorem 3.10.

The Newton–Wigner position observable $\chi^{\mathrm{NW}}$ is a centre of spin. ∎

Further rewriting the centre of spin condition, we see that (3.31) is equivalent to

\iota_{u+\frac{P}{mc}}S(u)\in\mathrm{span}\{u^{\flat},P^{\flat}\}.

(3.32)

Due to the antisymmetry of $S(u)$ , this is equivalent to

\iota_{u+\frac{P}{mc}}S(u)\in\mathrm{span}\left\{u^{\flat}-\frac{P^{\flat}}{mc}\right\}.

(3.33)

Using this, we can show:

Lemma 3.11.

$\chi$ is a centre of spin $\iff$ $\chi(u,\tau)-\chi^{\mathrm{NW}}(u,\tau)\in\mathrm{span}\{u,P\}$ .

Proof.

Writing $D:=\chi(u,\tau)-\chi^{\mathrm{NW}}(u,\tau)$ , the spin tensor of $\chi$ may be expressed as $S(u)=S^{\mathrm{NW}}(u)-D^{\flat}\wedge P^{\flat}$ . Thus, (3.33) is equivalent to

\iota_{u+\frac{P}{mc}}(D^{\flat}\wedge P^{\flat})\in\mathrm{span}\left\{u^{\flat}-\frac{P^{\flat}}{mc}\right\}.

(3.34)

We have $\iota_{u+\frac{P}{mc}}(D^{\flat}\wedge P^{\flat})=(D\cdot u+\frac{D\cdot P}{mc})P^{\flat}-(P\cdot u-mc)D^{\flat}$ , and thus (3.34) implies that for all $v\in u^{\perp}\cap P^{\perp}$ , we have

v\cdot D=0.

(3.35)

But this means $D\in(u^{\perp}\cap P^{\perp})^{\perp}=\mathrm{span}\{u,P\}$ .

Conversely, if $D\in\mathrm{span}\{u,P\}$ , we have $\iota_{u+\frac{P}{mc}}(D^{\flat}\wedge P^{\flat})\in\mathrm{span}\left\{\iota_{u+\frac{P}{mc}}(u^{\flat}\wedge P^{\flat})\right\}$ . But now

\iota_{u+\frac{P}{mc}}(u^{\flat}\wedge P^{\flat})=\left(-1+u\cdot\frac{P}{mc}\right)P^{\flat}-(u\cdot P-mc)u^{\flat}=(mc-u\cdot P)\left(u^{\flat}-\frac{P^{\flat}}{mc}\right),

(3.36)

and thus we have (3.34), i.e. $\chi$ is a centre of spin. ∎

We can now prove the main result of this section.

Theorem 3.12.

The Newton–Wigner position observable $\chi^{\mathrm{NW}}$ is the only centre of spin position observable that is continuous and defined by an SSC.

Proof.

Let $\chi$ be an SSC position observable. Writing $D(u,\tau):=\chi(u,\tau)-\chi^{\mathrm{NW}}(u,\tau)$ , we know by the Møller disc theorem (theorem 3.6) that the projection of $D(u,\tau)$ orthogonal to $P$ is orthogonal to the Pauli–Lubański vector $W$ . Thus, since $P$ itself is orthogonal to $W$ , we have

D(u,\tau)\perp W

(3.37)

for any $(u,\tau)\in\mathsf{SpHP}$ . In addition, we know that $D(u,\tau)\perp u$ ; in particular, $D(u,\tau)$ is spacelike for any $(u,\tau)\in\mathsf{SpHP}$ .

Now suppose that $\chi$ is a centre of spin. By lemma 3.11 this means that

D(u,\tau)\in\mathrm{span}\{u,P\}

(3.38)

for all $(u,\tau)\in\mathsf{SpHP}$ . Using (3.37) and $P\perp W$ , we conclude that

\text{for all}\;u\;\text{with}\;u\cdot W\neq 0:D(u,\tau)\in\mathrm{span}\{P\}.

(3.39)

Since $D(u,\tau)$ has to be spacelike, we thus have shown

D(u,\tau)=0\;\text{for all}\;u\;\text{with}\;u\cdot W\neq 0.

(3.40)

If $W\neq 0$ , the set of future-directed unit timelike $u$ satisfying $u\cdot W\neq 0$ is dense in the hyperboloid of all possible $u$ , and thus assuming continuity of $\chi$ , we conclude that $D(u,\tau)=0$ for all $u$ , finishing the proof.

If $W=0$ , then by the Møller disc theorem all SSC worldlines coincide, and thus we also have $\chi=\chi^{\mathrm{NW}}$ . ∎

Looking back into the various steps of the proofs it is interesting to note how the ‘extrinsic–intrinsic’ combination $u+P/(mc)$ for $f$ came about. It entered through the unique boost transformation (3.24) that was needed in order to transform an intrinsic quantity to an externally specified rest frame. The intrinsic quantity is the spin vector in the momentum rest frame, i.e. the Pauli–Lubański vector, which is a function of $\Gamma$ only, and the externally specified frame is defined by $u$ , which is independent of $\Gamma$ and determined through the choice of $\Sigma\in\mathsf{SpHP}$ .

4 A Newton–Wigner theorem for classical elementary systems

For elementary Poincaré-invariant quantum systems – i.e. quantum systems with an irreducible unitary action of the Poincaré group – the Newton–Wigner position operator is uniquely characterised by transforming ‘as a position should’ under translations, rotations and time reversal, having commuting components and satisfying a regularity condition. This has been well-known since the original publication by Newton and Wigner [4]. As advertised in the introduction, we shall now prove an analogous statement for classical systems.

For the whole of this section, we fix a future-directed unit timelike vector $u$ defining a Lorentz frame, and an adapted positively oriented orthonormal basis $\{u=e_{0},\dots,e_{3}\}$ . Unless otherwise stated, phrases such as ‘temporal’, ‘spatial’ and the like refer to the preferred time direction given by $u$ . We will raise and lower spatial indices by the Euclidean metric $\delta$ induced by the Minkowski metric $\eta$ on the orthogonal complement of $u$ ; the components of $\delta$ in the adapted basis are simply given by the usual Kronecker delta. We denote the spatial volume form by ${}^{(3)}\varepsilon=\iota_{u}\varepsilon$ .

We will employ a ‘three-vector’ notation for spatial vectors, for example writing $\vec{A}=(A^{a})$ . We then use the usual three-vector notations for the Euclidean scalar product $\vec{A}\cdot\vec{B}=A_{a}B^{a}$ , the Euclidean norm $|\vec{A}|:=\sqrt{\vec{A}^{2}}$ and the vector product $(\vec{A}\times\vec{B})_{a}={{}^{(3)}\varepsilon_{abc}}A^{b}B^{c}$ .

4.1 Classical elementary systems

In the quantum case, an elementary system is given by a Hilbert space with an irreducible unitary action of the Poincaré group – i.e. each state of the system is connected to any other by a Poincaré transformation. In direct analogy, we define the notion of a classical elementary system:

Definition 4.1.

A classical elementary system is a classical Poincaré-invariant system $(\Gamma,\omega,\Phi)$ , where $\Phi$ is a transitive action of the proper orthochronous Poincaré group $\mathcal{P}_{+}^{\uparrow}$ .

Note the we only assumed an action of the identity connected component of the Poincaré group, whereas Arens in [21] considered the whole Poincaré group. In the classical context, simple transitivity replaces irreducibility in the quantum case.

Arens classified the classical elementary systems¹²¹²12In fact, Arens classified what he called one-particle elementary systems (systems that admit a map from $\Gamma$ to the set of lines in Minkowski space which is equivariant with respect to a certain subgroup of $\mathcal{P}_{+}^{\uparrow}$ ). However, he also proved that this ‘one-particle’ condition is fulfilled for an elementary system if and only if the four-momentum is not zero. in [21]; the classification proceeds in terms of the system’s four-momentum and Pauli–Lubański vector (similar to the Wigner classification in the quantum case [39]). We are only interested in the case of timelike four-momentum. For this case, the phase space can be explicitly constructed as follows:

Theorem 4.2 (Phase space of a classical elementary system).

Any classical elementary system with timelike four-momentum is equivalent (in the sense of a symplectic isomorphism respecting the action of $\mathcal{P}_{+}^{\uparrow}$ ) to precisely one of the following two cases:

(i)

(Spin zero, one parameter $m\in\mathbb{R}_{+}$ )

•

Phase space $\Gamma=T^{*}\mathbb{R}^{3}$ with coordinates $(\vec{x},\vec{p})$ , symplectic form $\omega=\mathrm{d}x^{a}\wedge\mathrm{d}p_{a}$

•

Poincaré generators (i.e. component functions of the momentum map):


$\displaystyle\text{spatial translations}\quad P_{a}$	$\displaystyle=p_{a}$	(4.1a)
$\displaystyle\text{time translation}\quad P_{0}$	$\displaystyle=-\sqrt{m^{2}c^{2}+\vec{p}^{2}}$	(4.1b)
$\displaystyle\text{rotations}\quad J_{ab}$	$\displaystyle=x_{a}p_{b}-x_{b}p_{a}$	(4.1c)
$\displaystyle\text{boosts}\quad J_{a0}$	$\displaystyle=P_{0}x_{a}$	(4.1d)

(ii)

(Spin non-zero, two parameters $m,S\in\mathbb{R}_{+}$ )

•

Phase space $\Gamma=T^{*}\mathbb{R}^{3}\times\mathsf{S}^{2}$ with coordinates $(\vec{x},\vec{p})$ for $T^{*}\mathbb{R}^{3}$ , symplectic form $\omega=\mathrm{d}x^{a}\wedge\mathrm{d}p_{a}+S\cdot\mathrm{d}\Omega^{2}$ where $\mathrm{d}\Omega^{2}$ is the standard volume form on $\mathsf{S}^{2}$ . We denote the phase space function projecting onto the second factor $\mathsf{S}^{2}$ by $\hat{s}\colon\Gamma\to\mathsf{S}^{2}\subset\mathbb{R}^{3}$ . The spin vector observable is the $\mathsf{S}^{2}_{S}$ -valued phase space function $\vec{s}:=S\cdot\hat{s}$ ; its components satisfy the Poisson bracket relations

$\{s_{a},s_{b}\}={{}^{(3)}\varepsilon_{abc}}s^{c}.$ (4.2)

Here $\mathsf{S}^{2}_{S}\subset\mathbb{R}^{3}$ denotes the 2-sphere of radius $S$ in $\mathbb{R}^{3}$ .

•

Poincaré generators (i.e. component functions of the momentum map):


$\displaystyle\text{spatial translations}\quad P_{a}$	$\displaystyle=p_{a}$	(4.3a)
$\displaystyle\text{time translation}\quad P_{0}$	$\displaystyle=-\sqrt{m^{2}c^{2}+\vec{p}^{2}}$	(4.3b)
$\displaystyle\text{rotations}\quad J_{ab}$	$\displaystyle=x_{a}p_{b}-x_{b}p_{a}+{{}^{(3)}\varepsilon_{abc}}s^{c}$	(4.3c)
$\displaystyle\text{boosts}\quad J_{a0}$	$\displaystyle=P_{0}x_{a}-\frac{(\vec{p}\times\vec{s})_{a}}{mc-P_{0}}$	(4.3d)

Note that in fact the explicit construction of the systems in [21] as co-adjoint orbits of $\mathcal{P}_{+}^{\uparrow}$ is quite different in appearance to the forms given above. However, one can show that the above systems are indeed elementary systems (i.e. that the action of $\mathcal{P}_{+}^{\uparrow}$ is transitive), and thus due to Arens’ uniqueness result they are possible representatives of their respective classes. We will use the forms given above, which were anticipated by Bacry in [40], since they will be easier to explicitly work with. To unify notation, we let $S=0,\vec{s}:=0$ in the case of zero-spin systems. Furthermore, we introduce the open subset of phase space $\Gamma^{*}:=\Gamma\setminus\{|\vec{P}|=0\}$ and the $\mathsf{S}^{2}$ -valued function $\hat{P}:=\frac{\vec{P}}{|\vec{P}|}$ on $\Gamma^{*}$ .

Using the explicit form of the systems given in theorem 4.2, one directly checks:

Lemma 4.3.

For a classical elementary system with timelike four-momentum, the functions $P_{a},\hat{P}\cdot\vec{s}$ (or just the $P_{a}$ in the case of zero spin) form a complete involutive set on $\Gamma^{*}$ (or the whole of $\Gamma$ in the case of zero spin). ∎

The behaviour of the momentum and spin vectors under translations and rotations is also easily obtained:

Lemma 4.4.

For a classical elementary system with timelike four-momentum, $\vec{P}$ and $\vec{s}$ are invariant under translations and ‘transform as vectors’ under spatial rotations, i.e. we have

\{P_{a},V_{b}\}=0,\quad\{J_{ab},V_{c}\}=\delta_{ac}V_{b}-\delta_{bc}V_{a}\quad\text{for}\quad\vec{V}=\vec{P},\vec{s}.

(4.4)

Proof.

For $\vec{P}$ , these are part of the Poincaré algebra relations and thus true by definition. For $\vec{s}$ , they are easily confirmed using the explicit form of the Poincaré generators. ∎

For our considerations, we will need to know how the time reversal operation with respect to the hyperplane in $M$ through the origin $o\in M$ and orthogonal to $u=e_{0}$ is implemented on phase space. In order to get this right, we recall that the incorporation of time reversal in the context of Special Relativity corresponds, by its very definition, to a particular upward $\mathbb{Z}_{2}$ extension¹³¹³13Here we are using the terminology of [41, p. xx], according to which a group $G$ with normal subgroup $A$ and quotient $G/A\cong B$ is either called an upward extension of $A$ by $B$ or a downward extension of $B$ by $A$ . of $\mathcal{P}_{+}^{\uparrow}$ , i.e. the formation of a new group called $\mathcal{P}_{+}^{\uparrow}\cup\mathcal{P}_{-}^{\downarrow}$ of which $\mathcal{P}_{+}^{\uparrow}$ is a normal subgroup with $(\mathcal{P}_{+}^{\uparrow}\cup\mathcal{P}_{-}^{\downarrow})/\mathcal{P}_{+}^{\uparrow}\cong\mathbb{Z}_{2}$ . It is the particular nature of this extension that eventually defines what is meant by time reversal: it consists in the requirement that the outer automorphism induced by the only non-trivial element of $\mathbb{Z}_{2}$ on the Lie algebra $\mathfrak{p}$ of $\mathcal{P}_{+}^{\uparrow}$ shall be the one which reverses the sign of spatial translations and rotations and leaves invariant boosts and time translations; see, e.g., [42]. Implementing time reversal on phase space then means to extend the action of $\mathcal{P}_{+}^{\uparrow}$ to an action of $\mathcal{P}_{+}^{\uparrow}\cup\mathcal{P}_{-}^{\downarrow}$ .

Now, according to this scheme, we can immediately write down how our particular time reversal transformation on phase space, $T_{u}\colon\Gamma\to\Gamma$ , acts on the Poincaré generators, i.e. the component functions of the momentum map:

P_{a}\circ T_{u}=-P_{a}\;,\quad J_{ab}\circ T_{u}=-J_{ab}\;,\quad J_{a0}\circ T_{u}=J_{a0}\;,\quad P_{0}\circ T_{u}=P_{0}

(4.5)

From this the well-known result follows that time reversal (as defined above) necessarily corresponds to an anti-symplectomorphism. Hence, in the process of extending our symplectic action of $\mathcal{P}_{+}^{\uparrow}$ on $\Gamma$ to an action of $\mathcal{P}_{+}^{\uparrow}\cup\mathcal{P}_{-}^{\downarrow}$ satisfying the time reversal criterion above, we had to generalise to possibly anti-symplectomorphic actions. This is akin to the situation in Quantum Mechanics, where, as is well-known, time reversal necessarily corresponds to an anti-unitary transformation.

It is now clear how time reversal is implemented in the case at hand:

Lemma 4.5.

For an elementary system as in theorem 4.2, time reversal with respect to the hyperplane through the origin and orthogonal to $u=e_{0}$ is given by

T_{u}\colon(\vec{x},\vec{p},\hat{s})\mapsto(\vec{x},-\vec{p},-\hat{s}).\qed

(4.6)

Unless otherwise stated, in the following we will always mean time reversal with respect to the hyperplane through the origin and orthogonal to $u=e_{0}$ when saying ‘time reversal’.

4.2 Statement and interpretation of the Newton–Wigner theorem

The classical Newton–Wigner theorem we are going to prove can be formulated very similar to the quantum case:

Theorem 4.6 (Classical Newton–Wigner theorem).

For a classical elementary system with timelike four-momentum, there is a unique $\mathbb{R}^{3}$ -valued phase space function $\vec{X}$ that

(i)

is $C^{1}$ ,
(ii)

has Poisson-commuting components,
(iii)

satisfies the canonical Poisson relations $\{X^{a},P_{b}\}=\delta^{a}_{b}$ with the generators of spatial translations with respect to $u=e_{0}$ ,
(iv)

transforms ‘as a (position) vector’ under spatial rotations with respect to $u=e_{0}$ , i.e. satisfies $\{J_{ab},X^{c}\}=\delta_{a}^{c}X_{b}-\delta_{b}^{c}X_{a}$ , and
(v)

is invariant under time reversal with respect to the hyperplane through the origin and orthogonal to $u=e_{0}$ , i.e. satisfies $\vec{X}\circ T_{u}=\vec{X}$ .

In terms of the Poincaré generators, it is given by

X_{a}=-\frac{J_{a0}}{mc}-\frac{J_{ab}P^{b}}{mc(mc-P_{0})}-\frac{J_{b0}P^{b}}{P_{0}mc(mc-P_{0})}P_{a}\;,

(4.7)

where $m=\sqrt{P_{0}^{2}-\vec{P}^{2}}/c$ is the mass of the system.

Before proving the theorem in the next section, we will now discuss the interpretation of the ‘position’ $\vec{X}$ it characterises. We want to interpret the value of $\vec{X}$ (in some state $\gamma\in\Gamma$ ) as the spatial components of a point in Minkowski spacetime $M$ . Since $\vec{X}$ is invariant under time reversal with respect to the hyperplane through the origin and orthogonal to $u=e_{0}$ , it can be interpreted as defining a point on this hyperplane. Thus, if we want to use the phase space function from the Newton–Wigner theorem to define a position observable $\chi$ in the sense of section 3.1, we should set (in our basis adapted to $u$ )

\chi^{a}(u,\tau=0):=X^{a}\;,\quad\chi^{0}(u,\tau=0):=0.

(4.8)

The transformation behaviour of $\vec{X}$ under spatial translations and rotations (i.e. assumptions (iii) and (iv) of theorem 4.6) will then ensure that the position observable $\chi$ be covariant (in the sense of definition 3.2) regarding these transformations.

In fact, comparing (4.7) to the expression (3.17) for the Newton–Wigner position observable $\chi^{\mathrm{NW}}$ , we see that we have (in our adapted basis)

\chi^{\mathrm{NW},a}(u,\tau=0)=X^{a}\;,\quad\chi^{\mathrm{NW},0}(u,\tau=0)=0\colon

(4.9)

the position $\vec{X}$ characterised by theorem 4.6 is the one given by the Newton–Wigner position observable $\chi^{\mathrm{NW}}$ on the hyperplane $(u,0)\in\mathsf{SpHP}$ (which is a covariant position observable due to proposition 3.5). Let us also remark that since any position observable’s dependence on $\tau$ is fixed by (3.8), a position observable satisfying (4.8) is equal to the Newton–Wigner observable $\chi^{\mathrm{NW}}$ on the whole family of hyperplanes $\Sigma\in\mathsf{SpHP}$ with normal vector $u$ .

Combining this identification with the observation that we can freely choose the origin $o\in M$ , we can restate the Newton–Wigner theorem in the following form:

Theorem 4.7 (Classical Newton–Wigner theorem, version 2).

For a classical elementary system with timelike four-momentum, given any hyperplane $\Sigma=(u,\tau)\in\mathsf{SpHP}$ , there is a unique $\Sigma$ -valued phase space function $\chi^{\mathrm{NW}}(\Sigma)$ that

(i)

is $C^{1}$ ,
(ii)

has Poisson-commuting components, i.e.

$\left\{\chi^{\mathrm{NW},\mu}(\Sigma),\chi^{\mathrm{NW},\nu}(\Sigma)\right\}=0,$ (4.10a)
(iii)

satisfies the canonical Poisson relations with the generators of spatial translations with respect to $u$ , i.e.

$v_{\mu}w^{\nu}\left\{\chi^{\mathrm{NW},\mu}(\Sigma),P_{\nu}\right\}=v\cdot w\;\text{for}\;v,w\in u^{\perp},$ (4.10b)

(iv)

transforms ‘as a position’ under spatial rotations with respect to $u$ , i.e. satisfies

v^{\mu}\tilde{v}^{\nu}w_{\rho}\left\{J_{\mu\nu},\chi^{\mathrm{NW},\rho}(\Sigma)\right\}=v^{\mu}\tilde{v}^{\nu}w_{\rho}\left[\delta_{\mu}^{\rho}\chi^{\mathrm{NW}}_{\nu}(\Sigma)-\delta_{\nu}^{\rho}\chi^{\mathrm{NW}}_{\mu}(\Sigma)\right]\;\text{for}\;v,\tilde{v},w\in u^{\perp},

(4.10c)

and

(v)

is invariant under time reversal with respect to $\Sigma$ .

These $\chi^{\mathrm{NW}}(\Sigma)$ together form the Newton–Wigner observable as given by (3.17). ∎

4.3 Proof of the Newton–Wigner theorem

Proof of theorem 4.6.

For the whole of the proof, we will work with the explicit form of the phase space of our elementary system given in theorem 4.2. It is easily verified that in this explicit form, $\vec{x}$ (i.e. the coordinate of the base point in $T^{*}\mathbb{R}^{3}$ ) is a phase space function with the properties demanded for $\vec{X}$ . Thus we need to prove uniqueness. Our proof will follow the proof of the quantum-mechanical Newton–Wigner theorem given by Jordan in [43], some parts of which can be applied literally to the classical case.

We will several times need the following.

Lemma 4.8.

Consider a classical elementary system with timelike four-momentum, with phase space $\Gamma$ , and some open subset $\tilde{\Gamma}$ of $\Gamma^{*}=\Gamma\setminus\{|\vec{P}|=0\}$ . Let $f$ be an $\mathbb{R}$ -valued $C^{1}$ function defined on $\tilde{\Gamma}$ that is invariant under spatial translations and rotations, i.e. $\{P_{a},f\}=0=\{J_{ab},f\}$ . Then $f$ is a function of $|\vec{P}|,\hat{P}\cdot\vec{s}$ . ¹⁴¹⁴14By ‘ $f$ is a function of $|\vec{P}|,\hat{P}\cdot\vec{s}$ ’ we mean that $f$ depends on phase space only via $|\vec{P}|,\hat{P}\cdot\vec{s}$ , i.e. that there is a $C^{1}$ function $F\colon U\to\mathbb{R}$ , $U=\left\{(|\vec{P}|(\gamma),(\hat{P}\cdot\vec{s})(\gamma)):\gamma\in\tilde{\Gamma}\right\}\subset\mathbb{R}_{+}\times[-S,S]$ satisfying $f(\gamma)=F(|\vec{P}|(\gamma),(\hat{P}\cdot\vec{s})(\gamma))\;\text{for all}\;\gamma\in\tilde{\Gamma}.$

Proof.

$f$ Poisson-commutes with $\vec{P}$ and $J_{ab}$ . Therefore it also Poisson-commutes with $\vec{P}$ and $\frac{1}{2}{{}^{(3)}\varepsilon^{abc}}\hat{P}_{a}J_{bc}=\hat{P}\cdot\vec{s}$ . Now $\vec{P},\hat{P}\cdot\vec{s}$ form a complete involutive set on $\Gamma^{*}$ (lemma 4.3), so since $f$ Poisson-commutes with them, it must be a function of $\vec{P},\hat{P}\cdot\vec{s}$ . Since $f$ and $\hat{P}\cdot\vec{s}$ are rotation invariant (by lemma 4.4), $f$ must be a function of $|\vec{P}|,\hat{P}\cdot\vec{s}$ . ∎

Let now $\vec{X}$ be an observable as in the statement of theorem 4.6, and consider the difference $\vec{d}:=\vec{X}-\vec{x}$ . Due to the assumptions of theorem 4.6, $\vec{d}$ is $C^{1}$ , is invariant under translations (i.e. $\{d^{a},P_{b}\}=0$ ), transforms as a vector under spatial rotations (i.e. $\{J_{ab},d^{c}\}=\delta_{a}^{c}d_{b}-\delta_{b}^{c}d_{a}$ ) and is invariant under time reversal with respect to the hyperplane through the origin and orthogonal to $u$ (i.e. $\vec{d}\circ T_{u}=\vec{d}$ ).

Lemma 4.9.

Let $\vec{A}$ be a $\mathbb{R}^{3}$ -valued $C^{1}$ phase space function on a classical elementary system with timelike four-momentum that is invariant under translations, transforms as a vector under spatial rotations and is invariant under time reversal. Then $\vec{A}\cdot\vec{P}=0$ .

Proof.

Since $\vec{P}$ is invariant under translations and a vector under rotations, $\vec{A}\cdot\vec{P}$ is invariant under translations and rotations. By lemma 4.8, $\left.\vec{A}\cdot\vec{P}\right|_{\Gamma^{*}}$ is a function of $|\vec{P}|,\hat{P}\cdot\vec{s}$ . This means we have

\left.\vec{A}\cdot\vec{P}\right|_{\Gamma^{*}}=F(|\vec{P}|,\hat{P}\cdot\vec{s})

(4.11)

for some function $F\colon\mathbb{R}_{+}\times[-S,S]\to\mathbb{R}$ .

Now considering time reversal $T_{u}$ , on the one hand we have (using lemma 4.5)

	$\|\vec{P}\|\circ T_{u}=\|\vec{P}\circ T_{u}\|=\|-\vec{P}\|=\|\vec{P}\|$		(4.12a)
and

	$\displaystyle(\hat{P}\cdot\vec{s})\circ T_{u}$	$\displaystyle=\left(\frac{1}{2}{{}^{(3)}\varepsilon^{abc}}\hat{P}_{a}J_{bc}\right)\circ T_{u}$
		$\displaystyle=\frac{1}{2}{{}^{(3)}\varepsilon^{abc}}(\hat{P}_{a}\circ T_{u})(J_{bc}\circ T_{u})$
		$\displaystyle=\frac{1}{2}{{}^{(3)}\varepsilon^{abc}}(-\hat{P}_{a})(-J_{bc})$
		$\displaystyle=\frac{1}{2}{{}^{(3)}\varepsilon^{abc}}\hat{P}_{a}J_{bc}$
		$\displaystyle=\hat{P}\cdot\vec{s},$	(4.12b)

implying

F(|\vec{P}|,\hat{P}\cdot\vec{s})\circ T_{u}=F(|\vec{P}|\circ T_{u},(\hat{P}\cdot\vec{s})\circ T_{u})=F(|\vec{P}|,\hat{P}\cdot\vec{s}).

(4.13)

On the other hand, $\vec{A}$ is invariant under time reversal while $\vec{P}$ changes its sign, implying that $(\vec{A}\cdot\vec{P})\circ T_{u}=-\vec{A}\cdot\vec{P}$ . Combining this with (4.11) and (4.13), we obtain $\left.\vec{A}\cdot\vec{P}\right|_{\Gamma^{*}}=0$ , and continuity implies $\vec{A}\cdot\vec{P}=0$ . ∎

For zero spin, we can easily complete the proof of the Newton–Wigner theorem. Since the difference vector $\vec{d}$ is translation invariant and the $P_{a}$ form a complete involutive set on $\Gamma$ , $\vec{d}$ must be a function of $\vec{P}$ . Then since it is a vector under rotations, it must be of the form

\vec{d}(\vec{P})=F(|\vec{P}|)\vec{P}

(4.14)

for some function $F$ of $|\vec{P}|$ . Then, since according to lemma 4.9 $\vec{d}\cdot\vec{P}$ is zero, $\vec{d}$ is zero. Thus, for the spin-zero case, we have proved the Newton–Wigner theorem without any use of the condition of Poisson-commuting components of the position observable.

For the non-zero spin case, we continue as follows.

Lemma 4.10.

Let $\vec{A}$ be a $\mathbb{R}^{3}$ -valued $C^{1}$ phase space function on a classical elementary system with timelike four-momentum and non-zero spin that is invariant under translations, transforms as a vector under spatial rotations and satisfies $\vec{A}\cdot\vec{P}=0$ . Then it is of the form

\vec{A}=B\hat{P}\times\vec{s}+C\hat{P}\times(\hat{P}\times\vec{s})

(4.15)

on $\Gamma^{*}\setminus\{\vec{s}\parallel\hat{P}\}$ , where $B$ and $C$ are $C^{1}$ functions of $|\vec{P}|$ and $\hat{P}\cdot\vec{s}$ , i.e. $C^{1}$ functions

B,C\colon\mathbb{R}_{+}\times(-S,S)\to\mathbb{R}.

Proof.

For the whole of this proof, we will work on $\tilde{\Gamma}:=\Gamma^{*}\setminus\{\vec{s}\parallel\hat{P}\}$ . Since evaluated at each point of $\tilde{\Gamma}$ , the $\mathbb{R}^{3}$ -valued functions $\hat{P},\hat{P}\times\vec{s},\hat{P}\times(\hat{P}\times\vec{s})$ form an orthogonal basis of $\mathbb{R}^{3}$ , and since we have $\vec{A}\cdot\vec{P}=0$ , we can write $\vec{A}$ in the form (4.15) with coefficients $B,C$ given by

	$\displaystyle B$	$\displaystyle=\frac{\vec{A}\cdot(\hat{P}\times\vec{s})}{\|\hat{P}\times\vec{s}\|}\;,$		(4.16)
	$\displaystyle C$	$\displaystyle=\frac{\vec{A}\cdot(\hat{P}\times(\hat{P}\times\vec{s}))}{\|\hat{P}\times(\hat{P}\times\vec{s})\|}\;.$		(4.17)

Since $\vec{A}$ , $\vec{P}$ and $\vec{s}$ are invariant under translations and vectors under rotations, these equations imply that $B,C$ are invariant under translations and rotations. The result follows with lemma 4.8. ∎

Now we consider again the difference vector $\vec{d}=\vec{X}-\vec{x}$ . It satisfies $\vec{d}\cdot\vec{P}=0$ by lemma 4.9, and thus we have

\vec{X}\cdot\vec{P}=\vec{x}\cdot\vec{P}.

(4.18)

Since we assume that the components of $\vec{X}$ Poisson-commute with each other and that $\{X^{a},P_{b}\}=\delta^{a}_{b}$ , this implies

\{X^{a},\vec{x}\cdot\vec{P}\}=\{X^{a},\vec{X}\cdot\vec{P}\}=X^{a}.

(4.19)

Combining this with $\{x^{a},\vec{x}\cdot\vec{P}\}=x^{a}$ , we obtain

\{d^{a},\vec{x}\cdot\vec{P}\}=d^{a}.

(4.20)

On the other hand, for any function $F$ of $\vec{P}$ and $\vec{s}$ , we have

\{F(\vec{P},\vec{s}),\vec{x}\cdot\vec{P}\}=\{F(\vec{P},\vec{s}),x^{a}\}P_{a}=-\frac{\partial F(\vec{P},\vec{s})}{\partial P_{a}}P_{a}=-|\vec{P}|\left.\frac{\partial F}{\partial|\vec{P}|}\right|_{\hat{P}=\mathrm{const.},\vec{s}=\mathrm{const.}}.

(4.21)

This implies

\vec{d}=-|\vec{P}|\left.\frac{\partial\vec{d}}{\partial|\vec{P}|}\right|_{\hat{P}=\mathrm{const.},\vec{s}=\mathrm{const.}}.

(4.22)

Combining lemmas 4.9 and 4.10, we know that $\vec{d}$ has the form (4.15) on $\Gamma^{*}\setminus\{\vec{s}\parallel\hat{P}\}$ for two functions $B,C\colon\mathbb{R}_{+}\times(-S,S)\to\mathbb{R}$ . Thus (4.22) implies the two equations

B(|\vec{P}|,\hat{P}\cdot\vec{s})=-|\vec{P}|\frac{\partial B(|\vec{P}|,\hat{P}\cdot\vec{s})}{\partial|\vec{P}|},\;C(|\vec{P}|,\hat{P}\cdot\vec{s})=-|\vec{P}|\frac{\partial C(|\vec{P}|,\hat{P}\cdot\vec{s})}{\partial|\vec{P}|}

(4.23)

on $\mathbb{R}_{+}\times(-S,S)$ . These equations determine the $|\vec{P}|$ dependence of $B$ and $C$ ; they must be proportional to $|\vec{P}|^{-1}$ . However, for $\vec{d}$ to be $C^{1}$ on the whole of $\Gamma$ , in fact for (4.15) not to diverge as $|\vec{P}|\to 0$ even when coming from a single direction $\hat{P}$ , we then need $B$ and $C$ to vanish. Continuity implies $\vec{d}=0$ on all of $\Gamma$ . This finishes the proof of the Newton–Wigner theorem. ∎

5 Conclusion

In this paper we have studied the localisation problem for classical system whose phase space is a symplectic manifold. We focussed on the Newton–Wigner position observable and asked for precise characterisations of it in order to gain additional understanding, over and above that already known from its practical use for the solution of concrete problems of motion, e.g., in astrophysics [16, 17]. We proved two theorems that we believe advance our understanding in the desired direction: first we showed how Fleming’s geometric scheme [15] in combination with the characterisation of worldlines through SSCs (Spin Supplementary Conditions) allows to give a precise meaning to, and proof of, the fact that the Newton–Wigner position is the unique centre of spin. Given that interpretation, it also offers an insight as to why the Newton–Wigner SSC uses a somewhat unnatural looking ‘hybrid’ combination $f=u+\frac{P}{mc}$ , where $u$ is ‘external’ or ‘kinematical’, and $P$ is ‘internal’ or ‘dynamical’. Then, restricting to elementary systems, i.e. systems whose phase space admits a transitive action of the proper orthochronous Poincaré group, we proved again a uniqueness result to the effect that the Newton–Wigner observable is the unique phase space function whose components satisfy the ‘familiar’ Poisson relations, provided it is continuously differentiable, time-reversal invariant, and transforms as a vector under spatial rotations. These properties seem to be the underlying reason for the distinguished rôle it plays in solution strategies like those of [16, 17]¹⁵¹⁵15Despite the fact that on a more general level of theorisation other choices (characterised by other SSCs) are often considered more appropriate; see, e.g., [18]., which in recent years have lead to astounding progress in the Hamiltonian analytical understanding of the dynamics of binary systems of spinning compact objects: the calculations have been pushed to ever higher post-Newtonian orders, starting from the next-to-leading order for spin–orbit and spin–spin effects, i.e. order $c^{-4}$ , in [44, 45, 46], and most recently reaching a complete description at ‘4.5th’ post-Newtonian order, i.e. order $c^{-9}$ , in [47, 48]. We believe that our results add a conceptually clear and mathematically precise Hamiltonian underpinning of what the choice of the Newton–Wigner observable entails, at least in a special-relativistic context or, more generally, in general-relativistic perturbation theory around Minkowski space.

Acknowledgements

This work was supported by the Deutsche Forschungsgemeinschaft through the Collaborative Research Centre 1227 (DQ-mat), projects B08/A05.

References

Breit [1928] Gregory Breit, “An interpretation of Dirac’s theory of the electron,” Proceedings of the National Academy of Sciences of the Unites States of America 14, 553–559 (1928).
Schrödinger [1984a] Erwin Schrödinger, “Über die kräftefreie Bewegung in der relativistischen Quantenmechanik,” in Erwin Schrödinger Collected Papers Volume 3: Contributions to Quantum Theory (Verlag der Österreichische Akademie der Wissenschaften (Wien) and Friedrich Vieweg & Sohn (Braunschweig/Wiesbaden), 1984) pp. 357–368, originally published by: Preussische Akademie der Wissenschaften, physikalisch-mathematische Klasse, year 1930, volume XXIV, pages 417–428.
Schrödinger [1984b] Erwin Schrödinger, “Zur Quantendynamik des Elektrons,” in Erwin Schrödinger Collected Papers Volume 3: Contributions to Quantum Theory (Verlag der Österreichische Akademie der Wissenschaften (Wien) and Friedrich Vieweg & Sohn (Braunschweig/Wiesbaden), 1984) pp. 369–379, originally published by: Preussische Akademie der Wissenschaften, physikalisch-mathematische Klasse, year 1931, volume III, pages 62–72.
Newton and Wigner [1949] Theodore Duddell Newton and Eugene Paul Wigner, “Localized states for elementary systems,” Reviews of Modern Physics 21, 400–406 (1949).
Wightman [1962] Arthur Strong Wightman, “On the localizability of quantum mechanical systems,” Reviews of Modern Physics 34, 845–872 (1962).
Varadarajan [1985] Veeravalli Seshadri Varadarajan, Geometry of Quantum Theory (Springer, New York, 1985).
Segal and Goodman [1965] Irving Ezra Segal and Roe William Goodman, “Anti-locality of certain Lorentz-invariant operators,” Journal of Mathematics and Mechanics 14, 629–638 (1965).
Hegerfeldt [1974] Gerhard C. Hegerfeldt, “Remark on causality and particle localization,” Physical Review D 10, 3320–3321 (1974).
Ruijsenaars [1981] Simon N. M. Ruijsenaars, “On Newton-Wigner localization and superluminal propagation speeds,” Annals of Physics 137, 33–43 (1981).
Wightman and Schweber [1955] Arthur Strong Wightman and Silvan S. Schweber, “Configuration space methods in relativistic quantum field theory. I,” Physical Review 98, 812–837 (1955).
Fleming and Butterfield [1999] Gordon N. Fleming and Jeremy Butterfield, “Strange positions,” in From Physics to Philosophy, edited by Jeremy Butterfield and Constantine Pagonis (Cambridge University Press, Cambridge, 1999) pp. 108–165.
Fleming [2000] Gordon N. Fleming, “Reeh-Schlieder meets Newton-Wigner,” Philosophy of Science 67, S495–S515 (2000), Proceedings of the 1998 Biennial Meetings of the Philosophy of Science Association. Part II: Symposia Papers.
Halvorson [2001] Hans Halvorson, “Reeh-Schlieder defeats Newton-Wigner: On alternative localization schemes in relativistic quantum field theory,” Philosophy of Science 68, 111–133 (2001).
Pryce [1948] Maurice Henry Lecorney Pryce, “The mass-centre in the restricted theory of relativity and its connexion with the quantum theory of elementary particles,” Proceedings of the Royal Socienty A: Mathematical, Physical and Engineering Sciences 195, 62–81 (1948).
Fleming [1965] Gordon N. Fleming, “Covariant position operators, spin, and locality,” Physical Review 137, B188–B197 (1965).
Steinhoff [2011] Jan Steinhoff, “Canonical formulation of spin in general relativity,” Annalen der Physik 523, 296–353 (2011).
Schäfer and Jaranowski [2018] Gerhard Schäfer and Piotr Jaranowski, “Hamiltonian formulation of general relativity and post-Newtonian dynamics of compact binaries,” Living Reviews in Relativity 21 (2018), 10.1007/s41114-018-0016-5.
Puetzfeld et al. [2015] Dirk Puetzfeld, Claus Lämmerzahl, and Bernard Schutz, eds., Equations of Motion in Relativistic Gravity, Fundamental Theories of Physics, Vol. 179 (Springer, Cham, 2015).
Costa et al. [2018] L. Filipe O. Costa, Georgios Lukes-Gerakopolous, and Oldřich Semerák, “Spinning particles in general relativity: Momentum-velocity relation for the Mathisson-Pirani spin condition,” Physical Review D 97, 084023 (2018).
Arens [1971a] Richard Arens, “Classical relativistic particles,” Communications in Mathematical Physics 21, 139–149 (1971a).
Arens [1971b] Richard Arens, “Classical Lorentz invariant particles,” Journal of Mathematical Physics 12, 2415–2422 (1971b).
Abraham and Marsden [1978] Ralph Abraham and Jerrold E. Marsden, Foundations of Mechanics, 2nd ed. (AMS Chelsea Publishing, 1978).
Arnold [1989] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd ed. (Springer, New York, 1989).
Giulini [2015] Domenico Giulini, “Energy-momentum tensors and motion in special relativity,” in Equations of Motion in Relativistic Gravity, Fundamental Theories of Physics, Vol. 179, edited by Dirk Puetzfeld, Claus Lämmerzahl, and Bernard Schutz (Springer, Cham, 2015) Chap. 3, pp. 121–163.
Inönü and Wigner [1952] Erdal Inönü and Eugene Paul Wigner, “Representations of the Galilei group,” Il Nuovo Cimento 9, 705–718 (1952).
Woodhouse [1980] Nicholas Woodhouse, Geometric Quantization (SpringerClarendon Press, Oxford, 1980).
Straumann [2013] Norbert Straumann, General Relativity (Springer, Dordrecht, 2013).
Fleming [1966] Gordon N. Fleming, “A manifestly covariant description of arbitrary dynamical variables in relativistic quantum mechanics,” Journal of Mathematical Physics 7, 1959–1981 (1966).
Giulini [2018] Domenico Giulini, “Laue’s theorem revisited: Energy–momentum tensors, symmetries, and the habitat of globally conserved quantities,” International Journal of Geometric Methods in Modern Physics 15, 1850182 (2018).
Corinaldesi and Papapetrou [1951] Ernesto Corinaldesi and Achille Papapetrou, “Spinning test-particles in general relativity. II,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 209, 259–268 (1951).
Tulczyjew [1959] Wlodzimierz M. Tulczyjew, “Motion of multipole particles in general relativity theory,” Acta Physica Polonica 18, 393 (1959).
Dixon [1970] Graham Dixon, “Dynamics of extended bodies in general relativity. I. momentum and angular momentum,” Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 314, 499–527 (1970).
Frenkel [1926] Jacov Frenkel, “Die Elektrodynamik des rotierenden Elektrons,” Zeitschrift für Physik 37, 243–262 (1926).
Mathisson [1937] Myron Mathisson, “Neue Mechanik materieller Systeme,” Acta Physica Polonica 6, 163–200 (1937).
Mathisson [2010] Myron Mathisson, “New mechanics of material systems,” General Relativity and Gravitation 42, 1011–1048 (2010), republication of original paper [34] as ‘Golden Oldie’.
Pirani [1956] Felix A. E. Pirani, “On the physical significance of the Riemann tensor,” Acta Physica Polonica 15, 389–405 (1956).
Pirani [2009] Felix A. E. Pirani, “On the physical significance of the Riemann tensor,” General Relativity and Gravitation 41, 1215–1232 (2009), republication of original article [36] as ‘Golden Oldie’.
Møller [1949] Christian Møller, “On the definition of the centre of gravity of an arbitrary closed system in the theory of relativity,” Communications of the Dublin Institute for Advanced Studies A, 5, 1–42 (1949).
Wigner [1939] Eugene Paul Wigner, “On unitary representations of the inhomogeneous Lorentz group,” Annals of Mathematics 40, 149–204 (1939).
Bacry [1967] Henri Bacry, “Space-time and degrees of freedom of the elementary particle,” Communications in Mathematical Physics 5, 97–105 (1967).
Conway et al. [1985] John Horton Conway et al., ATLAS of Finite Groups (Clarendon Press, Oxford, 1985).
Bacry and Lévy-Leblond [1968] Henri Bacry and Jean-Marc Lévy-Leblond, “Possible kinematics,” Journal of Mathematical Physics 9, 97–105 (1968).
Jordan [1980] Thomas F. Jordan, “Simple derivation of the Newton–Wigner position operator,” Journal of Mathematical Physics 21, 2028–2032 (1980).
Steinhoff et al. [2008a] Jan Steinhoff, Steven Hergt, and Gerhard Schäfer, “Next-to-leading order gravitational spin(1)-spin(2) dynamics in hamiltonian form,” Physical Review D 77, 081501(R) (2008a).
Steinhoff et al. [2008b] Jan Steinhoff, Gerhard Schäfer, and Steven Hergt, “ADM canonical formalism for gravitating spinning objects,” Physical Review D 77, 104018 (2008b).
Steinhoff et al. [2008c] Jan Steinhoff, Steven Hergt, and Gerhard Schäfer, “Spin-squared hamiltonian of next-to-leading order gravitational interaction,” Physical Review D 78, 101503(R) (2008c).
Levi et al. [2019] Michele Levi, Stavros Mougiakakos, and Mariana Vieira, “Gravitational cubic-in-spin interaction at the next-to-leading post-Newtonian order,” (2019), arXiv:1912.06276 [hep-th] .
Antonelli et al. [2020] Andrea Antonelli, Chris Kavanagh, Mohammed Khalil, Jan Steinhoff, and Justin Vines, “Gravitational spin-orbit coupling through third-subleading post-Newtonian order: from first-order self-force to arbitrary mass ratios,” (2020), arXiv:2003.11391 [gr-qc] .

Appendix A Sign conventions for generators of special orthogonal groups

Let $V$ be a finite-dimensional real vector space with a non-degenerate, symmetric bilinear form $g\colon V\times V\to\mathbb{R}$ . Note that we do not assume anything about the signature of $g$ . We introduce the ‘musical isomorphism’

V\to V^{*},v\mapsto v^{\flat}:=g(v,\cdot)

(A.1)

induced by $g$ .

We fix a basis $\{e_{a}\}_{a}$ of $V$ . As bases for its dual vector space $V^{*}$ we distinguish its natural dual basis $\{\theta^{a}\}_{a}$ , where $\theta^{a}(e_{b})=\delta^{a}_{b}$ , and the ( $g$ -dependent) image of $\{e_{a}\}_{a}$ under (A.1), which is just $\{e^{\flat}_{a}\}_{a}$ , where $e^{\flat}_{a}=g_{ab}\theta^{b}$ , so that $e^{\flat}_{a}(e_{b})=g_{ab}$ . The reason for this will become clear now.

For each $a,b\in\{1,\dots,\dim V\}$ we introduce the endomorphism

B_{ab}:=e_{a}\otimes e_{b}^{\flat}-e_{b}\otimes e_{a}^{\flat}\in\mathrm{End}(V),

(A.2)

which satisfies

g(v,B_{ab}(w))=g(v,e_{a})g(e_{b},w)-g(v,e_{b})g(e_{a},w)=-g(B_{ab}(v),w).

(A.3)

This means that $B_{ab}$ is anti-self-adjoint with respect to $g$ and hence that it is an element of the Lie algebra $\mathfrak{so}(V,g)$ of the Lie group $\mathsf{SO}(V,g)$ of special orthogonal transformations of $(V,g)$ :

B_{ab}\in\mathfrak{so}(V,g).

(A.4)

As $B_{ab}=-B_{ba}$ , it is the set $\{B_{ab}:1\leq a<b\leq\dim V\}$ which is linearly independent and of the same dimension as $\mathfrak{so}(V,g)$ . Hence this set forms a basis of $\mathfrak{so}(V,g)$ so that any $\omega\in\mathfrak{so}(V,g)$ can be uniquely written in the form

\omega=\sum_{1\leq a<b\leq\dim V}\omega^{ab}B_{ab}=\frac{1}{2}\omega^{ab}B_{ab}\;,

(A.5)

where

\omega^{ab}=-\omega^{ba}\,.

(A.6)

This representation can easily be compared to the usual one in terms of the metric-independent basis $\{e_{a}\otimes\theta^{b}:1\leq a,b\leq\dim V\}$ of $\mathrm{End}(V)$ in the following way: for $\omega=\omega^{a}_{\hphantom{a}c}\,e_{a}\otimes\theta^{c}$ , we have $\omega\in\mathfrak{so}(V,g)$ if and only if

\omega^{a}_{\hphantom{a}c}\,g^{cb}=-\omega^{b}_{\hphantom{a}c}\,g^{ca}\;.

(A.7)

It is the obvious simplicity of (A.6) as opposed to (A.7) as conditions for $\omega\in\mathrm{End}(V)$ being contained in $\mathfrak{so}(V,g)\subset\mathrm{End}(V)$ that makes it easier to work with the basis $e_{a}\otimes e^{\flat}_{b}$ of $\mathrm{End}(V)$ rather than $e_{a}\otimes\theta^{b}$ . Note that the components of $\omega$ with respect to the two bases considered above are connected by the equation

\omega^{ab}=\omega^{a}_{\hphantom{a}c}\,g^{cb}\;.

(A.8)

The basis elements $B_{ab}$ satisfy the commutation relations

	$\displaystyle[B_{ab},B_{cd}]$	$\displaystyle=g_{bc}B_{ad}+g_{ad}B_{bc}-g_{ac}B_{bd}-g_{bd}B_{ac}$
		$\displaystyle=g_{bc}B_{ad}+\text{(antisymm.)},$		(A.9)

where ‘antisymm.’ is as explained below equation (2.4).

From now on, we will assume the basis $\{e_{a}\}_{a}$ to be orthonormal. For notational convenience, for $a,b\in\{1,\dots,\dim V\}$ we define

\varepsilon_{ab}:=g_{aa}g_{bb}=\pm 1

(A.10)

which has the value $+1$ if $g_{aa}=g(e_{a},e_{a})$ and $g_{bb}=g(e_{b},e_{b})$ have the same sign, and $-1$ if they have opposite signs¹⁶¹⁶16Note that repeated indices on the same level, i.e. both up or both down, are not to be summed over..

We now want to compute the exponential $\exp(\alpha B_{ab})\in\mathsf{SO}(V,g)$ . At first, we note that

	$\displaystyle(B_{ab})^{2}$	$\displaystyle=-g_{bb}e_{a}\otimes e_{a}^{\flat}-g_{aa}e_{b}\otimes e_{b}^{\flat}$
		$\displaystyle=-\varepsilon_{ab}\;\mathrm{Pr}_{ab}\,,$		(A.11)

where $\mathrm{Pr}_{ab}:=\mathrm{Pr}_{\mathrm{span}\{e_{a},e_{b}\}}$ denotes the $g$ -orthogonal projector onto the plane $\mathrm{span}\{e_{a},e_{b}\}$ in $V$ .¹⁷¹⁷17In the general case of two linearly independent vectors $v,w\in V$ , not necessarily orthonormal, the orthogonal projector is given by $\mathrm{Pr}_{\mathrm{span}(v,w)=\frac{1}{g(v,v)g(w,w)-(g(v,w))^{2}}}\left[g(w,w)\;v\otimes v^{\flat}+g(v,v)\;w\otimes w^{\flat}-g(v,w)\;(v\otimes w^{\flat}+w\otimes v^{\flat})\right],$ (A.12) implying $\displaystyle(v\otimes w^{\flat}-w\otimes v^{\flat})^{2}$ $\displaystyle=-g(w,w)\;v\otimes v^{\flat}-g(v,v)\;w\otimes w^{\flat}+g(v,w)\;(v\otimes w^{\flat}+w\otimes v^{\flat})$ $\displaystyle=-\left[g(v,v)g(w,w)-(g(v,w))^{2}\right]\mathrm{Pr}_{\mathrm{span}(v,w)}.$ (A.13) Using this and $B_{ab}\circ\mathrm{Pr}_{ab}=B_{ab}$ , the exponential series evaluates to

$\displaystyle\exp(\alpha B_{ab})$	$\displaystyle=(\mathrm{id}_{V}-\mathrm{Pr}_{ab})+\sum_{k=0}^{\infty}\frac{1}{(2k)!}\,\alpha^{2k}(-\varepsilon_{ab})^{k}\,\mathrm{Pr}_{ab}$
	$\displaystyle\quad+\sum_{k=0}^{\infty}\frac{1}{(2k+1)!}\,\alpha^{2k+1}(-\varepsilon_{ab})^{k}\,B_{ab}\circ\mathrm{Pr}_{ab}$
	$\displaystyle=(\mathrm{id}_{V}-\mathrm{Pr}_{ab})+\left\{\!\!\begin{aligned} &\cos(\alpha)\,\mathrm{id}_{V}+\sin(\alpha)\,B_{ab}\,,&\varepsilon_{ab}=+1\\ &\cosh(\alpha)\,\mathrm{id}_{V}+\sinh(\alpha)\,B_{ab}\,,&\varepsilon_{ab}=-1\end{aligned}\right\}\circ\mathrm{Pr}_{ab}\;.$	(A.14)

Geometrically, this transformation is either a rotation by angle $\alpha$ (for $\varepsilon_{ab}=+1$ ) or a boost by rapidity $\alpha$ (for $\varepsilon_{ab}=-1$ ) in the plane $\mathrm{span}\{e_{a},e_{b}\}$ . The direction of the transformation depends on the signs of $g_{aa},g_{bb}$ :

•
$\varepsilon_{ab}=+1$ :
1. (i)
  
  $g_{aa}=g_{bb}=+1$ : We have $B_{ab}(e_{a})=-e_{b},B_{ab}(e_{b})=e_{a}$ . Thus, $\exp(\alpha B_{ab})$ is a rotation by $\alpha$ from $e_{b}$ towards $e_{a}$ .
2. (ii)
  
  $g_{aa}=g_{bb}=-1$ : We have $B_{ab}(e_{a})=e_{b},B_{ab}(e_{b})=-e_{a}$ . Thus, $\exp(\alpha B_{ab})$ is a rotation by $\alpha$ from $e_{a}$ towards $e_{b}$ .
•
$\varepsilon_{ab}=-1$ :
1. (i)
  
  $g_{aa}=+1,g_{bb}=-1$ : We have $B_{ab}(e_{a})=-e_{b},B_{ab}(e_{b})=-e_{a}$ . Thus, $\exp(\alpha B_{ab})$ is a boost by $\alpha$ ‘away’ from $e_{a}+e_{b}$ .
2. (ii)
  
  $g_{aa}=-1,g_{bb}=+1$ : We have $B_{ab}(e_{a})=e_{b},B_{ab}(e_{b})=e_{a}$ . Thus, $\exp(\alpha B_{ab})$ is a boost by $\alpha$ ‘towards’ $e_{a}+e_{b}$ .

Now we will apply the preceding considerations to the case of (the ‘difference’ vector space of) Minkowski spacetime, where for now we leave open the signature convention for the metric (either $(+{--}-)$ or $(-{++}+)$ ). We work with respect to a positively oriented orthonormal basis $\{e_{\mu}\}_{\mu=0,\dots,3}$ where $e_{0}$ is timelike. Latin indices will denote spacelike directions.

In the case of ‘mostly minus’ signature $(+{--}-)$ , $B_{ab}$ generates rotations from $e_{a}$ towards $e_{b}$ and $B_{a0}$ generates boosts (with respect to $e_{0}$ ) in direction of $e_{a}$ . In the case of ‘mostly plus’ signature $(-{++}+)$ , $B_{ba}=-B_{ab}$ generates rotations from $e_{a}$ towards $e_{b}$ and $B_{0a}=-B_{a0}$ generates boosts (with respect to $e_{0}$ ) in direction of $e_{a}$ .

Thus, since we want to use the notation $J_{ab}$ for the spacelike rotational generator generating rotations from $e_{a}$ towards $e_{b}$ , we have to set

J_{\mu\nu}=\begin{cases}B_{\mu\nu}&\text{for $(+{--}-)$ signature},\\ -B_{\mu\nu}&\text{for $(-{++}+)$ signature}\end{cases}

(A.15)

for the Lorentz generators. Adopting this convention, boosts in direction of $e_{a}$ are then generated by $J_{a0}$ . The commutation relations for the $J_{\mu\nu}$ are

[J_{\mu\nu},J_{\rho\sigma}]=\begin{cases}\eta_{\mu\sigma}J_{\nu\rho}+\text{(antisymm.)}&\text{for $(+{--}-)$ signature},\\ \eta_{\mu\rho}J_{\nu\sigma}+\text{(antisymm.)}&\text{for $(-{++}+)$ signature},\end{cases}

(A.16)

and general Lorentz algebra elements $\omega\in\mathrm{Lie}(\mathcal{L})$ can be written as

\omega=\pm\frac{1}{2}\omega^{\mu\nu}J_{\mu\nu}\;\text{with}\;\omega^{\mu\nu}=\omega^{\mu}_{\hphantom{\mu}\rho}\;\eta^{\rho\nu}

(A.17)

in terms of their components $\omega^{\mu}_{\hphantom{\mu}\rho}$ as endomorphisms, where the upper/lower sign holds for $(+{--}-)$ / $(-{++}+)$ signature.

Appendix B Notes on the adjoint representation

Here we wish to make a few remarks and collect a few formulae concerning the adjoint and co-adjoint representation, which will be made use of in the main text.

In the defining representation on $V$ , an element $\Lambda\in\mathsf{GL}(V)$ is given in terms of the basis $\{e_{a}\}_{a}$ by the coefficients $\Lambda^{a}_{\phantom{a}{}_{b}}$ , where

\Lambda e_{a}=\Lambda^{b}_{\phantom{b}a}\,e_{b}\;.

(B.1)

This defines a left action of $\mathsf{GL}(V)$ on $V$ . The corresponding left action of $\mathsf{GL}(V)$ on the dual space $V^{*}$ is given by the inverse-transposed, i.e. $\mathsf{GL}(V)\times V^{*}\to V^{*}$ , $(\Lambda,\alpha)\mapsto(\Lambda^{-1})^{\top}\alpha:=\alpha\circ\Lambda^{-1}$ . For the basis $\{\theta^{a}\}_{a}$ of $V^{*}$ dual to the basis $\{e_{a}\}_{a}$ this means

\theta^{a}\circ\Lambda^{-1}=(\Lambda^{-1})^{a}_{\phantom{a}b}\,\theta^{b}\;.

(B.2)

In contrast, for the basis $\{e_{a}^{\flat}\}_{a}$ of $V^{*}$ , this reads in general

e_{b}^{\flat}\circ\Lambda^{-1}=g^{ac}g_{bd}(\Lambda^{-1})^{d}_{\phantom{d}c}\,e_{a}^{\flat}\;,

(B.3)

which for isometries $\Lambda\in\mathsf{O}(V,g)$ simply becomes

e^{\flat}_{b}\circ\Lambda^{-1}=\Lambda^{a}_{\phantom{a}b}\,e^{\flat}_{a}\;.

(B.4)

The adjoint representation of $\mathsf{GL}(V)$ on $\mathrm{End}(V)\cong V\otimes V^{*}$ or any Lie subalgebra of $\mathrm{End}(V)$ is by conjugation, which for our basis (A.2) implies, using (B.1) and (B.4),

\mathrm{Ad}_{\Lambda}B_{ab}=\Lambda\circ B_{ab}\circ\Lambda^{-1}=\Lambda^{c}_{\phantom{c}a}\Lambda^{d}_{\phantom{d}b}\,B_{cd}\quad\text{for}\;\Lambda\in\mathsf{O}(V,g).

(B.5)

The adjoint representation of the inhomogeneous group $\mathsf{GL}(V)\ltimes V$ on its Lie algebra $\mathrm{End}(V)\oplus V$ is given by, for any $X\in\mathrm{End}(V)$ and $y\in V$ ,

\mathrm{Ad}_{(\Lambda,a)}(X,y)=\left(\Lambda\circ X\circ\Lambda^{-1},\Lambda y-(\Lambda\circ X\circ\Lambda^{-1})a\right).

(B.6)

In the main text we will use this formula for $(\Lambda,a)$ being replaced by its inverse $(\Lambda,a)^{-1}=(\Lambda^{-1},-\Lambda^{-1}a)$ :

\mathrm{Ad}_{(\Lambda,a)^{-1}}(X,y)=\left(\Lambda^{-1}\circ X\circ\Lambda,\Lambda^{-1}y+(\Lambda^{-1}\circ X)a\right)

(B.7)

Applied to the basis vectors separately, i.e. to $(X,y)=(0,e_{b})$ and $(X,y)=(B_{bc},0)$ , for $\Lambda\in\mathsf{O}(V,g)$ we get


$\displaystyle\mathrm{Ad}_{(\Lambda,a)^{-1}}(0,e_{b})$	$\displaystyle=\left(0,(\Lambda^{-1})^{c}_{\phantom{c}b}e_{c}\right)$	(B.8a)
$\displaystyle\mathrm{Ad}_{(\Lambda,a)^{-1}}(B_{bc},0)$	$\displaystyle=\left((\Lambda^{-1})^{d}_{\phantom{d}b}(\Lambda^{-1})^{e}_{\phantom{e}c}\,B_{de},-a_{b}(\Lambda^{-1})^{d}_{\phantom{d}c}e_{d}+a_{c}(\Lambda^{-1})^{d}_{\phantom{d}b}e_{d}\right)$	(B.8b)

where $a_{b}:=e_{b}^{\flat}(a)=g_{bc}a^{c}$ in the second equation. From these equations we immediately deduce (2.21) in the case of four spacetime dimensions (greek indices) and signature mostly plus, in which case $J_{\mu\nu}=-B_{\mu\nu}$ according to (A.15).