In physics, relativistic quantum mechanics (RQM) is any Poincaré covariant formulation of quantum mechanics (QM). This theory is applicable to massive particles propagating at all velocities up to those comparable to the speed of light c, and can accommodate massless particles. The theory has application in high energy physics,^{[1]} particle physics and accelerator physics,^{[2]} as well as atomic physics, chemistry^{[3]} and condensed matter physics.^{[4]}^{[5]} Nonrelativistic quantum mechanics refers to the mathematical formulation of quantum mechanics applied in the context of Galilean relativity, more specifically quantizing the equations of classical mechanics by replacing dynamical variables by operators. Relativistic quantum mechanics (RQM) is quantum mechanics applied with special relativity. Although the earlier formulations, like the Schrödinger picture and Heisenberg picture were originally formulated in a nonrelativistic background, a few of them (e.g. the Dirac or pathintegral formalism) also work with special relativity.
Key features common to all RQMs include: the prediction of antimatter, spin magnetic moments of elementary spin 1⁄2 fermions, fine structure, and quantum dynamics of charged particles in electromagnetic fields.^{[6]} The key result is the Dirac equation, from which these predictions emerge automatically. By contrast, in nonrelativistic quantum mechanics, terms have to be introduced artificially into the Hamiltonian operator to achieve agreement with experimental observations.
The most successful (and most widely used) RQM is relativistic quantum field theory (QFT), in which elementary particles are interpreted as field quanta. A unique consequence of QFT that has been tested against other RQMs is the failure of conservation of particle number, for example in matter creation and annihilation.^{[7]}
In this article, the equations are written in familiar 3D vector calculus notation and use hats for operators (not necessarily in the literature), and where space and time components can be collected, tensor index notation is shown also (frequently used in the literature), in addition the Einstein summation convention is used. SI units are used here; Gaussian units and natural units are common alternatives. All equations are in the position representation; for the momentum representation the equations have to be Fourier transformed – see position and momentum space.
One approach is to modify the Schrödinger picture to be consistent with special relativity.^{[2]}
A postulate of quantum mechanics is that the time evolution of any quantum system is given by the Schrödinger equation:
using a suitable Hamiltonian operator Ĥ corresponding to the system. The solution is a complexvalued wavefunction ψ(r, t), a function of the 3D position vector r of the particle at time t, describing the behavior of the system.
Every particle has a nonnegative spin quantum number s. The number 2s is an integer, odd for fermions and even for bosons. Each s has 2s + 1 zprojection quantum numbers; σ = s, s − 1, ... , −s + 1, −s.^{[a]} This is an additional discrete variable the wavefunction requires; ψ(r, t, σ).
Historically, in the early 1920s Pauli, Kronig, Uhlenbeck and Goudsmit were the first to propose the concept of spin. The inclusion of spin in the wavefunction incorporates the Pauli exclusion principle (1925) and the more general spin–statistics theorem (1939) due to Fierz, rederived by Pauli a year later. This is the explanation for a diverse range of subatomic particle behavior and phenomena: from the electronic configurations of atoms, nuclei (and therefore all elements on the periodic table and their chemistry), to the quark configurations and colour charge (hence the properties of baryons and mesons).
A fundamental prediction of special relativity is the relativistic energy–momentum relation; for a particle of rest mass m, and in a particular frame of reference with energy E and 3momentum p with magnitude in terms of the dot product , it is:^{[8]}
These equations are used together with the energy and momentum operators, which are respectively:
to construct a relativistic wave equation (RWE): a partial differential equation consistent with the energy–momentum relation, and is solved for ψ to predict the quantum dynamics of the particle. For space and time to be placed on equal footing, as in relativity, the orders of space and time partial derivatives should be equal, and ideally as low as possible, so that no initial values of the derivatives need to be specified. This is important for probability interpretations, exemplified below. The lowest possible order of any differential equation is the first (zeroth order derivatives would not form a differential equation).
The Heisenberg picture is another formulation of QM, in which case the wavefunction ψ is timeindependent, and the operators A(t) contain the time dependence, governed by the equation of motion:
This equation is also true in RQM, provided the Heisenberg operators are modified to be consistent with SR.^{[9]}^{[10]}
Historically, around 1926, Schrödinger and Heisenberg show that wave mechanics and matrix mechanics are equivalent, later furthered by Dirac using transformation theory.
A more modern approach to RWEs, first introduced during the time RWEs were developing for particles of any spin, is to apply representations of the Lorentz group.
In classical mechanics and nonrelativistic QM, time is an absolute quantity all observers and particles can always agree on, "ticking away" in the background independent of space. Thus in nonrelativistic QM one has for a many particle system ψ(r_{1}, r_{2}, r_{3}, ..., t, σ_{1}, σ_{2}, σ_{3}...).
In relativistic mechanics, the spatial coordinates and coordinate time are not absolute; any two observers moving relative to each other can measure different locations and times of events. The position and time coordinates combine naturally into a fourdimensional spacetime position X = (ct, r) corresponding to events, and the energy and 3momentum combine naturally into the four momentum P = (E/c, p) of a dynamic particle, as measured in some reference frame, change according to a Lorentz transformation as one measures in a different frame boosted and/or rotated relative the original frame in consideration. The derivative operators, and hence the energy and 3momentum operators, are also noninvariant and change under Lorentz transformations.
Under a proper orthochronous Lorentz transformation (r, t) → Λ(r, t) in Minkowski space, all oneparticle quantum states ψ_{σ} locally transform under some representation D of the Lorentz group:^{[11]}^{[12]}
where D(Λ) is a finitedimensional representation, in other words a (2s + 1)×(2s + 1) square matrix . Again, ψ is thought of as a column vector containing components with the (2s + 1) allowed values of σ. The quantum numbers s and σ as well as other labels, continuous or discrete, representing other quantum numbers are suppressed. One value of σ may occur more than once depending on the representation.
The classical Hamiltonian for a particle in a potential is the kinetic energy p·p/2m plus the potential energy V(r, t), with the corresponding quantum operator in the Schrödinger picture:
and substituting this into the above Schrödinger equation gives a nonrelativistic QM equation for the wavefunction: the procedure is a straightforward substitution of a simple expression. By contrast this is not as easy in RQM; the energy–momentum equation is quadratic in energy and momentum leading to difficulties. Naively setting:
is not helpful for several reasons. The square root of the operators cannot be used as it stands; it would have to be expanded in a power series before the momentum operator, raised to a power in each term, could act on ψ. As a result of the power series, the space and time derivatives are completely asymmetric: infiniteorder in space derivatives but only first order in the time derivative, which is inelegant and unwieldy. Again, there is the problem of the noninvariance of the energy operator, equated to the square root which is also not invariant. Another problem, less obvious and more severe, is that it can be shown to be nonlocal and can even violate causality: if the particle is initially localized at a point r_{0} so that ψ(r_{0}, t = 0) is finite and zero elsewhere, then at any later time the equation predicts delocalization ψ(r, t) ≠ 0 everywhere, even for r > ct which means the particle could arrive at a point before a pulse of light could. This would have to be remedied by the additional constraint ψ(r > ct, t) = 0.^{[13]}
There is also the problem of incorporating spin in the Hamiltonian, which isn't a prediction of the nonrelativistic Schrödinger theory. Particles with spin have a corresponding spin magnetic moment quantized in units of μ_{B}, the Bohr magneton:^{[14]}^{[15]}
where g is the (spin) gfactor for the particle, and S the spin operator, so they interact with electromagnetic fields. For a particle in an externally applied magnetic field B, the interaction term^{[16]}
has to be added to the above nonrelativistic Hamiltonian. On the contrary; a relativistic Hamiltonian introduces spin automatically as a requirement of enforcing the relativistic energymomentum relation.^{[17]}
Relativistic Hamiltonians are analogous to those of nonrelativistic QM in the following respect; there are terms including rest mass and interaction terms with externally applied fields, similar to the classical potential energy term, as well as momentum terms like the classical kinetic energy term. A key difference is that relativistic Hamiltonians contain spin operators in the form of matrices, in which the matrix multiplication runs over the spin index σ, so in general a relativistic Hamiltonian:
is a function of space, time, and the momentum and spin operators.
Substituting the energy and momentum operators directly into the energy–momentum relation may at first sight seem appealing, to obtain the Klein–Gordon equation:^{[18]}
and was discovered by many people because of the straightforward way of obtaining it, notably by Schrödinger in 1925 before he found the nonrelativistic equation named after him, and by Klein and Gordon in 1927, who included electromagnetic interactions in the equation. This is relativistically invariant, yet this equation alone isn't a sufficient foundation for RQM for a at least two reasons: one is that negativeenergy states are solutions,^{[2]}^{[19]} another is the density (given below), and this equation as it stands is only applicable to spinless particles. This equation can be factored into the form:^{[20]}^{[21]}
where α = (α_{1}, α_{2}, α_{3}) and β are not simply numbers or vectors, but 4 × 4 Hermitian matrices that are required to anticommute for i ≠ j:
and square to the identity matrix:
so that terms with mixed secondorder derivatives cancel while the secondorder derivatives purely in space and time remain. The first factor:
is the Dirac equation. The other factor is also the Dirac equation, but for a particle of negative mass.^{[20]} Each factor is relativistically invariant. The reasoning can be done the other way round: propose the Hamiltonian in the above form, as Dirac did in 1928, then premultiply the equation by the other factor of operators E + cα · p + βmc^{2}, and comparison with the KG equation determines the constraints on α and β. The positive mass equation can continue to be used without loss of continuity. The matrices multiplying ψ suggest it isn't a scalar wavefunction as permitted in the KG equation, but must instead be a fourcomponent entity. The Dirac equation still predicts negative energy solutions,^{[6]}^{[22]} so Dirac postulated that negative energy states are always occupied, because according to the Pauli principle, electronic transitions from positive to negative energy levels in atoms would be forbidden. See Dirac sea for details.
In nonrelativistic quantum mechanics, the square modulus of the wavefunction ψ gives the probability density function ρ = ψ^{2}. This is the Copenhagen interpretation, circa 1927. In RQM, while ψ(r, t) is a wavefunction, the probability interpretation is not the same as in nonrelativistic QM. Some RWEs do not predict a probability density ρ or probability current j (really meaning probability current density) because they are not positive definite functions of space and time. The Dirac equation does:^{[23]}
where the dagger denotes the Hermitian adjoint (authors usually write ψ = ψ^{†}γ^{0} for the Dirac adjoint) and J^{μ} is the probability fourcurrent, while the Klein–Gordon equation does not:^{[24]}
where ∂^{μ} is the four gradient. Since the initial values of both ψ and ∂ψ/∂t may be freely chosen, the density can be negative.
Instead, what appears look at first sight a "probability density" and "probability current" has to be reinterpreted as charge density and current density when multiplied by electric charge. Then, the wavefunction ψ is not a wavefunction at all, but reinterpreted as a field.^{[13]} The density and current of electric charge always satisfy a continuity equation:
as charge is a conserved quantity. Probability density and current also satisfy a continuity equation because probability is conserved, however this is only possible in the absence of interactions.
Including interactions in RWEs is generally difficult. Minimal coupling is a simple way to include the electromagnetic interaction. For one charged particle of electric charge q in an electromagnetic field, given by the magnetic vector potential A(r, t) defined by the magnetic field B = ∇ × A, and electric scalar potential ϕ(r, t), this is:^{[25]}
where P_{μ} is the fourmomentum that has a corresponding 4momentum operator, and A_{μ} the fourpotential. In the following, the nonrelativistic limit refers to the limiting cases:
that is, the total energy of the particle is approximately the rest energy for small electric potentials, and the momentum is approximately the classical momentum.
In RQM, the KG equation admits the minimal coupling prescription;
In the case where the charge is zero, the equation reduces trivially to the free KG equation so nonzero charge is assumed below. This is a scalar equation that is invariant under the irreducible onedimensional scalar (0,0) representation of the Lorentz group. This means that all of its solutions will belong to a direct sum of (0,0) representations. Solutions that do not belong to the irreducible (0,0) representation will have two or more independent components. Such solutions cannot in general describe particles with nonzero spin since spin components are not independent. Other constraint will have to be imposed for that, e.g. the Dirac equation for spin 1/2, see below. Thus if a system satisfies the KG equation only, it can only be interpreted as a system with zero spin.
The electromagnetic field is treated classically according to Maxwell's equations and the particle is described by a wavefunction, the solution to the KG equation. The equation is, as it stands, not always very useful, because massive spinless particles, such as the πmesons, experience the much stronger strong interaction in addition to the electromagnetic interaction. It does, however, correctly describe charged spinless bosons in the absence of other interactions.
The KG equation is applicable to spinless charged bosons in an external electromagnetic potential.^{[2]} As such, the equation cannot be applied to the description of atoms, since the electron is a spin 1/2 particle. In the nonrelativistic limit the equation reduces to the Schrödinger equation for a spinless charged particle in an electromagnetic field:^{[16]}
Non relativistically, spin was phenomenologically introduced in the Pauli equation by Pauli in 1927 for particles in an electromagnetic field:
by means of the 2 × 2 Pauli matrices, and ψ is not just a scalar wavefunction as in the nonrelativistic Schrödinger equation, but a twocomponent spinor field:
where the subscripts ↑ and ↓ refer to the "spin up" (σ = +1/2) and "spin down" (σ = −1/2) states.^{[b]}
In RQM, the Dirac equation can also incorporate minimal coupling, rewritten from above;
and was the first equation to accurately predict spin, a consequence of the 4 × 4 gamma matrices γ^{0} = β, γ = (γ_{1}, γ_{2}, γ_{3}) = βα = (βα_{1}, βα_{2}, βα_{3}). There is a 4 × 4 identity matrix premultiplying the energy operator (including the potential energy term), conventionally not written for simplicity and clarity (i.e. treated like the number 1). Here ψ is a fourcomponent spinor field, which is conventionally split into two twocomponent spinors in the form:^{[c]}
The 2spinor ψ_{+} corresponds to a particle with 4momentum (E, p) and charge q and two spin states (σ = ±1/2, as before). The other 2spinor ψ_{−} corresponds to a similar particle with the same mass and spin states, but negative 4momentum −(E, p) and negative charge −q, that is, negative energy states, timereversed momentum, and negated charge. This was the first interpretation and prediction of a particle and corresponding antiparticle. See Dirac spinor and bispinor for further description of these spinors. In the nonrelativistic limit the Dirac equation reduces to the Pauli equation (see Dirac equation for how). When applied a oneelectron atom or ion, setting A = 0 and ϕ to the appropriate electrostatic potential, additional relativistic terms include the spin–orbit interaction, electron gyromagnetic ratio, and Darwin term. In ordinary QM these terms have to be put in by hand and treated using perturbation theory. The positive energies do account accurately for the fine structure.
Within RQM, for massless particles the Dirac equation reduces to:
the first of which is the Weyl equation, a considerable simplification applicable for massless neutrinos.^{[26]} This time there is a 2 × 2 identity matrix premultiplying the energy operator conventionally not written. In RQM it is useful to take this as the zeroth Pauli matrix σ_{0} which couples to the energy operator (time derivative), just as the other three matrices couple to the momentum operator (spatial derivatives).
The Pauli and gamma matrices were introduced here, in theoretical physics, rather than pure mathematics itself. They have applications to quaternions and to the SO(2) and SO(3) Lie groups, because they satisfy the important commutator [ , ] and anticommutator [ , ]_{+} relations respectively:
where ε_{abc} is the threedimensional LeviCivita symbol. The gamma matrices form bases in Clifford algebra, and have a connection to the components of the flat spacetime Minkowski metric η^{αβ} in the anticommutation relation:
(This can be extended to curved spacetime by introducing vierbeins, but is not the subject of special relativity).
In 1929, the Breit equation was found to describe two or more electromagnetically interacting massive spin 1/2 fermions to firstorder relativistic corrections; one of the first attempts to describe such a relativistic quantum manyparticle system. This is, however, still only an approximation, and the Hamiltonian includes numerous long and complicated sums.
The helicity operator is defined by;
where p is the momentum operator, S the spin operator for a particle of spin s, E is the total energy of the particle, and m_{0} its rest mass. Helicity indicates the orientations of the spin and translational momentum vectors.^{[27]} Helicity is framedependent because of the 3momentum in the definition, and is quantized due to spin quantization, which has discrete positive values for parallel alignment, and negative values for antiparallel alignment.
An automatic occurrence in the Dirac equation (and the Weyl equation) is the projection of the spin 1/2 operator on the 3momentum (times c), σ · c p, which is the helicity (for the spin 1/2 case) times .
For massless particles the helicity simplifies to:
The Dirac equation can only describe particles of spin 1/2. Beyond the Dirac equation, RWEs have been applied to free particles of various spins. In 1936, Dirac extended his equation to all fermions, three years later Fierz and Pauli rederived the same equation.^{[28]} The Bargmann–Wigner equations were found in 1948 using Lorentz group theory, applicable for all free particles with any spin.^{[29]}^{[30]} Considering the factorization of the KG equation above, and more rigorously by Lorentz group theory, it becomes apparent to introduce spin in the form of matrices.
The wavefunctions are multicomponent spinor fields, which can be represented as column vectors of functions of space and time:
where the expression on the right is the Hermitian conjugate. For a massive particle of spin s, there are 2s + 1 components for the particle, and another 2s + 1 for the corresponding antiparticle (there are 2s + 1 possible σ values in each case), altogether forming a 2(2s + 1)component spinor field:
with the + subscript indicating the particle and − subscript for the antiparticle. However, for massless particles of spin s, there are only ever twocomponent spinor fields; one is for the particle in one helicity state corresponding to +s and the other for the antiparticle in the opposite helicity state corresponding to −s:
According to the relativistic energymomentum relation, all massless particles travel at the speed of light, so particles traveling at the speed of light are also described by twocomponent spinors. Historically, Élie Cartan found the most general form of spinors in 1913, prior to the spinors revealed in the RWEs following the year 1927.
For equations describing higherspin particles, the inclusion of interactions is nowhere near as simple minimal coupling, they lead to incorrect predictions and selfinconsistencies.^{[31]} For spin greater than ħ/2, the RWE is not fixed by the particle's mass, spin, and electric charge; the electromagnetic moments (electric dipole moments and magnetic dipole moments) allowed by the spin quantum number are arbitrary. (Theoretically, magnetic charge would contribute also). For example, the spin 1/2 case only allows a magnetic dipole, but for spin 1 particles magnetic quadrupoles and electric dipoles are also possible.^{[26]} For more on this topic, see multipole expansion and (for example) Cédric Lorcé (2009).^{[32]}^{[33]}
The Schrödinger/Pauli velocity operator can be defined for a massive particle using the classical definition p = m v, and substituting quantum operators in the usual way:^{[34]}
which has eigenvalues that take any value. In RQM, the Dirac theory, it is:
which must have eigenvalues between ±c. See Foldy–Wouthuysen transformation for more theoretical background.
The Hamiltonian operators in the Schrödinger picture are one approach to forming the differential equations for ψ. An equivalent alternative is to determine a Lagrangian (really meaning Lagrangian density), then generate the differential equation by the fieldtheoretic Euler–Lagrange equation:
For some RWEs, a Lagrangian can be found by inspection. For example, the Dirac Lagrangian is:^{[35]}
and Klein–Gordon Lagrangian is:
This is not possible for all RWEs; and is one reason the Lorentz group theoretic approach is important and appealing: fundamental invariance and symmetries in space and time can be used to derive RWEs using appropriate group representations. The Lagrangian approach with field interpretation of ψ is the subject of QFT rather than RQM: Feynman's path integral formulation uses invariant Lagrangians rather than Hamiltonian operators, since the latter can become extremely complicated, see (for example) Weinberg (1995).^{[36]}
In nonrelativistic QM, the angular momentum operator is formed from the classical pseudovector definition L = r × p. In RQM, the position and momentum operators are inserted directly where they appear in the orbital relativistic angular momentum tensor defined from the fourdimensional position and momentum of the particle, equivalently a bivector in the exterior algebra formalism:^{[37]}^{[d]}
which are six components altogether: three are the nonrelativistic 3orbital angular momenta; M^{12} = L^{3}, M^{23} = L^{1}, M^{31} = L^{2}, and the other three M^{01}, M^{02}, M^{03} are boosts of the centre of mass of the rotating object. An additional relativisticquantum term has to be added for particles with spin. For a particle of rest mass m, the total angular momentum tensor is:
where the star denotes the Hodge dual, and
is the Pauli–Lubanski pseudovector.^{[38]} For more on relativistic spin, see (for example) Troshin & Tyurin (1994).^{[39]}
In 1926, the Thomas precession is discovered: relativistic corrections to the spin of elementary particles with application in the spin–orbit interaction of atoms and rotation of macroscopic objects.^{[40]}^{[41]} In 1939 Wigner derived the Thomas precession.
In classical electromagnetism and special relativity, an electron moving with a velocity v through an electric field E but not a magnetic field B, will in its own frame of reference experience a Lorentztransformed magnetic field B′:
In the nonrelativistic limit v << c:
so the nonrelativistic spin interaction Hamiltonian becomes:^{[42]}
where the first term is already the nonrelativistic magnetic moment interaction, and the second term the relativistic correction of order (v/c)², but this disagrees with experimental atomic spectra by a factor of 1⁄2. It was pointed out by L. Thomas that there is a second relativistic effect: An electric field component perpendicular to the electron velocity causes an additional acceleration of the electron perpendicular to its instantaneous velocity, so the electron moves in a curved path. The electron moves in a rotating frame of reference, and this additional precession of the electron is called the Thomas precession. It can be shown^{[43]} that the net result of this effect is that the spin–orbit interaction is reduced by half, as if the magnetic field experienced by the electron has only onehalf the value, and the relativistic correction in the Hamiltonian is:
In the case of RQM, the factor of 1⁄2 is predicted by the Dirac equation.^{[42]}
The events which led to and established RQM, and the continuation beyond into quantum electrodynamics (QED), are summarized below [see, for example, R. Resnick and R. Eisberg (1985),^{[44]} and P.W Atkins (1974)^{[45]}]. More than half a century of experimental and theoretical research from the 1890s through to the 1950s in the new and mysterious quantum theory as it was up and coming revealed that a number of phenomena cannot be explained by QM alone. SR, found at the turn of the 20th century, was found to be a necessary component, leading to unification: RQM. Theoretical predictions and experiments mainly focused on the newly found atomic physics, nuclear physics, and particle physics; by considering spectroscopy, diffraction and scattering of particles, and the electrons and nuclei within atoms and molecules. Numerous results are attributed to the effects of spin.
Albert Einstein in 1905 explained of the photoelectric effect; a particle description of light as photons. In 1916, Sommerfeld explains fine structure; the splitting of the spectral lines of atoms due to first order relativistic corrections. The Compton effect of 1923 provided more evidence that special relativity does apply; in this case to a particle description of photon–electron scattering. de Broglie extends wave–particle duality to matter: the de Broglie relations, which are consistent with special relativity and quantum mechanics. By 1927, Davisson and Germer and separately G. Thomson successfully diffract electrons, providing experimental evidence of waveparticle duality.
In 1935; Einstein, Rosen, Podolsky published a paper^{[48]} concerning quantum entanglement of particles, questioning quantum nonlocality and the apparent violation of causality upheld in SR: particles can appear to interact instantaneously at arbitrary distances. This was a misconception since information is not and cannot be transferred in the entangled states; rather the information transmission is in the process of measurement by two observers (one observer has to send a signal to the other, which cannot exceed c). QM does not violate SR.^{[49]}^{[50]} In 1959, Bohm and Aharonov publish a paper^{[51]} on the Aharonov–Bohm effect, questioning the status of electromagnetic potentials in QM. The EM field tensor and EM 4potential formulations are both applicable in SR, but in QM the potentials enter the Hamiltonian (see above) and influence the motion of charged particles even in regions where the fields are zero. In 1964, Bell's theorem was published in a paper on the EPR paradox,^{[52]} showing that QM cannot be derived from local hidden variable theories if locality is to be maintained.
In 1947 the Lamb shift was discovered: a small difference in the ^{2}S_{1⁄2} and ^{2}P_{1⁄2} levels of hydrogen, due to the interaction between the electron and vacuum. Lamb and Retherford experimentally measure stimulated radiofrequency transitions the ^{2}S_{1⁄2} and ^{2}P_{1⁄2} hydrogen levels by microwave radiation.^{[53]} An explanation of the Lamb shift is presented by Bethe. Papers on the effect were published in the early 1950s.^{[54]}
Atomic physics and chemistryEditMathematical physicsEdit

Particle physics and quantum field theoryEdit

Relativistic quantum mechanics.
hyperfine structure in relativistic quantum mechanics.
magnetic moments in relativistic quantum mechanics.