Posts Tagged ‘Special Theory of Relativity’

I have taught special relativity for many years, but every time I teach it I present the result that mass changes as a function of velocity as a consequence of the modified version of Newton’s 2nd law.

As almost everyone knows, Newton’s 2nd law says that


where F is the force applied, m is the mass, and a is the acceleration felt by the body. In Newtonian mechanics, mass is invariant, but a consequence of special relativity is that nothing can travel faster than the speed of light c. This raises the conundrum of why can’t we keep applying a force to a body of mass m, causing it to continue accelerating and to ultimately increase its velocity to one greater than the speed of light?

The answer is that Newton’s 2nd law is incomplete. Einstein showed that mass is also a function of velocity, and so we should write

m = \gamma m_{0} \text{ (1) }

Where \gamma = \frac{ 1 }{ \sqrt{ (1 - V^{2}/c^{2}) } } is the so-called Lorentz factor andm_{0} is the rest mass (also known as the invariant mass or gravitational mass), the mass an object has when it is at rest relative to the observer. Hence we can argue that, as we approach the speed of light, the applied force goes into changing the mass of the body, rather than accelerating it, leading to a modified version of Newton’s 2nd law

F = \gamma m_{0} a

where both velocity and/or mass change as a force is applied. But, because of the fact that \gamma \approx 1 until V \approx c/2 (see Figure 1), very little increase in mass occurs until V has reached appreciable values.

The variation of \gamma (the Lorentz factor) as a function of the speed V. Until V \approx c/2, \; \gamma is very close to unity

However, I have always found this an inadequate explanation of the relativistic mass, as it does not derive it but rather argues for its necessity. So, as I’m teaching special relativity again this year, I decided a few weeks ago to see if I could find a way of deriving it from a simple argument. After several weeks of hunting around I think I have found a derivation which is robust and easy to understand. But, in my searching I came across several “derivations” which were nothing more than circular arguments, and also some derivations which were simply incorrect.

Two balls colliding

The best explanation that I have found to derive the relativistic mass is to use the scenario of two balls colliding. Although it would be possible, in theory to have the balls moving in any direction, we are going to make things a lot easier by having the balls moving in the y-direction, but with the two reference frames S \text{ and } S^{\prime} moving relative to each other with a velocity V in the x-direction. Also, the balls are going to have the same rest mass, m_{0}, as measured in their respective frames S and S^{\prime} (the rest mass of each ball can be measured by each observer in their respective reference frames when they are at rest in their respective frames).

The blue ball B moves solely in the y-direction in reference frame S, and the red ball R moves solely in the y^{\prime}-direction in reference frame S^{\prime}. Ball B starts by moving in the positive y-direction in reference frame S with a velocity u_{0}, and ball R starts moving in the negative y^{\prime}-direction in reference frame S^{\prime} with a velocity -u_{0} in frame S^{\prime}.

Reference frame S^{\prime} is moving relative to frame S at a velocity V in the positive x-direction. So, as seen in S, the motion of ball R appears as shown in the left of Figure 2. That is, it appears in S to move both in the negative y-direction and the positive x-direction, and so follows the path shown by the red arrow pointing downwards and to the right.

At some moment the two balls collide. After the collision, as seen in S, ball B will move vertically downwards in the negative y-direction, with a velocity -u_{0}. Ball R moves upwards (positive y-direction) and to the right (positive x-direction), as shown by the red arrow in the diagram on the left of Figure 1.

In reference frame S^{\prime} the motions of balls B and R looks like the diagram on the right of Figure 1. In S^{\prime}, it is ball R which moves vertically, and ball B which moves in both the x^{\prime} and y^{\prime} directions.


Two balls colliding. Ball B (in blue) moves solely in the y-direciton as seen in frame S, ball R (in red) moves solely in the y-direction in frame S^{\prime}.


The velocity of ball R in S

To calculate the velocity of ball R as seen in S, we have to use the Lorentz transformations for velocity. As we showed in this blog here, if we have an object moving with a velocity u^{\prime} in S^{\prime} which is moving relative to S with a velocity V, then the velocity u in frame S is given by

u = \frac{ u^{\prime} + V }{ \left( 1 + \frac{ u^{\prime}V }{ c^{2} } \right) } \text{ (2) }

This equation is true when the velocity is in the x^{\prime}-direction, and the frames are moving relative to each other in the x-direction. So we are going to re-write Equ. (2) as

u_{x} = \frac{ u^{\prime}_{x} + V }{ \left( 1 + \frac{ u^{\prime}_{x}V }{ c^{2} } \right) } \text{ (3) }

However, if the velocity of an object is in the y^{\prime}-direction, rather than the x^{\prime}-direction, then we need a different expression. We can derive it from going back to our equations for the Lorentz transformations


The Lorentz transformations

This time we write

dy = dy^{\prime}


dt = \gamma \left( dt^{\prime} + \frac{ dx^{\prime}V }{ c^{2} } \right)


\frac{ dy }{ dt } = \frac{ dy^{\prime} }{ \gamma \left( dt^{\prime} + \frac{ dx^{\prime}V }{ c^{2} } \right) }

Dividing each term in the right-hand side by dt^{\prime}, we get

\frac{ dy }{ dt } = \frac{ dy^{\prime}/dt^{\prime} }{ \gamma \left( dt^{\prime}/dt^{\prime} + \frac{ dx^{\prime}V }{ dt^{\prime}c^{2} } \right) }

u_{y} = \frac{ u^{\prime}_{y} }{ \gamma \left( 1 + \frac{ u^{\prime}_{x}V }{ c^{2} } \right) } \text{ (4) }

Equations (3) and (4) allow us to work out the components of ball R’s velocities u_{x} in the x-direction and u_{y} in the y-direction in frame S.

u(R)_{x} = \frac{ 0 + V }{ \left( 1 + \frac{ 0 \cdot V }{ c^{2} } \right) } = V \text{ (5) }

u(R)_{y} = \frac{ -u_{0} }{ \gamma \left( 1 + \frac{ 0 \cdot V }{ c^{2} } \right) } = \frac{ -u_{0} }{ \gamma } \text{ (6) }

After the collision, the velocity of ball B becomes u(B) = -u_{0}. What about ball R?

We can see that u(R)_{x} will not change, and u(R)_{y} after the collision will be - \frac{ + u_{0} }{ \gamma }.

The momentum before and after the collision

We are now going to look at the momentum of balls B and R before and after the collision, as seen in frame S. We will start off by assuming that the mass is constant for both balls, that is that m=m_{0} for both balls, despite the two reference frames moving relative to each other.

If we do this, we can write that the momentum in the x-direction before the collision is given by

(p(B)_{x} + p(R)_{x})_{i} = 0 + m_{0}V = m_{0}V

The momentum after the collision in the x-direction is given by

(p(B)_{x} + p(R)_{x})_{f} = 0 + m_{0}V = m_{0}V

So, momentum is conserved in the x-direction. But, what about in the y-direction? Before the collision, the momentum is given by

(p(B)_{y} + p(R)_{y})_{i} = + m_{0}u_{0} + m_{0} \left( \frac{ -u_{0} }{ \gamma } \right) =m_{0}u_{0} - \frac{ m_{0}u_{0} }{ \gamma }

After the collision, the momentum in the y-direction is given by

(p(B)_{y} + p(R)_{y})_{f} = m_{0}(-u_{0}) + m_{0} \left( \frac{ +u_{0} }{ \gamma } \right) = -m_{0}u_{0} + \frac{ m_{0}u_{0} }{ \gamma }.

If we assume that momentum is conserved, we can write

m_{0}u_{0} - \frac{ m_{0}u_{0} }{ \gamma } = -m_{0}u_{0} + \frac{ m_{0}u_{0} }{ \gamma } \rightarrow 2m_{0}u_{0} = \frac{ 2m_{0}u_{0} }{ \gamma } \rightarrow \gamma = 1

So, if we assume that the mass of both ball B and ball R in frame S is m_{0}, the momentum in the y-direction is only conserved if \gamma =1. But, \gamma is only equal to unity when the relative velocity V between the two frames is zero; in other words when the two frames are not moving relative to each other! If V \neq 0 and mass is constant, momentum will not be conserved.

In physics, the conservation of momentum is considered a law, it is believed to always hold. In order for momentum to be conserved, we can qualitatively see that the mass of ball R needs to be greater than the mass of ball B as seen in frame S, as the speed of ball R in the y-direction in frame S, |u(R)_{y}| = u_{0} / \gamma < u_{0}.

Allowing the mass to change

We have just shown above that, if we assume both masses are invariant, momentum will only be conserved in the y-direction in the trivial case where the two frames are stationary relative to each other. So, let us now assume that, if V \neq 0, we have to allow the masses to change.

We will assume that mass is a function of speed. For ball B, the momentum in the x-direction is still zero, both before and after the collision. For ball R, we will now write the momentum in the x-direction, both before and after the collision, as

p(R)_{x} = m(R) u(R)_{x} = m(R) V

What about in the y-direction? For ball B, before the collision we can write

p(B)_{y} = m(B) u(B)_{y} = m(B) u_{0}

Where m(B) is the mass of ball B in frame S which is affected by its velocity in frame S, which is u_{0}.

For ball R as seen in frame S we can write that the momentum in the y-direction before the collision is given by

p(R)_{y} = m(R) u(R)_{y} = m(R) \cdot \left( \frac{ - u_{0} }{ \gamma } \right) = \frac{ -m(B)u_{0} }{ \gamma }

Where \gamma = 1/(1-V^{2}/c^{2}), the Lorentz factor due to the relative velocity V between S and S^{\prime}.

After the collision, we can write the momentum for ball B in the y-direction as being

p(B)_{y} = m(B) u(B)_{y} = -m(B) u_{0}

And, for ball R we can write

p(R)_{y} = m(R) u(R)_{y} = \frac{ +m(R)u_{0} }{ \gamma }

Equating the momentum in the y-direction before and after the collision, we have

m(B) u_{0} - \left( \frac{ m(R) u_{0} }{ \gamma } \right) = -m(B) u_{0} + \left( \frac{ m(R) u_{0} }{ \gamma } \right)

\rightarrow 2m(B) u_{0} = 2 \left( \frac{ m(R) u_{0} }{ \gamma } \right) \rightarrow m(A) =\frac{ m(R) }{ \gamma }

For ball B, we will write

m(B) = \gamma_{B} m_{0}


\gamma_{B} = \frac{ 1 }{ \sqrt( 1 - u^{2}(B)/c^{2} ) } = \frac{ 1 }{ \sqrt( 1 - u_{0}^{2}/c^{2} ) }

(that is, \gamma_{B} depends on the speed of ball B in frame S, and that speed is u_{0}).

So, the momentum of ball B in the y-direction is given by

p(B)_{y} = m(B)u(B)_{y} \rightarrow \boxed {p(B)_{y} = \frac{ m_{0} u_{0} }{ \sqrt( 1 - u_{0}^{2}/c^{2} ) } \text{ (7) } }

For ball R, we will write

m(R) = \gamma_{R} m_{0}

Where \gamma_{R} = 1/\sqrt{ (1 - u^{2}(R)/c^{2} ) } depends on the speed u(R) of ball R as seen in frame S. (Note: the mass does not depend on just the y-component of ball R‘s speed (as is often incorrectly stated), it depends on its total speed).

To calculate the value of u(R) we note that it is made up of the x-component u(R)_{x} and the y-component u(R)_{y}. But, u(R)_{x} = V, and we showed above that u(R)_{y} = -u_{0}/ \gamma, where this \gamma = 1/\sqrt{ (1 - V^{2}/c^{2}) }.

Using Pythagoras to calculate u(R), we have

u(R)^{2} = V^{2} + u_{0}^{2}/\gamma^{2} = V^{2} + u_{0}^{2}(1 -V^{2}/c^{2})


u(R)^{2} = u_{0}^{2} + V^{2}( 1 - u_{0}^{2}/c^{2} )

Using this value of u(R) we can write

\gamma_{R} = \frac{ 1 }{ \sqrt{ ( 1 - u(R)^{2}/c^{2} )} } = \frac{ 1 }{ \sqrt{ ( 1 - u_{0}^{2}/c^{2} - V^{2}/c^{2} + V^{2}u_{0}^{2}/c^{4} ) } }

But, the terms ( 1 - u_{0}^{2}/c^{2} - V^{2}/c^{2} + V^{2}u_{0}^{2}/c^{4} ) can be factorised as

( 1 - u_{0}^{2}/c^{2} - V^{2}/c^{2} + V^{2}u_{0}^{2}/c^{4} ) = (1 - u_{0}^{2}/c^{2})(1 - V^{2}/c^{2})

And so we can write

\gamma_{R} = \frac{ 1 }{ \sqrt{ ( 1 - u_{0}^{2}/c^{2} ) } \sqrt{ (1 - V^{2}/c^{2}) } }

But, 1/\sqrt{ (1 - V^{2}/c^{2}) } = \gamma, so we can write

\gamma_{R} = \gamma \cdot \frac{ 1 }{ \sqrt{ ( 1 - u_{0}^{2}/c^{2} ) } }

This means that we can write the momentum for ball R in the y-direction as

p(R)_{y} = m(R) u(R)_{y} = \gamma_{R} m_{0} u(R)_{y}

p(R)_{y} = \gamma \cdot \frac{ 1 }{ \sqrt{ ( 1 - u_{0}^{2}/c^{2} ) } } \cdot m_{0} \cdot \frac{ u_{0} }{ \gamma }

\boxed{ p(R)_{y} = \frac{ m_{0}u_{0} }{ \sqrt{ ( 1 - u_{0}^{2}/c^{2} ) } } \text{ (8) } }

Comparing this to Equ. (7), the equation for p(B)_{y}, we can see that they are equal, as required.

So, we have proved that, to conserve momentum, we need mass to be a function of speed, and specifically that

\boxed{ m = \frac{ m_{0} }{ \sqrt{ (1 - u^{2}/c^{2}) } } }

Where u is the speed of the ball in a particular direction in frame S.

Read Full Post »

Riding on a beam of light

In this previous blog, I discussed how an experiment involving electrodynamics was not invariant under a Galilean transformation. Or, to put it another way, the laws of electrodynamics as stated would allow someone to determine whether they were at rest or moving, something which deeply troubled a young Albert Einstein. It is said that one of Einstein’s first “thought experiments” was to imagine himself travelling along on a beam of light. Light is the ultimate “free lunch”, the changing magnetic field produces a changing electric field which produces a changing magnetic field. It self-propogates at a speed of 3 \times 10^{8} metres per second in a vacuum.

Einstein realised that if he were travelling with the beam of light then, relative to him, the light would disappear as the electric and magnetic fields would be stationary relative to him. This worried him, as it suggested that one would be able to tell whether one was travelling or at rest, just by measuring the properties of light. Einstein realised, in an insight which possibly no one else was capable of, that the speed of light was fundamental to physics, and needed to always be constant. This led him to develop what we now call the special theory of relativity, most of which is expressed in a paper he published in 1905 called “On The Electrodynamics of moving bodies“.

Einstein’s special theory of relativity

Einstein’s Special Theory of Relativity is based on two very simple but far reaching principles

  1. No experiment, mechanical or electrodynamical, can distinguish between being at rest or moving at a constant velocity.
  2. That the speed of light in a vacuum, c, is constant to any observer, no matter how quickly the observer is moving.

From the second of these principles, with a simple thought experiment, we can derive the Lorentz transformations from first principles. These are the equations which allow us to translate from one frame of reference to another so that all the laws of Physics are invariant.

An expanding sphere of light

The thought experiment we will use to derive the Lorentz transformations from first principles is one of a flash of light originating at the origin of two frames of reference S and S’ which are moving relative to each other with a velocity v. We set up our experiment so that at time t=0 the origins of the two frames of reference are in the same place.


Two frames of reference S and S' moving relative to each other have a flash of light originate at their respective origins at time t=0

Two frames of reference S and S’ moving relative to each other with a velocity v have a flash of light originate at their respective origins at time t=0


The flash of light will expand as a sphere, moving with a velocity c in both frames of reference, in accordance with Einstein’s 2nd principle of relativity. For reference frame S we can write that the square of the radius r^{2} of the sphere is x^{2} + y^{2} + z^{2} = c^{2}t^{2} so

\boxed{ x^{2} + y^{2} + z^{2} - c^{2}t^{2}=0 } \qquad(1)

For the reference frame S’ we can write that

\boxed{ x^{\prime 2} + y^{\prime 2} + z^{\prime 2} - c^{2}t^{\prime 2} = 0 } \qquad(2)

These two equations must be equal, as it is the same sphere of light and therefore the sphere must have the same radius in the two reference frames. Let us see if we can transform from one to the other using the Galilean transforms, which are

\boxed {\begin{array}{lcl} x^{\prime} & = & x - vt \\ y^{\prime} & = & y \\ z^{\prime} & = & z \\ t^{\prime} & = & t \end{array} }

x^{\prime 2} + y^{\prime 2} + z^{\prime 2} -c^{2}t^{2} = (x-vt)^{2} + y^{2} + z^{2} - c^{2}t^{2}

Expanding the brackets of the right hand side gives

x^{2} - 2vtx + v^{2}t^{2} + y^{2} + z^{2} - c^{2}t^{2} \neq x^{2} + y^{2} + z^{2} - c^{2}t^{2}


The the equation should be equal, but the terms highlighted do not exist on the right hand side of the equation.

The left side of the equation should be equal to the right side, but the terms highlighted do not exist on the right hand side of the equation.


As we can see, the two expressions are not equal as the left hand side has the extra terms -2vtx + v^{2}t^{2}. This means that a Galilean transformations does not work. The extra terms involve a combination of x and t, which suggests that both the equations linking x and x^{\prime} and t and t^{\prime} need to be modified, not just the equation for x as is the case in the Galilean transformations.

Modifying the Galilean transformations

Let us assume that the transformations can be written as

\boxed {\begin{array}{lcl} x^{\prime} & = & a_{1}x + a_{2}t \qquad(3) \\ y^{\prime} & = & y \\ z^{\prime} & = & z \\ t^{\prime} & = & b_{1}x + b_{2}t \qquad(4) \end{array} }

We need to find the values of a_{1}, a_{2}, b_{1} and b_{2} which correctly transform the equations for the expanding sphere of light. We do this by substituting equations (3) and (4) into equation (2). Before we do this, we note that the origin of the primed frame x^{\prime}=0 is a point that moves with speed v as seen in the unprimed frame S. Therefore its location in the unprimed frame S at time t is just x=vt. So we can write equation (3) as

x^{\prime} = 0 = a_{1}x + a_{2}t \rightarrow x = -\frac{a_{2}}{a_{1}} t = vt

\therefore \frac{ a_{2} }{ a_{1} } = -v

Re-writing equation (3)

x^{\prime} = a_{1}x + a_{2}t = a_{1}(x+\frac{ a_{2} }{ a_{1} } t) = a_{1}(x-vt)

Now we substitute this expression and equation (4) into equation (2)

a_{1}^{2}(x-vt)^{2} + y^{\prime 2} + z^{\prime 2} -c^{2}(b_{1}x+b_{2}t)^{2} = x^{2} + y^{2} + z^{2} -c^{2}t^{2}

a_{1}^{2} x^{2} -2a_{1}^{2} xvt + a_{1}^{2} v^{2} t^{2} - c^{2} b_{1}^{2} x^{2} - 2c^{2} b_{1} b_{2} xt -c^{2} b_{2}^{2} t^{2} = x^{2} - c^{2} t^{2}

Equating coefficients:

( a_{1}^{2} - c^{2}b_{1}^{2} ) x^{2} = x^{2} \rm{\;\; or \;\;} a_{1}^{2} - c^{2}b_{1}^{2} = 1 \qquad(5)

( a_{1}^{2} v^{2} - c^{2} b_{2}^{2} ) t^{2} = -c^{2} t^{2} \rm{\;\; or \;\;} c^{2} b_{2}^{2} -a_{1}^{2} v^{2} = c^{2} \qquad(6)

(2a_{1}^{2} v + 2b_{1} b_{2} c^{2} ) xt = 0 \rm{\;\; or \;\;} b_{1} b_{2} c^{2} = -a_{1}^{2}v \qquad(7)

From equations (5) and (6) we can write

b_{1}^{2} c^{2} = a_{1}^{2} - 1 \qquad(8)


b_{2}^{2} c^{2} = c^{2} + a_{1}^{2} v^{2} \qquad (9)

Multiplying equations (8) and (9) and squaring equation (7) we get

b_{1}^{2} b_{2}^{2} c^{4} = ( a_{1}^{2} - 1 )( c^{2} + a_{1}^{2} v^{2} ) = a_{1}^{4} v^{2}


a_{1}^{2} c^{2} - c^{2} + a^{4} v^{2} - a_{1}^{2} v^{2} = a_{1}^{4} v^{2}

a_{1}^{2} c^{2} - a_{1}^{2} v^{2} = c^{2}

a_{1}^{2} ( c^{2} - v^{2} ) = c^{2}

a_{1}^{2} = \frac{ c^{2} }{ c^{2} - v^{2} } = \frac{ 1 }{ 1 - v^{2}/c^{2} }


\boxed{ a_{1} = \frac{ 1 }{ \sqrt{ (1 - v^{2}/c^{2} ) } } }

Thus we can write

\boxed{ a_{2} = -v \cdot \frac{ 1 }{ \sqrt{ ( 1 - v^{2}/c^{2} ) } } }

Using equation (8) we can write

b_{1}^{2} c^{2} = \frac{ 1 }{ (1 - v^{2}/c^{2} ) } - 1

b_{1}^{2} c^{2} = \frac{ 1 - ( 1 - v^{2}/c^{2} ) }{ (1 - v^{2}/c^{2} ) } = \frac{ v^{2}/c^{2} }{ (1 - v^{2}/c^{2} ) } = \frac{ v^{2} }{ c^{2} } \cdot \frac { 1 }{ (1 - v^{2}/c^{2} ) }

so b_{1}^{2} = \frac{ v^{2} }{ c^{4} } \cdot \frac{ 1 }{ ( 1 - v^{2}/c^{2} ) }

Taking the negative square root we can write

\boxed{ b_{1} = - \frac{ v }{ c^{2} } \cdot \frac{ 1 }{\sqrt{ (1 - v^{2}/c^{2} ) }} }

From equation (9) we can write

b_{2}^{2} c^{2} = c^{2} + v^{2} \cdot \frac{ 1 }{ ( 1 - v^{2}/c^{2} ) } = \frac{ c^{2}( 1 - v^{2}/c^{2} ) + v^{2} }{ ( 1 - v^{2}/c^{2} ) } = \frac{ c^{2} - v^{2} + v^{2} }{ (1 - v^{2}/c^{2} ) } = \frac{ c^{2} }{ ( 1 - v^{2}/c^{2} ) }

which leads to

b_{2}^{2} = \frac{ 1 }{ ( 1 - v^{2}/c^{2} ) }

and so

\boxed{ b_{2} = \frac{ 1 }{ \sqrt{ ( 1 - v^{2}/c^{2} ) } } }

which is the same as a_{1}.

If we define

\gamma = \frac{ 1 }{ \sqrt{ ( 1 - v^{2}/c^{2} ) } }

we can write

a_{1} = \gamma, \;\;\; a_{2} = -\gamma v, \;\;\; b_{1} = -\frac{ v }{ c^{2} } \cdot \gamma \rm{\;\;\ and \;\;\;} b_{2} = \gamma

Thus we can finally write our transformations as

\boxed {\begin{array}{lcl} x^{\prime} & = & \gamma (x - vt) \\ y^{\prime} & = & y \\ z^{\prime} & = & z \\ t^{\prime} & = & \gamma ( t - \frac{ v }{ c^{2} }x ) \end{array} }

These are known as the Lorentz transformations.

The Lorentz factor

The term \gamma is know as the Lorentz factor.


The Lorentz factor gamma plotted against speed as a fraction of the speed of light.

The Lorentz factor \gamma plotted against speed as a fraction of the speed of light.


As this plot shows, the Lorentz factor is essentially unity until the ratio v/c (the ratio of the speed to the speed of light) becomes about half of the speed of light, or about 1.5 \times 10^{8} m/s. Given that even our fastest space ships only travel at a tiny fraction of the speed of light, it is not surprising that we have no direct experience of the weird effects that a Lorentz factor deviating significantly from one produce. Of course we see these effects in particle accelerators and cosmic ray showers, but human beings are a long way from attaining speeds where the Lorentz factor will deviate from unity.

In a future blog I will discuss some of these weird effects. They include time passing more slowly and distances shrinking. Very very weird; but very very real, they are shown to happen every day in our particle accelerators.

Read Full Post »