Feeds:
Posts

## Derivation of Planck’s radiation law – part 4 (final part)

In part 3 of this blog series I explained how Max Planck found a mathematical formula to fit the observed Blackbody spectrum, but that when he presented it to the German Physics Society on the 19th of October 1900 he had no physical explanation for his formula. Remember, the formula he found was

$E_{\lambda} \; d \lambda = \frac{ A }{ \lambda^{5} } \frac{ 1 }{ (e^{a/\lambda T} -1) } \; d\lambda$

if we express it in terms of wavelength intervals. If we express it in terms of frequency intervals it is

$E_{\nu} \; d \nu = A^{\prime} \nu^{3} \frac{ 1 }{ (e^{ a^{\prime} \nu / T } - 1) } \; d\nu$

Planck would spend six weeks trying to find a physical explanation for this equation. He struggled with the problem, and in the process was forced to abandon many aspects of 19th Century physics in both the fields of thermodynamics and electromagnetism which he had long cherished. I will recount his derivation – it is not the only one and maybe in coming blog posts I can show how his formula can be derived from other arguments, but this is the method Planck himself used.

As we saw in the derivation of the Rayleigh-Jeans law (see part 3 here, and links in that to parts 1 and 2), blackbody radiation can be modelled as an idealised cavity which radiates through a small hole. Importantly, the system is given enough time for the radiation and the material from which the cavity is made to come into thermal equilibrium with each other. This means that the walls of the cavity are giving energy to the radiation at the same rate that the radiation is giving energy to the walls.

Using classical physics, as we did in the derivation of the Rayleigh-Jeans law, we saw that the energy density (the energy per unit volume) is

$\frac{du}{d\nu} = \left( \frac{ 8 \pi kT }{ c^{3} } \right) \nu^{2}$

After trying to derive his equation based on standard thermodynamic arguments, which failed, Planck developed a model which, he found, was able to produce his equation. How did he do this?

### Harmonic Oscillators

First, he knew from classical electromagnetic theory that an oscillating electron radiates (as it is accelerating), and he reasoned that when the cavity was in thermal equilibrium with the radiation in the cavity, the electrons in the walls of the cavity would oscillate and it was they that produced the radiation.

After much trial and error, he decided upon a model where the electrons were attached to massless springs. He could model the radiation of the electrons by modelling them as a whole series of harmonic oscillators, but with different spring stiffnesses to produce the different frequencies observed in the spectrum.

As we have seen (I derived it here), in classical physics the energy of a harmonic oscillator depends on both its amplitude of oscillation squared ($E \propto A^{2}$); and it also depends on its frequency of oscillation squared ($E \propto \nu^{2}$). The act of heating the cavity to a particular temperature is what, in Planck’s model, set the electrons oscillating; but whether a particular frequency oscillator was set in motion or not would depend on the temperature.

If it were oscillating, it would emit radiation into the cavity and absorb it from the cavity. He knew from the shape of the blackbody curve (and, by now, his equation which fitted it), that the energy density $E d\nu$ at any particular frequency started off at zero for high frequencies (UV), then rose to a peak, and then dropped off again at low frequencies (in the infrared).

So, Planck imagined that the number of oscillators with a particular resonant frequency would determine how much energy came out in that frequency interval. He imagined that there were more oscillators with a frequency which corresponded to the maximum in the blackbody curve, and fewer oscillators at higher and lower frequencies. He then had to figure out how the total energy being radiated by the blackbody would be shared amongst all these oscillators, with different numbers oscillating at different frequencies.

He found that he could not derive his formula using the physics that he had long accepted as correct. If he assumed that the energy of each oscillator went as the square of the amplitude, as it does in classical physics, his formula was not reproduced. Instead, he could derive his formula for the blackbody radiation spectrum only if the oscillators absorbed and emitted packets of energy which were proportional to their frequency of oscillation, not to the square of the frequency as classical physics argued. In addition, he found that the energy could only come in certain sized chunks, so for an oscillator at frequency $\nu, \; E = nh\nu$, where $n$ is an integer, and $h$ is now known as Planck’s constant.

What does this mean? Well, in classical physics, an oscillator can have any energy, which for a particular oscillator vibrating at a particular frequency can be altered by changing the amplitude. Suppose we have an oscillator vibrating with an amplitude of 1 (in abitrary units), then because the energy goes as the square of the amplitude its energy is $E=1^{2} =1$. If we increase the amplitude to 2, the energy will now be $E=2^{2} = 4$. But, if we wanted an energy of 2, we would need an amplitude of $\sqrt{2} = 1.414$, and if we wanted an energy of $3$ we would need an amplitude of $\sqrt{3} = 1.73$.

In classical physics, there is nothing to stop us having an amplitude of 1.74, which would give us an energy of 3.0276 (not 3), or an amplitude of 1.72 whichg would give us an energy of 2.9584 (not 3). But, what Planck found is that this was not allowed for his oscillators, they did not seem to obey the classical laws of physics. The energy could only be integers of $h\nu$, so $E=0h\nu, 1h\nu, 2h\nu, 3h\nu, 4h\nu$ etc.

Then, as I said above, he further assumed that the total energy at a particular frequency was given by the energy of each oscillator at that frequency multiplied by the number of oscillators at that frequency. The frequency of a particular oscillator was, he imagined, determined by its stiffness (Hooke’s constant). The energy of a particular oscillator at a particular frequency could be varied by the amplitude of its oscillations.

Let us assume, just to illustrate the idea, that the value of h is 2. If the total energy in the blackbody at a particular frequency of, say, 10 (in arbitrary units) were 800 (also in arbitrary units), this would mean that the energy of each chunk ($E=h \nu$) was $E = 2 \times 10 = 20$. So, the number of chunks at that frequency would then be $800/20 = 40$. 40 oscillators, each with an energy of 20, would be oscillating to give us our total energy of 800 at that frequency.

Because of this quantised energy, we can write that $E_{n} = nh \nu$, where $n=0,1,2,3, \cdots$.

### The number of oscillators at each frequency

The next thing Planck needed to do was derive an expression for the number of oscillators at each frequency. Again, after much trial and error he found that he had to borrow an idea first proposed by Austrian physicist Ludwig Boltzmann to describe the most likely distribution of energies of atoms or molecules in a gas in thermal equilibrium. Boltzmann found that the number of atoms or molecules with a particular energy $E$ was given by

$N_{E} \propto e^{-E/kT}$

where $E$ is the energy of that state, $T$ is the temperature of the gas and $k$ is now known as Boltzmann’s constant. The equation is known as the Boltzmann distribution, and Planck used it to give the number of oscillators at each frequency. So, for example, if $N_{0}$ is the number of oscillators with zero energy (in the so-called ground-state), then the numbers in the 1st, 2nd, 3rd etc. levels ($N_{1}, N_{2}, N_{3},\cdots$) are given by

$N_{1} = N_{0} e^{ -E_{1}/kT }, \; N_{2} = N_{0} e^{ -E_{2}/kT }, \; N_{3} = N_{0} e^{ -E_{3}/kT }, \cdots$

But, as $E_{n} = nh \nu$, we can write

$N_{1} = N_{0} e^{ -h \nu /kT }, \; N_{2} = N_{0} e^{ -2h \nu /kT }, \; N_{3} = N_{0} e^{ -3h \nu /kT }, \cdots$

Planck modelled blackbody radiation as a series of harmonic oscillators with equally spaced energy levels

To make it easier to write, we are going to substitute $x = e^{ -h \nu / kT }$, so we have

$N_{1} = N_{0}x, \; N_{2} = N_{0} x^{2}, \; N_{3} = N_{0} x^{3}, \cdots$

The total number of oscillators $N_{tot}$ is given by

$N_{tot} = N_{0} + N_{1} + N_{2} + N_{3} + \cdots = N_{0} ( 1 + x + x^{2} + x^{3} + \cdots)$

Remember, this is the number of oscillators at each frequency, so the energy at each frequency is given by the number at each frequency multiplied by the energy of each oscillator at that frequency. So

$E_{1}=N_{1} h \nu , \; E_{2} = N_{2} 2h \nu , \; E_{3} = N_{3} 3h \nu, \cdots$

which we can now write as

$E_{1} = h \nu N_{0}x, \; E_{2} = 2h \nu N_{0}x^{2}, \; E_{3} = 3h \nu N_{0}x^{3}, \cdots$

The total energy $E_{tot}$ is given by

$E_{tot} = E_{0} + E_{1} + E_{2} + E_{3} + \cdots = N_{0} h \nu (0 + x + 2x^{2} + 3x^{3} + \cdots)$

The average energy $\langle E \rangle$ is given by

$\langle E \rangle = \frac{ E_{tot} }{ N_{tot} } = \frac{ N_{0} h \nu (0 + x + 2x^{2} + 3x^{3} + \cdots) }{ N_{0} ( 1 + x + x^{2} + x^{3} + \cdots ) }$

The two series inside the brackets can be summed. The sum of the series in the numerator, which we will call $S_{1}$ is given by

$S_{1} = \frac{ x - (n+1)x^{n+1} + nx^{n+2} }{ (1-x)^{2} }$

(for the proof of this, see for example here)

The series in the denominator, which we will call $S_{2}$, is just a geometric progression. The sum  of such a series is simply

$S_{2} = \frac{ 1 - x^{n} }{ (1-x) }$

Both series  are in $x$, but remember $x = e^{-h \nu / kT}$. Also, both series are from a frequency of $\nu = 0 \text{ to } \infty$, and $e^{-h \nu /kT} < 1$, which means the sums converge and can be simplified.

$S_{1} \rightarrow \frac{x}{ (1-x)^{2} } \text{ and } S_{2} \rightarrow \frac{ 1 }{(1-x)}$

which means that $\langle E \rangle = (h \nu S_{1})/S_{2}$ is given by

$\langle E \rangle = \frac{ h \nu x }{ (1-x)^{2} } \times \frac{ (1-x) }{1} = \frac{h \nu x}{ (1-x) }$

and so we can write that the average energy is

$\boxed{ \langle E \rangle = \frac{h \nu}{( 1/x - 1) } = \frac{h \nu}{ (e^{h \nu/kT} - 1) } }$

## The radiance per frequency interval

In our derivation of the Rayleigh-Jeans law (in this blog here), we showed that, using classical physics, the energy density $du$ per frequency interval was given by

$du = \frac{ 8 \pi }{ c^{3} } kT \nu^{2} \, d \nu$

where $kT$ was the energy of each mode of the electromagnetic radiation. We need to replace the $kT$ in this equation with the average energy for the harmonic oscillators that we have just derived above. So, we re-write the energy density as

$du = \frac{ 8 \pi }{ c^{3} } \frac{ h \nu }{ (e^{h\nu/kT} - 1) } \nu^{2} \; d\nu = \frac{ 8 \pi h \nu^{3} }{ c^{3} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu$

$du$ is the energy density per frequency interval (usually measured in Joules per metre cubed per Hertz), and by replacing $kT$ with the average energy that we derived above the radiation curve does not go as $\nu^{2}$ as in the Rayleigh-Jeans law, but rather reaches a maximum and turns over, avoiding the ultraviolet catastrophe.

It is more common to express the Planck radiation law in terms of the radiance per unit frequency, or the radiance per unit wavelength, which are written $B_{\nu}$ and $B_{\lambda}$ respectively. Radiance is the power per unit solid angle per unit area. So, as a first step to go from energy density to radiance we will divide by $4 \pi$, the total solid angle. This gives

$\frac{ 2 h \nu^{3} }{ c^{3} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu$

We want the power per unit area, not the energy per unit volume. To do this we first note that power is energy per unit time, and second that to go from unit volume to unit area we need to multiply by length. But, for EM radiation, length is just $ct$. So, we need to divide by $t$ and multiply by $ct$, giving us that the radiance per frequency interval is

$\boxed{ B_{\nu} = \frac{ 2h \nu^{3} }{ c^{2} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu }$

which is the way the Planck radiation law per frequency interval is usually written.

## Radiance per unit wavelength interval

If you would prefer the radiance per wavelength interval, we note that $\nu = c/\lambda$ and so $d\nu = -c/\lambda^{2} \; d\lambda$. Ignoring the minus sign (which is just telling us that as the frequency increases the wavelength decreases), and substituting for $\nu$ and $d\nu$ in terms of $\lambda$ and $d\lambda$, we can write

$B_{\lambda} = \frac{ 2h }{ c^{2} } \frac{ c^{3} }{ \lambda^{3} } \frac{ 1 }{ ( e^{hc/\lambda kT} - 1 ) } \frac{ c }{ \lambda^{2} } \; d\lambda$

Tidying up, this gives

$\boxed{ B_{\lambda} = \frac{ 2hc^{2} }{ \lambda^{5} } \frac{ 1 }{ ( e^{hc/\lambda kT} - 1 ) } \; d\lambda }$

which is the way the Planck radiation law per wavelength interval is usually written.

## Summary

To summarise, in order to reproduce the formula which he had empirically derived and presented in October 1900, Planck found that he he could only do so if he assumed that the radiation was produced by oscillating electrons, which he modelled as oscillating on a massless spring (so-called “harmonic oscillators”). The total energy at any given frequency would be given by the energy of a single oscillator at that frequency multiplied by the number of oscillators oscillating at that frequency.

However, he had to assume that

1. The energy of each oscillator was not related to either the square of the amplitude of oscillation or the square of the frequency of oscillation (as it would be in classical physics), but rather to the square of the amplitude and the frequency, $E \propto \nu$.
2. The energy of each oscillator could only be a multiple of some fundamental “chunk” of radiation, $h \nu$, so $E_{n} = nh\nu$ where $n=0,1,2,3,4$ etc.
3. The number of oscillators with each energy $E_{n}$ was given by the Boltzmann distribution, so $N_{n} = N_{0} e^{-nh\nu/kT}$ where $N_{0}$ is the number of oscillators in the lowest energy state.

In a way, we can imagine that the oscillators at higher frequencies (to the high-frequency side of the peak of the blackbody) are “frozen out”. The quantum of energy for a particular oscillator, given by $E_{n}=nh\nu$, is just too large to exist at the higher frequencies. This avoids the ultraviolet catastrophe which had stumped physicists up until this point.

By combining these assumptions, Planck was able in November 1900 to reproduce the exact equation which he had derived empirically in October 1900. In doing so he provided, for the first time, a physical explanation for the observed blackbody curve.

• Part 1 of this blogseries is here.
• Part 2 is here.
• Part 3 is here.

## Does centrifugal force exist?

For several weeks now I have been planning to write a blog about centrifugal force, mainly prompted by seeing a post by John Gribbin on Facebook of the xkcd cartoon about it. In the cartoon James Bond is threatened with torture on a centrifuge. Here is a link to the original cartoon.

The xkcd cartoon about centrifugal force involves James Bond being tortured on a centrifuge

I have taught mechanics many times to physics undergraduates, and they are often confused about centripetal force and centrifugal force, and what the difference is between them. Some have heard that centrifugal force doesn’t really exist, just as Bond states in this cartoon. What is the real story?

## Rotating frames of reference

Everyone reading this (apart from a few “flat-Earth adherents” maybe) knows that we live on the surface of a planet which is rotating on its axis once a day. This means that we do not live in an inertial frame of reference (an inertial frame is one which is not accelerating), as clearly being on the surface of a spinning planet means that we are experiencing acceleration all the time; as we are not travelling in a straight line. That acceleration is provided by the force of gravity, and it stops us from going off in a straight line into space!

Because we are living in a non-inertial frame of reference we need to modify Newton’s laws of motion to properly describe such a non-inertial frame (which I am going to call a “rotating frame” from now on, although a rotating frame is just one example of a non-inertial frame but it is the one relevant to us on the surface of a rotating Earth).

Let us consider our usual Cartesian coordinate system. The unit vector in the x-direction is usually written as $\hat{\imath}$, the one in the y-direction as $\hat{\jmath}$, and the one in the z-direction as $\hat{k}$. We are going to consider an object rotating about the $\hat{k}$ (z-axis) direction.

We will consider two reference frames, one which stays fixed (the inertial reference frame), denoted by $(\hat{\imath},\hat{\jmath},\hat{k})$, and a second reference frame which rotates with the rotation, denoted by $(\hat{\imath}_{r} ,\hat{\jmath}_{r} ,\hat{k}_{r})$, where the subscript $r$ reminds us that this is the rotating frame of reference.

For the derivation below I am going to assume that we are considering motion with a constant radius $r$. I want to illustrate how centrifugal force arrises in a rotating frame such as being on the surface of our Earth. Our Earth is not spherical, but at any given point the size of the radius does not change, so this is a reasonable simplification.

As I showed in this blog on angular velocity, we can write the linear velocity $\vec{v}$ of an object moving in a circle as

$\vec{v} = \frac{ d \vec{r} }{ dt } = \vec{\omega} \times \vec{r}$

where $\vec{r}$ is the radius vector and $\vec{\omega}$ is the angular velocity.

Writing $\vec{r}$ in terms of its x,y and z-components in our inertial (non-rotating) frame, $\vec{r}=(\hat{\imath},\hat{\jmath},\hat{k})$, so in general we then have

$\vec{v} = \frac{ d \vec{r} }{ dt } \rightarrow \frac{ d \hat{\imath} }{dt} = \vec{\omega} \times \hat{\imath}, \; \; \frac{ d \hat{\jmath} }{dt} = \vec{\omega} \times \hat{\jmath} , \; \; \frac{ d \hat{k} }{dt} = \vec{\omega} \times \hat{k}$

Let us consider the specific case of a small rotation $\delta \theta$ about the $\hat{k}$ axis, as shown in the figure below. As the figure shows, in our inertial (fixed) frame of reference, the new direction of the x-axis is now $\hat{\imath} + \delta \hat{\imath}$, and the new direction of the y-axis is $\hat{\jmath} + \delta \hat{\jmath}$. The direction of the $\hat{k}$ axis is unchanged.

We are going to rotate about the z-axis ($\hat{k}$ direction) by an angle $\delta \theta$

Because we are rotating about the $\hat{k}$ axis, the angular velocity is in this direction, and so we can write (using the right-hand rule for vector products as I blogged about here)

$\vec{\omega} \times \hat{\imath} = \omega \hat{\jmath}, \; \; \vec{\omega} \times \hat{\jmath} = -\omega \hat{\imath}, \; \; \vec{\omega} \times \hat{k} =0$

Let us now consider some vector $\vec{a}$, which we will write in the rotating frame of reference as

$\vec{a} = a_{x} \hat{\imath}_{r} + a_{y} \hat{\jmath}_{r} + a_{z} \hat{k}_{r}$

If we now look at the rate of change of this vector in the rotating frame we have

$\left( \frac{d \vec{a} }{dt} \right)_{r} = \frac{d}{dt}(a_{x}\hat{\imath}_{r}) + \frac{d}{dt}(a_{y}\hat{\jmath}_{r}) + \frac{d}{dt}(a_{z}\hat{k}_{r})$

In the rotating frame of reference, $\hat{\imath}_{r}, \hat{\jmath}_{r}$ and $\hat{k}_{r}$ do not change with time, so we can write

$\left( \frac{d \vec{a} }{dt} \right)_{r} = \frac{ da_{x} }{dt} \hat{\imath}_{r} + \frac{ da_{y} }{dt} \hat{\jmath}_{r} + \frac{ da_{z} }{dt} \hat{k}_{r}$

In the inertial frame of reference $\hat{\imath}_{r}, \hat{\jmath}_{r}$ and $\hat{k}_{r}$ move, so

$\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{d}{dt} (a_{x} \hat{\imath}_{r}) + \frac{d}{dt} (a_{y} \hat{\jmath}_{r}) + \frac{d}{dt} (a_{z} \hat{k}_{r})$

$\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + a_{x} \frac{d \hat{\imath}_{r} }{dt} + a_{y} \frac{d \hat{\jmath}_{r} }{dt} + a_{z} \frac{d \hat{k}_{r} }{dt}$

But, we can write (see above) that

$\frac{d\hat{\imath}_{r} }{dt} = \vec{\omega} \times \hat{\imath}_{r}, \; \; \frac{d\hat{\jmath}_{r} }{dt} = \vec{\omega} \times \hat{\jmath}_{r}, \; \; \frac{d\hat{k}_{r} }{dt} = \vec{\omega} \times \hat{k}_{r}$

and so

$\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + a_{x} \vec{\omega} \times \hat{\imath}_{r} + + a_{y} \vec{\omega} \times \hat{\jmath}_{r} + + a_{z} \vec{\omega} \times \hat{k}_{r}$

$\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + \vec{\omega} \times \vec{a}$

$\boxed{ \left( \frac{d \vec{a} }{dt} \right)_{i} = \left( \frac{d \vec{a} }{dt} \right)_{r} + (\vec{\omega} \times \vec{a}) }$

## A fixed point on the Earth’s surface

Let us now consider the point $\vec{a} = \vec{r}$, where $\vec{r}$ is a fixed point on the Earth’s surface. We can write

$\left( \frac{d \vec{r} }{dt} \right)_{i} = \left( \frac{d \vec{r} }{dt} \right)_{r} + (\vec{\omega} \times \vec{r})$

But, in the rotating frame of reference this point does not change with time, so

$\left( \frac{d \vec{r} }{dt} \right)_{r} = 0$

and so

$\left( \frac{d \vec{r} }{dt} \right)_{i} = (\vec{\omega} \times \vec{r}) = \omega r \sin(\theta)$

where $\theta$ is the angle between the Earth’s rotation axis and the latitude of the point (so $\theta = 90^{\circ} - \text{ latitude}$).

Let us now calculate the acceleration in an inertial frame in terms of acceleration in a rotating frame. Writing $\vec{a}$ as $\vec{r}$ as above, we now have

$\left( \frac{d \vec{r} }{dt} \right)_{i} = \left( \frac{d \vec{r} }{dt} \right)_{r} + (\vec{\omega} \times \vec{r})$

To make things easier to write, we will re-write

$\left( \frac{d \vec{r} }{dt} \right)_{i} = \frac{d \vec{r}_{i} }{dt} \text{ and } \left( \frac{d \vec{r} }{dt} \right)_{r} = \frac{d \vec{r}_{r} }{dt}$

so

$\frac{d \vec{r}_{i} }{dt} = \frac{d \vec{r}_{r} }{dt} + (\vec{\omega} \times \vec{r})$

$\vec{v}_{i} = \vec{v}_{r} + (\vec{\omega} \times \vec{r})$

If we now differentiate $\vec{v}_{i}$ with respect to time, we will have the acceleration in the inertial frame

$\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \left( \frac{d \vec{v}_{i} }{dt} \right)_{r} + (\vec{\omega} \times \vec{v}_{i})$

But, $\vec{v}_{i} = \vec{v}_{r} + (\vec{\omega} \times \vec{r})$

so

$\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \frac{d}{dt}(\vec{v}_{r} + \vec{\omega} \times \vec{r})_{r} + \vec{\omega} \times (\vec{v}_{r} + \vec{\omega} \times \vec{r})$

Expanding this out we get

$\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \left( \frac{d \vec{v}_{r} }{dt} \right)_{r} + \frac{d}{dt}(\vec{\omega} \times \vec{r}_{r}) + \vec{\omega} \times \vec{v}_{r} + \vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})$

$\vec{a}_{i} = \vec{a}_{r} + 2\vec{\omega} \times \vec{v}_{r} + \vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})$
Multiplying the acceleration by the mass $m$ to get a force

$m\vec{a}_{i} = m\vec{a}_{r} + 2m\vec{\omega} \times \vec{v}_{r} + m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})$

So, writing the force in the rotating frame in terms of the force in the inertial frame, we have

$\boxed{ m\vec{a}_{r} = m\vec{a}_{i} - 2m\vec{\omega} \times \vec{v}_{r} - m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r}) }$

So,

$\boxed{\vec{F}_{r} = \vec{F}_{i} - 2m\vec{\omega} \times \vec{v}_{r} - m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r}) }$

Notice that there are two extra terms (Term A and Term B) in the equation on the right, I have highlighted them below.

If we compare the force in a rotating frame to an inertial frame, two extra terms (Term A and Term B) arise. Term A is the Coriolis force, Term B is the centrifugal force

Term A is what we call the Coriolis force, which depends on the velocity in the rotating frame $v_{r}$. It is the force which causes water going down a plughole to rotate about the hole and to move anti-clockwise in the northern hemisphere and clockwise in the southern hemisphere. It is also the force which determines the direction of rotation of low pressure systems in the atmosphere. I will discuss the coriolis force more in a future blog.

Term B is the centrifugal force, the force we were aiming to derive in this blogpost. The strength of the centrifugal force depends on the position of the object in the rotating frame – $r_{r}$.

## What is the direction of the centrifugal force

The direction of $(\vec{\omega} \times \vec{r}_{r})$ can be found using the right-hand rule for the vector product, which I blogged about here. Remembering that the direction of $\vec{r}_{r}$\$ is radially outwards from the centre of the Earth, and the direction of $\vec{\omega}$ is the direction of the Earth’s axis (pointing north), then the direction of $\vec{\omega} \times \vec{r}_{r}$ is towards the east (right if looking at the Earth with the North pole up).

We now need to take the vector produce of $\vec{\omega}$ with a vector in this eastwards direction, and again using the right-hand rule gives us that the direction of $(\vec{\omega} \times \vec{r}_{r})$ is outwards (not radially from the centre of the Earth, but at right angles to the axis of the Earth). But, notice the centrifugal force has a minus sign in front of it, so the direction of the centrifugal force is outwards, away and at right angles to the Earth’s axis.

The direction of the centrifugal force is away from the axis of rotation, as shown in this diagram

This means that it acts to reduce the force of gravity which keeps us on the Earth’s surface. It also depends on the angle between where you are and the Earth’s axis, so is greatest at the equator and goes to zero at the pole. It means that you will weight slightly less than if the Earth were not rotating, but the effect is quite small and you would not notice such a difference going from the pole to the equator.

## What is the strength of the centrifugal acceleration due to Earth’s rotation?

Let us calculate the centrifugal force at the Earth’s equator, where it is at its greatest.

At the equator, we can write that the centrifugal acceleration has a value of

$\omega^{2} r \text{ as } \theta = 90^{\circ}$

We can calculate $\omega$ for the Earth by remembering that it takes 24 hours to rotate once, and $\omega$ is related to the period $T$ of rotation via

$\omega = \frac{2 \pi }{ T}$

We need to convert the period $T$ to seconds, so $T = 24 \times 60 \times 60 = 86400 \; s$. This gives that

$\omega = 7.272 \times 10^{-5} \text{ rad/s }$

If we take the Earth’s radius to be 6,378.1 km (this is the radius at the equator), then we have that

$\omega^{2} r = 0.0337 \text{ m/s/s}$

Compare this to the acceleration due to gravity which pulls us towards the Earth’s surface, which is 9.81 m/s/s and we can see that the centrifugal force at its greatest is only $0.34 \%$ of the acceleration due to gravity. Tiny.

It is, however, noticeable when you are on a roundabout, and is used on fairground rides where you spin inside a drum and the floor moves away leaving you pinned to the wall of the drum. The force you feel pushing against this wall is the centrifugal force, and it is very real for you in that rotating frame!

So, there we have it, centrifugal force does exist in a rotating frame of reference, but does not exist from the perspective of someone in an inertial frame of reference.

## Einstein’s general relativity centenary

There has been quite a bit of mention in the media this last week or so that it is 100 years since Albert Einstein published his ground-breaking theory of gravity – the general theory of relativity. Yet, there seems to be some confusion as to when this theory was first published, in some places you will see 1915, in others 1916. So, I thought I would try and clear up this confusion by explaining why both dates appear.

Albert Einstein in Berlin circa 1915/16 when his General Theory of Relativity was first published

## From equivalence to the field equations

Everyone knew that Einstein was working on a new theory of gravity. As I blogged about here, he had his insight into the equivalence between acceleration and gravity in 1907, and ever since then he had been developing his ideas to create a new theory of gravity.

He had come up with his principle of equivalence when he was asked in the autumn of 1907 to write a review article of his special theory of relativity (his 1905 theory) for Jahrbuch der Radioaktivitätthe (the Yearbook of Electronics and Radioactivity). That paper appeared in 1908 as Relativitätsprinzip und die aus demselben gezogenen Folgerungen (On the Relativity Principle and the Conclusions Drawn from It) (Jahrbuch der Radioaktivität, 4, 411–462).

In 1908 he got his first academic appointment, and did not return to thinking about a generalisation of special relativity until 1911. In 1911 he published a paper Einfluss der Schwerkraft auf die Ausbreitung des Lichtes (On the Influence of Gravitation on the Propagation of Light) (Annalen der Physik (ser. 4), 35, 898–908), in which he calculated for the first time the deflection of light produced by massive bodies. But, he also realised that, to properly develop his ideas of a new theory of gravity, he would need to learn some mathematics which was new to him. In 1912, he moved to Zurich to work at the ETH, his alma mater. He asked his friend Marcel Grossmann to help him learn this new mathematics, saying “You’ve got to help me or I’ll go crazy.”

Grossmann gave Einstein a book on non-Euclidean geometry. Euclidean geometry, the geometry of flat surfaces, is the geometry we learn in school. The geometry of curved surfaces, so-called Riemann geometry, had first been developed in the 1820s by German mathematician Carl Friedrich Gauss. By the 1850s another German mathematician, Bernhard Riemann developed this geometry of curved surfaces even further, and this was the Riemann geometry textbook which Grossmann gave to Einstein in 1912. Mastering this new mathematics proved very difficult for Einstein, but he knew that he needed to master it to be able to develop the equations for general relativity.

These equations were not ready until late 1915. Everyone knew Einstein was working on them, and in fact he was offered and accepted a job in Berlin in 1914 as Berlin wanted him on their staff when the new theory was published. The equations of general relativity were first presented on the 25th of November 1915, to the Prussian Academy of Sciences. The lecture Feldgleichungen der Gravitation (The Field Equations of Gravitation) was the fourth and last lecture that Einstein gave to the Prussian Academy on his new theory (Preussische Akademie der Wissenschaften, Sitzungsberichte, 1915 (part 2), 844–847), the previous three lectures, given on the 4th, 11th and 18th of November, had been leading up to this. But, in fact, Einstein did not have the field equations ready until the last few days before the fourth lecture!

The peer-reviewed paper of the theory (which also contains the field equations) did not appear until 1916 in volume 49 of Annalen der PhysikGrundlage der allgemeinen Relativitätstheorie (The Foundation of the General Theory of Relativity) Annalen der Physik (ser. 4), 49, 769–822. The paper was submitted by Einstein on the 20th of March 1916.

The beginning of Einstein’s first peer-reviewed paper on general relativity, which was received by Annalen der Physik on the 20th of March 1916

In a future blog, I will discuss Einstein’s field equations, but hopefully I have cleared up the confusion as to why some people refer to 1915 as the year of publication of the General Theory of Relativity, and some people choose 1916. Both are correct, which allows us to celebrate the centenary twice!

You can read more about Einstein’s development of the general theory of relativity in our book 10 Physicists Who Transformed Our Understanding of Reality. Order your copy here

## Derivation of the moment of inertia of an annulus

Following on from my derivation of the moment of inertia of a disk, in this blog I will derive the moment of inertia of an annulus. By an annulus, I mean a disk which has the inner part missing, as shown below.

An annulus is a disk of small thickness $t$ with the inner part missing. The annulus goes from some inner radius $r_{1}$ to an outer radius $r$.

To derive its moment of inertia, we return to our definition of the moment of inertia, which for a volume element $dV$ is given by

$dI = r_{\perp}^{2} dm$

where $dm$ is the mass of the volume element $dV$. We are going to initially consider the moment of inertia about the z-axis, and so for this annulus it will be

$I_{zz} = \int _{r_{1}} ^{r} r_{\perp}^{2} dm$

where $r_{1}$ and $r$ are the inner radius and outer radius of the annulus respectively. As with the disk, the mass $dm$ of the volume element $dV$ is related to its volume and density via

$dm = \rho dV$

(assuming that the annulus has a uniform density). The volume element $dV$ can be found as before by considering a ring at a radius of $r$ which a width $dr$ and a thickness $t$. The volume of this will be

$dV = (2 \pi r dr) t$

and so we can write the mass $dm$ as

$dm = (2 \pi \rho t)rdr$

Thus we can write the moment of inertia $I_{zz}$ as

$I_{zz} = \int _{r_{1}} ^{r} r_{\perp}^{2} dm = 2 \pi \rho t \int _{r_{1}} ^{r} r_{\perp}^{3} dr$

Integrating this between $r_{1}$ and $r$ we get

$I_{zz} = 2 \pi \rho t [ \frac{ r^{4} - r_{1}^{4} }{4} ] = \frac{1}{2} \pi \rho t (r^4 - r_{1}^{4}) \text{ (Equ. 1)}$

But, we can re-write $(r^{4} - r_{1}^{4})$ as $(r^{2} + r_{1}^{2})(r^{2} - r_{1}^{2})$ (remember $x^{2} - y^{2}$ can be written as $(x+y)(x-y)$). So, wen can write Eq. (1) as

$I_{zz} = \frac{1}{2} \pi \rho t (r^{2} + r_{1}^{2})(r^{2} - r_{1}^{2}) \text{ (Equ. 2)}$

The total mass $M_{a}$ of the annulus can be found by considering the total mass of a disk of radius $r$ (which we will call $M_{2}$) and then subtracting the mass of the inner part, a disk of radius $r_{1}$ (which we will call $M_{1}$). The mass of a disk is just its density multiplied by its area multiplied by its thickness.

$M_{2} = \pi \rho t r^{2} \text{ and } M_{1} = \pi \rho t r_{1}^{2}$

so the mass $M_{a}$ of the annulus is

$M_{a} = M_{2} - M_{1} = \pi \rho t r^{2}- \pi \rho t r_{1}^{2} = \pi \rho t (r^{2} - r_{1}^{2})$

Substituting this expression for $M_{a}$ into equation (2) above, we can write that the moment of inertia for an annulus, which goes from an inner radius of $r_{1}$ to an outer radis of $r$, about the z-axis is

$\boxed{ I_{zz} = \frac{1}{2} M_{a} (r^{2} + r_{1}^{2}) }$

## Comparison to the moment of inertia of a disk

As we saw in this blog, the moment of inertia of a disk is $I_{zz} = \frac{1}{2} Mr^{2}$. It may therefore seem, at first sight, that the moment of inertia of an annulus is more than that of a disk. This would be true if they have the same mass, but if they have the same thickness and density the mass of an annulus will be much less.

Let us compare the moment of inertia of a disk and an annulus for the 4 following cases.

The same density and thickness, $r_{1} = 0.5 r$
The same density and thickness, $r_{1} = 0.9 r$
The same mass, $r_{1} = 0.5 r$
The same mass, $r_{1} = 0.9 r$

## The same density and thickness, $r_{1}=0.5r$

We are first going to compare the moment of inertia of a disk of mass $M$ with that of an annulus which goes from half the radius of the disk to the radius of the disk (i.e. $r_{1} \text{ the inner radius of the annulus, is } = 0.5 r$.

For the disk, its mass will be

$M = \rho t (\pi r^{2}) = \pi \rho t r^{2}$

The mass of the annulus, $M_{a}$, will be this mass less the mass of the missing part $M_{1}$, so

$M_{a} = M - M_{1} = M - \pi \rho t (r_{1})^{2} = \pi \rho t (r^{2} - (0.5r)^{2})= \pi \rho t (1-0.25)r^{2}$

$M_{a} = \pi \rho t (0.75)r^{2} = 0.75 M$

The moment of inertia of the disk will be

$I_{d} = \frac{1}{2} M r^{2}$
The moment of inertia of the annulus will be

$I_{a} = \frac{1}{2} M_{a} (r^{2} + r_{1}^{2}) = \frac{1}{2} (0.75M)(r^{2} + (0.5r)^{2}) = \frac{1}{2} (0.75M)(1.25r^{2}) = \frac{1}{2} (0.9375) M r^{2}$

So, for this case, $I_{a} = 0.9375 I_{d}$, i.e. slightly less than the disk.

## The same density and thickness, $r_{1}=0.9r$

Let us now consider the second case, with an annulus of the same density and thickness as the disk, and its inner radius being 90% of the outer radius, $r_{1} = 0.9r$. Now, the mass of the missing part of the disk, $M_{1}$ will be

$M_{1} = \rho t (\pi r_{1}^{2}) = \rho t \pi (0.9r)^{2} = 0.81 \rho t \pi r^{2} = 0.81M$

which means that the mass of the annulus, $M_{a}$ is

$M_{a} = M - M_{1} = M-0.81M=0.19M$

The moment of inertia of the annulus will then be

$I_{a} = \frac{1}{2}M_{a}(r^{2}+r_{1}^{2}) = \frac{1}{2}(0.19M)(r^{2}+(0.9r)^{2})=\frac{1}{2}(0.19M)((1.81)r^{2} = \frac{1}{2}(0.1539)Mr^{2}$

and so in this case

$I_{a} = 0.1539 I_{d}$

which is much less than the moment of inertia of the disk.

## The same mass, $r_{1}=0.5r$

In this third case, the mass of the annulus is the same as the mass of the disk, and its inner radius is 50% of the radius of the disk. This would, of course, require the annulus to either have a greater density than the disk, or to be thicker (or both). So, $M_{a} = M$. The moment of inertia of the annulus will be

$I_{a} = \frac{1}{2} M(r^{2} + r_{1}^{2}) = \frac{1}{2} M(r^{2} + (0.5r)^{2}) = \frac{1}{2} M(r^{2} + 0.25r^{2}) = \frac{1}{2} M(1.25)r^{2}$

$I_{a}= 1.25 I_{d}$

## The same mass, $r_{1}=0.9r$

The last case we will consider is an annulus with its inner radius being 90% of the outer radius, but its mass the same. So, $M_{a} = M$. The moment of inertia of the annulus will be

$I_{a} = \frac{1}{2} M(r^{2} + r_{1}^{2}) = \frac{1}{2} M(r^{2} + (0.9r)^{2}) = \frac{1}{2} M(r^{2} + 0.81r^{2}) = \frac{1}{2} M(1.81)r^{2}$

$I_{a}= 1.81 I_{d}$

## Summary

To summarise, we have

The same density and thickness, $r_{1} = 0.5 r, \; \; I_{a}=0.9375 I_{d}$
The same density and thickness, $r_{1} = 0.9 r, \; \; I_{a}=0.1539 I_{d}$
The same mass, $r_{1} = 0.5 r, \; \; I_{a}=1.25 I_{d}$
The same mass, $r_{1} = 0.9 r, \; \; I_{a}=1.81 I_{d}$

So, as these calculations show, if keeping the mass of a flywheel down is important, then a larger moment of inertia will be achieved by concentrating most of that mass in the outer parts of the flywheel, as this photograph below shows.

If keeping mass down is important, a flywheel’s moment of inertia can be increased by concentrating most of the mass in its outer parts

In the next blogpost in this series I will calculate the moment of inertia of a solid sphere.

## Derivation of the moment of inertia of a disk

In physics, the rotational equivalent of mass is something called the moment of inertia. The definition of the moment of inertia of a volume element $dV$ which has a mass $dm$ is given by

$dI = r_{\perp}^{2} dm$

where $r_{\perp}$ is the perpendicular distance from the axis of rotation to the volume element. To find the total moment of inertia of an object, we need to sum the moment of inertia of all the volume elements in the object over all values of distance from the axis of rotation. Normally we consider the moment of inertia about the vertical (z-axis), and we tend to denote this by $I_{zz}$. We can write

$I_{zz} = \int _{r_{1}} ^{r_{2}} r_{\perp}^{2} dm$

The moment of inertia about the other two cardinal axes are denoted by $I_{xx}$ and $I_{yy}$, but we can consider the moment of inertia about any convenient axis.

## Derivation of the moment of inertia of a disk

In this blog, I will derive the moment of inertia of a disk. In upcoming blogs I will derive other moments of inertia, e.g. for an annulus, a solid sphere, a spherical shell and a hollow sphere with a very thin shell.

For our purposes, a disk is a solid circle with a small thickness $t$ ($t \ll r$, small in comparison to the radius of the disk). If it has a thickness which is comparable to its radius, it becomes a cylinder, which we will discuss in a future blog. So, our disk looks something like this.

A disk of small thickness $t$, with a radius of $r$

To calculate the moment of inertia of this disk about the z-axis, we sum the moment of inertia of a volume element $dV$ from the centre (where $r=0$) to the outer radius $r$.

$I_{zz} = \int_{r=0} ^{r=r} r_{\perp} ^{2} dm \text{ (Equ. 1)}$

The mass element $dm$ is related to the volume element $dV$ via the equation
$dm = \rho dV$ (where $\rho$ is the density of the volume element). We will assume in this example that the density $\rho(r)$ of the disk is uniform; but in principle if we know its dependence on $r, \; \rho (r) = f(r)$, this would not be a problem.

The volume element $dV$ can be calculated by considering a ring at a radius $r$ with a width $dr$ and a thickness $t$. The volume of this ring is just this rings circumference multiplied by its width multiplied by its thickness.

$dV = (2 \pi r dr) t$

so we can write

$dm = \rho (2 \pi r dr) t$

and hence we can write equation (1) as

$I_{zz} = \int_{r=0} ^{r=r} r_{\perp} ^{2} \rho (2 \pi r dr) t = 2 \pi \rho t \int_{r=0} ^{r=r} r_{\perp} ^{3} dr$

Integrating between a radius of $r=0$ and $r$, we get

$I_{zz} = 2 \pi \rho t [ \frac{ r^{4} }{ 4 } -0 ] = \frac{1}{2} \pi \rho t r^{4} \text{ (Equ. 2)}$

If we now define the total mass of the disk as $M$, where

$M = \rho V$

and $V$ is the total volume of the disk. The total volume of the disk is just its area multiplied by its thickness,

$V = \pi r^{2} t$

and so the total mass is

$M = \rho \pi r^{2} t$

Using this, we can re-write equation (2) as

$\boxed{ I_{zz} = \frac{1}{2} \pi \rho t r^{4} = \frac{1}{2} Mr^{2} }$

## What are the moments of inertia about the x and y-axes?

To find the moment of inertia about the x or the y-axis we use the perpendicular axis theorem. This states that, for objects which lie within a plane, the moment of inertia about the axis parallel to this plane is given by

$I_{zz} = I_{xx} + I_{yy}$

where $I_{xx}$ and $I_{yy}$ are the two moments of inertia in the plane and perpendicular to each other.

We can see from the symmetry of the disk that the moment of inertia about the x and y-axes will be the same, so $I_{zz} = 2I_{xx}$. Therefore we can write

$\boxed{ I_{xx} = I_{yy} = \frac{1}{2}I_{zz} = \frac{1}{4} Mr^{2} }$

## Flywheels

Flywheels are used to store rotational energy. This is useful when the source of energy is not continuous, as they can help provide a continuous source of energy. They are used in many types of motors including modern cars.

It is because of an disk’s moment of inertia that it can store rotational energy in this way. Just as with mass in the linear case, it requires a force to change the rotational speed (angular velocity) of an object. The larger the moment of inertia, the larger the force required to change its angular velocity. As we can see above from the equation for the moment of inertia of a disk, for two flywheels of the same mass a thinner larger one will store more energy than a thicker smaller one because its moment of inertia increases as the square of the radius of the disk.

Sometimes mass is a critical factor, and next time I will consider the case of an annulus, where the inner part of the disk is removed.

## Harmonic Oscillators

Today I was planning to post the fourth and final part of my series of blogs about the derivation of Planck’s radiation law. But, I realised on Sunday that it would not be ready, so I’m postponing it until next Thursday (17th). Parts 1, 2 and 3 are here, here and here respectively).

One of the reasons for this is that my time is being consumed by writing articles for 30-second Einstein which I talked about on Tuesday. Another reason is that I am scrambling to finish a slew of things by the 15th (next Tuesday!!), as I have been asked to go on another cruise to give astronomy lectures. More about that next week 🙂

There is a third reason, I have realised that I have not yet done a blog about harmonic oscillators, which is a necessary part of understanding Planck’s derivation. So, that is the subject of today’s blogpost.

Harmonic oscillators is another term for something which is exhibiting simple harmonic motion, and I did blog here about how a pendulum exhibits simple harmonic motion (SHM), and how this relates to circular motion. Another example of SHM is a spring oscillating back and forth. Whether the spring is vertical or horizontal, if it is displaced from its equilibrium position it will exhibit SHM. So, a spring is a harmonic oscillator.

## The frequency of a harmonic oscillator

The restoring force on a spring when it is displaced from its equilibrium position is given by Hooke’s law, which states

$\vec{F} = - k \vec{x}$

where $\vec{F} \text{ and } \vec{x}$ are the force and displacement respectively (vector quantities), and the minus sign is telling us that the force acts in the opposite direction to the displacement; that is it is a restoring force which is directed back towards the equilibrium position. The term $k$ is known as Hooke’s constant, and is basically the stiffness of the spring.

Because we can also write the force in terms of mass and acceleration (Newton’s 2nd law of motion), and acceleration is the second derivate of displacement, we can write

$m \vec{a} = m \frac{ d^{2}\vec{x} }{ dt^{2} }= - k \vec{x}$

If we divide by $m$ we get an expression for the acceleration, which is

$\boxed{ \vec{a} = - \frac{k}{m} \vec{x} }$

which, if you compare it to the equation for SHM for a pendulum, has the same form. The usual way to write equations of SMH is to write

$\vec{a} = - \omega^{2} \vec{x}$

where $\omega$ is the angular velocity, and as I mentioned in the blog I did on the pendulum, $\omega$ is related to the period of the SHM, via $T = 2 \pi / \omega$.

For our derivation of Planck’s radiation law, the parts which we need to know about are that the frequency of a harmonic oscillator, $\nu$ is given by $\nu = \omega / 2 \pi$, and so depends only on $\omega, \; \boxed{\nu \propto \omega }$. So the frequency of the spring’s oscillations depends only on k/m, we can write $\boxed{ \nu \propto k/m }$. A stiffer spring oscillates with a higher frequency, more mass (in either the spring or what is attached to it) will reduce the frequency of the oscillations.

## The energy of a harmonic oscillator

The other thing we need to know about to understand Planck’s derivation of his blackbody radiation law is the energy of the harmonic oscillator. This is always constant, but is divided between kinetic energy and potential energy. The kinetic energy is at a maximum when the spring is at its equilibrium position, at this moment it actually has zero potential energy.

The velocity of a harmonic oscillator $v$ can be found my differentiating the displacement $x$ with respect to time. The expression for displacement (see my blog here on SHM in a pendulum) is

$x(t) = A sin ( \omega t )$

where $A$ is the maximum displacement (amplitude) of the oscillstions. So

$v = \frac{dx}{dt} = A \omega cos ( \omega t)$

This will be a maximum when $cos( \omega t) = 1$ and so

$v_{max} = A \omega$

which means that the maximum kinetic energy, and hence the total energy of the harmonic oscillator is given by

$\text{Total energy} = E = \frac{1}{2}mv_{max}^{2} = \frac{1}{2} m A^{2} \omega^{2}$

As the frequency $\nu$ is just $\omega / 2 \pi$, this tells us that, for a harmonic oscillator of a given mass, the energy depends on both the square of frequency $\nu$ and the square of the size of the oscillations (larger oscillations mean more energy, double the size of the oscillations and the energy goes up by a factor of four). Mathematically we can write this as $\boxed{ E \propto A^{2} }$ and $\boxed{ E \propto \nu^{2} }$.

As we shall see, the theoretical explanation which Planck concocted to explain his blackbody curve involved assuming the walls of the cavity producing the radiation oscillated in resonance with the radiation; this is why I needed to derive these things on this blog today.

## Derivation of Planck’s radiation law – part 3

As I have outlined in parts 1 and 2 of this series (see here and here), in the 1890s, mainly through the work of the Physikalisch-Technische Reichsanstalt (PTR) in Germany, the exact shape of the blackbody spectrum began to be well determined. By mid-1900, with the last remaining observations in the infrared being completed, its shape from the UV through the visible and into the infrared was well determined for blackbodies with a wide range of temperatures.

I also described in part 2 that in 1896 Wilhelm Wien came up with a law, based on a thermodynamical argument, which almost explained the blackbody spectrum. The form of his equation (which we now know as Wien’s distribution law) is
$\boxed{ E_{ \lambda } d \lambda = \frac{ A }{ \lambda ^{5} } e^{ -a / \lambda T } d \lambda }$

Notice I said almost. Below I show two plots which I have done showing the Wien distribution law curve and the actual blackbody curve for a blackbody at a temperature of $T=4000 \text{Kelvin}$. As you can see, they are not an exact match, the Wien distribution law fails on the long-wavelength side of the peak of the blackbody curve.

Comparison of the Wien distribution law and the actual blackbody curve for a blackbody at a temperature of $T=4000 \text{Kelvin}$. Although they agree very well on the short wavelength side of the peak, the Wien law drops away too quickly on the long-wavelength side compared to the observed blackbody spectrum.

A zoomed-in view to highlight the difference between the Wien distribution law and the actual blackbody curve for a blackbody at a temperature of $T=4000 \text{Kelvin}$. Although they agree very well on the short wavelength side of the peak, the Wien law drops away too quickly on the long-wavelength side compared to the observed blackbody spectrum.

## Planck’s “act of desperation”

By October 1900 Max Planck had heard of the latest experimental results from the PTR which showed, beyond any doubt, that Wien’s distribution law did not fit the blackbody spectrum at longer wavelengths. Planck, along with Wien, was hoping that the results from earlier in the year were in error, but when new measurements by a different team at the PTR showed that Wien’s distribution law failed to match the observed curve in the infrared, Planck decided he would try and find a curve that would fit the data, irrespective of what physical explanation may lie behind the mathematics of the curve. In essence, he was prepared to try anything to get a fit.

Planck would later say of this work

Briefly summarised, what I did can be described as simply an act of desperation

What was this “act of desperation”, and why did Planck resort to it? Planck was 42 when he unwittingly started what would become the quantum revolution, and his act of desperation to fit the blackbody curve came after all other options seemed to be exhausted. Before I show the equation that he found to be a perfect fit to the data, let me say a little bit about Planck’s background.

## Who was Max Planck?

Max Karl Ernst Ludwig Planck was born in Kiel in 1858. At the time, Kiel was part of Danish Holstein. He was born into a religious family, both his paternal great-grandfather and grandfather had been distiguished theologians, and his father became professor of constitutional law at Munich University. So he came from a long line of men who venerated the laws of God and Man, and Planck himself very much followed in this tradition.

He attended the most renowned secondary school in Munich, the Maximilian Gymnasium, always finishing near the top of his class (but not quite top). He excelled through hard work and self discipline, although he may not have had quite the inherent natural ability of the few who finished above him. At 16 it was not the famous taverns of Munich which attracted him, but rather the opera houses and concert halls; he was always a serious person, even in his youth.

In 1874, aged 16, he enrolled at Munich University and decided to study physics. He spent three years studying at Munich, where he was told by one of his professors ‘it is hardly worth entering physics anymore’; at the time it was felt by many that there was nothing major left to discover in the subject.

In 1877 Planck moved from Munich to the top university in the German-speaking world – Berlin. The university enticed Germany’s best-known physicist, Herman von Helmholtz, from his position at Heidelberg to lead the creation of what would become the best physics department in the world. As part of creating this new utopia, Helmholtz demanded the building of a magnificient physics institute, and when Planck arrived in 1877 it was still being built. Gustav Kirchhoff, the first person to systematically study the nature of blackbody radiation in the 1850s, was also enticed from Heidelberg and made professor of theoretical physics.

Planck found both Helmholtz and Kirchhoff to be uninspring lecturers, and was on the verge of losing interest in physics when he came across the work of Rudolf Clausius, a professor of physics at Bonn University. Clausius’ main research was in thermodynamics, and it is he who first formulated the concept of entropy, the idea that things naturally go from order to disorder and which, possibly more than any other idea in physics, gives an arrow to the direction of time.

Planck spent only one year in Berlin, before he returned to Munich to work on his doctoral thesis, choosing to explore the concept of irreversibility, which was at the heart of Claussius’ idea of entropy. Planck found very little interest in his chosen topic from his professors in Berlin, and not even Claussius answered his letters. Planck would later say ‘The effect of my dissertation on the physicists of those days was nil.’

Undeterred, as he began his academic career, thermodynamics and, in particular, the second law (the law of entropy) became the focus of his research. In 1880 Planck became Privatdozent, an unpaid lecturer, at Munich University. He spent five years as a Privatdozent, and it looked like he was never going to get a paid academic position. But in 1885 Gottingen University announced that the subject of its prestigoius essay competition was ‘The Nature of Energy’, right up Planck’s alley. As he was working on his essay for this competition, he was offered an Extraordinary (assistant) professorship at the University of Kiel.

Gottingen took two years to come to a decision about their 1885 essay competition, even though they had only received three entries. They decided that no-one should receive first prize, but Planck was awarded second prize. It later transpired that he was denied first prize because he had supported Helmholtz in a scientific dispute with a member of the Gottingen faculty. This brought him to the attention of Helmholtz, and in November 1888 Planck was asked by Helmholtz to succeed Kirchhoff as professor of theoretical physics in Berlin (he was chosen after Ludwig Boltzmann turned the position down).

And so Planck returned to Berlin in the spring of 1889, eleven years after he had spent a year there, but this time not as a graduate student but as an Extraordinary Professor. In 1892 Planck was promoted to Ordinary (full) Professor. In 1894 both Helmholtz and August Kundt, the head of the department, died within months of each other; leaving Planck at just 36 as the most senior physicist in Germany’s foremost physics department.

Max Planck who, in 1900 at the age of 42, found a mathematical equation which fitted the entire blackbody spectrum correctly.

As part of his new position as the most senior physicist in the Berlin department, he took over the duties of being adviser for the foremost physics journal of the day – Annalen der Physik (the journal in which Einstein would publish in 1905). It was in this role of adviser that he became aware of the work being done at PTR on determining the true spectrum of a blackbody.

Planck regarded the search for a theoretical explanation of the blackbody spectrum as nothing less than the search for the absolute, and as he later stated

Since I had always regarded the search for the absolute as the loftiest goal of all scientific activity, I eagerly set to work

When Wien published his distribution law in 1896, Planck tried to put the law on a solid theoretical foundation by deriving it from first principles. By 1899 he thought he had succeeded, basing his argument on the second law of thermodynamics.

## Planck finds a curve which fits

But, all of this fell apart when it was shown conclusively on the 2nd of February 1900, by Lummer and Pringsheim of the PTR, that Wien’s distribribution law was wrong. Wien’s law failed at high temperatures and long wavelengths (the infrared); a replacement which would fit the experimental curve needed to be found. So, on Sunday the 7th of October, Planck set about trying to find a formula which would reproduce the observed blackbody curve.

He was not quite shooting in the dark, he had three pieces of information to help him. Firstly, Wien’s law worked for the intensity of radiation at short wavelengths. Secondly, it was in the infrared that Wien’s law broke down, at these longer wavelengths it was found that the intensity was directly propotional to the temperature. Thirdly, Wien’s displacement law, which gave the relationship between the wavelength of the peak of the curve and the blackbody’s temperature worked for all observed blackbodies.

After working all night of the 7th of October 1900, Planck found an equation which fitted the observed data. He presented this work to the German Physical Society a few weeks later on Friday the 19th of October, and this was the first time others saw the equation which has now become known as Planck’s law.

The equation he found for the energy in the wavelength interval $d \lambda$ had the form
$\boxed{ E_{\lambda} \; d \lambda = \frac{ A }{ \lambda^{5} } \frac{ 1 }{ (e^{a/\lambda T} - 1) } \; d\lambda }$

(compare this to the Wien distribution law above).

After presenting his equation he sat down; he had no explanation for why this equation worked, no physical understanding of what was going on. That understanding would dawn on him over the next few weeks, as he worked tirelessly to explain the equation on a physical basis. It took him six weeks, and in the process he had to abandon some of the ideas in physics which he held most dear. He found that he had to abandon accepted ideas in both thermodynamics and electromagnetism, two of the cornerstones of 19th Century physics. Next week, in the fourth and final part of this blog-series, I will explain what physical theory Planck used to explain his equation; the theory which would usher in the quantum age.