Archive for the ‘mathematics’ Category

The other week I was asked to explain how a cylinder (or ball) rolling down a slope differs from e.g. a ball being dropped vertically. It is an interesting question, because it illustrates some things which are not immediately obvious. We all know that, if you drop two balls, say a tennis ball and a cannon ball, they will hit the ground at the same time. This is despite their having very different masses (weights). Galileo supposedly showed this idea by dropping objects of different weights from the tower of Pisa (although he probably never did this, see our book Ten Physicists Who Transformed Our Understanding of Reality).

With a tennis ball and a cannon ball, they clearly have very different masses (weights), but will fall to the ground at the same rate. This fact, contrary to the teachings of Aristotle, was one of the key breakthroughs which Galileo made in our understanding of motion. But, what about if we roll the two balls down a slope? If we build a track to keep them going straight, will a tennis ball roll down a slope at the same rate as a cannon ball? The answer is no, and I will explain why.

Rolling rather than dropping

When a ball rolls down a slope, it starts off at the top of the slope with gravitational potential energy. When it starts rolling down the slope, this gravitational potential energy gets converted to kinetic energy. This is the same as when the ball drops vertically. But, in the case of the ball dropping vertically, the kinetic energy is all in the form of linear kinetic energy, given by

\text{ linear kinetic energy } = \frac{ 1 }{ 2 } mv^{2}

where m is the mass of the ball and v is its velocity (which is increasing all the time as it falls and speeds up). The gravitational potential energy is converted to linear kinetic energy as the ball drops; by the time the ball hits the bottom of its fall all of the PE has been converted to KE.

If, instead, we roll a ball down a slope, the kinetic energy is in two forms, linear kinetic energy but also rotational kinetic energy, which is given by

\text{ rotational kinetic energy } = \frac{ 1 }{ 2 } I \omega^{2}

where I is the ball’s moment of inertia, and \omega is the ball’s angular velocity, usually measured in radians per second. The key point is that the the moment of inertia for the two balls in this example (a tennis ball and a cannon ball) have a different value, because the distribution of the mass in the two balls is different. For the tennis ball it is all concentrated in the layer of the rubber near the ball’s surface, with a hollow interior. For the cannon ball, the mass is distributed throughout the ball.

Two cylinders rolling down a slope

Let us, instead, consider the case of two cylinders rolling down a slope. One is a solid cylinder, the other is a hollow one with all of its mass concentrated near the surface. We will make the two cylinders have the same mass; this can be done by making the material from which the hollow cylinder is made denser than the material for the solid cylinder. So, even though the material of the hollow cylinder is all concentrated near the surface of the cylinder, and there is a lot less of it, if it is denser it can have equal mass.


A solid cylinder on an inclined plane. We will make the mass of this solid cylinder the same as that of the hollow cylinder, by making it of less dense material. Although it will have the same mass m and the same radius R, it will not have the same moment of inertia I.

We will start both cylinders from rest near the top of the slope, and let them roll down. We will observe what happens.


A hollow rolling down an inclined plane. We will make the hollow cylinder denser than the solid one, so that they both have the same mass m and the same outer radius R. But, they will not have the same moment of inertia I.

When things are dropped, the rate at which they fall is independent of the mass, but when they roll the rate at which they roll is not indpendent of the moment of inertia. In particular, it is not independent of the distribution of mass in the rolling object. As this video shows, the solid cylinder rolls down the slope faster than the hollow one!

But, why??

Why does the solid cylinder roll down quicker?

The reason that the solid cylinder rolls down faster than the hollow cylinder has to do with the way that the potential energy (PE) is converted to kinetic energy. Because the cylinder is rolling, some of the PE is converted to rotational kinetic energy (RKE), not just to linear kinetic energy (LKE). The only way that a cylinder can roll down a slope is if there is friction between the cylinder and the slope, if the slope were perfectly smooth the cylinder would slide and not roll.

The torque (rotational force) \tau is related to the angular acceleration \alpha in a similar way that the linear force F is related to linear accelerate a. From Newton’s second law we know that F = ma where m is the mass of the object. The rotational equivalent of this law is

\tau = I \alpha

where I is the moment of inertia. The moment of inertia I is different for a hollow cylinder and a solid cylinder. For the solid cylinder it is given by

I_{sc} = \frac{ 1 }{ 2 } mR^{2} = 0.5mR^{2}

where m is the mass of the cylinder and R is the radius of the cylinder. For the hollow cylinder, the moment of inertia is given by

I_{hc} = \frac{ 1 }{ 2 }m(R_{2}^{2} + R_{1}^{2})

where R_{2} \text{ and } R_{1} are the outer and inner radii of the annulus of the cylinder. We are going to make the hollow cylinder such that the inner 80% is hollow, so that R_{1} = 0.8R_{2} = 0.8R. We will make the mass m of the two cylinders the same.

Thus, for the hollow cylinder, we can now write

I_{hc} = \frac{ 1 }{ 2 }m(R^{2} + (0.8R)^{2}) = \frac{ 1 }{ 2 }mR^{2}(1+0.64) = \frac{ 1 }{ 2 }mR^{2}(1.64) = 0.82 mR^{2}

The cylinder accelerates down the slope due to the component of its weight which acts down the slope. This component is mg sin(\theta) where g is the acceleration due to gravity and \theta is the angle of the slope from the horizontal. To make the maths easier, we are going to set \theta = 30^{\circ}, as sin(30) =0.5.

Friction always acts in the opposite direction to the direction of motion, and in this case the friction F_{f} is related to the torque \tau via the equation

\tau = F_{f}R \text{ (1)}

so we can write

F_{f}R = \tau = I \alpha \text{ (2)}

where \alpha is the rotational acceleration. Re-arranging this to give F_{f}, we have

F_{f} = \frac{I \alpha}{ R }

The force down the slope, F (=ma) is just the component of the weight down the slope minus the frictional force F_{f} acting up the slope.

ma = mg\sin(30) - F_{f} = 0.5mg - \frac{ I \alpha }{ R } \text{ (3)}

The angular acceleration \alpha is given by \alpha = a/R where a is the linear acceleration. So, we can re-write Eq. (3) as

ma = 0.5mg - \frac{ Ia }{ R^{2} } \text{ (4)}

Now we will put in the moments of inertia for the solid cylinder and the hollow cylinder. For the solid cylinder, we can write

ma = 0.5mg - \frac{ 0.5mR^{2}a }{ R^{2} } = 0.5mg - 0.5ma

The mass m can be cancelled out, and assuming g=9.8 \text{ m/s/s}, we have

a = 0.5g - 0.5a \rightarrow 1.5a = 4.9 \rightarrow \boxed{ a = 3.27 \text{ m/s/s  (5)} }

Notice that Equation (5) does not have the mass m in it, as this cancels out. It also does not have the radius R of the cylinder in it; the acceleration of the cylinder as it rolls down the slope is independent of both the mass and the radius of the cylinder.

For the hollow cylinder, again using Eq. (4), we have

ma = 0.5mg - \frac{ 0.82maR^{2} }{ R^{2} } = 0.5mg - 0.82ma

This simplifies to

a = 4.9 - 0.82a \rightarrow 1.82a = 4.9 \rightarrow \boxed{ a = 2.69 \text{ m/s/s} (6)}

As with Equation (5), Equation (6) is independent of both mass and radius.

So, as we can see, the linear acceleration a for the hollow cylinder is 2.69 m/s/s, less than the linear acceleration for the solid cylinder, which was 3.27 m/s/s. This is why the solid cylinder rolls down the slope quicker than the hollow cylinder! And, the result is independent of both the mass and the radius of either cylinder. Therefore, a less massive solid cylinder will roll down a slope faster than a more massive hollow one, which may seem contradictory.


All objects falling vertically fall at the same rate, but this is not true for objects which roll down a slope. We have shown above that a solid cylinder will roll down a slope quicker than a hollow one. This is because their moments of inertia are different, it requires a greater force to get the hollow cylinder turning than it does the solid cylinder. Remember, the meaning of the word ‘inertia’ is a reluctance to change velocity, so in this case a reluctance to start rolling from being stationary. A larger moment of inertia means a greater reluctance to start rolling.

The solid cylinder will start turning more quickly from being stationary than the hollow cylinder, and this means that it will roll down the slope quicker. This result is independent of the masses (and radii) of the two cylinders; even a less massive solid cylinder will roll down a slope quicker than a more massive hollow one, which may be counter-intuitive.

Read Full Post »

Last week, in this blogpost here, I tried to explain why the scale height of the atmosphere H is defined as the altitude one has to go up for the pressure to drop by a factor of 1/e \;(\approx 37\%) of its value at sea level. After posting that blog I decided I would write a blog to say a little bit more about the mathematical constant e, a very important number in mathematics.

Probably the best known mathematical constant is \pi, which is defined as the ratio of the circumference of a circle to its diameter. Pretty much every school child comes across \pi; but it is only people who study maths at a more advanced level who come across e, so let me try in this blogpost to explain what e is and why it is so important.

Euler’s number

e is also know as Euler’s number, named after the Swiss mathematician Leonhard Euler who lived from 1707 to 1783. Euler was one of the great mathematicians, but it was not he who discovered e. The constant was discovered by Jacob Bernoulli (from the same family as the name attached to “the Bernoulli effect” which causes lift in the wings of an aeroplane), but it was Euler who started to use the letter ‘e’ to represent the constant, in 1727 or 1728.

The number itself is defined as the solution to the following sum

\displaystyle e \equiv \sum_{n=0}^{\infty} \frac{1}{n!} = 1 + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \frac{1}{4!} + \frac{1}{5!} + .... = 2.718281828...

where 2! \text{ (called "two factorial") } = 2 \times 1, 3! = 3 \times 2 \times 1, \; 4! = 4 \times 3 \times 2 \times 1 etc. 0! \equiv 1 by definition. This is an example of a converging series. It is summed to infinity, but each term is smaller than the one before; but it never ends. So, as you go to more and more terms in the series you only affect numbers which are maybe 100 or even 1,000 places to the right of the decimal point. Calculators usually display numbers to 7 or 8 decimal places, so you would not need to go very far in this series to get the number displayed by your calculator (try it to find out how many!)


The mathematical constant e is the sum of 1/n! where n goes from zero to infinity (\displaystyle \sum_{n=0}^{\infty} \frac{1}{n!}). It is equal to 2.718281828…….

Just like with \pi, e is a transcendental number (I will explain what that is in more detail in a future blogpost). Briefly, this means that it carries on forever and does not repeat, but it is slightly more complicated than that. Unlike with \pi, where people have competitions to remember it to thousands of decimal places (the current world record is 70,000 decimal places achieved by Rajveer Meena on 21 March 2015!!), no one seems that concerned in remembering e.

Why is e important in mathematics?

Mathematicians often love numbers and formulae for their own sake, sometimes just for their beauty. So, for example, the solution to something like

\displaystyle a = \sum_{n=0}^{\infty} \left( 1 + \frac{1}{n^{3}} \right)

(which I just made up) may not have any importance mathematically, but still mathematicians may enjoy playing and exploring such a series. But, in the case of

\displaystyle e \equiv \sum_{n=0}^{\infty} \frac{1}{n!}

the number which comes form this, 2.718281828……., is important mathematically (and physically), and here I will try to explain why.

Compound interest

I mentioned above that e was discovered by Jacob Bernoulli. His discovery was made in 1683 when he was investigating a question concerning compound interest. Remember, compound interest is when you get a certain percentage interest on the amount you have in e.g. a bank, but the interest is calculated not on the initial amount but on the amount after the previous period’s interest has been added.

Suppose we invest £10 in a bank account which pays an interest of 5% per year, and we leave it there for 3 years. Assuming the interest is added just once a year, after the first year our £10 will have earned £0.50 interest, so we will have £10.50. At the end of the second year the interest earned will be 5% of £10.50 which is £0.53 (rounding up to the nearest penny), so we will now have £11.03 at the start of year 3. The interest at the end of year 3 is going to be 5% of £11.03 which is £0.55, so the total at the end of the 3  years is £11.58.

Doing this for just 3 years manually is quite easy, but if we wanted to do it for e.g. 25 years, there would be a lot of tedious calculation. Also, often the interest is added more than once a year. So, for example, you may have an annual interest rate of 5% but it is added each quarter. You can quickly see that even doing this manually over a 3-year period would be a lot of calculating.

Thankfully, there is a simple formula for calculating the total accumulated value, which is

P(1 + i/n)^{nt}

where P is the initial amount invested (the ‘principal’), i is the rate of interest, n is how often each year the interest is added (called the ‘compounding frequency’) and t is the time for which we are making the calculation, expressed in years.

If we go back to our example above, and stick to n=1, we have P=10, \; i=0.05, \; t=3 and so

P(1 + i/n)^{nt} = 10(1+0.05)^{3} = 10(1.05)^{3} = 10(1.157625) = 11.58

exactly as we calculated manually.

But, you may be wondering, what is the similarity between the formula

P(1 + i/n)^{nt}

and the formula for e

\displaystyle e \equiv \sum_{n=0}^{\infty} \frac{1}{n!}?

What Bernoulli noticed was that, if you make n larger and larger (do the compounding daily instead of quarterly or once a year), the sequence approaches a limit. So, for example, in the above example, with n=1 we found the amount at the end was £11.58. If we made n=4 (compounding quarterly) we would get £11.61 and if we compounded the interest every day (n=365) we would get £11.62. If we compounded every hour (!!!) (n=8760) we would get £11.62, the same answer as if we compound every day, so we have reached the limit.

What Bernoulli did was consider the formula with P=1, \; i=100\% and t=1. If we do this for different values of n we get the following curves. The first one is just showing n from 0 to 20, and it is clear that the values are flattening out. The second plot goes from n=0 to 500, and I have just shown on the y-axis values from 2.5 to 2.72. This shows even more clearly that, as n gets larger, the value of 1(1+1/n)^{n} tends towards a particular value, and that value turns out to be 2.71828… (which is e).



y = 1(1+1/n)^{n} for n from 0 to 20


y=1(1+1/n)^{n} for n from 0 to 500, but note the y-axis is only displayed between 2.5 and 2.72. The curve is clearly tending towards a value, and that value is e=2.71828...

e in calculus

Even if you don’t know how to do calculus, you have probably heard of it. It was co-invented by Isaac Newton and Gottfried von Leibniz (see my blogpost “Who Invented Calculus” for the fascinating story of the 30-year feud between Newton and Leibniz). Without going into too much detail about all the various things one can do with calculus, one thing is that it gives is the gradient (slope) of a curve at a given point (something which is sometimes called the derivative).

There are an infinite number of different mathematical functions, for example y=x^{2}, or y=x^{3} - 2x + 3, and we can use differentiation to determine the gradient of these functions for any particular value of x. But, the function y=e^{x} is unique; it is the only mathematical function whose derivative is the same as the function. To put this another way, the slope of the curve y=e^{x} is e^{x} for any value of x, and there is no other mathematical function whose derivative is the same as the function, only f(x)=e^{x}.


A plot of e^{x} as a function of x. At x=0, \; e^{x}=1. As x \rightarrow -\infty, \; x \rightarrow 0.

In addition, when we integrate something like dx/x, we get a logarithm; but the base of that logarithm is e, not base 10 (our usual base of counting). In fact, we call logarithms in base e natural logarithms. Because the derivation of the variation of pressure with altitude involves integrating dP/P, we find that the vertical distribution of pressure is logarithmic, but in base e, P=P_{0}e^{-z/H}, where H is the scale height and P_{0} is the pressure at sea level. It is because of the pressure’s exponential dependence on altitude that H is usually expressed as the value for the pressure to drop by a factor of 1/e.

The normal (or gaussian) distribution

As one last example of where e pops up in mathematics, it arises in the equation which describes the normal or gaussian distribution. I blogged about that distribution in this blogpost here “What does a 1-sigma, 3-sigma or 5-sigma detection mean?”. The function which describes the normal distribution has the form

f(x) = \frac{1}{\sqrt{2\pi}} e^{-x^{2}/2}

where e is our friend, Euler’s number.


The normal, or Gaussian, distribution y = \frac{1}{\sqrt{2\pi}}e^{-x^{2}/2}

Read Full Post »

Yesterday (Monday 27 June) I wrote a blogpost entitled “What is the scale height of water vapour in the Earth’s atmosphere?“. In that blogpost I said that the scale height of the gas (nitrogen, oxygen) in the Earth’s atmosphere was 7.64km, and that this means that every 7.64km it reduces by a factor of 1/e, which is a factor of 0.368.

This means that, at 7.64km, the atmosphere has a thickness (pressure) of  0.368 (= 36.8\%) of its value at sea level. If we go to twice this height, 15.28km, the pressure of the atmosphere is 0.135\% \; (= 0.368 \times 0.368) of its value at sea level. At 3 \times 7.64 = 22.92 \text{ km} it is 0.368^{3} = 0.05 = 5\% of its value at sea level, etc.

The question was asked, why is it a factor of 1/e that we quote for the scale height, and not 1/2, or 1/\text{something else}? The answer is the way that the formula for the scale height H is derived. It comes about from integrating an infinitesimal change in pressure dP divided by the pressure P, that is the integral of (dP/P), and this is what leads to the exponential.

Deriving the equation for the scale height of the atmosphere

We are going to determine the pressure of the atmosphere as a function of altitude. Pressure is defined as the force per unit area, and for the atmosphere the pressure is due to the weight (not mass) of the overlying atmosphere. This is why pressure goes down with altitude. Let us assume that at a particular height z the atmosphere has a density (mass per unit volume) of \rho and a pressure P.

If we have a small slab of volume dV = Adz, the mass of this slab will be dm=\rho dV = A \rho dz. The weight of any object is given by dW=gdm, so the weight of this slab is Ag\rho dz. But, pressure is force (weight) per unit area, so

dP = \frac{dW}{A} = \frac{Ag \rho dz}{A} = g \rho dz \text{ (1)}


Consider a slab of thickness dz and area A, so the volume dV=Adx.


Re-arranging Equation (1) we get

\frac{dP}{dz} = - g \rho \text{ (2)}

where g is the acceleration due to gravity at that particular altitude. In theory g is a function of z, but because the change in g is so small for the changes in z that we will consider, we are going to assume it is constant. The minus sign in the above expression is because P decreases as we increase z.

For an ideal gas, we can write

pdV = NkT

where N is the number of molecules, k is Boltzmann’s constant, and T is the temperature in Kelvin. If the mass of each molecules is M, then the total mass of the slab of gas is NM, and so we can say that the density is

\rho = \frac{NM}{dV} = \frac{NM}{1} \cdot \frac{P}{NkT} = \frac{MP}{kT} \text{ (3)}

If we combine equations (2) and (3) we get

\frac{dP}{dz} = -g \frac{MP}{kT} \text{ (4)}

Re-arranging Equation (4) we get

\frac{dP}{P} = -\frac{Mg}{kT}dz \text{ (5)}

Equation (5) is a differential equation, so to solve it we integrate

\int{ \frac{dP}{P} }= - \frac{Mg}{kT} \int{ dz }

which becomes

\ln{P} = -\frac{Mg}{kT} z +C  \text{ (6)}

where C is a constant which we determine by the boundary conditions. \ln{P} is the natural logarithm of the pressure P, that is the logarithm to the base e. The boundary conditions are that at z=0, \; P(z)=P_{0}, the pressure at sea level, so we can write

\ln{ P_{0}} = 0 + C \rightarrow C = \ln{ P_{0} }

Putting this back in Equation (6) we have

\ln{P} = -\frac{Mgz}{kT} + \ln{ P_{0} } \rightarrow \ln{P} - \ln{ P_{0} } = -\frac{Mgz}{kT} \rightarrow \ln{ \left( \frac{ P }{ P_{0} } \right) } = -\frac{Mgz}{kT} \text{ (7)}

We get rid of the logarithm in Equation (7) by taking the exponent, so it becomes

\frac{ P }{ P_{0} } = e^{ -\frac{Mgz}{kT} } \text{ (8)}

Finally, we define the scale height H as H = \frac{ kT }{Mg} so we have
\boxed{ \frac{P}{P_{0}} = e^{-\frac{z}{H}} \text{ or } P=P_{0}e^{-\frac{z}{H}} \text{ (9)} }

As we can see, the pressure varies with altitude in the sense that the ratio of pressure at any altitude P to its value at sea level P_{0} is given by an exponent; the negative sign in the exponent tells us that pressure will decrease with increasing altitude.

The variation of pressure with altitude

If we plot Equation (9) we get the following (with a value of H=7.64 \text{ km})


The variation of pressure with altitude assuming a scale height of 7.64km

This shows the exponential drop off of atmospheric pressure with altitude, as given in Equation (9) above. We can, however, plot the pressure (y-axis) on a logarithmic scale. We take Equation (9) and write

\ln{ \frac{P}{P_{0} } } = - \frac{z}{H} \rightarrow \ln{P} - \ln{P_{0}} = -\frac{z}{H}

which we can re-arrange to give

\boxed{ \ln{P} = -\frac{1}{H}z + \ln{P_{0}} \text{ (10)} }

This is the equation of a straight line (c.f. y=mx+c), so the intercept of our straight line is \ln{P_{0}} and our gradient is -(1/H). It is because the integration of our expression dP/P (Equation (5) above) produces an exponential that the scale height H is expressed as the altitude one needs to ascend for the pressure to drop by a factor of 1/e and not, e.g. 1/2.



If we plot the pressure as a function of altitude with the pressure (on the y-axis) plotted on a logarithmic scale, we get a straight line. The equation of this line is \ln{P} = - \frac{1}{H}z + \ln{P_{0}} The gradient of the line is -1/H, where H is the scale height of the atmosphere. So, on this linear-log plot, if we increase the altitude by H, the natural log of the pressure will drop by 1.

Read Full Post »

One of the physicists in our book Ten Physicists Who Transformed Our Understanding of Reality (follow this link for more information on the book) is, not surprisingly, Isaac Newton. In fact, he is number 1 in the list. One could argue that he practically invented the subject of physics. We decided to call him the ‘father of physics’, with Galileo (whose life preceded Newton’s) being given the title of ‘grandfather’.

Newton was, clearly, a man of genius. But he was also a nasty, vindictive bastard (not to mince my words!). He didn’t really have any close friends in his life; there were plenty of people who admired him and respected him, and of course he had colleagues. But, apart from a niece whom he seemed to dote on in later life, and two men with whom he probably had love affairs, he was not a man who sought company. He was probably autistic, but lived at a time before such conditions were diagnosed or talked about.

Isaac Newton (1643-1727), the ‘father of physics’. He relished in feuding with other scientists

One sort of interaction that he did seem to enjoy with other people though was feuds. In fact, he seemed to thrive on feuding with other scientists. He loved to argue with others, which is not uncommon amongst academics. He had strong opinions which he liked to defend; this is normal. But, Newton took these disputes to an extreme; if he fell out with someone he would do everything he could to destroy that person.

Although I am sure that he had many ‘minor’ arguments, he had three main feuds with fellow scientists. These three men were

  • Robert Hooke – curator of experiments at the Royal Society
  • Gottfried (von) Leibniz – the German mathematician
  • John Flamsteed – the first Astronomer Royal

In each case, he did his level best to destroy the other man. Each of these feuds is discussed in more detail in our book, but in this blogpost I will give a brief summary of his feud with Leibniz.

The feud came about because Newton refused to believe that Leibniz had independently come up with the mathematical idea of calculus. It was a recurring theme throughout Newton’s life that he sincerely believed that he was special. He had deep religious views (some would say extreme religious views). As part of these views, he believed that he had been specially chosen by God to understand things that others would never be able to understand.

Thus, when he heard that Leibniz had developed a mathematics similar to his own ‘theory of fluxions’ (as Newton called it), he naturally assumed that the German had stolen it from him. There then ensued a 30-year dispute between the two men, with Newton very much the aggressor.

Gottfried (von) Leibniz (1646-1716), German mathematician and co-inventor of calculus

It escalated from a dispute to a feud, and culminated in the Royal Society commissioning an ‘official investigation’ to establish propriety for the invention of calculus. When the report came out in 1713 it came out in Newton’s favour. But, by this time Newton was not only President of the Royal Society, but he had secretly authored the entire report. It was anything but impartial. Leibniz died the following year, a broken man from Newton’s relentless attacks.

One should, of course, be able to to admire a person for their work but not admire them in the least for the person that they were. Newton, in my mind, falls very firmly into this category. His contribution to physics is unparalleled, but I don’t think he was the kind of person one would want to know or even come across if one could help it!

Ten Physicists Who Transformed Our Understanding of Reality is available now. Follow this link to order

Ten Physicists Who Transformed Our Understanding of Reality is available now. Follow this link to order

What is your favourite story about Newton?


Read Full Post »

As I mentioned in this blog here, a few months ago I contributed some articles to a book called 30-Second Einstein, which will be published by Ivy Press in the not too distant future. One of the articles I wrote for the book was on Indian mathematical physicist Satyendra Bose. It is after Bose that ‘bosons’ are named (as in ‘the Higgs boson’), and also terms like ‘Bose-Einstein statistics’ and ‘Bose-Einstein condensate’. So, who was Satyendra Bose, and why is his name attached to these things?


Satyendra Bose was an Indian mathematical physicist after whom the 'boson' and Bose-Einstein statistics are named

Satyendra Bose was an Indian mathematical physicist after whom the ‘boson’ and Bose-Einstein statistics are named

Satyendra Bose was born in Calcutta, India, in 1894. He studied applied mathematics at Presidency College, Calcutta, obtaining a BSc in 1913 and an MSc in 1915. On both occasions, he graduated top of his class. In 1919, he made the first English translation of Einstein’s general theory of relativity, and by 1921 he had moved to Dhaka (in present-day Bangladesh) to become Reader (one step below full professor) in in the department of Physics.

It was whilst in Dhaka, in 1924, that he came up with the theory of how to count indistinguishable particles, such as photons (light particles). He showed that such particles follow statistics which are different from particles which can be distinguished. All his attempts to get his paper published failed, so in an act of some desperation he sent it to Einstein. The great man recognised the importance of Bose’s work immediately, translated it into German and got it published in Zeitschrift für Physik, one of the premier  physics journals of the day.

Because of Einstein’s part in getting the theory published, we now know of this way of counting indistinguishable particles as Bose-Einstein statistics. We also name particles which obey this kind of statistics bosons; examples are the photon, the W and Z-particles (which mediate the weak nuclear force), and the most famous boson, the Higgs boson (responsible for mediating the property of mass via the Higgs field).

With the imminent partition of India when it was gaining independence from Britain, Bose returned to his native Calcutta where he spent the rest of his career. He died in 1974 at the age of 80.

You can read more about Satyendra Bose, Bose-Einstein statistics and Bose-Einstein condensates in 30-second Einstein, out soon from Ivy Press. 

Read Full Post »

As anyone who hasn’t been living under a rock knows, the International Space Station (ISS) orbits the Earth with (typically? always?) six astronauts on board. It has been doing this for something like the last fifteen years. One of the astronauts currently on board is the Disunited Kingdom’s first Government-funded astronaut, Tim Peake.

The first British person to go into space was Helen Sharman, but she went into space in a privately funded arrangement with the Russian Space Programme in 1991. Other British-born astronauts have gone into space through having become naturalised Americans, and going into space with NASA. But, Tim Peake has gone to the ISS as part of ESA’s space programme, and his place is due to Britain’s contribution to ESA’s astronaut programme. So, he is the first UK Government-funded astronaut, which is why there has been so much fuss about it in these lands.


Official NASA portrait of British astronaut Timothy Peake. Photo Date: August 28, 2013.

Anyway, I digress. This blog is not about Tim Peake per se, or about the ISS really. I wanted this blog to be about whether Tim Peake is getting older or younger whilst in orbit. Of course everyone is getting older, including Tim Peake, as ‘time waits for no man’ as the saying goes. What I really mean is whether time is passing more or less quickly for Tim Peake (and the other astronauts) in the ISS compared to those of us on Earth.

As some of you might now, when an astronaut is in orbit he is in a weaker gravitational field, as the Earth’s gravitational field drops off with distance (actually as the square of the distance) from the centre of the Earth. Time will therefore pass more quickly for Tim Peake than for someone on the Earth’s surface due to this effect. This is time dilation due to gravity, a general relativity (GR) effect.

But, there is also another time dilation, the time dilation due to one’s motion relative to another observer, the time dilation in special relativity (SR). Because Tim Peake is in orbit, and hence moving relative to someone on the surface of the Earth, this means that time will appear to move more slowly for him as observed by someone on Earth. Interestingly (at least for me!), the SR effect works in the opposite sense to the GR effect.

Which effect is greater? And, how big is the effect?

Time dilation due to SR – slowing it down for Tim Peake

As I showed in this blog, the time dilation due to SR can be calculated using the equation

t^{\prime} = \gamma t \text{ where } \gamma = \frac{ 1 }{ \sqrt{ ( 1 -v^{2}/c^{2} )} }

If he is in orbit at an altitude of 500km (I guessed at this amount, according to Wikipedia it is 400km, but it does not alter the argument which ensues) then his distance from the centre of the Earth (assuming a spherical Earth) is 6.371 \times 10^{6} + 500 = 6.3715 \times 10^{6} metres. The centripetal force keeping him in orbit is provided by the force of gravity, and in this blog I showed that the centripetal force F_{c} is given by

F_{c} = \frac{mv^{2} }{r}

where m is the mass of the object in orbit, v is its velocity and r is the radius of its orbit.This centripetal force is being provided by gravity, which we know is

F_{g} = \frac{ GMm }{ r^{2} }

where G is the universal gravitational constant, and M is the mass of the Earth. Putting these two equal to each other

\frac{ mv^{2} }{ r } = \frac{ GMm }{ r^{2} } \rightarrow v^{2} = \frac{GM}{r}

Putting in the values we have for the ISS, where r=6.3715 \times 10^{6}, G=6.67 \times 10^{-11} and M= 5.97237 \times 10^{24}, we find that

v^{2} = 6.2522 \times 10^{7} \rightarrow v = 7.907(067129) \times 10^{3} \text { m/s} = \boxed{ 7.907(067129) \text{ km/s} }

But, this is the motion relative to the centre of the Earth. People on the surface of the Earth are also moving about the centre, as the Earth is spinning on its axis. But, we cannot calculate this speed as we have done above; people on the surface are not in orbit, but on the Earth’s surface. For something to stay e.g. 1 metre above the Earth’s surface in orbit it would have to move considerably quicker than the rotation rate of the Earth.

The Earth turns once every 24 hours, so for someone on the equator they are moving at

v_{se} = \frac{ 2 \pi \times 6.3715 \times 10^{6} }{ 24 \times 3600 } = 463.348(5554) \text { m/s}

where v_{se} refers to the speed of someone on the surface of the Earth. Someone at other latitudes is moving less quickly, at the poles they are not moving at all relative to the centre of the Earth. The speed of someone on the surface will go as v_{se} \cos (\theta) where \theta is the latitude. This is why we launch satellites as close to the Earth’s equator as is feasible; we maximise v_{se} and thus get the benefit of the speed of rotation of the Earth at the launch site to boost the rocket’s speed in an easterly direction.

The difference in speeds between the ISS and someone at the equator on the surface of Earth is therefore

7.907(067129) \times 10^{3} - 463.348(5554) = \boxed { 7.443(718574) \times 10^{3} \text { m/s} }

Referring back to my blog on time dilation in special relativity that I mentioned at the start of this section, this means that the time dilation factor \gamma, using this value of v, is

\gamma = \frac{ 1 }{ \sqrt{(1 - (v/c)^{2})} } = \frac{ 1 }{ 0.9999999997 }
(where c is, of course, the speed of light).
This value of \gamma is equal to unity to 3 parts in 10^{10}, so it would require Tim Peake to orbit for about 3 \times 10^{9} seconds for the time dilation factor to amount to 1 second. 3 \times 10^{9} seconds is just over 96 years, let us say 100 years.

The time dilation due to GR – speeding it up for Tim Peake

For GR, the time dilation works in the other sense, it will run more slowly for those of us on the Earth’s surface; we experience gravitational time dilation which is greater than that experienced by Tim Peake. In this blog here, I derived from the principle of equivalence the time dilation due to GR, and found

\Delta T_{B} = \Delta T_{A} \left( 1 - \frac{ gh }{ c^{2} } \right)

where, in this case, \Delta T_{B} would be the rate of time passing on the Earth’s surface, \Delta T_{A} the rate of time passing on the ISS,  g = 9.81 (the acceleration due to gravity at the Earth’s surface) and h is the height of the orbit, which we have assumed (see above) to be 500 km = 500 \times 10^{3}.

Plugging in these values we get that

\frac{ \Delta T_{B} }{ \Delta T_{A} } = 1 - 5.45 \times 10^{-11}

So, the GR effect is about one part in 10^{11} (100 billion). In six months, the number of seconds that Tim Peake will be in orbit is about 1.6 \times 10^{7} seconds, so a factor of about 10,000 less than for the GR effect to amount to 1 second. Tim Peake would need to be in orbit for about 5,000 years for the GR effect to amount to 1 second of difference!


In conclusion, the SR effect on how quickly time is passing for Tim Peake is about 3 parts in 10 billion, in the sense that it passes more slowly for Tim Peake. The GR effect is even smaller, about one part in 100 billion, but in the sense that time is passing more quickly for him. The SR ‘time slowing down’ effect is greater than the GR time ‘passing more quickly effect’, by roughly a factor of 300.

Tim Peake is therefore actually ageing more slowly by being in orbit than if he were on Earth. But, he would need to orbit for nearly 100 years for this difference to amount to just 1 second! And, none of this of course takes into account the detrimental biological effects of being in orbit, which are probably not good to anyone’s longevity!






Read Full Post »

Yesterday (Thursday 21 January) I was on BBC Radio talking about the possibility of their being a 9th planet in the Solar System (remember, in 2006 Pluto was demoted to being a minor-planet, leaving us with 8). If this suggestion is true, this would lead to our once again having to revise the list of planets that many of us know knew by heart. It would not be the first time we have had to revise it, nor I suspect will it be the last.

The team’s argument is based on anomalies in the orbits of Kuiper belt objects. The Kuiper belt is a region beyond the orbit of Pluto which is the reservoir of short-period comets. I have blogged about the Kuiper belt before here. The authors of this new paper argue that some Kuiper belt objects are having their orbits disturbed by an unseen object, and they suggest that it is an object about ten times larger than Earth.


The Caltech team claim that anomalies in the orbits of Kuiper belt objects suggest that there is a large planet disturbing them

It may come as a surprise to some of you that this is precisely the way that Neptune was discovered. After Uranus’ discovery by William Herschel in 1781, astronomers noticed that it was not orbiting exactly as it should. The simplest explanation was that its orbit was being affected by an unseen planet. Two mathematicians (Frenchman Urbain le Verrier and Englishman John Couch Adams) separately worked out where the disturbing object should be.  There was a race on for astronomers to find the object, and the race was won by astronomer Johann Galle in 1846 working at the Berlin Observatory.

The existence of this new 9th planet is a long way from being proven. The anomalies in the orbits of the Kuiper belt objects is an example of something called a ‘many-body problem’. The gravitational influence of many objects, including the Sun, Jupiter, the other gas giants, as well as other Kuiper belt objects, all have to be calculated to see if there are any unaccounted for effects. This is a horrendously complicated problem, and I am sure this prediction by this team from Caltech will be challenged by others working in this area of research. 


Read Full Post »

In part 3 of this blog series I explained how Max Planck found a mathematical formula to fit the observed Blackbody spectrum, but that when he presented it to the German Physics Society on the 19th of October 1900 he had no physical explanation for his formula. Remember, the formula he found was

E_{\lambda} \; d \lambda = \frac{ A }{ \lambda^{5} } \frac{ 1 }{ (e^{a/\lambda T} -1) } \; d\lambda

if we express it in terms of wavelength intervals. If we express it in terms of frequency intervals it is

E_{\nu} \; d \nu = A^{\prime} \nu^{3} \frac{ 1 }{ (e^{ a^{\prime} \nu / T } - 1) } \; d\nu

Planck would spend six weeks trying to find a physical explanation for this equation. He struggled with the problem, and in the process was forced to abandon many aspects of 19th Century physics in both the fields of thermodynamics and electromagnetism which he had long cherished. I will recount his derivation – it is not the only one and maybe in coming blog posts I can show how his formula can be derived from other arguments, but this is the method Planck himself used.

Radiation in a cavity

As we saw in the derivation of the Rayleigh-Jeans law (see part 3 here, and links in that to parts 1 and 2), blackbody radiation can be modelled as an idealised cavity which radiates through a small hole. Importantly, the system is given enough time for the radiation and the material from which the cavity is made to come into thermal equilibrium with each other. This means that the walls of the cavity are giving energy to the radiation at the same rate that the radiation is giving energy to the walls.

Using classical physics, as we did in the derivation of the Rayleigh-Jeans law, we saw that the energy density (the energy per unit volume) is

\frac{du}{d\nu} = \left( \frac{ 8 \pi kT }{ c^{3} } \right) \nu^{2}


After trying to derive his equation based on standard thermodynamic arguments, which failed, Planck developed a model which, he found, was able to produce his equation. How did he do this?

Harmonic Oscillators

First, he knew from classical electromagnetic theory that an oscillating electron radiates (as it is accelerating), and he reasoned that when the cavity was in thermal equilibrium with the radiation in the cavity, the electrons in the walls of the cavity would oscillate and it was they that produced the radiation.

After much trial and error, he decided upon a model where the electrons were attached to massless springs. He could model the radiation of the electrons by modelling them as a whole series of harmonic oscillators, but with different spring stiffnesses to produce the different frequencies observed in the spectrum.

As we have seen (I derived it here), in classical physics the energy of a harmonic oscillator depends on both its amplitude of oscillation squared (E \propto A^{2}); and it also depends on its frequency of oscillation squared (E \propto \nu^{2}). The act of heating the cavity to a particular temperature is what, in Planck’s model, set the electrons oscillating; but whether a particular frequency oscillator was set in motion or not would depend on the temperature.

If it were oscillating, it would emit radiation into the cavity and absorb it from the cavity. He knew from the shape of the blackbody curve (and, by now, his equation which fitted it), that the energy density E d\nu at any particular frequency started off at zero for high frequencies (UV), then rose to a peak, and then dropped off again at low frequencies (in the infrared).

So, Planck imagined that the number of oscillators with a particular resonant frequency would determine how much energy came out in that frequency interval. He imagined that there were more oscillators with a frequency which corresponded to the maximum in the blackbody curve, and fewer oscillators at higher and lower frequencies. He then had to figure out how the total energy being radiated by the blackbody would be shared amongst all these oscillators, with different numbers oscillating at different frequencies.

He found that he could not derive his formula using the physics that he had long accepted as correct. If he assumed that the energy of each oscillator went as the square of the amplitude, as it does in classical physics, his formula was not reproduced. Instead, he could derive his formula for the blackbody radiation spectrum only if the oscillators absorbed and emitted packets of energy which were proportional to their frequency of oscillation, not to the square of the frequency as classical physics argued. In addition, he found that the energy could only come in certain sized chunks, so for an oscillator at frequency \nu, \; E = nh\nu, where n is an integer, and h is now known as Planck’s constant.

What does this mean? Well, in classical physics, an oscillator can have any energy, which for a particular oscillator vibrating at a particular frequency can be altered by changing the amplitude. Suppose we have an oscillator vibrating with an amplitude of 1 (in abitrary units), then because the energy goes as the square of the amplitude its energy is E=1^{2} =1. If we increase the amplitude to 2, the energy will now be E=2^{2} = 4. But, if we wanted an energy of 2, we would need an amplitude of \sqrt{2} = 1.414, and if we wanted an energy of 3 we would need an amplitude of \sqrt{3} = 1.73.

In classical physics, there is nothing to stop us having an amplitude of 1.74, which would give us an energy of 3.0276 (not 3), or an amplitude of 1.72 whichg would give us an energy of 2.9584 (not 3). But, what Planck found is that this was not allowed for his oscillators, they did not seem to obey the classical laws of physics. The energy could only be integers of h\nu, so E=0h\nu, 1h\nu, 2h\nu, 3h\nu, 4h\nu etc.

Then, as I said above, he further assumed that the total energy at a particular frequency was given by the energy of each oscillator at that frequency multiplied by the number of oscillators at that frequency. The frequency of a particular oscillator was, he imagined, determined by its stiffness (Hooke’s constant). The energy of a particular oscillator at a particular frequency could be varied by the amplitude of its oscillations.

Let us assume, just to illustrate the idea, that the value of h is 2. If the total energy in the blackbody at a particular frequency of, say, 10 (in arbitrary units) were 800 (also in arbitrary units), this would mean that the energy of each chunk (E=h \nu) was E = 2 \times 10 = 20. So, the number of chunks at that frequency would then be 800/20 = 40. 40 oscillators, each with an energy of 20, would be oscillating to give us our total energy of 800 at that frequency.

Because of this quantised energy, we can write that E_{n} = nh \nu, where n=0,1,2,3, \cdots.

The number of oscillators at each frequency

The next thing Planck needed to do was derive an expression for the number of oscillators at each frequency. Again, after much trial and error he found that he had to borrow an idea first proposed by Austrian physicist Ludwig Boltzmann to describe the most likely distribution of energies of atoms or molecules in a gas in thermal equilibrium. Boltzmann found that the number of atoms or molecules with a particular energy E was given by

N_{E} \propto e^{-E/kT}

where E is the energy of that state, T is the temperature of the gas and k is now known as Boltzmann’s constant. The equation is known as the Boltzmann distribution, and Planck used it to give the number of oscillators at each frequency. So, for example, if N_{0} is the number of oscillators with zero energy (in the so-called ground-state), then the numbers in the 1st, 2nd, 3rd etc. levels (N_{1}, N_{2}, N_{3},\cdots) are given by

N_{1} = N_{0} e^{ -E_{1}/kT }, \; N_{2} = N_{0} e^{ -E_{2}/kT }, \; N_{3} = N_{0} e^{ -E_{3}/kT }, \cdots

But, as E_{n} = nh \nu, we can write

N_{1} = N_{0} e^{ -h \nu /kT }, \; N_{2} = N_{0} e^{ -2h \nu /kT }, \; N_{3} = N_{0} e^{ -3h \nu /kT }, \cdots


Planck modelled blackbody radiation as a series of harmonic oscillators with equally spaced energy levels

Planck modelled blackbody radiation as a series of harmonic oscillators with equally spaced energy levels

To make it easier to write, we are going to substitute x = e^{ -h \nu / kT }, so we have

N_{1} = N_{0}x, \; N_{2} = N_{0} x^{2}, \; N_{3} = N_{0} x^{3}, \cdots

The total number of oscillators N_{tot} is given by

N_{tot} = N_{0} + N_{1} + N_{2} + N_{3} + \cdots = N_{0} ( 1 + x + x^{2} + x^{3} + \cdots)

Remember, this is the number of oscillators at each frequency, so the energy at each frequency is given by the number at each frequency multiplied by the energy of each oscillator at that frequency. So

E_{1}=N_{1} h \nu , \; E_{2} = N_{2} 2h \nu , \; E_{3} = N_{3} 3h \nu, \cdots

which we can now write as

E_{1} = h \nu N_{0}x, \; E_{2} = 2h \nu N_{0}x^{2}, \; E_{3} = 3h \nu N_{0}x^{3}, \cdots

The total energy E_{tot} is given by

E_{tot} = E_{0} + E_{1} + E_{2} + E_{3} + \cdots = N_{0} h \nu (0 + x + 2x^{2} + 3x^{3} + \cdots)

The average energy \langle E \rangle is given by

\langle E \rangle = \frac{ E_{tot} }{ N_{tot} } = \frac{ N_{0} h \nu (0 + x + 2x^{2} + 3x^{3} + \cdots) }{ N_{0} ( 1 + x + x^{2} + x^{3} + \cdots ) }

The two series inside the brackets can be summed. The sum of the series in the numerator, which we will call S_{1} is given by

S_{1} = \frac{ x - (n+1)x^{n+1} + nx^{n+2} }{ (1-x)^{2} }

(for the proof of this, see for example here)

The series in the denominator, which we will call S_{2}, is just a geometric progression. The sum  of such a series is simply

S_{2} = \frac{ 1 - x^{n} }{ (1-x) }

Both series  are in x, but remember x = e^{-h \nu / kT}. Also, both series are from a frequency of \nu = 0 \text{ to } \infty, and e^{-h \nu /kT} < 1, which means the sums converge and can be simplified.

S_{1} \rightarrow \frac{x}{ (1-x)^{2} } \text{ and } S_{2} \rightarrow \frac{ 1 }{(1-x)}

which means that \langle E \rangle = (h \nu S_{1})/S_{2} is given by

\langle E \rangle = \frac{ h \nu x }{ (1-x)^{2} } \times \frac{ (1-x) }{1} = \frac{h \nu x}{ (1-x) }

and so we can write that the average energy is

\boxed{ \langle E \rangle = \frac{h \nu}{( 1/x - 1) } = \frac{h \nu}{ (e^{h \nu/kT} - 1) } }

The radiance per frequency interval

In our derivation of the Rayleigh-Jeans law (in this blog here), we showed that, using classical physics, the energy density du per frequency interval was given by

du = \frac{ 8 \pi }{ c^{3} } kT \nu^{2} \, d \nu

where kT was the energy of each mode of the electromagnetic radiation. We need to replace the kT in this equation with the average energy for the harmonic oscillators that we have just derived above. So, we re-write the energy density as

du = \frac{ 8 \pi }{ c^{3} } \frac{ h \nu }{ (e^{h\nu/kT} - 1) } \nu^{2} \; d\nu = \frac{ 8 \pi h \nu^{3} }{ c^{3} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu

du is the energy density per frequency interval (usually measured in Joules per metre cubed per Hertz), and by replacing kT with the average energy that we derived above the radiation curve does not go as \nu^{2} as in the Rayleigh-Jeans law, but rather reaches a maximum and turns over, avoiding the ultraviolet catastrophe.

It is more common to express the Planck radiation law in terms of the radiance per unit frequency, or the radiance per unit wavelength, which are written B_{\nu} and B_{\lambda} respectively. Radiance is the power per unit solid angle per unit area. So, as a first step to go from energy density to radiance we will divide by 4 \pi, the total solid angle. This gives

\frac{ 2 h \nu^{3} }{ c^{3} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu

We want the power per unit area, not the energy per unit volume. To do this we first note that power is energy per unit time, and second that to go from unit volume to unit area we need to multiply by length. But, for EM radiation, length is just ct. So, we need to divide by t and multiply by ct, giving us that the radiance per frequency interval is

\boxed{ B_{\nu} = \frac{ 2h \nu^{3} }{ c^{2} } \frac{ 1 }{ (e^{h\nu/kT} - 1) } \; d\nu }

which is the way the Planck radiation law per frequency interval is usually written.

Radiance per unit wavelength interval

If you would prefer the radiance per wavelength interval, we note that \nu = c/\lambda and so d\nu = -c/\lambda^{2} \; d\lambda. Ignoring the minus sign (which is just telling us that as the frequency increases the wavelength decreases), and substituting for \nu and d\nu in terms of \lambda and d\lambda, we can write

B_{\lambda} = \frac{ 2h }{ c^{2} } \frac{ c^{3} }{ \lambda^{3} } \frac{ 1 }{ ( e^{hc/\lambda kT} - 1 ) } \frac{ c }{ \lambda^{2} } \; d\lambda

Tidying up, this gives

\boxed{ B_{\lambda} = \frac{ 2hc^{2} }{ \lambda^{5} } \frac{ 1 }{ ( e^{hc/\lambda kT} - 1 ) } \; d\lambda }

which is the way the Planck radiation law per wavelength interval is usually written.


To summarise, in order to reproduce the formula which he had empirically derived and presented in October 1900, Planck found that he he could only do so if he assumed that the radiation was produced by oscillating electrons, which he modelled as oscillating on a massless spring (so-called “harmonic oscillators”). The total energy at any given frequency would be given by the energy of a single oscillator at that frequency multiplied by the number of oscillators oscillating at that frequency.

However, he had to assume that

  1. The energy of each oscillator was not related to either the square of the amplitude of oscillation or the square of the frequency of oscillation (as it would be in classical physics), but rather to the square of the amplitude and the frequency, E \propto \nu.
  2. The energy of each oscillator could only be a multiple of some fundamental “chunk” of radiation, h \nu, so E_{n} = nh\nu where n=0,1,2,3,4 etc.
  3. The number of oscillators with each energy E_{n} was given by the Boltzmann distribution, so N_{n} = N_{0} e^{-nh\nu/kT} where N_{0} is the number of oscillators in the lowest energy state.

In a way, we can imagine that the oscillators at higher frequencies (to the high-frequency side of the peak of the blackbody) are “frozen out”. The quantum of energy for a particular oscillator, given by E_{n}=nh\nu, is just too large to exist at the higher frequencies. This avoids the ultraviolet catastrophe which had stumped physicists up until this point.

By combining these assumptions, Planck was able in November 1900 to reproduce the exact equation which he had derived empirically in October 1900. In doing so he provided, for the first time, a physical explanation for the observed blackbody curve.

  • Part 1 of this blogseries is here.
  • Part 2 is here.
  • Part 3 is here.



Read Full Post »

For several weeks now I have been planning to write a blog about centrifugal force, mainly prompted by seeing a post by John Gribbin on Facebook of the xkcd cartoon about it. In the cartoon James Bond is threatened with torture on a centrifuge. Here is a link to the original cartoon.

The xkcd cartoon about centrifugal force

The xkcd cartoon about centrifugal force involves James Bond being tortured on a centrifuge

I have taught mechanics many times to physics undergraduates, and they are often confused about centripetal force and centrifugal force, and what the difference is between them. Some have heard that centrifugal force doesn’t really exist, just as Bond states in this cartoon. What is the real story?

Rotating frames of reference

Everyone reading this (apart from a few “flat-Earth adherents” maybe) knows that we live on the surface of a planet which is rotating on its axis once a day. This means that we do not live in an inertial frame of reference (an inertial frame is one which is not accelerating), as clearly being on the surface of a spinning planet means that we are experiencing acceleration all the time; as we are not travelling in a straight line. That acceleration is provided by the force of gravity, and it stops us from going off in a straight line into space!

Because we are living in a non-inertial frame of reference we need to modify Newton’s laws of motion to properly describe such a non-inertial frame (which I am going to call a “rotating frame” from now on, although a rotating frame is just one example of a non-inertial frame but it is the one relevant to us on the surface of a rotating Earth).

Let us consider our usual Cartesian coordinate system. The unit vector in the x-direction is usually written as \hat{\imath}, the one in the y-direction as \hat{\jmath}, and the one in the z-direction as \hat{k}. We are going to consider an object rotating about the \hat{k} (z-axis) direction.

We will consider two reference frames, one which stays fixed (the inertial reference frame), denoted by (\hat{\imath},\hat{\jmath},\hat{k}), and a second reference frame which rotates with the rotation, denoted by (\hat{\imath}_{r} ,\hat{\jmath}_{r} ,\hat{k}_{r}), where the subscript r reminds us that this is the rotating frame of reference.

For the derivation below I am going to assume that we are considering motion with a constant radius r. I want to illustrate how centrifugal force arrises in a rotating frame such as being on the surface of our Earth. Our Earth is not spherical, but at any given point the size of the radius does not change, so this is a reasonable simplification.

As I showed in this blog on angular velocity, we can write the linear velocity \vec{v} of an object moving in a circle as

\vec{v} = \frac{ d \vec{r} }{ dt } = \vec{\omega} \times \vec{r}

where \vec{r} is the radius vector and \vec{\omega} is the angular velocity.

Writing \vec{r} in terms of its x,y and z-components in our inertial (non-rotating) frame, \vec{r}=(\hat{\imath},\hat{\jmath},\hat{k}), so in general we then have

\vec{v} = \frac{ d \vec{r} }{ dt } \rightarrow \frac{ d \hat{\imath} }{dt} = \vec{\omega} \times \hat{\imath}, \; \; \frac{ d \hat{\jmath} }{dt} = \vec{\omega} \times \hat{\jmath} , \; \; \frac{ d \hat{k} }{dt} = \vec{\omega} \times \hat{k}

Let us consider the specific case of a small rotation \delta \theta about the \hat{k} axis, as shown in the figure below. As the figure shows, in our inertial (fixed) frame of reference, the new direction of the x-axis is now \hat{\imath} + \delta \hat{\imath}, and the new direction of the y-axis is \hat{\jmath} + \delta \hat{\jmath}. The direction of the \hat{k} axis is unchanged.

We are going to rotate about the z-axis khat direction) by an angle delta theta

We are going to rotate about the z-axis (\hat{k} direction) by an angle \delta \theta

Because we are rotating about the \hat{k} axis, the angular velocity is in this direction, and so we can write (using the right-hand rule for vector products as I blogged about here)

\vec{\omega} \times \hat{\imath} = \omega \hat{\jmath}, \; \; \vec{\omega} \times \hat{\jmath} = -\omega \hat{\imath}, \; \; \vec{\omega} \times \hat{k} =0

Let us now consider some vector \vec{a}, which we will write in the rotating frame of reference as

\vec{a} = a_{x} \hat{\imath}_{r} + a_{y} \hat{\jmath}_{r} + a_{z} \hat{k}_{r}

If we now look at the rate of change of this vector in the rotating frame we have

\left( \frac{d \vec{a} }{dt} \right)_{r} = \frac{d}{dt}(a_{x}\hat{\imath}_{r}) + \frac{d}{dt}(a_{y}\hat{\jmath}_{r}) + \frac{d}{dt}(a_{z}\hat{k}_{r})

In the rotating frame of reference, \hat{\imath}_{r}, \hat{\jmath}_{r} and \hat{k}_{r} do not change with time, so we can write

\left( \frac{d \vec{a} }{dt} \right)_{r} = \frac{ da_{x} }{dt} \hat{\imath}_{r} + \frac{ da_{y} }{dt} \hat{\jmath}_{r} + \frac{ da_{z} }{dt} \hat{k}_{r}

In the inertial frame of reference \hat{\imath}_{r}, \hat{\jmath}_{r} and \hat{k}_{r} move, so

\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{d}{dt} (a_{x} \hat{\imath}_{r}) + \frac{d}{dt} (a_{y} \hat{\jmath}_{r}) + \frac{d}{dt} (a_{z} \hat{k}_{r})

\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + a_{x} \frac{d \hat{\imath}_{r} }{dt} + a_{y} \frac{d \hat{\jmath}_{r} }{dt} + a_{z} \frac{d \hat{k}_{r} }{dt}

But, we can write (see above) that

\frac{d\hat{\imath}_{r} }{dt} = \vec{\omega} \times \hat{\imath}_{r}, \; \; \frac{d\hat{\jmath}_{r} }{dt} = \vec{\omega} \times \hat{\jmath}_{r}, \; \; \frac{d\hat{k}_{r} }{dt} = \vec{\omega} \times \hat{k}_{r}

and so

\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + a_{x} \vec{\omega} \times \hat{\imath}_{r} + + a_{y} \vec{\omega} \times \hat{\jmath}_{r} + + a_{z} \vec{\omega} \times \hat{k}_{r}

\left( \frac{d \vec{a} }{dt} \right)_{i} = \frac{ da_{x} }{dt}\hat{\imath}_{r} + \frac{ da_{y} }{dt}\hat{\jmath}_{r} + \frac{ da_{z} }{dt}\hat{k}_{r} + \vec{\omega} \times \vec{a}

\boxed{ \left( \frac{d \vec{a} }{dt} \right)_{i} = \left( \frac{d \vec{a} }{dt} \right)_{r} + (\vec{\omega} \times \vec{a}) }

A fixed point on the Earth’s surface

Let us now consider the point \vec{a} = \vec{r}, where \vec{r} is a fixed point on the Earth’s surface. We can write

\left( \frac{d \vec{r} }{dt} \right)_{i} = \left( \frac{d \vec{r} }{dt} \right)_{r} + (\vec{\omega} \times \vec{r})

But, in the rotating frame of reference this point does not change with time, so

\left( \frac{d \vec{r} }{dt} \right)_{r} = 0

and so

\left( \frac{d \vec{r} }{dt} \right)_{i} = (\vec{\omega} \times \vec{r}) = \omega r \sin(\theta)

where \theta is the angle between the Earth’s rotation axis and the latitude of the point (so \theta = 90^{\circ} - \text{ latitude}).

Let us now calculate the acceleration in an inertial frame in terms of acceleration in a rotating frame. Writing \vec{a} as \vec{r} as above, we now have

\left( \frac{d \vec{r} }{dt} \right)_{i} = \left( \frac{d \vec{r} }{dt} \right)_{r} + (\vec{\omega} \times \vec{r})

To make things easier to write, we will re-write

\left( \frac{d \vec{r} }{dt} \right)_{i} = \frac{d \vec{r}_{i} }{dt} \text{ and } \left( \frac{d \vec{r} }{dt} \right)_{r} = \frac{d \vec{r}_{r} }{dt}


\frac{d \vec{r}_{i} }{dt} = \frac{d \vec{r}_{r} }{dt} + (\vec{\omega} \times \vec{r})

\vec{v}_{i} = \vec{v}_{r} + (\vec{\omega} \times \vec{r})

If we now differentiate \vec{v}_{i} with respect to time, we will have the acceleration in the inertial frame

\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \left( \frac{d \vec{v}_{i} }{dt} \right)_{r} + (\vec{\omega} \times \vec{v}_{i})

But, \vec{v}_{i} = \vec{v}_{r} + (\vec{\omega} \times \vec{r})


\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \frac{d}{dt}(\vec{v}_{r} + \vec{\omega} \times \vec{r})_{r} + \vec{\omega} \times (\vec{v}_{r} + \vec{\omega} \times \vec{r})

Expanding this out we get

\left( \frac{d \vec{v}_{i} }{dt} \right)_{i} = \left( \frac{d \vec{v}_{r} }{dt} \right)_{r} + \frac{d}{dt}(\vec{\omega} \times \vec{r}_{r}) + \vec{\omega} \times \vec{v}_{r} + \vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})

\vec{a}_{i} = \vec{a}_{r} + 2\vec{\omega} \times \vec{v}_{r} + \vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})
Multiplying the acceleration by the mass m to get a force

m\vec{a}_{i} = m\vec{a}_{r} + 2m\vec{\omega} \times \vec{v}_{r} + m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r})

So, writing the force in the rotating frame in terms of the force in the inertial frame, we have

\boxed{ m\vec{a}_{r} = m\vec{a}_{i} - 2m\vec{\omega} \times \vec{v}_{r} - m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r}) }


\boxed{\vec{F}_{r} = \vec{F}_{i} - 2m\vec{\omega} \times \vec{v}_{r} - m\vec{\omega} \times (\vec{\omega} \times \vec{r}_{r}) }

Notice that there are two extra terms (Term A and Term B) in the equation on the right, I have highlighted them below.

If we compare the force in a rotating frame to an inertial frame, two extra terms (Term A and Term B) arise. Term A is the Coriolis force, Term B is the centrifugal force

If we compare the force in a rotating frame to an inertial frame, two extra terms (Term A and Term B) arise. Term A is the Coriolis force, Term B is the centrifugal force

Term A is what we call the Coriolis force, which depends on the velocity in the rotating frame v_{r}. It is the force which causes water going down a plughole to rotate about the hole and to move anti-clockwise in the northern hemisphere and clockwise in the southern hemisphere. It is also the force which determines the direction of rotation of low pressure systems in the atmosphere. I will discuss the coriolis force more in a future blog.

Term B is the centrifugal force, the force we were aiming to derive in this blogpost. The strength of the centrifugal force depends on the position of the object in the rotating frame – r_{r}.

What is the direction of the centrifugal force

The direction of (\vec{\omega} \times \vec{r}_{r}) can be found using the right-hand rule for the vector product, which I blogged about here. Remembering that the direction of \vec{r}_{r}$ is radially outwards from the centre of the Earth, and the direction of \vec{\omega} is the direction of the Earth’s axis (pointing north), then the direction of \vec{\omega} \times \vec{r}_{r} is towards the east (right if looking at the Earth with the North pole up).

We now need to take the vector produce of \vec{\omega} with a vector in this eastwards direction, and again using the right-hand rule gives us that the direction of (\vec{\omega} \times \vec{r}_{r}) is outwards (not radially from the centre of the Earth, but at right angles to the axis of the Earth). But, notice the centrifugal force has a minus sign in front of it, so the direction of the centrifugal force is outwards, away and at right angles to the Earth’s axis.

The direction of the centrifugal force is away from the axis of rotation, as shown in this diagram

The direction of the centrifugal force is away from the axis of rotation, as shown in this diagram

This means that it acts to reduce the force of gravity which keeps us on the Earth’s surface. It also depends on the angle between where you are and the Earth’s axis, so is greatest at the equator and goes to zero at the pole. It means that you will weight slightly less than if the Earth were not rotating, but the effect is quite small and you would not notice such a difference going from the pole to the equator.

What is the strength of the centrifugal acceleration due to Earth’s rotation?

Let us calculate the centrifugal force at the Earth’s equator, where it is at its greatest.

At the equator, we can write that the centrifugal acceleration has a value of

\omega^{2} r \text{ as } \theta = 90^{\circ}

We can calculate \omega for the Earth by remembering that it takes 24 hours to rotate once, and \omega is related to the period T of rotation via

\omega = \frac{2 \pi }{ T}

We need to convert the period T to seconds, so T = 24 \times 60 \times 60 = 86400 \; s. This gives that

\omega = 7.272 \times 10^{-5} \text{ rad/s }

If we take the Earth’s radius to be 6,378.1 km (this is the radius at the equator), then we have that

\omega^{2} r = 0.0337 \text{ m/s/s}

Compare this to the acceleration due to gravity which pulls us towards the Earth’s surface, which is 9.81 m/s/s and we can see that the centrifugal force at its greatest is only 0.34 \% of the acceleration due to gravity. Tiny.

It is, however, noticeable when you are on a roundabout, and is used on fairground rides where you spin inside a drum and the floor moves away leaving you pinned to the wall of the drum. The force you feel pushing against this wall is the centrifugal force, and it is very real for you in that rotating frame!

So, there we have it, centrifugal force does exist in a rotating frame of reference, but does not exist from the perspective of someone in an inertial frame of reference.

Read Full Post »

There has been quite a bit of mention in the media this last week or so that it is 100 years since Albert Einstein published his ground-breaking theory of gravity – the general theory of relativity. Yet, there seems to be some confusion as to when this theory was first published, in some places you will see 1915, in others 1916. So, I thought I would try and clear up this confusion by explaining why both dates appear.

Albert Einstein in Berlin circa 1915 when his General Theory of Relativity was first published

Albert Einstein in Berlin circa 1915/16 when his General Theory of Relativity was first published

From equivalence to the field equations

Everyone knew that Einstein was working on a new theory of gravity. As I blogged about here, he had his insight into the equivalence between acceleration and gravity in 1907, and ever since then he had been developing his ideas to create a new theory of gravity.

He had come up with his principle of equivalence when he was asked in the autumn of 1907 to write a review article of his special theory of relativity (his 1905 theory) for Jahrbuch der Radioaktivitätthe (the Yearbook of Electronics and Radioactivity). That paper appeared in 1908 as Relativitätsprinzip und die aus demselben gezogenen Folgerungen (On the Relativity Principle and the Conclusions Drawn from It) (Jahrbuch der Radioaktivität, 4, 411–462).

In 1908 he got his first academic appointment, and did not return to thinking about a generalisation of special relativity until 1911. In 1911 he published a paper Einfluss der Schwerkraft auf die Ausbreitung des Lichtes (On the Influence of Gravitation on the Propagation of Light) (Annalen der Physik (ser. 4), 35, 898–908), in which he calculated for the first time the deflection of light produced by massive bodies. But, he also realised that, to properly develop his ideas of a new theory of gravity, he would need to learn some mathematics which was new to him. In 1912, he moved to Zurich to work at the ETH, his alma mater. He asked his friend Marcel Grossmann to help him learn this new mathematics, saying “You’ve got to help me or I’ll go crazy.”

Grossmann gave Einstein a book on non-Euclidean geometry. Euclidean geometry, the geometry of flat surfaces, is the geometry we learn in school. The geometry of curved surfaces, so-called Riemann geometry, had first been developed in the 1820s by German mathematician Carl Friedrich Gauss. By the 1850s another German mathematician, Bernhard Riemann developed this geometry of curved surfaces even further, and this was the Riemann geometry textbook which Grossmann gave to Einstein in 1912. Mastering this new mathematics proved very difficult for Einstein, but he knew that he needed to master it to be able to develop the equations for general relativity.

These equations were not ready until late 1915. Everyone knew Einstein was working on them, and in fact he was offered and accepted a job in Berlin in 1914 as Berlin wanted him on their staff when the new theory was published. The equations of general relativity were first presented on the 25th of November 1915, to the Prussian Academy of Sciences. The lecture Feldgleichungen der Gravitation (The Field Equations of Gravitation) was the fourth and last lecture that Einstein gave to the Prussian Academy on his new theory (Preussische Akademie der Wissenschaften, Sitzungsberichte, 1915 (part 2), 844–847), the previous three lectures, given on the 4th, 11th and 18th of November, had been leading up to this. But, in fact, Einstein did not have the field equations ready until the last few days before the fourth lecture!

The peer-reviewed paper of the theory (which also contains the field equations) did not appear until 1916 in volume 49 of Annalen der PhysikGrundlage der allgemeinen Relativitätstheorie (The Foundation of the General Theory of Relativity) Annalen der Physik (ser. 4), 49, 769–822. The paper was submitted by Einstein on the 20th of March 1916.

The beginning of Einstein's first paper on general relativity, which was received by Annalen der Physik on the 20th of March 1916 and

The beginning of Einstein’s first peer-reviewed paper on general relativity, which was received by Annalen der Physik on the 20th of March 1916

In a future blog, I will discuss Einstein’s field equations, but hopefully I have cleared up the confusion as to why some people refer to 1915 as the year of publication of the General Theory of Relativity, and some people choose 1916. Both are correct, which allows us to celebrate the centenary twice!

You can read more about Einstein’s development of the general theory of relativity in our book 10 Physicists Who Transformed Our Understanding of Reality. Order your copy here

Read Full Post »

Older Posts »