I am getting ready for my spring class on statistical graphics.  One idea that we will talk about in my graphics course is the importance of logs in helping us to understand patterns .  We have also seen in our EDA class that it is useful to reexpress data by logs.

Here is some data that deserve a log reexpression.   Nowadays everyone owns a cell phone, but it wasn’t always that way.   I’m interested in studying the growth of cell phones, so I find some data.  The data file cellphones.txt (found at http://bayes.bgsu.edu/eda/data/cellphones.txt) gives the number of cell phones in the United States in the 25 year period between 1985 to 2000.

I plot the data.

Obvious this is not a straight-line pattern, so it is not appropriate to fit a line.   Actually, it looks exponential, and we can (hopefully) straighten the plot by applying an appropriate reexpression to Number.
If we take a log, any type of log (base 10, base e, base 2, etc) will help in straightening the graph, but there is some advantage to taking log base 2.
The following graph plots log2(Number) against Year.  It looks like that the graph is more straight, especially for years 1985 through 1995.

Let’s focus on the interpretation of the growth.  One advantage of taking logs base 2, is that a change of 1 on the log 2 scale is equivalent to doubling the value.  Similarly, a change of 2 on the log 2 scale is equivalent to increasing the value by a factor of 4.

So I can tell from the graph that

• The number of cell phones increased four-fold between 1986 and 1989
• The number of cell phones increased again four-fold between 1990 and 1994
• The number of cell phones again increased by a factor of 4 between 1998 and 2010
• Actually it appears that we have reached a saturation point in the number of cell phones.  Maybe we can start counting the number of iPads in the U.S.