Setting up a histogram in Excel is not too difficult but you might be surprised to find that there are several options to allow you to do that. In this post I will demonstrate three ways of setting up a histogram and mention a fourth.

I think the most important part of setting up a histogram is in setting up the bin and frequency ranges. If you set up the bin range manually, you might find it is a bit difficult to do so there are two or three ways in which Excel can help with this.

Setting a Bin Range

John Walkenbach, in his Excel Formulas books gives us this array formula that prepares a bin range automatically:

=MIN(data)+(ROW(INDIRECT(“1:10”))*(MAX(data)-MIN(data)+1)/10)-1

The value of 10 in this formula tells you that the bin range is 10 rows deep: change from 10 to 7 and you get a bin range 7 rows deep. The screenshot below is from an Excel file I prepared that uses this formula so you can see what it does.

My personal approach to setting up a bin range is to use =MAX() and =MIN() to find the extreme points in the data series and then I manually set the bin range from there: again, you can see what I did in the screenshot below.

Take a look at the screenshot below and you will see that the two bin creation methods can given different looking results as the boundaries or class intervals will usually vary from method to method.

Finding the Frequencies

There are several ways of finding the frequencies for a histogram: let’s look at two methods and I will mention a third.

I tend to use the array entered FREQUENCY() function to find the frequencies in my histograms; and in the screenshot you will see the effects of using this formula:

=FREQUENCY(B4:B103,G4:G13)

To use this formula, you need to select the correct range size before entering the formula.

Alternatively, to use the following array entered formula you can enter it in just the top cell in the frequency range and then drag it down to fill the rest of the range:

=SUM((data>=D20)*(data<=E20))

The Other Method!

I have mentioned a fourth method, which is to use the ToolPak Add-In Histogram function and that can do everything for you, even the bin range. I do not demonstrate the use of the ToolPak method here but it is a very useful thing to use. However, it has one major drawback: it is not volatile. What that means is that if you change any of the raw data points, the histogram will not be updated unless you run the ToolPak again. On the other hand, the other methods are volatile.

Duncan Williamson

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: