I don't know if it is the average. I had always thought that if you took the RGB histogram and laid all three on each other you would have the normal historgram. But now I would like to know.
I was intrigued so I spent the last 20 min playing in photoshop and here's what I have come up with. The histogram is made up of a pixel count for various brightnesses. The red green and blue histograms which are the simplest ones to understand may also surprise you.
Assume you have an image that has 3 stripes, one pure red, one pure green, one pure blue. Each stripe takes up 1/3rd of the image. And here is an explanation of what is shown:
Each channel histogram, R, G, and B will each have the 255 value at 50%, and the 0 value at 100%. This may sound surprising at first but you need to realise that in the red channel there is twice as much black as there is completely red, the same for a green and blue.
This should instantly show why the RGB histogram can't give a distribution of tone in an image. There is no black in an image however each histogram shows far more black in every channel than it does colour.
Now the RGB, or colour channel histogram is now the average of each of the pixel values. Again this shows twice as much pure dark as pure light colours, which is misleading, but tells us that the values are directly proportional to the values in each channel. Thus yes if you lay each channel on top of each other and average them you end up with A histogram, the RGB histogram, but this is not a normal one. It doesn't take into account that the three stripes don't look like the same brightness to me.
The luminosity channel however does not concern itself with the pixel value of a single channel in the other systems. Instead it looks at the perceived (note that word it's very important) brightness of each individual pixel. In the histogram for this channel you will see 3 distinct peaks. All three are distributed towards the middle and low ends of the histogram. However none are white and none are black, since none of the 3 colours are perceived to be pure white, nor pure black, which is exactly what is to be expected.
What may not be expected is the value of blue is assigned 28, red 76, and green 150. However this is easily explained as the eye perceives brightnesses in different ways. There's a few various standards for colour to brightness conversion used by various television and media formats, some more accurate than others (I feel for Americans and your NTSC), but these standards typically supply a weighted average to use in calculations.
Anyway, each pixel is counted by the weighted addition: Y = 0.299*R + 0.587*G + 0.114*B to correspond how each value has a different perceived brightness. It may come as no surprise that .299+.587+.114 = 1. So a value of pure white (255,255,255) will give a Y = 255 value for the histogram, and a value of any colour will give it's perceived value.
In a quick google I also found out that this method is inaccurate, where the actual perceived brightness is supposed to be a weighted average. However since a weighted average involves 3 squares and square root per pixel, no one would want to wait for photoshop or their camera to calculate that histogram on a 15mpx image
In addition, the above formula is for photoshop. Cameras are much likely to use yet another approximation: Y = 0.375*R + 0.5*G + 0.125*B. This quite a bit less accurate but has the distinct advantage of involving very very basic maths. This can be written as Y = (3*R + 4*G + B) / 8, which in micro-controller code is Y=(R+R+R+G+G+G+G+B)>>3 and is a VERY efficient integer operation compared to the more accurate method above.
You probably didn't want to know half of that. I need a life.