On Wed, 3 Dec 2003, Charlie Younghusband wrote:
> basic.rptm.hist:
> # bin min max count % acc%
> 817 1216 1391 8644 1.28 1.28
> 861 1392 1399 5795 0.86 2.15
>
> The min, max, count and accuracy are easy to understand on their own, for
> example from these to row I can say that 2.15% of the transactions took less
> than 1400ms. I don't understand why the number of bins varies, the number
> used for the bin, and why the count percentage varies so much from bin to
> bin. The college stats books I have don't cover it. :(
The number of bins uses depends on the range that need to be covered by
the histogram. When there is holes in the bin range there is holes in the
result with no results having values that would go into the missing bins.
A histogram for the same value range always have the same bins, but the
distribution within the bins varies with the test data.
> What I would really like to is to graph the histogram itself for
> visualization. I tried fooling around with the numbers in Excel awhile ago
> to it to represent it properly and failed.
I think it is best plotted as a diagram using as X scale the value or log
of the value (*), Y scale the average count per value covered by the bin
(count or % +- accuracy) / (max - min), and filling this with blocks of
each bin. area (min,0) to (max, count / (max - min)). This should
give you an approximation of the count per value.
*) The X scale to use depends on the distribution of the value type and
should match the distribution scale used for the bins when the histogram
was collected. If you want to use another X scale you may need to first
rescale the histogram to plot it correcly.
Regards
Henrik
This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:27 MST