Re: Zipf vs. Uniform

From: Alex Rousskov (rousskov@ircache.net)
Date: Tue Nov 30 1999 - 20:30:43 MST


On Wed, 1 Dec 1999, Lincoln Dale wrote:

> unfortunately, we don't believe that the 'unif' object distribution model
> is realistic.
> in fact, Alex and Duane have said as much .. that it serves the purpose of
> 'preventing artificially high memory hit ratios'.
>
> all we're pointing out is that it _isn't_ realistic - and giving some
> pointers to what we believe _are_ realistic.

(a) We are not saying that Uniform model is realistic. It is not.
    When selecting between artificially high and artificially low
    hit ratios, we have chosen the latter. Now polygraph allows for
    a trade-off or compromise that we are trying to discuss.

(b) So far, unfortunately, there were no pointers that would help us to
    determine the right value for a more realistic Zipf distribution.

> indeed, yes. we've spent considerable effort researching what real object
> distribution patterns are.

Then you should have no trouble identifying the Zipf value, validating and
possibly adjusting it against your products _under Polygraph workload_, and
then sharing the final number with the rest of us.

Our group has a luxury of being able to make our research public. Zipf
parameter derived from NLANR logs is available (for the last few years) at
        http://www.ircache.net/Cache/Statistics/Popularity-Index/

Unfortunately, we cannot use our logs in this case because the resulting
memory hit ratio without adjustments is wrong (too high). Our logs are
indeed skewed for many reasons irrelevant to this discussion.

Thanks,

Alex.

P.S. I certainly do not think Cisco is trying to fudge the workload
     to help their product. We all have the same primary goal here --
     realism in simulation.



This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:09 MDT