Re: Zipf vs. Uniform

From: Alex Rousskov (rousskov@ircache.net)
Date: Mon Nov 29 1999 - 23:59:13 MST


On Mon, 29 Nov 1999, Pei Cao wrote:

> I actually think that the 4hr worth of working set is not small. Assume
> the throughput is 1000 req/second, each object is 13KB. Then 4 hr worth
> of traffic under 80% cacheable, 55% hit ratio is about 48GB. It won't
> wrap around the disks but it surely will wrap around RAM many many times.
> So even with 0.6 alpha value I don't think you will get a memory hit ratio
> that is too high.

You are right that in this example, RAM cache would wrap a few times, but I
want to clarify that "small" is relative to the size of a cache that a
customer would buy given the request rate. For your example, a customer
would probably demand a 100-150GB cache (to cache at least a few days of
traffic).

So far, no vendor has brought an appropriately configured cache to the
tests. As you know, for this bake-off we are going to publish "cache
capacity in hours given the advertised request rate" figures to encourage
vendors to bring more reasonable configurations.

> If 0.6 leads to memory cache that is too high, we can always lower it
> a bit.

Sure. To reduce the number of painful iterations we would prefer to have
vendors tell us what settings they would be comfortable with. Then we can
see if there is any consensus and decide on the final value. And thanks
for encouraging others to provide their performance data!
 
> Actually, if a vendor indeed has more RAM, then their throughput would be
> higher, which means that the working set size of 4hrs will be larger, which
> in turn reduces memory hit ratio in the RAM. This is true for both uniform and
> Zipf-distributions. In addition, more RAM costs more and this will be
> reflected in the price of the box. So it is not so easy for a vendor to
> "cheat" by simply adding more RAM.

I would not bore you with a long discussion here. Suffice to say that I
have cloned the actual argument that several vendors have used in
conversations with us...
 
> I strongly feel that uniform doesn't reflect real world experience of caches.
> I can only supply anecdotal data here, but I have seen caches with 100MB of
> RAM giving memory hit ratio of 2-3% after serving about 1.3 million
> requesting and caching 7GB worth of new data.

OK. So we are probably after 4-6% memory hit ratio for a "typical" 1-2GB
RAM cache (in Zipf workloads, the memory hit ratio probably follows the log
law so adding extra GBs gives relatively little after the first few
percents). We can certainly estimate Zipf parameters based on that
assumption if vendors do not supply better data.

Thanks,

Alex.



This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:09 MDT