Hi,
I have a question about a characteristic of the polymix-4 workload. The
fraction of the bytes requested that are due to large files seems low
compared to previously published measurements. I placed a graph comparing
the polygraph generated workload to the earlier measurements at
http://www-2.cs.cmu.edu/~mukesh/loadfrac.ps.gz
The data sets in the graph are:
Calgary
One year's requests for the CS department websever at University of Calgary
From Arlitt and Williamson -- Sigmetrics 96
Clarknet
Two week's requests for a Wash, DC ISP's web server
From Arlitt and Williamson -- Sigmetrics 96
WorldCup (busy)
A couple hours worth of requests for the busiest day of the
WorldCup '98 website
WorldCup (last day)
The entire day's requests for the last day of WorldCup '98.
Berkeley HomeIP
The four hour trace from Berkeley's HomeIP study.
Polygraph
The workload from a Polygraph run.
The only trace that has a smaller fraction of the bytes due to files >100K
is the busy trace from the WorldCup site. Even the HomeIP trace, which I
would expect to be skewed towards small files (as the users are connected
via modems) has a larger fraction of load from large files (files >100K
comprise ~23% of the load in HomeIP, versus 10% for polygraph run).
So my question is: is the stock polymix-4 workload intended to accurately
model the fraction of the load due to large objects? If so, is the distribution
I'm seeing consistent with what is intended?
Thanks,
mukesh
-- public key: finger mukesh@cs.cmu.edu fingerprint: BDAB AB7A ADFB 9229 1BD8 45FD BE21 850C E36C D4AA
This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:23 MST