Re: object life cycles

From: Alex Rousskov (rousskov@measurement-factory.com)
Date: Fri Apr 19 2002 - 09:16:53 MDT


Mukesh,

        I think you are correct that most PolyMix-4 objects are not
modified during a run. It would be interesting to know what the actual
percent of modifications is, but I cannot think of a trivial way to
measure that without logging URLs of modified objects whenever they
are requested.

        However, I am not sure I can agree with your "minimum cycle"
analysis. It seems to me that since the object birthday is selected at
random, even objects with 2 year cycle (olcStatic) can be modified
during a short test:

        For simplicity, consider zero variance, one year cycle, 24
        hour test. With 1/365 probability a given object will be
        modified during the test. The probability of two modifications
        happening to the same object is zero.

Are you with me?

        I think that PolyMix content type-based life cycle settings
are reasonable. Ideally, we need to add a few "hot" servers with
content that does not follow "normal" or "average" life cycle
patterns. Call them CNN servers if you wish. Once those servers are
introduced into the mix, they should receive significantly more
traffic that an average server and have frequent content
modifications, at least for index.html-like objects (images on cnn.com
do not change often if at all, I guess).

        Both features (skewed origin server access pattern and
frequent content modification) are already supported. You can improve
your PolyMix workload by using them. Hopefully, we can add hot servers
to PolyMix-5. The difficulty is in automatic spreading the hot server
agents among available simulation hosts (PCs) so that no single host
gets overloaded.

        The out-of-sync problems you pointed out are now fixed. Please
let me know if I missed anything.

Thank you,

Alex.

On Thu, 18 Apr 2002, mukesh agrawal wrote:

> I have a question on the object life cycles defined in the PolyMix
> 4 workload. I'm trying to understand how many objects are expected
> to be modified during the normal run of the benchmark (which is
> ~12 hours, if I remember correctly).
>
> Clearly, none of the olcStatic objects will change. For olcOther,
> we expect very few to change (1/365 will have a cycle time of 1
> day. As the variance is 50%, a small fraction may change within 12
> hours.)
>
> The content type that changes most frequently is olcHTML. This has
> a lognormal distribution with mean 7 days and sd 1 day. To
> understand what that meant, I followed the polygraph code's method
> of computing aMu and aSigmaSq from mean and sd. I plugged aMu and
> sqrt(aSigmaSq) into Matlab's lognpdf function. From this, it seems
> that the minimum cycle time for logn(7 days, 1 day) is 4 days.
> Factoring in the 33% variation, I figure that the minimum time for
> an HTML object to change is ~2.7 days.
>
> Is this correct? If so, does it mean that the polymix-4 workload
> effectively doesn't include news sites (like the CNN example in
> the object life cycle documentation) in its workload model?
>
> Small nit: the object life cycle documentation is out of sync with
> the code at the moment. (It talks about the birthday field, which
> is no longer used in polygraph 2.7.5.) The affected pages are
>
> http://www.web-polygraph.org/docs/reference/models/objlife.html#modification
> http://www.web-polygraph.org/docs/reference/pgl/types.html#type:docs/reference/pgl/types/distr
>
> Thanks!
>
> --
> public key: finger mukesh@cs.cmu.edu
> fingerprint: BDAB AB7A ADFB 9229 1BD8 45FD BE21 850C E36C D4AA
>
>
>



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:22 MST