Re: several messages

From: Henrik Nordström (hno@marasystems.com)
Date: Fri Nov 16 2001 - 11:02:54 MST


On Friday 16 November 2001 17.45, Alex Rousskov wrote:

> Not that unrealistic, I think. The "fallen out" URLs could be pages
> with last month's news or a bankrupt dot.com sites. Note that URL
> popularity within the working set is not constant and is controlled by
> the hot set setting of popularity model.

Except that we are talking about a surrogate here not a proxy.. the pages
usually stay, the popularity may drop. But this very much depends on the type
of site. Some sites have a sliding time window for their content, other sites
are more static in their organisation..

> A better approach would be to remove individual random URLs or chunks
> of URLs from the working set instead of sliding WS window. This
> approach would require too much RAM to store WS info in its current
> encoding though.

Maybe... but I don't think "random" removal of content is that usual. In my
experience the content either is a sliding window (periodical, only N latest
available) or constantly increasing where older content rapidly drops in
popularity after their "freshness/archival date" but is still there and
sometimes requested. But of corse.. when being archived content are often
moved around... so random deletes makes sense after all.

Then there is site restructuring, completely disturbing everything, but I am
not sure there is a point to simulate such events as part of a normal
workload. If needed such events can be simulated by introducing a "site is
being restructured" phase that totally mucks of the WS for a while..

Thinking of it, a good approximation of both content models would probably be
a slowly sliding but increasing WS, with a constant size of "hot" objects.

Anyway, the slowly sliding WS is most likely a quite good approximation, and
it is not very likely the measured performance will be much different using
any of the other working set approximations, only marginal differences.

Regards
Henrik Nordström



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:20 MST