Re: polymix-2 successful test - how to get the info?

From: Alex Rousskov (rousskov@ircache.net)
Date: Sun Apr 23 2000 - 09:31:18 MDT


On Sun, 23 Apr 2000, Serge Ayoun wrote:

> How do you decide the experiment was a successful one?

1. Check whether the test has finished. That is, check that _all_
   clients and servers exited after completing their goals and not
   earlier.

2. Generate a Polygraph report based on the top2 phase. If traces in the
   report imply that other phases may have misbehaved, generate a
   report for those phases or just manually extract interesting
   parameters (e.g., error rate) for those phases.

        http://polygraph.ircache.net/doc/reportgen.html
        http://polygraph.ircache.net/UserManual/reportgen.html
        http://polygraph.ircache.net/doc/lxopts.html#--phase

3. Check all "possible problems" listed in the auto-generated report.

4. Check that trace plots look reasonable, and that all anomalies can be
   explained.

5. Compare the report with the results of other, known-to-be-successful,
   runs.

> 1. The log file reports the average rate reply and request rate and not the
> rate at the peak phases.

Report Generator and lx take care of this. Using log_extractor_opts
(ReportGen) and --phase (lx) options, one can specify that report tables
should be based on a given phase.

> 2. Imagine the case where the request rate is supposed to be 400 req/s (1000
> robots) but somehow the reply and request rate during the top phases stays
> at 394 req/s. Obviously the experiment failed (am I wrong). How do you get
> the info from the log file. By the way the number of errors in this case
> can still be close to zero (see my next point).

Report generator will show if you have excessive number of waiting
transactions. If response time gets too high, the request rate may be
lower than desired due to a per-robot limit on the number of open
connections. You may recall the discussions about that on this list.

> 3. Each robot is supposed to send 0.4 req/s. However the maximum number of
> connections per robot is limited to 4 (polymix-2 script), therefore 1000
> robots do not necessarily produces a 400 req/s rate - this will happen when
> each robot has 4 opened connections active all the time and still did not
> reach the 0.4 req/s rate.

Right. If proxy response time is reasonable, the limit is sufficiently
high. If proxy gets behind (or there is some other bottleneck in your
setup), the limit will prevent you from reaching the configured request
rate. Usually, it is clear from the report that something went wrong.

There will be always some transactions in a "wait" state due to embedded
objects. Successful runs will have that wait queue short enough so that
the request rate is not affected.

> 4. Do you consider an experiment as a successful one even if the peak rate
> is not reached during phase top1 and 2?

No.

> 5. The request rates reported in the bakeoff2 document, are they average
> over the 14 hours or rather phase top1 and 2 peak rates?

Baseline presentation and auto-generated reports use top2 phase.

Alex.



This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:12 MDT