On Wed, 12 Jul 2000 brianok@us.ibm.com wrote:
> Since the PolyMix-2 test is a total of 14 hours, we were assuming that the
> various phases were pretty exact in their durations, which would easily
> allow two client machines to each throw 400 req/sec at a proxy at the exact
> same times ("top1" and "top2").
Polygraph is not able to synchronize phases (yet), but PolyMix workloads
should not need precise synchronization unless I am missing something
important.
> What we're seeing is that the "warm" phase takes longer on one
> client machine than on another client machine, such that the "inc1"
> and subsequent phases on one client machine are not synchronized
> with the corresponding phases on another client machine.
Yes, this is possible for both PolyMix-2 and PolyMix-3. What kind of
difference in phase start time do you monitor? I hope it is "small
enough".
> We thought the "warm" phase was a set amount of time, but it appears
> to be the time that it takes to start up all the servers. Is this
> true?
It is the maximum of the above two times. The warm up phase will not
finish until all servers are ready to be hit (see below).
> We don't appear to be having basic network problems, because the
> rest of the testing seems to be running well. When the "warm" phase
> is waiting for all the servers to come up, what is it really doing?
Each polyclt process waits for all servers to enter a "ready to be hit"
state. Polyclt marks a server agent (not polysrv process!) as ready when
all local robot agents can repeat a request to that server. The latter
assures that each robot can generate a "hit" request when asked to.
The details of the above process are too boring and complicated so I
would not describe them unless you must know everything. The complexity
is due to public (shared by all robots in the test) and private (known
to a single robot) URL spaces that Polygraph maintains. In short, robots
submit requests until each knows enough about each server to generate a
hit on that server...
The servers are not doing anything special during this phase, just
answering robot requests.
> We see the servers starting up slower and slower as the number
> increases. Perhaps this is due to the load on the server machines,
> but I thought that using 600 servers on each server machine was well
> within the capacity of a 400 MHz machine.
The server agents are not really starting up. All servers must be up and
ruining before your start polyclt processes (or Robots).
The warmup progress indicator on the client side may indeed show
non-constant speed. We have done some improvements in 2.5.2 to speedup
the whole process though. The progress indicator has been polished as
well.
For S server agents, it should now take not much more than 2*S requests
to finish the scan. It used to be 8*S.
Alex.
This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:14 MDT