On Thu, 27 Jul 2000 Craig_Lowery@Dell.com wrote:
> We had the same problem, but it came about when we ran the EXACT same
> experiment over again, the only difference being that we replaced a 100MB/s
> connection with a 1000MB/s connection.
That could be a big difference! TCP and NIC performance can affect
bench behavior a lot.
BTW, the original problem that Bjorn reported was indeed due to an
overload condition and waiting queues growing out of bounds (double
checked with Bjorn off-list).
Alex.
> -----Original Message-----
> From: Alex Rousskov [mailto:rousskov@ircache.net]
> Sent: Thursday, July 27, 2000 12:02 PM
> To: Bjorn Townsend
> Cc: 'polygraph@ircache.net'
> Subject: Re: virtual memory exceeded in 'new'
>
>
> On Thu, 27 Jul 2000, Bjorn Townsend wrote:
>
> > We were running an 800req/sec test on one of our caches last night.
> > Everything seemed to be going very well -- there were hardly any
> > errors, and the cache was sustaining the 800req/second quite
> > nicely... or so it seemed.
> >
> > In top1, without warning, the client machines gave a "virtual memory
> > exceeded in 'new'" error and terminated. As far as I can tell, the
> > cache itself rebooted shortly thereafter. The clients did not dump
> > core.
> >
> > Has anyone else seen this error? What does it mean?
>
> Yes. It probably means that you configured Polygraph robots to generate
> more transactions per second than they could due to the (also
> configured) limit on the number of open connections per robot.
>
> By default, each PolyMix-2,3 robot has a limit of 4 open connections. If
> you raise per-robot request rate (0.4/sec by default) without raising
> the open_conn_lmt, you will get more and more transactions getting
> queued (waiting for a connection slot to become available). Eventually,
> Polygraph will simply run out of memory.
>
> Report Generator plots "wait" queue length for successful tests. You can
> get similar plot manually from an aborted run using ltrace + gnuplot.
> Make sure that the number of waiting transactions are always a lot
> smaller than the number of concurrent active transactions. See bake-off
> reports for "normal" behavior.
>
> BTW, using "wait_xact_lmt" field, you can configure a robot to have
> finite waiting queues and warn about them being overflowed.
> http://polygraph.ircache.net/doc/pgl/types.html#Pgl:Robot
> However, be careful not to create a best-effort workload. The above
> paragraph still applies.
>
> Alex.
>
>
This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:14 MDT