Re: polyclt crash

From: Alex Rousskov (rousskov@ircache.net)
Date: Wed Apr 12 2000 - 09:23:10 MDT


Dear Sung,

        Your setup generates about 1.8 million errors per 5 seconds with
no successful transactions after the first 82159 replies. Please note
the huge number of errors that Polygraph is reporting on the console.
Essentially, the test is not going anywhere because your OS does not
give Polygraph enough resources to support the specified load.

        It looks like your setup cannot handle the number of robots that
you configured Polygraph with. Look for the maximum number of usable
file descriptors that Polygraph reports when it starts (it is probably
around 1000 on your machine) and then either decrease the number of
robots or increase the number of file descriptors available to Polygraph.
The latter is an OS-specific task:
        http://polygraph.ircache.net/Tips/

        Keep in mind that during a fill, each robot is allowed to open
up to 4 connections. I am not sure why you are getting ~1000 open
connections with 200 robots. Perhaps you use more than 200 robots? How
many robots does polyclt start?

HTH,

Alex.

P.S. There is a [known] bug that causes the assertion you quoted
     when Polygraph is subjected to so many errors during a best-effort
     phase. Phase stats are not dumped when Polygraph terminates on
     an assertion.

On Wed, 12 Apr 2000, Sung Lee wrote:

> Does anyone see following messages before ??
>
> "polyclt" is running on Sun E-250 server with
> 1 G RAM, SparcII 400 Mhz, 9 G disk.
> In webaxe file, I set "size WSS = 1GB;" and 200 robots.
>
> I haven't reached the peak phases.
> After coredump, I tried to "label_results",
> error message is "no matching phases found".
>
> Thanks in advance to any information.
> ----------------------------------------
>
> 358.17| Connection.cc:172: error: 8191/262366505 (145) Connection timed
> out
> 358.18| i-fill 82159 3.28 -1 -1.00 1794199 979
> 363.01| i-fill 82159 0.00 -1 -1.00 1818571 979
> 363.01| Connection.cc:112: error: 268435455/268466321 (24) Too many open
> files
> 367.82| i-fill 82159 0.00 -1 -1.00 1807720 979
> 372.62| i-fill 82159 3.31 -1 -1.00 1802263 979
> 377.37| i-fill 82159 0.00 -1 -1.00 1798801 979
> 382.07| i-fill 82159 0.00 -1 -1.00 1778085 979
> 386.78| i-fill 82159 3.39 -1 -1.00 1761572 979
> 391.45| i-fill 82159 0.00 -1 -1.00 1760802 979
> 396.09| i-fill 82159 0.00 -1 -1.00 1747540 979
> 400.70| i-fill 82159 3.45 -1 -1.00 1738878 979
> 405.32| i-fill 82159 0.00 -1 -1.00 1727446 979
> 409.97| i-fill 82159 0.00 -1 -1.00 1731988 979
> 414.65| i-fill 82159 3.40 -1 -1.00 1741299 979
> 419.30| i-fill 82159 0.00 -1 -1.00 1752831 979
> 423.90| i-fill 82159 0.00 -1 -1.00 1737229 979
> 428.51| i-fill 82159 3.46 -1 -1.00 1726696 979
> 433.19| i-fill 82159 0.00 -1 -1.00 1725460 979
> 437.96| i-fill 82159 0.00 -1 -1.00 1751817 979
> 442.82| i-fill 82159 3.27 -1 -1.00 1783860 979
> AlarmClock.cc:74: assertion failed: 'thePendAlarmCnt >= 0'
> Abort - core dumped
> client #
>
>
>



This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:12 MDT