I'm running a slightly modified nolb-l4-4 test with 1000 robots per host
and it looks like it's running out of file descriptors near the magic
1024 mark.
I RTFMed and searched the list archives, but the only thread I saw
didn't directly address this problem.
009.76| i-ramp 72306 244.58 160 0.00 0 954
009.85| i-ramp 73508 240.40 158 0.00 0 956
009.88| Connection.cc:96: error: 1/5 (s24) Too many open files
009.88| Client.cc:265: error: 1/6 (c53) failed to establish a connection
009.88| 10.1.3.62 failed to connect to 10.2.1.1:80
009.88| Connection.cc:96: error: 2/7 (s24) Too many open files
009.88| Client.cc:265: error: 2/8 (c53) failed to establish a connection
009.88| 10.1.3.62 failed to connect to 10.2.1.1:80
009.93| i-ramp 74710 239.93 162 0.00 3 954
My first guess would be that Polygraph is using select() instead of
poll(), but since the default settings use up to 1000 robots per host, I
assume this works for the PolyTeam. My ulimits appear to be set
correctly (see below). I am using Linux 2.4 instead of FreeBSD; maybe
this accounts for the difference.
Is ~1000 open connections a hard Polygraph limit (in which case I should
just use more machines) or is my system misconfigured?
[root@----- root]# cat /proc/sys/fs/file-max
32768
[root@----- root]# ulimit -a
core file size (blocks) 0
data seg size (kbytes) unlimited
file size (blocks) unlimited
max locked memory (kbytes) unlimited
max memory size (kbytes) unlimited
open files 32768
pipe size (512 bytes) 8
stack size (kbytes) 8192
cpu time (seconds) unlimited
max user processes 2047
virtual memory (kbytes) unlimited
[root@---- root]# uname -a
Linux ----- 2.4.9-e.3 #1 Fri May 3 17:02:43 EDT 2002 i686 unknown
[root@---- root]# /usr/local/polygraph/bin/polyclt
--version
Polygraph 2.7.6 Copyright (C) 1998-2002.
-- Wes Felter System Software Department IBM Austin Research Lab 11400 Burnet Rd., Austin, TX 78758 Tel 512-838-7933
This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:23 MST