Hi,
I've been running numerous polygraph runs, trying to put our
filtering solution through some proxying paces. Everything works
great, except for retrieving data from the client log files. I've
discovered that the servers cleanly detect the end of the run, and
correctly flush their logs to the designated file. However, 95%
of the time, the clients do NOT realize that they should stop, and
the logs aren't flushed.
Notes from the client side:
- Transition occurs from the second (of three) phases, to
the last phase:
"fyi: local phase `phase_one' reached synchronization
point"
- The end of the last phase is noticed:
"fyi: local phase `cool' reached synchronization point"
- Then, presumably because the servers correctly have closed
down, all kinds of client errors are generated, and the
clients continue to loop somewhere. I have to kill them.
And as mentioned, the client logs are incomplete; none
of the in-memory log data is written.
I poked around the code a little, trying to see what causes the
data flushing at the end of a normal run. I didn't see any signal
that would allow me prod a running client process into flushing
it's memory logs to disk. I also tried some of the other phase
goals (like xactions and fill_size) for the cool phase. No luck;
it sure seems like duration=1min should work).
Has anyone else had this problem, and is there a recommended testing
method to get all clients and servers to correctly quit when the
test is over?
I've included some details below.
Thanks,
- Danno
=========================================================================
Danno Coppock danno@internetproducts.com
Internet Products, Inc. http://www.internetproducts.com
10350 Science Center Drive, Suite 100 San Diego, CA 92121
(858) 320-4871 Fax: (858) 320-4848
=========================================================================
-----------------------------------------
(I have a server side script that rsh's to the other clients and
servers to start them. Here are the command lines that get run.)
polysrv --verb_lvl 3 --config v2_perf.pg \
--log logfile --log_size 10MB
polyclt --verb_lvl 3 --config v2_perf.pg \
--log logfile --log_size 10MB --unique_world 0
-----------------------------------------
(Here's one of the common config files that all clients and servers
use.)
/*
* $Id: simple.pg,v 1.1.1.1.2.1 2000/06/21 20:52:36 rousskov Exp $
*
* a very simple "Hello World!" workload
*
*/
// get the distributed wait and cool phases
#include "include/phases.pg"
// we start with defining content properties for our server to generate
Content SimpleContent = {
size = exp(13KB); // response sizes distributed exponentially
cachable = 80%; // 20% of content is uncachable
};
// a primitive server cleverly labeled "S101"
// normally, you would specify far more properties,
// but we will mostly rely on defaults for now
Server S = {
kind = "S101";
contents = [ SimpleContent ];
direct_access = [ SimpleContent ];
};
// a primitive robot
Robot R = {
kind = "R101";
public_interest = 50%;
pop_model = { pop_distr = pmUnif(); };
// pconn_use_lmt = zipf(128);
// open_conn_lmt = 8; // open connections limit
};
// recurrence is the probability that a robot revisits a URL
// set recurrence ratio as desired_DHR/cachability_ratio
R.recurrence = 55% / SimpleContent.cachable;
// for production tests, never use one host for clients and servers!
addr[] srv_ips = ['172.16.1.1:80' ];
addr[] rbt_ips = ['172.16.129.1', '172.16.130.1'];
R.origins = srv_ips; // tell our robot where the server is
// assign agents (servers and robots) to their hosts
S.hosts = srv_ips;
R.hosts = rbt_ips;
Phase phONE = { name = "phase_one";
goal.duration = 5min;
log_stats = true; };
schedule( phWait, phONE, phCool );
// commit to using these servers and robots
use(S, R);
This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:16 MDT