Re: FreeBSD & Polygraph: Spontaneous reboots

From: Alex Rousskov (rousskov@measurement-factory.com)
Date: Mon Apr 05 2004 - 10:46:30 MDT


On Mon, 5 Apr 2004, Bryn Reeves wrote:

> I have made four attempts so far, and each time the client machine
> has spontaneously rebooted between 10-30hours into testing. Nothing
> gets written to /var/log/messages and the console logs from polyclt
> look completely normal
>
> We are using The Measurement Factory edition of FreeBSD 4.3 on the
> client/server nodes with polygraph 2.7.6. Would it be worth trying
> 2.8.0 as a first step?

While upgrading Polygraph is a good idea, I would not do it as the
first step since Polygraph is unlikely to be at fault here. Here is
what you can try:

        0. Run a memory test. Duane Wessels has a simple
           program you can use (and there are probably many
           fancier ones):
           http://www.life-gone-hazy.com/src/test_memory/

        1. Run some CPU benchmark, "make world", or
           make anything that is likely to stress the system.
           I am not optimistic about this step, but
           it might help.

        2. Did you run a bi-directional netperf test as a
           part of your PolyMix preparation routine? Run
           it a bit longer (20min?) to see if this is a
           networking related issue (e.g., problems with
           a NIC driver)

        3. Run a no-proxy test at 500 req/sec or more.
           Does that crash the machine?

        4. Upgrade to FreeBSD 4.7. We can provide ISO
           images if you need them. Perhaps this should
           be the first step, not sure.

What CPU and NIC are you using? Feel free to privately e-mail me your
client and server Polygraph --console logs in case you missed some
clues there (I doubt you did).

Thanks,

Alex.



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:27 MST