Re: too many errors in fill phase

From: Alex Rousskov (rousskov@measurement-factory.com)
Date: Fri Jan 19 2001 - 08:19:46 MST


On Fri, 19 Jan 2001, [ks_c_5601-1987] Çã³²Á¶ wrote:

> I am testing with polygraph-2.5.4. With two pairs of
> polysrv-polyclt set, I ran 800 req/sec test. When I test the
> direct connection of polysrv and polyclt, the performance was
> good. But when the cache was used, there were several hundreds of
> errors per sec. in the polygraph. Those errors happened in fill
> phase. But In the top phase there was no errors. So It does not
> seem to be a problem of cache.
>
> ReportGen said following errors
>
> #errno count count% explanation
> 32 82 0.01 "Broken pipe"
> 54 405644 70.14 "Connection reset by peer"
> 60 3021 0.52 "Operation timed out"
> 61 11623 2.01 "Connection refused"
> 263 241 0.04 "premature end of msg body"
> 267 157731 27.27 "unsupported HTTP status code"
>
> Have anybody experienced similar problem?
> Please give me an advice.

I can see two distinct possibilities:

        - The cache was not able to *fill* at the offered
          rate, but was able to sustain peak rate of the
          "normal" workload. Fill traffic is very different
          from the traffic in main phases so the cache may
          be able to survive one but not the other.
          Suggestion: try reducing fill rate by 50% or more
          without changing the peak rate.

        - The same errors happen during the main phases, but
          you thought they did not because Polygraph uses
          exponential back-off algorithm to report errors on
          the console.
          Suggestion: use error graph (the last graph) in the
          generated report to see if errors have indeed
          disappeared; also look carefully at the console
          output to see if the errors were there (but were
          _reported_ less frequently).

HTH,

Alex.



This archive was generated by hypermail 2b29 : Tue Jul 10 2001 - 12:00:17 MDT