Re: missing server host information in the request with webaxe-4

From: Alex Rousskov (rousskov@measurement-factory.com)
Date: Fri Aug 31 2001 - 11:26:31 MDT


On Sat, 1 Sep 2001, Eiji Kawai wrote:

> I tried to upgrade the polygraph (2.7.2->2.7.3) and found that the
> requests generated by polygraph-2.7.3 with webaxe-4 did not contain
> the information about server hosts.

WebAxe-4 request URLs do not contain host names because WebAxe robots
are sending their requests to what they believe is an origin server,
not a proxy. WebAxe-4 requests do contain host names which are passed
via the Host: header, as required by HTTP. That host name is the name
of the origin server the robots think they are talking to.
 
> With polygraph-2.7.2, the requests sent to the proxy server are like
> the following one.
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> GET http://172.16.101.61/w052e6e70.73aa019b:00000006/t02/_00000002.htm HTTP/1.0
> Host: 172.16.101.61:80
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Of course, 172.16.101.61 is the address of the server host that
> executes the polysrv program, and not the address of the proxy server.

This 2.7.2 behavior was incorrect. Robots should be sending requests
to an origin server, not a proxy. Thus, they should use relative URLs.

> On the other hand, with polygraph-2.7.3, the requests sent to the
> proxy server are like
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> GET /w052e6e70.73aa019b:00000006/t02/_00000002.htm HTTP/1.0
> Host: 172.16.101.32:80
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Here, 172.16.101.32 is the address of the proxy server.

This is correct. While you think of 172.16.101.32 as a proxy, robots
think of that address as an origin server. This is the primary
difference between forward caching proxy and a surrogate (reverse
proxy). The surrogate (172.16.101.32 in your case) represents origin
servers and is indistinguishable from the "real" origin servers from
clients point of view.
 
> Is this behavior of the polyclt is correct? or do I set some wrong
> configuration?

The robot behavior in 2.7.3 is correct. Your PGL configuration is also
correct.

You just need to configure your proxy to accelerate Polygraph servers
as opposed to being a regular forward cache. In other words, you need
to configure your proxy to be a surrogate for 172.16.101.191-192:80
origin servers. Clients should never talk to those servers directly.
All clients know is that there is one [powerful] origin server known
as 172.16.101.32:80.

The above is designed to mimic common surrogate installations: a
surrogate representing one or more identical origin servers. There are
other accelerating models as well.
 
HTH,

Alex.



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:19 MST