Re: realistic content simulation & compression

From: Alex Rousskov (rousskov@measurement-factory.com)
Date: Thu May 15 2003 - 13:57:31 MDT


On Thu, 15 May 2003, Andrew Sundelin wrote:

> I'm thinking about using the "Realistic Content Simulation" feature
> of Polygraph over a link that has the capability of compressing
> traffic. However, the doc page
>
> http://www.web-polygraph.org/docs/userman/csm/
>
> seems to have all sorts of caveats about the images.
>
> Is the issue that the images just aren't always displayed properly
> in browsers or are there other issues with the image content? i.e.
> if I use the demo.cdb file am I going to get traffic that is
> basically representative of real web traffic in terms of its
> compressability (or non-compressability as the case may be)?

There are several related aspects here. First of all, Polygraph does
not really know what an "image" is. Polygraph robots simply follow
(fetch) URLs embedded in the responses they receive. The on-line demo,
on the other hand, assumes that you use a browser that does try to
display images.

Second, Content Database (CDB) interface assumes markup content by
default: when you add a file to a CDB, the "cdb" tool parses that file
as XML-like markup and splits it into elements before inserting those
elements into the database. This is done to allow Polygraph servers to
generate new markup "pages" based on existing ("real") elements or
existing markup fragments.

Third, a "verbatim" format can be used to instruct "cdb" that no
parsing/splitting should be performed on the files, and that each file
is to be added to the database "as is". It is recommended that you use
the same "verbatim" setting for all files within one CDB. If you
follow that recommendation, then you can configure Polygraph server to
serve those files "as is". The latter is done by omitting the "size"
distribution when specifying a Content object.

To use real content in a test, follow these steps:
        - create CDB(s) with markup content (e.g., HTML and XML)
        - create CDB(s) with verbatim content (e.g., images and sound)
        - specify one Content PGL object per CDB
        - configure Polygraph Servers to use an appropriate mix
          of Content objects

The above should give you a realistic traffic mix and should allow for
accurate compression tests (as long as your mix and your CDB sample is
realistic).

Please let me know if you need more information. The above is just a
brief summary.

Alex.

-- 
                            | HTTP performance - Web Polygraph benchmark
www.measurement-factory.com | HTTP compliance+ - Co-Advisor test suite
                            | all of the above - PolyBox appliance



This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 12:00:26 MST