PolyMix-3

Here is PolyMix-3 at a glance.

Workload Name: PolyMix-3

Polygraph Version: 2.5.4

Configuration: workloads/polymix-3.pg

Workload Parameters: cache size, fill request rate, peak request rate

Results: available

Synopsis: workload designed specifically for the third cache-off

Workload Name:	PolyMix-3
Polygraph Version:	2.5.4
Configuration:	workloads/polymix-3.pg
Workload Parameters:	cache size, fill request rate, peak request rate
Results:	available
Synopsis:	workload designed specifically for the third cache-off

PolyMix-3 is based on our experience with using PolyMix-2 during the second cache-off and other tests. We have eliminated some of the known problems of the old workloads and added new features. The ultimate goal is, of course, getting our model closer to the real worlds.

1. Feature overview
    1.1 Phase schedule
    1.2 Fill phase caveats
    1.3 Reply sizes
    1.4 Cachable and uncachable replies
    1.5 Life-cycle model
    1.6 Content types
    1.7 Latency and packet loss
    1.8 If-Modified-Since requests
    1.9 Cache hits and misses
    1.10 Object popularity
    1.11 Simulated robots and servers
    1.12 Persistent connections
2. Address allocation
    2.1 Allocation scheme
    2.2 Number of clients and servers
    2.3 Configuration example
    2.4 Other provisions

1. Feature overview

The PolyMix environment has been modeling the following Web traffic characteristics since PolyMix-2.

a mixture of content types

varying offered load, depending on the test phase

a working set of URLs that changes its content with time but can preserve its size

all distributed clients can share information about the global URL set

object life-cycles (expiration and last-modification times)

persistent connections

network packet loss

reply sizes

server-side latencies

a mixture of cache hits and cache misses

a mixture of cachable and uncachable responses

object popularity (recurrence)

request rates and interarrival times

embedded objects and browser behavior

virtually infinite number of different objects that are added to the working set as needed

These features were added for PolyMix-3.

integrated fill and measurement phases into a single run

cache validation (IMS requests)

forced cache validations (reloads)

hot subsets simulating flash crowds

improved URL working set handling

Still absent from the cache-off workload are.

DNS-lookup latencies

aborted requests

real content (HTML, images, etc.)

client-side latencies, bandwidth limits

non-HTTP traffic

different popularity characteristics among servers

While the last four features are already supported in a Polygraph environment, they are prohibitively CPU intensive or require further improvement.

1.1 Phase schedule

The following table describes all the important phases in a PolyMix-3 test. Not counting the fill phase, the test takes about 12 hours. Filling the cache usually takes an additional 3-12 hours, depending on the product.

Phase
Name Duration Activity

framp 30 min The load is increased from zero to the peak fill rate.

fill variable The cache is filled twice, and the working set size is frozen.

fexit 30 min The load is decreased to 10% of the peak fill rate. At the same time, recurrence is increased from 5% DHR to its maximum level.

inc1 30 min The load is increased during the first hour to reach its peak level.

top1 4 hours The period of peak ``daily'' load.

dec1 30 min The load steadily goes down, reaching a period of relatively low load.

idle 30 min The ``idle'' period with load level around 10% of the peak request rate.

inc2 30 min The load is increased to reach its peak level again.

top2 4 hours The second period of peak ``daily'' load.

dec2 30 min The load steadily goes down to zero.

Most reliable/interesting measurements are usually taken from the top2 phase when the proxy is more likely to be in a steady state.

1.2 Fill phase caveats

As mentioned previously, PolyMix-3 combines the fill and measurement phases into a single workload. The benefit to this approach is that the device under test is more likely to have steady state conditions during the measurement phases. Also, a larger URL working set can now be formed without increasing the duration of a test. Under PolyMix-2, the fill phase was an isolated test. That meant that the measurement phase could not request objects used during the fill phase.

A downside to integrating the fill phase is that it is now difficult to skip the fill phase and go right to measuring. For some products, half of the testing time is spent in the fill phase. The total duration of the test remains similar to the PolyFill-2 plus PolyMix-2 sequence, decreasing for some products.

The old PolyFill-2 workload used Polygraph's best-effort request submission model, and vendors could choose how many robots to use for the fill. Some participants apparently found that a small number of robots left the disk system in a higher performing state than did a larger number. Now, PolyMix-3 uses the same number of robots during all of its phases, and the participants can specify fill rate directly just as they specify the peak request rate.

One of the rules of PolyMix-3 is that the request rate during the fill phase must not be greater than the peak rate (as used in top1 and top2). Otherwise, users are free to choose virtually any fill rate the like. Usually, the selected fill rate is at least 50% of the peak request rate.

PolyMix-3 limits fill request rate to peak request rate to prevent test participants from specifying very high fill rate that causes the cache to reject or bypass some of the incoming fill traffic, effectively reducing the amount of content stored by the cache. Some products used this (now illegal) trick with PolyFill-2 to cache less data during the fill and optimize their dataplacement layout for the measurement phases. Ideally, Polygraph should check how much data is actually cached instead of imposing artificial request rate limits.

1.3 Reply sizes

Object reply size distributions are different for different content types (see the table below). Reply sizes range from 300 bytes to 5 MB with an overall mean of about 11 KB and a median of 5 KB. The reply size depends only on the object ID (oid). Thus, the same object always has the same reply size, regardless of the number of requests for that object.

1.4 Cachable and uncachable replies

Polygraph servers mark some of their responses as uncachable. The particular probability varies with content types (see the table below). Overall, the workload results in about 80% of all responses being cachable. The real world cachability varies from location to location. We have chosen 80% as a typical value that is close to many common environments.

A cachable response includes the following HTTP header field.
Cache-Control: public

An uncachable response includes the following HTTP header fields.
Cache-Control: private,no-cache Pragma: no-cache

Object cachability depends only on the oid. The same oid is always cachable, or always uncachable.

1.5 Life-cycle model

Web Polygraph is capable of simulating realistic (complex) object expiration and modification conditions using Expires: and Last-Modified: HTTP headers. Each object is assigned a ``birthday'' time. An object goes through modification cycles of a given length. Modification and expiration times are randomly selected within each cycle. The corresponding parameters for the model are drawn from the user-specified distributions.

The Life-cycle model configuration in PolyMix-3 does not utilize all the available features. We restrict the settings to reduce the possibility that a cache serves a stale response. While stale objects are common in real traffic, caching vendors strongly believe that allowing them into the benchmark sends the wrong message to buyers.

Consecutively, all Polygraph responses in PolyMix-3 carry modification and expiration information, and that information is correct. The real-world settings would be significantly different, but it is difficult to accurately estimate the influence of these settings on cache performance.

1.6 Content types

PolyMix-3 defines a mixture of content types. Each content type has the following properties.

popularity

content size distribution

cachability percentage

life-cycle parameters

file name extensions distribution

The approximate parameters for the first four properties are given in the table below. For exact definitions, see the workload files.

Type Portion Reply Size Cachability Expiration

Image 65.0% exp(4.5KB) 80% logn(30day, 7day)

HTML 15.0% exp(8.5KB) 90% logn(7day, 1day)

Download 0.5% logn(300KB,300KB) 95% logn(0.5year, 30day)

Other 19.5% logn(25KB,10KB) 72% unif(1day, 1year)

1.7 Latency and packet loss

PolyMix-3 uses the same latency and packet loss parameters that we used for PolyMix-2. The Polygraph client and server machines are configured to use FreeBSD's DummyNet feature.

We configure Polygraph servers with 40 millisecond delays (per packet, incoming and outgoing), and with a 0.05% probability of dropping a packet. Server think times are normally distributed with a 2.5 second mean and a 1 second standard deviation. Note that the server think time does not depend on the oid. Instead, it is randomly chosen for every request.

We do not use packet delays or packet loss on Polygraph clients.

1.8 If-Modified-Since requests

About 20% of PolyMix-3 requests contain an ``If-Modified-Since'' HTTP header. Polygraph robots cache object timestamps to generate those headers. When an IMS request has to be made but the corresponding timestamp is not available, the value of ``Thu Jan 1 00:00:00 UTC 1970'' is used. In fact, the majority of requests end up using that value so that the percentage of ``304 Not Modified'' responses from an ideal cache is only around 5%, far less than 20% of IMS requests.

1.9 Cache hits and misses

PolyMix-3 workload has a 58% offered hit ratio. In the workload definition, this is actually specified through the recurrence ratio (i.e., the probability of revisiting a Web object). The recurrence ratio must account for uncachable responses and special requests. In PolyMix-3, a recurrence ratio of 72% yields an offered hit ratio of 58%. Note that to simplify analysis, only ``basic'' requests are counted when hit ratio is computed; special requests (If-Modified-Since and Reload) are ignored because in many cases there is no reliable way to detect whether the response' was served as a cache hit.

Polygraph enforces the desired hit ratio by requesting objects that have been requested before, and should have been cached. There is no guarantee, however, that the object is in the cache. Thus, our parameter (58%) is an upper limit. The hit ratio achieved by a proxy may be lower if a proxy does not cache some cachable objects, or purges previously cached objects before the latter are revisited. Various HTTP race conditions also make it difficult, if not impractical, to achieve ideal hit ratios.

1.10 Object popularity

PolyMix-3 introduces a ``hot subset'' simulation into the popularity model. At any given time, a 1% subset of the URL working set is dedicated to receive 10% of all requests. As the working set slides with time, the hot subset may jump to a new location so that all hot objects stay within the working set. This model is designed to simulate realistic Internet conditions, including ``flash crowds.'' We have not yet fully analyzed the effect of this hot subset model.

1.11 Simulated robots and servers

A single Polygraph client machine supports many simulated robots. A robot can emulate various types of Web clients, from a human surfer to a busy peer cache. All robots in PolyMix-3 are configured identically, except that each has its own IP address. We limit the number of robots (and hence IP aliases) to 1000 per client machine.

A PolyMix-3 robot requests objects using a Poisson-like stream, except for embedded objects (images on HTML pages) that are requested simulating cache-less browser behavior. A limit on the number of simultaneously open connections is also supported, and may affect the request stream.

PolyMix-3 servers are configured identically, except that each has its own IP address.

1.12 Persistent connections

Polygraph supports persistent connections on both client and server sides. PolyMix-3 robots close an ``active'' persistent connection right receiving the N-th reply, where N is drawn from a Zipf(64) distribution. The robots will close an ``idle'' persistent connection if the per-robot connection limit has been reached and connections to other servers must be opened. The latter mimics browser behavior.

PolyMix-3 servers use a Zipf(16) distribution to close active connections. The servers also timeout idle persistent connection after 15 sec of inactivity, just like many real servers would do.

Phase Name	Duration	Activity
framp	30 min	The load is increased from zero to the peak fill rate.
fill	variable	The cache is filled twice, and the working set size is frozen.
fexit	30 min	The load is decreased to 10% of the peak fill rate. At the same time, recurrence is increased from 5% DHR to its maximum level.
inc1	30 min	The load is increased during the first hour to reach its peak level.
top1	4 hours	The period of peak ``daily'' load.
dec1	30 min	The load steadily goes down, reaching a period of relatively low load.
idle	30 min	The ``idle'' period with load level around 10% of the peak request rate.
inc2	30 min	The load is increased to reach its peak level again.
top2	4 hours	The second period of peak ``daily'' load.
dec2	30 min	The load steadily goes down to zero.

Type	Portion	Reply Size	Cachability	Expiration
Image	65.0%	exp(4.5KB)	80%	logn(30day, 7day)
HTML	15.0%	exp(8.5KB)	90%	logn(7day, 1day)
Download	0.5%	logn(300KB,300KB)	95%	logn(0.5year, 30day)
Other	19.5%	logn(25KB,10KB)	72%	unif(1day, 1year)

2. Address allocation

In theory, the algorithm for assigning IP addresses to servers and robots should not affect the results of the tests. However, knowing the IP allocation scheme may be important for those who rely on IP-based redirection capabilities of their network gear. Reachability of servers and robots is also an issue.

The IP allocation scheme for PolyMix-3 is the same as for PolyMix-2. However, Polygraph is now capable of automatically computing required IP addresses based on the bench configuration specified using PGL.

2.1 Allocation scheme

The following allocation scheme is used for the tests. Each robot or server within a testing Cluster is assigned a 10.C.x.y/N IP address. The values of C, x, y, and N are defined below.

The number of IP addresses per testing Cluster (and hence the number of robots and servers) is proportional the the maximum requested load. Enough physical hosts are provided to ensure Polygraph is not a bottleneck. At the time of writing, we expect to be able to handle at least 1,000 IP addresses per host.

The values of C, x, y, and N are defined below.

C is the testing Cluster identifier (constant for all IPs within a cluster).

For robots, x is in the [1,127] range.

For servers, x is in the [129,250] range.

For robots and servers, y is in the [1,250] range.

10.C.0.1 is the proxy address known to robots (if robots are configured to talk to a proxy). Participants can use other 10.C.0.0/24 and 10.C.128.0/24 addresses if needed.

Polyteam will use other IP addresses as needed for monitoring and other purposes.

Moreover, exactly two schemes are supported.

Switched network: For robots, servers, and a known proxy address (if any), the subnet /N is set to /16 (a class B network, 255.255.0.0 netmask).

Routed network: For robots, servers, and a known proxy address (if any), the subnet /N is set to /17 (a 255.255.128.0 netmask). Client side machines point their default routes to 10.C.0.253.

The routed network configuration may be useful for router-based redirection techniques such as WCCP.

2.2 Number of clients and servers

Given the total request rate RR req/sec, we allocate (R = RR/0.4 = 2.5*RR) robot IP addresses (0.4 is request rate for an individual robot). The number of server IP addresses is then 0.1*R+500. Robot and server IPs are distributed evenly and sequentially across Polygraph machines (see example).

We place a limit of 1000 robots per client machine. The number of server machines is equal to the number of client machines.

When the calculations produce non-integer value V, we round towards the closest integer greater than V.

2.3 Configuration example

Here is an example of a possible configuration. Let's assume we want to test a product under 800 req/sec peak load.

An 800 req/sec setup requires 2,000 robots and 700 servers. We must utilize two machines running polyclt and two machines running polysrv. The robot and server IP addresses are allocated as follows (assuming test Cluster id 100).

Host Switched network Routed network

First client side host: 10.100.1-4.1-250/16 10.100.1-4.1-250/17

Second client side host: 10.100.5-8.1-250/16 10.100.5-8.1-250/17

First server side host: 10.100.129-130.1-175/16 10.100.129-130.1-175/17

Second server side host: 10.100.131-132.1-175/16 10.100.131-132.1-175/17

Thus, each client-side host is assigned 1,000 IP addresses while each server host gets 350 IPs.

2.4 Other provisions

All robots must be able to ``talk'' HTTP to all servers at all times, even if a proxy is not in the loop (a no-proxy test). No changes in robot or server configuration are allowed for a no-proxy test. Only unplugging the proxy cable is allowed. Consequently, if a proxy relies on robots and servers being on different subnets during performance tests, a no-proxy run must be feasible without changing the subnets of the robots and servers. Providing for intra-robot or intra-server communication is not required.

Participants must provide unlimited TCP/UDP connectivity from a single dedicated host (a monitoring station maintained by Polyteam, one per testing Cluster) to all robots and servers. The monitoring station has a single network interface.

There is no DNS server and other global services reachable from a testing Cluster. There is no permanent inter-cluster connections.

Not all operating systems can [efficiently] support large number of IP addresses per host. We patch the kernel to ensure that FreeBSD can support thousands of addresses.

PolyMix-3

Table of Contents

1. Feature overview

1.1 Phase schedule

1.2 Fill phase caveats

1.3 Reply sizes

1.4 Cachable and uncachable replies

1.5 Life-cycle model

1.6 Content types

1.7 Latency and packet loss

1.8 If-Modified-Since requests

1.9 Cache hits and misses

1.10 Object popularity

1.11 Simulated robots and servers

1.12 Persistent connections

2. Address allocation

2.1 Allocation scheme

2.2 Number of clients and servers

2.3 Configuration example

2.4 Other provisions