Continuing earlier tests

This page documents Polygraph functionality that enables a tester to continue a test with a working set of an old test instead of starting from scratch. This functionality is supported starting with Polygraph version 2.8.0.

1. Introduction
2. Dangers
3. Storing working set
4. Loading working set
5. Load, update, store, ...
6. In-between-tests changes to workload
7. Frozen working set size
8. Does it really work?

1. Introduction

When testing stateful devices such as Web caches, the device must be brought to a steady state before any reliable measurements can be collected. Thus, most stand-alone workloads have two parts: reaching a steady state and measuring sustained performance using that steady state information. Depending on the environment, it may take many hours if not days to reach a steady state. Unfortunately, once the test is over, all state information is lost and Polygraph has to start from scratch even though the device under test may still be in a steady state.

The Persistent Working Set feature described below lets Polygraph to record and reuse the state information so that you can reach a steady state once and then continue with many measurement tests as if it was a single, very long test with different measurement phases.

Please read tests recycling page for an alternative to the Persistent Working Set feature described below.

2. Dangers

Imagine a situation where a device under test has reached a steady state but then crashed and lost most of the state information. Polygraph does not know about the loss. If instructed to save and continue using its internal representation of the state, it will do so. There is an implicit assumption that if you tell Polygraph to use its old state, then the device under test is in sync with that state. You are responsible for making sure that assumption is correct.

Use this feature for preliminary results. Always verify estimations obtained using Persistent Working Set with a clean, from-scratch test.

3. Storing working set

Use the --store_working_set command line option to specify the name of a file where Polygraph should write its working set information to. The set is process-specific.
clt1> polyclt ... --store_working_set clt1.pws clt2> polyclt ... --store_working_set clt2.pws ... srv1> polysrv ... --store_working_set srv1.pws srv2> polysrv ... --store_working_set srv2.pws ...

Working set is stored at the very end of the test. Polygraph process must quit nicely in order to store the working set. If you have to terminate a process prematurely, send it an INT signal (same as stopping the process in he foreground by pressing Control-C).

Polygraph informs you that the working set has been stored and assigned a unique ID and a version number. The version starts with 1 (one) and is incremented every time you store a set with the same ID (you have to load the set to re-store it, see below). This numbering scheme lets you double check that Polygraph is using the right working set info.
130.12| fyi: working set stored in srv1.pws; id: 074d90e5.3d44411e:00000002, version: 3

Working set snapshot may reach several megabytes in size.

4. Loading working set

Use the --load_working_set command line option to specify the name of a file where Polygraph should load its working set information from. The set must be first stored using the --store_working_set option described above.
clt1> polyclt ... --load_working_set clt1.pws clt2> polyclt ... --load_working_set clt2.pws ... srv1> polysrv ... --load_working_set srv1.pws srv2> polysrv ... --load_working_set srv2.pws ...

Working set is loaded at the very beginning of the test, before any requests are submitted. Random seeds are restored based on the information in the loaded working set.

Polygraph informs you that the working set has been loaded and displays the set ID and the version number.
000.02| fyi: working set loaded from srv1.pws; id: 074d90e5.3d44411e:00000002, version: 3

5. Load, update, store, ...

Once the working set is loaded, Polygraph proceeds as if no working set was specified. That is, it will update the working set according to the workload settings and can even freeze its size. Thus, at the end of each test, the working set will usually differ from the one that was loaded. In most cases, you would want to store that updated working set for the next test to use.
# the very first test to reach steady state clt1> polyclt ... --store_working_set clt1-0.pws # second test, continue using the previously stored state (0) clt1> polyclt ... --load_working_set clt1-0.pws --store_working_set clt1-1.pws # third test, continue using the previously stored state (1) clt1> polyclt ... --load_working_set clt1-1.pws --store_working_set clt1-2.pws # and so on clt1> polyclt ... --load_working_set clt1-2.pws --store_working_set clt1-3.pws

Since set loading and storing is done at separate times, the same file name may be used, simplifying name management quite a bit:
clt1> polyclt ... --store_working_set clt1.pws clt1> polyclt ... --load_working_set clt1.pws --store_working_set clt1.pws clt1> polyclt ... --load_working_set clt1.pws --store_working_set clt1.pws clt1> polyclt ... --load_working_set clt1.pws --store_working_set clt1.pws

Once again, Polygraph cannot check that the state it is loading is in sync with reality. Keeping things in sync is your responsibility.

6. In-between-tests changes to workload

The Persistent Working Set feature has been designed and implemented to minimize dependencies between the stored working set and workload specs. You should be able to change many workload parameters without breaking working set persistency. Parameters that should not be changed (or should be changed with care), include:

Parameters affecting URL string generation such as visible server names or content types used by servers: Working set contains object identifiers (oids), not complete URLs. URLs are generated from oids based on relevant workload settings.

Parameters affecting the number of robot agents: Working set consists of per-agent slices. If you increase the number of agents, some agents will not get any slices from the stored working set (and start as if no working set was loaded for them). If you decrease the number of agents, some old (private) URLs will never be requested. Note that in many workloads, changing peak request rate will change the number of agents.

Seeds for random number generation: The seeds are stored with and taken from the working set file, so changing them will have no effect after the working set is loaded.

The above list is probably not comprehensive.

7. Frozen working set size

Polygraph can be instructed to freeze working set size (not to be confused with set content) after a user-specified number of cachable objects have been requested (for details see working_set_length PGL call). When the set is stored or loaded, information about frozen slices (parts of the set) is stored or loaded too. Thus, if you store a working set with a frozen size, you get a frozen size when you load the set.

You can freeze the set size after the set was loaded. Recall that Polygraph "forgets" the fact that it loaded something immediately after the loading is complete. The following technique can be useful:

test 1: reach steady state

test 1: freeze working set size

test 1: store working set

test 2: load working set

test 2: freeze working set (again)

test 2: proceed with the measurement phase

test 2: store working set

Freezing working set after loading it ensures that if any set slice sizes were not frozen when the working set was stored, they will be frozen with zero size before the test continues. The latter ensures that the working set does not grow in size after it is stored/loaded. This may or may not be what you want.

8. Does it really work?

The Persistent Working Set feature is easy to use, but it may be tricky to test. Here is how we tested the implementation.

We used Squid caching proxy for our tests. Squid has a feature called no_cache which can be used to prevent Squid from caching new responses while still serving previously cached documents. Using no_cache, it is possible to test that a previously stored working set is reused by monitoring hit ratio reported by Polygraph client-side process. Here are step-by-step instructions and an archive with configuration files.

Setup Squid to cache everything (default, see squid-1.conf for the exact configuration we used)

Setup Polygraph to use 100% cachable content, repeat no URLs, and freeze working set size at 5000 objects (test_pws-1.pg).

Run the first test and save its working set (test_pws-1.sh). You should see 0% hit ratio because Polygraph is not repeating any URLs.

Setup Squid to cache nothing (squid-2.conf).

Setup Polygraph to produce no new URLs (test_pws-2.pg).

Run the second test, using the previously saved working set (test_pws-2.sh). You should see sustained 80-100% hit ratio levels. Squid may not have cached or may have purged a few URLs during the first test, which could prevent it from getting 100% hit ratio. You can compare Squid access logs to verify that no new URLs were introduced during the second test. We had to do the comparison (diff-logs.sh) because our test yielded about 90% hit ratio only.

You should be able to repeat the last step (the second test) many times with nearly identical results.

If you do not load saved working set at the last step, you should see 0% hit ratio as Squid is instructed not to cache new objects.

Please note that no_cache semantics in Squid may have changed since this page was written. We used Squid 2.4.DEVEL4. There were plans to prevent Squid from serving hits matching the no_cache ACL. Consult your Squid documentation.