Calls

PGL supports the following function and procedure calls.

int clientHostCount (Bench bench)

The clientHostCount() function calculates the number of client-side hosts (usually physical machines/PCs) required to support peak request rate given the bench configuration.

If the bench has client_side.hosts field set, the clientHostCount() function simply returns the number of addresses in that array.

int count (array arr)

Count function returns the number of items in the array arr.

TheBench.client_side.hosts = [ '10.13.0.1-15' ];
// set peak request rate to the maximum
TheBench.peak_req_rate = 
    count(TheBench.client_side.hosts) * TheBench.client_side.max_host_load;
DynamicName dynamicName (addr domain, float prob)

The dynamicName function creates DynamicName objects. The function takes two parameters. The first parameter is the address mask in the '*.example.com:9090' form. Given such a mask, Robots will generate domain names using 'wNNNNNN.example.com' pattern. Generated name pattern format may change without notice. If the address mask parameter does not start with '*.', that prefix is added automatically.

The second parameter is the probability of a Robot using a new domain name when generating a miss request. If set to 1.0 or 100%, the robots will use a new domain name for each miss request.

Dynamic domain names are documented elsewhere.

DynamicName [ ] dynamize (addr [ ] domains, float prob)

The dynamize function converts an array of static domain names into dynamic ones, using the supplied renewal percentage, similar to what dynamicName() function does to a single domain name.

The dynamize() function has been available since Polygraph v4.1.0. Dynamic domain names are documented elsewhere.

float float (sometype obj)

The float function attempts to convert the value of obj into a floating point number. The type of obj can be int or any other type that can be converted to float.

int int (sometype obj)

The int function attempts to convert the value of obj into an integer number. The type of obj can be float or any other type that can be converted to int.

int n2 = int(3.5/2)   // OK; n2 is equal to 1
int n1 = 3.5/2;       // Error: type mismatch
addr [ ] ipsToNames (addr [ ] ips, string template_or_zone)

The ipsToNames function takes an array of N IP addresses and returns an array of N host names. If the second parameter contains no ${macro} substitutions, it is interpreted as a constant zone suffix and an internal IP-to-name algorithm kicks in (see further below for details). Otherwise, the second parameter is interpreted as a name template which is applied to every IP address to generate a host name.

Two macros are supported in the name generation template:

For example:

// if the first address is '10.0.0.1:8080', then
// the first domain will be 'h10-0-0-1.perf.tst:8080'
addr[] domains = ipsToNames(addresses, "h${dashed_ip}.perf.tst:${port}");

Note that if the template starts with a digit, Polygraph treats the resulting name as an IP address rather than a domain name. This behavior is likely to change.

Support for name templates has been available since Polygraph v4.1.0.

When the second ipsToNames parameter contains no macros, it is interpreted as a domain name suffix. This usage will be depricated in favor of templates. IP-to-name conversion is done using a simple 1:1 mapping. The mapping algorithm is hard-coded. The algorithm generates domain names of the same length, regardless of the input IP addresses.

The second parameter in this case is appended as a constant string to each generated domain name, effectively placing all names in the same DNS "zone".

AddrMap map = {
    addresses = serverAddrs(asPolyMix4, TheBench);
    names = ipsToNames(addresses, "bench.tst");
};

A dns_cfg tool can be used to generate BIND-style configuration files for your domain name server based on a PGL workload file. More information on DNS-related configuration is available elsewhere.

addr [ ] tracedHosts (string trace)

The tracedHosts function loads a URL trace from the named file and extracts unique host names from traced URLs. The resulting array of addresses (in unspecified order) is returned. Port information is not extracted.

This function was created to simplify DNS configuration for trace replay:

AddrMap map = {
    zone = ".";
    addresses = serverAddrs(asPolyMix4, TheBench);
    names = tracedHosts("/tmp/test.urls");
};

A dns_cfg can be used to generate BIND-style configuration files for your domain name server based on a PGL workload file. More information on DNS-related configuration is available elsewhere.

sometype max (sometype obj, ...)

The max function converts its arguments to floating point numbers, finds maximum, and returns a clone of the corresponding argument.

int n1 = max(1, 3, 2)        // OK; n1 is equal to 3
int n2 = max(2.5, 3.5);      // Error: type mismatch
int n3 = int(max(2.5, 3.5)); // OK; n3 is equal to 3
sometype min (sometype obj, ...)

The min function converts its arguments to floating point numbers, finds minimum, and returns a clone of the corresponding argument. See max() function for an example.

note_substitutes (addr [ ] substitutes, ...)

The note_substitutes() call tells Polygraph that specified addresses are semantically equal. For example, substitutes array may include IP addresses of simulated origin servers that are identically configured (except for their location on the network) and generate the same content.

Specific interpretation or use of substitutes may be affected by other PGL objects. See Robot's minimize_new_conn field for an example.

Note that the arguments of note_substitutes() are arrays, not individual addresses! Each argument is a substitute group, and groups are not merged.

proxyAddrs (AddrScheme addr_scheme, Bench bench)

The proxyAddrs() function applies a given addressing scheme to the specified bench configuration and returns a list of IP addresses for simulated proxies to use. See robotAddrs() for details and examples.

robotAddrs (AddrScheme addr_scheme, Bench bench)

The robotAddrs() function applies a given addressing scheme to the specified bench configuration and returns a list of IP addresses for robots to use. The resulting addresses will satisfy address allocation rules for the given addressing scheme and bench configuration, naturally.

Global bench settings (e.g., peak_req_rate) and client side of the bench configuration are usually used by the addressing scheme. If the bench configuration is incomplete or inconsistent, PGL interpreter will quit with an error.

#include "benches.pg"
Robot R = {
    hosts = robotAddrs(asPolyMix4, TheBench);
};
schedule (Phase phase, ...)

Schedule() is for phases as use() is for agents. Schedule iterates through the argument list and appends each phase to a global ``schedule''. Polygraph will follow that schedule and will stop execution when all phases are completed (subject to other conditions such as idle timeout).

#include "phases.pg"  // import some useful phases
Phase phMeas = { ... };
schedule(phWait, phMeas, phCool); // build a schedule

If you call schedule() with an array instead of an individual phase, the array gets interpolated as if all its items were explicitly listed as actual parameters.

The schedule() procedure can be called more than once. Each new call appends to the existing global schedule.

serverAddrs (AddrScheme addr_scheme, Bench bench)

The serverAddrs() function applies a given addressing scheme to the specified bench configuration and returns a list of IP addresses for simulated servers to use. See robotAddrs() for details and examples.

system (string command)

The system() function accepts a single PGL string as an argument, treats that argument as a shell command, executes the command, and returns the standard output of the command as a PGL string value:

Phase phA;
phA.script = {
    every sampleS do {
        string health = system("cat /tmp/health.txt");
        if (float(health) < SomeThreshold) then {
            changePopulusFactorBy(-10%);
        }
    }
};

Outside Phase scripts, the system() function is equivalent to a back quoted `command` and is executed only once at startup. Inside Phase scripts, the system() function is executed every time the script is run, unlike back quotes which are executed only once at startup even in a Phase script context. See Run-time load adjustments for more details about Phase scripts.

The system() function is available starting with Web Polygraph version 4.4.0.

sometype undef ()

The undef() function returns an ``undefined'' value that is compatible with any type. The only useful application of this function is to reset the value of some field to its ``default''.

// import useful objects, including myRobot
#include "my_objects.pg" 
// tell Polygraph to use default value for nagle option
// when myRobot is launched
myRobot.socket.nagle = undef();

Note that the value of myRobot.socket.nagle in the example above becomes ``undefined'' and not some default value (true or false). The default is substituted after the configuration file is interpreted. That is why the function is called undef() and not default().

myServer.accept_lmt = -1;
myServer.accept_lmt = undef();
        
// this is an error because myServer.accept_lmt is undefined:
otherServer.accept_lmt = myServer.accept_lmt + 10;
uniq_id uniqId ()

The uniqId function returns a unique identifier. New unique identifier is returned with each call.

use (Agent agent, ...)
use (AddrMap map, ...)
use (Bench bench, ...)

Use() iterates through the argument list, creates a copy of each argument, and places that copy into the global ``use us'' list. When interpreting a PGL configuration, Polygraph only cares about the objects on that ``use us'' list, other objects are ignored.

Server S1 = { .... };
Server S2 = { .... };
Robot R = { ... };
TheBench bench = { ... };

S2.hosts = S1.hosts;
use(S2, R);
use(bench); 
// Server S1 will be ignored, but S2 will have S1's hosts

If you call use() with an array instead of an individual objects, the array gets interpolated as if all its items were explicitly listed as actual parameters.

The use() procedure allows you to create a library of generally useful objects but use only some subset of those objects in a particular workload.

The definition of use() implies that all post-call modifications to its parameters will not be noticed. To avoid confusion, always call use() last and create copies of the objects (under different names) if you want to change them after the use() call.

The use() procedure can be called more than once.

working_set_length (time length)

Use working_set_length() call to limit the size of URL space (a.k.a. ``working set'' size). Polygraph constantly introduces new objects into the working set to simulate cachable misses. By default, the working set will grow indefinitely. In other words, more and more URLs will have a non-zero probability of being re-visited. This unlimited growth may eventually decrease memory hit ratio and even disk hit ratio (both due to finite cache sizes).

The working_set_length() call essentially specifies for how long a cache needs to store fill stream to achieve optimal hit ratio during the entire experiment. If a cache cannot hold all cachable traffic generated during working_set_length time, then it is almost guaranteed to have sub-optimal hit ratio later, because some of the re-visited objects will not be in the cache.

time len = 4hour;
working_set_length(len);

Specifying working_set_length larger than experiment duration does not have any effect (the set will grow during the entire run then). One can think of the first working_set_length minutes of the experiment as of a ``warm-up'' phase. Experiment enters its ``steady state'' only after the working set size is frozen.

It is often convenient and desirable to limit the size of the working set. We often specify that limit in terms of time (rather than number of objects or their total size) because that makes the limit independent from cache capacity or request rate. In other words, when two very different caches are subjected to the same time-based limit, the workload is ``fair''.

working_set_cap (int capacity)

The working_set_cap() procedure call is the same as working_set_length(), except the working set limit is specified using number of objects rather than time.

Use working_set_cap() when the working set size does not depend on request rate. For example, when creating an environment similar to a given origin server in a steady state.