Loadable modules

This page documents loadable modules interface for Web Polygraph. This functionality is supported starting with Polygraph version 2.8.0.

Table of Contents

1. Introduction
2. Availability
3. Enabling a module
4. Writing a module
    4.1 Session watchdog modules
    4.2 Header filter modules
    4.3 Non C++ modules
5. Compiling and linking
    5.1 Linking polyclt
    5.2 Compiling a module

1. Introduction

Loadable modules are pieces of object code that can be injected into a running program. The program source code is not modified and does not need to be recompiled. Once the module is loaded, it can manipulate host program data and call program functions. Usually, the host program delegates some optional processing to loadable modules so that users can modify and enhance the functionality of the program without modifying or even understanding much of the original source code.

Polygraph uses loadable modules for two primary purposes:

The original motivation for adding module interfaces was to let Polygraph users to support proprietary and closed authentication protocols, such as Microsoft NTLM, without the need to expose the authentication source code or modify Polygraph. Other uses are certainly possible, and Polygraph development team would be happy to talk to those who need to extend the interface.

Polygraph distribution does not include useful modules. If you need to fiddle with HTTP headers emitted by Polygraph, you have to write your own module to do that.

2. Availability

Loadable modules are supported starting with Polygraph version 2.8. At the time of writing, that version has not been released.

Loadable modules may work on any platform that supports dlopen(3) family of function calls that provide interface to the dynamic linker. FreeBSD, Linux, and many other OSes support dlopen(3). Similar functionality is available on Microsoft Windows, but it is not supported in Polygraph (yet?).

Polygraph's ./configure script will warn you if it is unable to detect libraries and header files necessary to support dynamic loadable modules.

3. Enabling a module

If you already have a module, it is easy to add it to polyclt using the --loadable_modules option:

polyclt ... --loadable_modules /tmp/myfilter.so ...
...
000.01| loading dynamic module: /tmp/myfilter.so
000.01| dynamic modules loaded: 1
...
000.02| registered client-side data filters: 1
            MyFilter-0.1:      adds HTTP Foo-Bar header

The polyclt output above shows that Polygraph first loads the specified module (using /tmp/myfilter.so, a file with the module code) and then reports all registered client-side filters. Given the above output, we can assume that /tmp/myfilter.so contained a filter called "MyFilter" that adds a "Foo-Bar" header to polyclt HTTP headers.

It is best to use absolute filenames with --loadable_modules because the dynamic linker often does not check current directories and because the actual current directory may differ at the time polyclt is started from, say, a shell script.

4. Writing a module

To write a friendly module, you need to know the interface that Polygraph is using to communicate with loaded modules. There are two such interfaces: a session watchdog interface and a header filter interface. We describe both below. The descriptions assume that C++ language is used to write the modules. We end this section with an explanation on how to add a non-C++ module/code to Web Polygraph.

4.1 Session watchdog modules

All watchdog modules must inherit from the SessionWatch class declared in runtime/SessionWatch.h. They must implement all pure virtual methods of that class.

When polyclt loads a session watchdog module, the module initialization proceeds as if the module was a part of a program being started. As a part of the initialization process, the module should register itself by calling the TheSessionWatchRegistry().add() method declared in client/SessionWatchRegistry.h. The only parameter to the method is a pointer to the watchdog object being registered. See an example below on how this auto-registration can be implemented in C++.

When polyclt is ready to start a new user session (and before the first request is submitted within that session), polyclt notifies every registered session watchdog by calling its noteStart(client) method. Polygraph will also call session watchdogs every heartbeat interval and when the session ends. PGL Session type is documented elsewhere.

The watchdogs are called one-by-one, in registration order. There is no way to terminate the calling process prematurely. A watchdog may do nothing or may, for example, form and send a RADIUS packet on the network. Once all watchdogs are called, polyclt proceeds as usual.

Polygraph supplies a pointer to the Client class that represents the HTTP user or robot initiating the session. A watchdog may obtain details about the session using public Client methods. For example, robot IP address and configured credentials are accessible. The watchdog may also access Polygraph's global structures and functions, just like any other piece of Polygraph code.

Below is an example of a simple session watchdog that announces session states on the console along with robot IP address and credentials. See the src/client/SessionAnnouncer.cc file in the Polygraph distribution for a copy of this code.

#include "base/polygraph.h"

#include "runtime/LogComment.h"
#include "client/SessionWatchRegistry.h"
#include "client/Client.h"


class SessionAnnouncer: public SessionWatch<Client> {
    public:
virtual String id() const;
virtual void describe(ostream &os) const;

virtual void noteStart(const Client *client);
virtual void noteHeartbeat(const Client *client);
virtual void noteEnd(const Client *client);

    protected:
void announce(const Client *client, const String &state) const;
};


static bool registered = registered ||
    TheSessionWatchRegistry().add(new SessionAnnouncer);


String SessionAnnouncer::id() const {
    return "SessionAnnouncer-1.0";
}

void SessionAnnouncer::describe(ostream &os) const {
    os << "announces client-side sessions on the console";
}

void SessionAnnouncer::noteStart(const Client *client) {
    announce(client, "starts session");
    Comment(5) << "robot credentials: " << client->credentials() << endc;
}

void SessionAnnouncer::noteHeartbeat(const Client *client) {
    announce(client, "continues session");
}

void SessionAnnouncer::noteEnd(const Client *client) {
    announce(client, "ends session");
}

void SessionAnnouncer::announce(const Client *client, const String &action) const {
    Comment(5) << "robot " << client->id() << " @ " << client->host() << 
": " << action << endc;
}

4.2 Header filter modules

All loadable filters must inherit from the CltDataFilterRegistry::Filter class declared in client/CltDataFilterRegistry.h. They must implement all pure virtual methods of that class.

When polyclt loads a filter module, the module initialization proceeds as if the module was a part of a program being started. As a part of the initialization process, the module should register itself by calling the TheCltDataFilterRegistry().add() method declared in client/CltDataFilterRegistry.h. The only parameter to the method is a pointer to the filter object being registered. See examples below on how this auto-registration can be implemented in C++.

When polyclt is done with stuffing HTTP request headers (including request line but excluding terminating CRLF), it passes the buffer with the headers through all registered filters using their apply() method. The filters are applied one-by-one, in registration order. There is no way to terminate the filtering process prematurely. A filter may modify the buffer in any way or may chose to do nothing, depending on the headers being passed. Once all filters are applied, polyclt appends terminating CRLF and eventually sends the headers to the wire.

Polygraph supplies a pointer to the CltXact class that represents the current HTTP transaction building the headers. A filter may obtain details about the transaction using CltXact methods. The filter may access Polygraph's global structures and functions, just like any other piece of Polygraph code.

Below is an example of a simple filter. A more complex example of a filter that adds HTTP Basic authentication headers is available in Polygraph distribution as src/client/HttpBasicAuthenticator.cc.

#include "base/polygraph.h"

#include <iostream>

#include "runtime/IOBuf.h"
#include "client/CltDataFilterRegistry.h"

class MyFilter: public CltDataFilterRegistry::Filter {
    public:
virtual String id() const { return "MyFilter-0.1"; }
virtual void describe(ostream &os) const;

virtual void apply(CltDataFilterRegistry::Producer &p, IOBuf &buf);
};

static bool registered = registered ||
    TheCltDataFilterRegistry().add(new MyFilter);


void MyFilter::describe(ostream &os) const {
    os << "adds HTTP Foo-Bar header";
}

void MyFilter::apply(CltDataFilterRegistry::Producer &, IOBuf &buf) {
    static const String header = "Foo-Bar: XYZ\r\n";
    buf.append(header.data(), header.len());
}

Also see an NTLM authentication filter below.

4.3 Non C++ modules

If you already have code written in C or any programming language other than C++, you can still use the module interface as long as you can call your routines from C++. Here is how one might wrap existing C code that implements NTLM authentication via C AddNtlmAuthHeaders() function.

#include "base/polygraph.h"

#include <iostream>

#include "runtime/LogComment.h"
#include "runtime/IOBuf.h"
#include "client/CltDataFilterRegistry.h"
#include "client/UserCred.h"
#include "client/CltXact.h"

// this is the profile of a C routine we will be calling;
// normally, it would be in some .h header we would include above,
// surrounded by extern "C" {}
extern "C" int AddNtlmAuthHeaders(
    const char *credentials, const char *domain,
    const char *hdrsRep, int hdrsRepSize,
    char *hdrsReq, int *hdrsReqSize);

// a C++ wrapper implementing Polygraph's filter interface
class NtmlFilter: public CltDataFilterRegistry::Filter {
    public:
        virtual String id() const { return "NtmlFilter-3.6d"; }
        virtual void describe(ostream &os) const;

        virtual void apply(CltDataFilterRegistry::Producer &p, IOBuf &buf);
};

static bool registered = registered ||
    TheCltDataFilterRegistry().add(new NtmlFilter);


void NtmlFilter::describe(ostream &os) const {
    os << "adds NTLM authentication headers via C routine";
}

void NtmlFilter::apply(CltDataFilterRegistry::Producer &p, IOBuf &buf) {
    // append authentication headers only of user credentials are
    // set; the latter implies that we received '407 Auth Required'
    const UserCred &credentials = p->credentials();
    if (!credentials.image())
        return;

    // must have NTLM challange info in 407 response that caused this xaction
    const CltXact *cause = p->cause();
    if (!cause)
        return; // XXX: report error here

    // Polygraph should have saved response header for us
    const IOBuf &repHeader = cause->savedRepHeader();
    if (repHeader.contSize() <= 0)
        return; // XXX: report error here

    int exchangeSize = buf.spaceSize();
    const int err = AddNtlmAuthHeaders(
        credentials.image().cstr(), "hardcoded-domain",
        repHeader.content(), repHeader.contSize(),
        buf.space(), &exchangeSize); // in: space size; out: content size

    if (err) {
        Comment << "error: AddNtlmAuthHeaders() error code: " << err << endc;
        return;
    }

    // let I/O buffer know that some content got appended
    if (Should(0 <= exchangeSize && exchangeSize <= buf.spaceSize()))
        buf.appended(exchangeSize);
}

Similarly, modules in any programming language can be used as long as you can call your routines from C++.

5. Compiling and linking

To use a loadable module, one must both

This section talks about each step. Unless noted otherwise, the discussion is FreeBSD-specific. If there are no instructions for your operating system below, you will need to experiment; please send us commands for environments not covered here so that others can enjoy the fruits of your labor.

5.1 Linking polyclt

For loadable modules to work, polyclt executable must have symbol tables with all code symbols used by a module, even if some of those symbols are not used by core polyclt code. GNU C++ compiler from the GCC suite has a special option to build complete symbol tables. That option is called -rdynamic. Documentation for -rdynamic is scarce, but it appears that the compiler converts the option into appropriate options for the linker, depending on the build platform. Some platforms require no special linker options because they build complete dynamic symbol tables by default. Some GNU linkers use the --whole-archive option.

Polygraph's ./configure script configures Makefiles to use -rdynamic when linking polyclt executable, provided you are using a GCC compiler, and that the compiler appears to produce executables when this option is used (i.e., the compiler does not quit with an error when the option is supplied).

If the configuration does not do the right thing in your environment, you will need to supply correct linking options before configuring Polygraph (use LDFLAGS environment variable for that) or manually re-link polyclt with the right set of options. Note that providing custom LDFLAGS will affect linking of all Polygraph executables.

5.2 Compiling a module

The compiler options required to compile a loadable module are OS- and compiler-specific. Essentially, you need to compile the module as a shared library. For example, the following command compiles a loadable module called MyModule with module object code already compiled as MyModule.o.

c++ -shared -export-dynamic -Wl,-soname,MyModule.so -o MyModule.so \
    -I. -I.. -I../.. MyModule.o

The command works on FreeBSD and Linux with GCC from within the src/client directory of Polygraph source distribution. The command produces MyModule.so file to be used as a loadable module.