Abstract Spider¶

The core functionality of PATHspider is implemented in pathspider.base.Spider. The documentation for this class is below:

class pathspider.base.Spider[source]¶

A spider consists of a configurator (which alternates between two system configurations), a large number of workers (for performing some network action for each configuration), an Observer which derives information from passively observed traffic, and a thread that merges results from the workers with flow records from the collector.

activate(worker_count, libtrace_uri)[source]¶

The activate function performs initialisation of a pathspider plugin.

Parameters:	worker_count (int) – The number of workers to use. libtrace_uri (str) – The URI to pass to the Observer to describe the interface on which packets should be captured.
See also:	`pathspider.base.ISpider.activate()`

It is expected that this function will be overloaded by plugins, though the plugin should always make a call to the activate() function of the abstract Spider class as this initialises all of the base functionality:

super().activate(worker_count=worker_count,
                 libtrace_uri=libtrace_uri,
                 check_interrupt=check_interrupt)

This can be used to initialise any variables which may be required in the object. Do not initialise any variables in the __init__ method, or perform any other operations there as all plugins must be instantiated in order to be loaded and this will cause unnecessary delays in the starting of pathspider.

add_job(job)[source]¶

Adds a job to the job queue.

If PATHspider is currently stopping, the job will not be added to the queue.

config_one()[source]¶: Changes the global state or system configuration for the experimental measurements.

config_zero()[source]¶: Changes the global state or system configuration for the baseline measurements.

configurator()[source]¶: Thread which synchronizes on a set of semaphores and alternates between two system states.

connect(job, pcs, config)[source]¶

Performs the connection.

Parameters:	job (dict) – The job record. pcs (dict) – The result of the pre-connection operations(s). config (int) – The current state of the configurator (0 or 1).
Returns:	object – Any result of the connect operation to be passed to `pathspider.base.Spider.post_connect()`.

The connect function is used to perform the connection operation and is run for both the A and B test. This method is not implemented in the abstract pathspider.base.Spider class and must be implemented by any plugin.

Sockets created during this operation can be returned by the function for use in the post-connection phase, to minimise the time that the configurator is blocked from moving to the next configuration.

create_observer()[source]¶

Create a flow observer.

This function is called by the base Spider logic to get an instance of pathspider.observer.Observer configured with the function chains that are requried by the plugin.

This method is not implemented in the abstract pathspider.base.Spider class and must be implemented by any plugin.

For more information on how to use the flow observer, see Observer.

merge(flow, res)[source]¶

Merge a job record with a flow record.

Parameters:	flow (dict) – The flow record. res (dict) – The job record.
Returns:	tuple – Final record for job.

In order to create a final record for reporting on a job, the final job record must be merged with the flow record. This function should be implemented by any plugin to provide the logic for this merge as the keys used in these records cannot be known by PATHspider in advance.

This method is not implemented in the abstract pathspider.base.Spider class and must be implemented by any plugin.

merger()[source]¶: Thread to merge results from the workers and the observer.

post_connect(job, conn, pcs, config)[source]¶

Performs post-connection operations.

Parameters:	job (dict) – The job record. conn (object) – The result of the connection operation(s). pcs (dict) – The result of the pre-connection operations(s). config (int) – The state of the configurator during `pathspider.base.Spider.connect()`.
Returns:	dict – Result of the pre-connection operation(s).

The post_connect function can be used to perform any operations that must be performed after each connection. It will be run for both the A and the B configuration, and is not synchronised with the configurator.

Plugins to PATHspider can optionally implement this function. If this function is not overloaded, it will be a noop.

Any sockets or other file handles that were opened during pathspider.base.Spider.connect() should be closed in this function if they have not been already.

pre_connect(job)[source]¶

Performs pre-connection operations.

Parameters:	job (dict) – The job record.
Returns:	dict – Result of the pre-connection operation(s).

The pre_connect function can be used to perform any operations that must be performed before each connection. It will be run only once per job, with the same result passed to both the A and B connect calls. This function is not synchronised with the configurator.

Plugins to PATHspider can optionally implement this function. If this function is not overloaded, it will be a noop.

shutdown()[source]¶: Shut down PathSpider in an orderly fashion, ensuring that all queued jobs complete, and all available results are merged.

start()[source]¶

This function starts a PATHspider plugin.

In order to run, the plugin must have first been activated by calling its activate() method. This function causes the following to happen:

Set the running flag

Create an pathspider.observer.Observer and start its process

Start the merger thread

Start the configurator thread

Start the worker threads

The number of worker threads to start was given when activating the plugin.

terminate()[source]¶: Shut down PathSpider as quickly as possible, without any regard to completeness of results.

worker(worker_number)[source]¶

This function provides the logic for the worker threads.

Parameters:	worker_number (int) – The unique number of the worker.

The workers operate as continuous loops:

Fetch next job from the job queue

Perform pre-connection operations

Acquire a lock for “config_zero”

Perform the “config_zero” connection

Release “config_zero”

Acquire a lock for “config_one”

Perform the “config_one” connection

Release “config_one”

Perform post-connection operations for config_zero and pass the result to the merger

Perform post-connection operations for config_one and pass the result to the merger

Do it all again

If the job fetched is the SHUTDOWN_SENTINEL, then the worker will terminate as this indicates that all the jobs have now been processed.

Abstract Spider¶

Previous topic

Next topic

This Page