Abstract Spider

The core functionality of PATHspider plugins is implemented in two classes: pathspider.sync.SynchronizedSpider and pathspider.desync.DesynchronizedSpider. There is also a third class, pathspider.forge.ForgeSpider that inherits from pathspider.desync.DesynchronizedSpider. These both inherit from the base pathspider.base.Spider which provides a skeleton that has the required functions for any plugin. The documentation for this base class is below:

pathspider.base

Basic framework for Pathspider: coordinate active measurements on large target lists with both system-level network stack state (sysctls, iptables rules, etc) as well as information derived from flow-level passive observation of traffic at the sender.

class pathspider.base.PluggableSpider[source]
static register_args(subparsers)[source]
class pathspider.base.Spider(worker_count, libtrace_uri, args, server_mode)[source]

A spider consists of a configurator (which alternates between two system configurations), a large number of workers (for performing some network action for each configuration), an Observer which derives information from passively observed traffic, and a thread that merges results from the workers with flow records from the collector.

add_job(job)[source]

Adds a job to the job queue.

If PATHspider is currently stopping, the job will not be added to the queue.

chains = []
combine_flows(flows)[source]
configurator()[source]
create_observer()[source]

Create a flow observer.

This function is called by the base Spider logic to get an instance of pathspider.observer.Observer configured with the function chains that are requried by the plugin.

exception_wrapper(target, *args, **kwargs)[source]
merge(flow, res)[source]

Merge a job record with a flow record.

Parameters:
  • flow (dict) – The flow record.
  • res (dict) – The job record.
Returns:

tuple – Final record for job.

In order to create a final record for reporting on a job, the final job record must be merged with the flow record. This function should be implemented by any plugin to provide the logic for this merge as the keys used in these records cannot be known by PATHspider in advance.

This method is not implemented in the abstract pathspider.base.Spider class and must be implemented by any plugin.

merger()[source]

Thread to merge results from the workers and the observer.

post_connect(job, rec, config)[source]

Performs post-connection operations.

Parameters:
  • job (dict) – The job record.
  • rec (dict) – The result of the connection operation(s).
  • config (int) – The state of the configurator during pathspider.base.Spider.connect().

The post_connect function can be used to perform any operations that must be performed after each connection. It will be run for both the A and the B configuration, and is not synchronized with the configurator.

Plugins to PATHspider can optionally implement this function. If this function is not overloaded, it will be a noop.

Any sockets or other file handles that were opened during pathspider.base.Spider.connect() should be closed in this function if they have not been already.

pre_connect(job)[source]

Performs pre-connection operations.

Parameters:job (dict) – The job record

The pre_connect function can be used to perform any operations that must be performed before each connection. It will be run only once per job, with the same result passed to both the A and B connect calls. This function is not synchronized with the configurator.

Plugins to PATHspider can optionally implement this function. If this function is not overloaded, it will be a noop.

shutdown()[source]

Shut down PathSpider in an orderly fashion, ensuring that all queued jobs complete, and all available results are merged.

start()[source]

This function starts a PATHspider plugin by:

  • Setting the running flag
  • Create and start an observer
  • Start the merger thread
  • Start the configurator thread
  • Start the worker threads

The number of worker threads to start was given when activating the plugin.

terminate()[source]

Shut down PathSpider as quickly as possible, without any regard to completeness of results.

worker(worker_number)[source]