Developing Plugins¶
PATHspider is written to be extensible and the plugins that are included in the PATHspider distribution are examples of the measurements that PATHspider can perform.
pathspider.plugins
is a namespace package. Namespace packages are a
mechanism for splitting a single Python package across multiple directories on
disk. One or more distributions may provide modules which exist inside the same
namespace package. The PATHspider distribution’s plugins are installed in
pathspider.plugins
, but also 3rd-party plugins can exist in this path
without being a part of the PATHspider distribution.
Quickstart¶
The directory layout and example plugin below can be found in the pathspider-example GitHub repository. You can get going quickly by forking this repository and using that as a basis for plugin development.
Directory Layout¶
To get started you will need to create the required directory layout for PATHspider plugins, in this case for the Example plugin:
pathspider-example
└── pathspider
├── __init__.py
└── plugins
├── __init__.py
└── example.py
Inside both __init__.py files, you will need to add the following (and only the following):
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
Your plugin will be written in example.py
and this plugin will be
discovered automatically when you run PATHspider.
Example Plugin¶
The following code can be found in the quickstart example as a starting point
for developing your plugin. If you are not using the quickstart example, you
may copy and paste this code into a Python file under pathspider/plugins/
in the directory structure. This example is explained in the following
sections.
import sys
import collections
import logging
from pathspider.base import SynchronizedSpider
from pathspider.base import PluggableSpider
from pathspider.base import NO_FLOW
from pathspider.observer import simple_observer
Connection = collections.namedtuple("Connection", ["host", "state"])
SpiderRecord = collections.namedtuple("SpiderRecord", ["ip", "rport", "port",
"host", "config",
"connstate"])
class Example(SynchronizedSpider, PluggableSpider):
"""
An example PATHspider plugin.
"""
def config_zero(self):
logger = logging.getLogger("example")
logger.debug("Configuration zero")
def config_one(self):
logger = logging.getLogger("example")
logger.debug("Configuration one")
def connect(self, job, pcs, config):
sock = tcp_connect(job)
return Connection(sock, 1)
def post_connect(self, job, conn, pcs, config):
rec = SpiderRecord(job[0], job[1], job[2], config, True)
return rec
def create_observer(self):
logger = logging.getLogger("example")
try:
return simple_observer()
except:
logger.error("Observer would not start")
sys.exit(-1)
def merge(self, flow, res):
if flow == NO_FLOW:
flow = {"dip": res.ip,
"sp": res.port,
"dp": res.rport,
"observed": False}
else:
flow['observed'] = True
self.outqueue.put(flow)
@staticmethod
def register_args(subparsers):
parser = subparsers.add_parser('example', help="Example starting point for development")
parser.set_defaults(spider=Example)
You will need to provide implementations for each of these functions, which are explained next. We’ll start with the connection logic.
Connection Logic¶
Configurator¶
These functions perform global changes that may be required between performing the baseline (A) and the experimental (B) configurations. The changes may be a call to sysctl, changes via netfilter or a call to a robot arm to reposition the satellite array. In the event that global state changes are not required, these can be implemented as no-ops.
An example implementation of these methods can be found in the ECN plugin:
(Pre-,Post-) Connection¶
The pre-connection function will run only once, and the result of the pre-connection operation will be available to both runs of the connection and post-connection functions.
If you require to pass different values depending on the configuration, you can perform two operations in the pre-connect function, returning a tuple, and selecting the value to use based on the configuration in the later functions.
An example implementation of these methods can be found in the ECN plugin:
Observer Functions¶
PATHspider’s observer will accept functions and pass python-libtrace dissected packets along with the associated flow record to them for every packet recieved.
The pathspider.observer
module provides
pathspider.observer.simple_observer()
which allows the creation of a very
simple Observer during development of the other portions of the plugin. There
are two simple examples of observer functions that are used in the observer
created by this function.
When you are ready to start working with your own Observer functions, you will
need to expand your create_observer()
function. You can use the following
example:
from pathspider.observer import Observer
from pathspider.observer import basic_flow
from pathspider.observer import basic_count
class Example(SynchronizedSpider, PluggableSpider):
[...]
def create_observer(self):
logger = logging.getLogger("example")
try:
return Observer(self.libtrace_uri,
new_flow_chain=[basic_flow],
ip4_chain=[basic_count],
ip6_chain=[basic_count])
except:
logger.error("Observer would not start")
sys.exit(-1)
Depending on the types of analysis you would like to do on the packets, you should pass your functions to the appropriate chain:
Function Chain | Description |
---|---|
new_flow_chain | Functions to initialise fields in the flow record for new flows. |
ip4_chain | Functions to record details from IPv4 headers. |
ip6_chain | Functions to record details from IPv6 headers. |
tcp_chain | Functions to record details from TCP headers. |
udp_chain | Functions to record details from UDP headers. |
l4_chain | Functions to record details from other layer 4 headers. |
Library Observer Functions¶
The pathspider.observer.basic_flow()
function simply creates the inital
state for the flow record, extracting the 5-tuple and initialising counters.
The counters are used by the pathspider.observer.basic_count()
function
that counts the number of packets and octets seen in each direction. These
combined will allow your plugin to produce the default output fields.
PATHspider also provides library observer functions for some protocols:
Writing Observer Functions¶
When you are ready to write functions for the observer, first identify which
data should be stored in the flow record. This is a dict
that is made
available for every call to an observer function for a particular flow and
not shared across flows. Once the flow is completed, this is the record that
will be returned to the merger.
The flow record should be initialised when a new flow has been identified. The
functions in the new_flow_chain
are called, in sequence, when a new flow
is identified by the Observer. These functions are passed two arguments:
rec
- the empty flow record, and ip
- the IP header.
You should familiarise yourself with the python-libtrace documentation. The analysis
functions all follow the same function prototype with rec
- the empty flow
record, x
- the header, and rev
- boolean value indicating the
direction the packet travelled (i.e. Was the packet in the reverse direction?).
The only difference in these functions is the header that is passed, as a python-libtrace object, to the function. The same flow record is always passed for each call for the same flow, regardless of which function chain the function is in.
If a function returns False, as it has identified the end of the flow, the
Observer will consider the flow to be finished and will pass it to be merged
with the job record after a short delay. This might occur for TCP flows when
both FIN packets have been seen using the
pathspider.observer.tcp.tcp_complete()
function.
Merging¶
The merge function will be called for every job and given the job record and the observer record. The merge function is then to return the final record to be recorded in the dataset for the measurement run.
Warning
It is possible for the Observer to return a NO_FLOW object in some circumstances, where the flow has not been observed. Any implementation must handle this gracefully.
An example implementation of this method can be found in the ECN plugin:
Running Your Plugin¶
In order to run your plugin, in the root of your plugin source tree run:
PYTHONPATH=. pathspider example </usr/share/doc/pathspider/examples.csv >results.txt
Unless you install your plugin, you will need to add the plugin tree to the
PYTHONPATH
to allow the plugin to be discovered.