Flow Analysis Chains¶
PATHspider’s flow observer accepts analysis chains and passes python-libtrace dissected packets
along with the associated flow record to them for every packet recieved. The
chains that are desired should be specified in the chains
attribute of
your class as a list of classes. If this list is empty, which is the default if
not overridden in your class, no flow analysis will be performed. This can be
used during early development of your plugin while you work on the traffic
generation.
When you are ready to start working with flow analysis, you will need to expand
your chains
attribute. You can see this in the following example:
from pathspider.chains.basic import BasicChain
class Example(SynchronizedSpider, PluggableSpider):
name = "example"
description = "An Example Plugin"
version = "1.0"
chains = [BasicChain, ...]
...
Depending on the types of analysis you would like to do on the packets, you should add additional chains to the chains attribute of your plugin class.
Library Flow Analysis Chains¶
The pathspider.chains.basic.BasicChain
chain creates inital
state for the flow record, extracting the 5-tuple and counting the number of
packets and octets in each direction. Unless you have good reason, this chain
should be included in your plugin as its fields are used by the merger to match
flow records with their corresponding jobs.
PATHspider also provides library flow analysis chains for some protocols and extensions which you can find in the Observer page.
Writing Flow Analysis Chains¶
When you are ready to write a chain for the observer, first identify which
data should be stored in the flow record. This is a dict
that is made
available for every call to a chain function for a particular flow
(identified by its 5-tuple) and not shared across flows.
Flow chains inherit from pathspider.chains.base.Chain
and provide
a series of functions for handling different types of packet.
-
class
pathspider.chains.base.
Chain
[source] This is an abstract flow analysis chain. It is intended that all flow analysis chains will subclass this class and it is not intended for this class to be directly used by PATHspider plugins.
You should familiarise yourself with the python-libtrace documentation. The analysis
functions all follow similar function prototypes with rec
: the flow
record, x
: the protocol header, and rev
: boolean value indicating the
direction the packet travelled (i.e. was the packet in the reverse direction?).
The exception to this rule is for icmp4
and icmp6
which also provide a
q
argument, the ICMP quotation if the message was a type that carries a
quotation otherwise this is set to None
.
The only requirement for a flow analysis chain is that it provides a
new_flow()
function. All other functions are optional. If the
new_flow()
function does not return True, the flow will be discarded. All
other functions must return True
unless they have identified that the flow
is complete and should be passed on to the merger. If this is not easily
detectable, a timeout will pass the flow for merging after a fixed interval
where no new packets have been seen.
You can find descriptions for each of the possible chain functions in
pathspider.chains.noop.NoOpChain
:
-
class
pathspider.chains.noop.
NoOpChain
[source] This flow analysis chain does not perform any analysis and is present here for the purpose of documentation and testing.
-
icmp4
(rec, ip, q, rev)[source] This function is called for every new ICMPv4 packet seen. It can be used to record details for fields present in the ICMPv4 header or quotation.
Note
The IP header is passed as the argument, not the ICMP header as it may be desirable to access fields in the IP header, for instance to determine the router or host that sent the ICMP message
Parameters: - rec (dict) – the flow record
- ip (plt.ip) – the IPv4 packet that was observed to be part of this flow and contained an ICMPv4 header
- q (plt.ip) – the ICMP quotation of the packet that triggered this message (if any)
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-
icmp6
(rec, ip6, q, rev)[source] This function is called for every new ICMPv6 packet seen. It can be used to record details for fields present in the ICMPv6 header or quotation.
Note
The IP header is passed as the argument, not the ICMP header as it may be desirable to access fields in the IP header, for instance to determine the router or host that sent the ICMP message
Parameters: - rec (dict) – the flow record
- ip (plt.ip6) – the IPv6 packet that was observed to be part of this flow and contained an ICMPv6 header
- q (plt.ip6) – the ICMP quotation of the packet that triggered this message (if any)
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-
ip4
(rec, ip, rev)[source] This function is called for every new IPv4 packet seen. It can be used to record details for fields present in the IPv4 header.
Parameters: - rec (dict) – the flow record
- ip (plt.ip) – the IPv4 packet that was observed to be part of this flow
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-
ip6
(rec, ip6, rev)[source] This function is called for every new IPv6 packet seen. It can be used to record details for fields present in the IPv6 header.
Parameters: - rec (dict) – the flow record
- ip (plt.ip6) – the IPv6 packet that was observed to be part of this flow
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-
new_flow
(rec, ip)[source] This function is called for every new flow to initialise a flow record with the fields that will be used by this chain. It is recommended to initialise all fields to None until other functions have set values for them to make clear which fields are set by this chain and to avoid key errors later.
Parameters: - rec (dict) – the flow record
- ip (plt.ip or plt.ip6) – the IP or IPv6 packet that triggered the creation of a new flow record
Returns: True if flow should be kept, False if flow should be discarded
Return type: bool
-
tcp
(rec, tcp, rev)[source] This function is called for every new TCP packet seen. It can be used to record details for fields present in the TCP header.
Parameters: - rec (dict) – the flow record
- tcp – the TCP segment that was observed to be part of this flow
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-
udp
(rec, udp, rev)[source] This function is called for every new UDP packet seen. It can be used to record details for fields present in the UDP header.
Parameters: - rec (dict) – the flow record
- tcp – the UDP segment that was observed to be part of this flow
- rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns: True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)
Return type: bool
-