Flow Analysis Chains

PATHspider’s flow observer accepts analysis chains and passes python-libtrace dissected packets along with the associated flow record to them for every packet recieved. The chains that are desired should be specified in the chains attribute of your class as a list of classes. If this list is empty, which is the default if not overridden in your class, no flow analysis will be performed. This can be used during early development of your plugin while you work on the traffic generation.

When you are ready to start working with flow analysis, you will need to expand your chains attribute. You can see this in the following example:

from pathspider.chains.basic import BasicChain

class Example(SynchronizedSpider, PluggableSpider):

    name = "example"
    description = "An Example Plugin"
    version = "1.0"
    chains = [BasicChain, ...]

    ...

Depending on the types of analysis you would like to do on the packets, you should add additional chains to the chains attribute of your plugin class.

Library Flow Analysis Chains

The pathspider.chains.basic.BasicChain chain creates inital state for the flow record, extracting the 5-tuple and counting the number of packets and octets in each direction. Unless you have good reason, this chain should be included in your plugin as its fields are used by the merger to match flow records with their corresponding jobs.

PATHspider also provides library flow analysis chains for some protocols and extensions which you can find in the Observer page.

Writing Flow Analysis Chains

When you are ready to write a chain for the observer, first identify which data should be stored in the flow record. This is a dict that is made available for every call to a chain function for a particular flow (identified by its 5-tuple) and not shared across flows.

Flow chains inherit from pathspider.chains.base.Chain and provide a series of functions for handling different types of packet.

class pathspider.chains.base.Chain[source]

This is an abstract flow analysis chain. It is intended that all flow analysis chains will subclass this class and it is not intended for this class to be directly used by PATHspider plugins.

You should familiarise yourself with the python-libtrace documentation. The analysis functions all follow similar function prototypes with rec: the flow record, x: the protocol header, and rev: boolean value indicating the direction the packet travelled (i.e. was the packet in the reverse direction?). The exception to this rule is for icmp4 and icmp6 which also provide a q argument, the ICMP quotation if the message was a type that carries a quotation otherwise this is set to None.

The only requirement for a flow analysis chain is that it provides a new_flow() function. All other functions are optional. If the new_flow() function does not return True, the flow will be discarded. All other functions must return True unless they have identified that the flow is complete and should be passed on to the merger. If this is not easily detectable, a timeout will pass the flow for merging after a fixed interval where no new packets have been seen.

You can find descriptions for each of the possible chain functions in pathspider.chains.noop.NoOpChain:

class pathspider.chains.noop.NoOpChain[source]

This flow analysis chain does not perform any analysis and is present here for the purpose of documentation and testing.

icmp4(rec, ip, q, rev)[source]

This function is called for every new ICMPv4 packet seen. It can be used to record details for fields present in the ICMPv4 header or quotation.

Note

The IP header is passed as the argument, not the ICMP header as it may be desirable to access fields in the IP header, for instance to determine the router or host that sent the ICMP message

Parameters:
  • rec (dict) – the flow record
  • ip (plt.ip) – the IPv4 packet that was observed to be part of this flow and contained an ICMPv4 header
  • q (plt.ip) – the ICMP quotation of the packet that triggered this message (if any)
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool

icmp6(rec, ip6, q, rev)[source]

This function is called for every new ICMPv6 packet seen. It can be used to record details for fields present in the ICMPv6 header or quotation.

Note

The IP header is passed as the argument, not the ICMP header as it may be desirable to access fields in the IP header, for instance to determine the router or host that sent the ICMP message

Parameters:
  • rec (dict) – the flow record
  • ip (plt.ip6) – the IPv6 packet that was observed to be part of this flow and contained an ICMPv6 header
  • q (plt.ip6) – the ICMP quotation of the packet that triggered this message (if any)
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool

ip4(rec, ip, rev)[source]

This function is called for every new IPv4 packet seen. It can be used to record details for fields present in the IPv4 header.

Parameters:
  • rec (dict) – the flow record
  • ip (plt.ip) – the IPv4 packet that was observed to be part of this flow
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool

ip6(rec, ip6, rev)[source]

This function is called for every new IPv6 packet seen. It can be used to record details for fields present in the IPv6 header.

Parameters:
  • rec (dict) – the flow record
  • ip (plt.ip6) – the IPv6 packet that was observed to be part of this flow
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool

new_flow(rec, ip)[source]

This function is called for every new flow to initialise a flow record with the fields that will be used by this chain. It is recommended to initialise all fields to None until other functions have set values for them to make clear which fields are set by this chain and to avoid key errors later.

Parameters:
  • rec (dict) – the flow record
  • ip (plt.ip or plt.ip6) – the IP or IPv6 packet that triggered the creation of a new flow record
Returns:

True if flow should be kept, False if flow should be discarded

Return type:

bool

tcp(rec, tcp, rev)[source]

This function is called for every new TCP packet seen. It can be used to record details for fields present in the TCP header.

Parameters:
  • rec (dict) – the flow record
  • tcp – the TCP segment that was observed to be part of this flow
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool

udp(rec, udp, rev)[source]

This function is called for every new UDP packet seen. It can be used to record details for fields present in the UDP header.

Parameters:
  • rec (dict) – the flow record
  • tcp – the UDP segment that was observed to be part of this flow
  • rev (bool) – True if the packet was in the reverse direction, False if in the forward direction
Returns:

True if flow should continue to be observed, False if the flow should be passed on for merging (i.e. the flow is complete)

Return type:

bool