DIstributed Firewall and Flow-shaper Using
Statistical Evidence (DIFFUSE)
Overview
In recent years a body of
research emerged around the identification and
classification of traffic flows based on statistical properties
(features) -- and in particular the application of Machine Learning (ML) techniques to generate such
classifiers. Statistical properties, such as distributions of
packet size or inter-packet arrival times, may be calculated
without accessing packet payloads (packet inspection). Such
techniques assist Internet Service Providers (ISPs) to work
within any legal or technical limitations on direct payload
inspection – potential new applications include
characterising traffic for Lawful Interception, automated
‘market research’ or automated prioritisation
of real-time traffic.
For many of these new applications a de-coupling between
flow classification and treatment (the actions performed
on flows, such as blocking or shaping) is highly desirable. For
example, a single high performance classifier near the core of an
ISP network may control multiple low-power nodes near the network
edge (perhaps embedded within ADSL or Cable modem gateways) so
that centralised traffic classification can automatically modify
the Quality of Service (QoS) treatment experienced by packets at
the network edge. This de-coupling also enables potentially
computationally intensive per-flow statistics calculations to be
offloaded from the packet forwarding path.
However, common open-source packet filters that
combine firewall and traffic shaping (such as ipfw,
pf, netfilter and similar) currently do
not use traffic statistics, instead relying on direct inspection of
packets passing through the filtering node’s local
interfaces. Furthermore, these filters tightly couple the
flow classification and treatment, i.e. after flows are
classified actions are executed locally immediately after the
classification.
In this project we will design and develop extensions for existing
packet filter providing ML-based classification based on
statistical properties and de-coupling of flow classification
and treatment, and we will analyse the accuracy, performance and
scalability of such a distributed system. We further will
explore whether automatic (re)training of classifiers may be
practically achieved using live IP traffic going past particular
points inside an ISP network, and the degree to which noise (packet
loss and jitter) in the live traffic feed negatively impacts on the
system's ability to recognise the same class of traffic in the
future.
Figure 1: DIFFUSE allows
the use of traffic statistics to augment traditional packet
filtering and traffic shaping decisions
Project Goals
- Design packet filter extensions that
allow ML-based classification and the de-coupling of flow
classification and treatment.
- Design a protocol to transport information
about flow classes and actions from classifiers to nodes
enforcing actions.
- Develop extensions for existing packet
filters that implement the developed approach and can be used as
demonstrator.
- Evaluate the accuracy, performance and
scalability of a distributed classification system and
characterise the various trade-offs.
- Investigate methods for dynamic (re)training
of classifiers and investigate the impact of noise on the
performance of these methods.
As part of this project we will develop and publicly release
software that allows the classification of flows based on
statistical properties and de-couples the classification from the
actions undertaken, and publish interim results and papers on our
website. The links at the top will take you to additional
information.
News
April 13th, 2012: We have
released DIFFUSE for OpenWRT, a version of
DIFFUSE that works on embedded devices such as home Internet
gateways. Our current prototype is based on the DIFFUSE 0.4
distribution running on the Attitude Adjustment (r29537) version of
OpenWRT (an embedded Linux
operating system). DIFFUSE for OpenWRT allows to enable
automatic and dynamic QoS for home networks.
Project Members
 |
This project began in June 2010 and has been made possible in
part by a gift from The
Cisco University Research Program Fund, a corporate advised
fund of Silicon Valley Community Foundation, for a project titled
"Exploring the efficacy of
distributed statistical traffic classification using modified open
source packet filters".
|