netAI Classifier

NAME

netAI_CL - Network traffic based Application Identification (Command line interface jarfile)

SYNOPSIS

netAI_CL <algorithm> [-c <classindex> ] [-C <flowstatusindex> ] [-l <model-file> ] [-t <training-file> ] [-A <attributes-index> ] [-Y <output-attributes> ] [-s <stats-file> ] [-o <output-file> ] [-P <port-number> ] [-e <delimiter> ] -au [-- <algorithm-parameters> ]

DESCRIPTION

The Network Traffic based Application Identification (netAI) tool has been developed for identifying the end host applications that are responsible for traffic flows in the network. Unlike previous solutions that identify the applications based on port numbers or packet payload information (either through protocol decoding or signatures) netAI computes a variety of payload independent features (e.g. packet length statistics) for a traffic flow and uses machine learning (ML) techniques to identify the application that generated the traffic flow. Before netAI can be used to classify a particular application it must be trained on a representative set of traffic flows. netAI can be used offline (reading packet data from trace files) and online (live capturing on network interfaces). netAI_CL is the command line version of the software, useful for testing and experimentation. You must specify either a training file or model file when starting. It is recommended that model files be used rather than large training files, as they build a classifier faster. By default the classifier will listen for TCP connections containing flow statistics on port 4837. Statistics can also be accepted via UDP and read from file (but not all at once). This program can be executed using the supplied scripts (see netAI manpage) or as a standalone (although Weka must be included in the classpath).

OPTIONS

-c class index

Index of class attribute, according to position in ARFF header or -A (starting from 1). e.g when using the atrributes "-A 3,4,5,6" and attribute "3" is the class attribute, then the class index is: -c 1

-C command parameter index

Index of NetMate command parameter for active/complete flag (default is 1)

-l model-file

Load a saved Weka classifier model.

-t training file

ARFF file containing training data. Use with -B to build a model file. Must be included in all tests.

-A used attributes index

Comma seperated list of the index of each attribute used in testing. The index of each attribute is its position in the instance data as exported from netmate (starting from 0). Must include the class attribute index.

-Y output attributes index

Comma seperated list of attributes to be printed alongside prediction. By default only the prediction and probability of each instance is printed. We recommend printing out at least the destination port and hosts of each flow instance.

-s flow statistics file

Read data from a flow statistics file generated by NetMate. Flow statistics files are should not be confused with ARFF test datasets (-T), as no header information is contained within. Flow statistics files contain the same data as sent by NetMate over TCP/UDP.

-o output file

The file into which predictions are written as text. By default results are printed to stdout.

-P port Number

Specify an alternative port on which to accept incoming test data. (default is 4837)

-e delimiter

Delimiter used when outputting results. (default single whitespace)

-a

Include interim (active) flow predictions in the output. We recommend outputting the ’flowID’ and ’active/complete flag’ when using this feature.

-u

Accept data from NetMate using UDP. Packet loss can occur when sending UDP data from NetMate at high bit rates, so we recommend TCP, which is the default.

-- <Algorithm parameters>

Any further algorithm specific parameters to be used. See weka for usage details. Must be last argument e.g -- -U for non-pruning tree in J48

FILES

netAI_CL.jar

The netAI JAR file containing the program

EXAMPLES

See the getting started document supplied with this software for some usage examples.

AUTHOR

Nigel Williams <niwilliams@swin.edu.au>