Code Structure - Overview
Netsniff is based on an object-oriented design and implemented using C++.
The underlying PCAP library is used
to capture all the network traffic which is then passed on to a class
hierarchy for parsing and logging of relevant information.
The hierarchical layout allows a packet header to be processed for information
by a single class which then decides whether to pass the enclosed payload to
another class for processing. Netsniff works slightly differently with stream
based protocols such as TCP Streams. Netsniff uses a
TCPStream instance to reconstruct an entire
TCP Stream - gathering TCP level statistics in the process - and passing the
TCP bit-stream to an application level parser for further processing.
Captured packets are processed both at the packet level and at the stream
level where appropriate.
Current implementation of packet level processing is indicated in Figure 1,
where an arrow indicates what packet types are currently checked for within
a particular packet type.The PCAPDev class
is not a packet in it its own right, but contains the function called by the
pcap library and creates the first instance of the packet class used to process
the captured data.

Figure 1. Collaboration diagram of packet classes
Example of parsing an ICMP Packet
An ICMP packet is captured on an Ethernet device:
- The PCAPDev class constructs an
EtherPacket instance.
- The EtherPacket instance processes
the Ethernet headers and constructs an IPPacket
instance.
- The IPPacket instance processes the IP
header and constructs an ICMPPacket instance.
- The ICMPPacket instance processes the payload
of the ICMP Packet.
This class-based design allows for parsing of ICMP packets over a variety of
different underlaying protocols as long as their parsers are complete.
Application streams are currently processed using the approach shown in Figure
2, indicating which applications running over a TCP session are currently processed
and logged. Any packets that are not processed as part of a supported application
are automatically shortened to 68 bytes and written to a log file in
tcpdump format.

Figure 2. Stream level protocols
Physical and link-layer protocol implementation
For each physical and link-layer protocol, a class instance is created to handle
the protocol under consideration. The instance is created by the class type
processing the immediately outlying packet type. Each packet is constructed with
a pointer to its encapsulating packet, allowing classes furhter down the
protocol tree to backtrack and request information from parent classes (protocols).
All information within the packet headers and payload must be processed within
the constructor. Once the class is constructed, the memory allocated to store the
packet contents is discarded by the pcap
library. If the packet and/or header contents are required for other purposes they
must be copied or extracted during class construction. The destructor frees any
resources allocated in the constructor. For TCPPacket,
a ParseStream() method constructs and maintains
the TCP streams. The Output() method is called to
log all collected information to the provided file handle.
The run-time procedure is for the packet headers and payload to be parsed in the
main Packet class and subsequent constructors
which store information in their member variables. Once the captured information
has been parsed, the Output() method on the main
Packet class is automatically called with the
correct output stream to output all stored information, the
Output() method on any encapsulated packets is
also called, so that all processed packet information is output.
Packet-based application layer protocol implementation
Applications that fall into this category perform all their communications
at a Packet level rather than at a Stream level. To parse a new application
type, first subclass the Packet class and:.
- All information contained within the packet payload must be processed or
copied by the class constructor as the memory allocated to the payload will
be freed once the constructor terminates.
- The DumpFileName() method should be
overloaded to return the file name to dump packet information to.
An output grammar specifies the data to be
- The Output() method should be overloaded
to output the stored payload information. All data logged by encapsulating
packets (eg. IP Layer) will be automatically output prior to the application
layer Output() method being called.
- If necessary, the boolean OmmitOutputFromParent()
method allows exclusion of the encapsulating packets output.
Stream-based application layer protocol implementation
Currently only TCP-based stream applications are processed by netsniff.
Netsniff will automatically reconstruct the TCP stream and extract TCP level
information for logging, passing the reconstructed TCP bitstream to a
registered application parser. This allows us to construct a parser that is
concerned only with the application layer information. Netsniff currently supports
the HTTP, FTP, HTTPS(TLS) POP3, SMTP and IMAP4 application protocols.
Each parser is implemented in a class subclassed from the APPParser
class. Port numbers for the application are registered in APPParserFactory.
Data from the TCP bitstream is collected a portion at a time, the parser must be able to process this
data in blocks as it is provided. The proceedure is:
- The underlying TCPStream class
reconstructs the bitstream in the correct order and passes portions to
the parser via the ParseClient() and
ParseServer() methods. The parser
processes these data blocks for any relevant information and stores it
for later output.
- The parser must honour the global anonymisation flag to anonymise data if required.
- The boolean Parsed() method should return
whether the TCP Stream is parsed or not (allowing non-parsed streams to be logged
in the notparsed.dump file.
- The DumpFileName() method should return the file
name to dump all logged information to.
- The Output() method is overloaded to output
the stored information. Prepended to this output (for each individual TCP stream)
is the TCP Stream statistics logged while the TCP bitstream was reconstructed.
- The static Create() method is called by the
APPParserFactory to dynamically create an instance
of the required parser.
Anonymisation tools implementation
Netsniff can anonymise potentially sensitive information. To allow for
correlation between applications, sessions, usernames and hosts, we would
like to be able to determine if email addresses, IP addresses and other
identifying information has been repeated. To help perform this task, netsniff
includes a serie of classes to perform these functions.
- IP Address Anonymisation - The IPAddressMap
class anonymises addresses using a similar algorithm to that used by
tcpdcriv
running with the -A 50 flag. The class maintains a map of seen
IP addresses to anonymised IP adresses so that the same IP address will
be mapped to the same anonymised IP address. This algorithm also
maintains network locality information between two IP addresses in their
anonymised forms.
- String Hashing - Strings are anonymised using a secure hash with
a random key (which is regenerated for each run of netsniff).
The output of the string anonymisation is a hex string which uniquely
represents the original input string. Comparison of similar input strings
is not possible since the hash function has the required property of equally
dispersing strings in the input-space into the output-space.
For more details on the anonymisation features of netsniff, please see
here.
More Information
For further information on developing modules for the netsniff architecture, please
see the netsniff documentation page.
|