netAI - Example Results

Overview

This page describes some of the experiments we have done using netAI and the results obtained. We only present some key results on this page. Please fllow the links to the tech reports for more detailed information.

Detecting Game Traffic

Overview

The Internet is experiencing an increase in the use and commercialisation of interactive applications such as telephony and online gaming. Online gaming in particular is expected to become a large source of income, through either subscription-based games or dedicated gaming services. Internet Service Providers may also charge a premium for Quality of Service (QoS)-enhanced accounts targeted at gamers.

Highly interactive online games, such as First Person Shooter (FPS) games, have a narrow tolerance to network issues such as delay, jitter and packet loss necessitating more rigid QoS compared to the best effort service used for traditional Internet applications such as web or email. In order for QoS to be effective however, an accurate and timely method of identifying and classifying network gaming flows is required. As it is unlikely that game applications will ever explicitly signal their QoS demands to the network, the network must identify game flows and establish adequate QoS for these flows. Once highly interactive game traffic can be identified it can be given a higher priority over other traffic in the network.

We evaluate the performance of several machine learning algorithms for separating network games from generic (i.e. common) network traffic. Although this is not the main focus we also investigate how effectively different games can be separated from each other.

We find that some algorithms are able to separate the different games from each other and other traffic with very high (>99%) accuracy. We also find that all of the ML techniques seem to be fast enough for real-time classification of a fairly large number of simultaneous flows (at least several thousands per second). Furthermore, most of the algorithms can train fast enough to allow for frequent updates of the classifier (training took no more than half an hour).

Tech Report

A much more detailed tech report will be soon available for download here.

Features

We classify packets to flows based on source IP and source port, destination IP and destination port. Flows are bidirectional and the first packet seen by the classifier determines the forward direction.

Flows have limited duration. UDP flows are terminated by a 60 second flow timeout, while TCP flows are terminated upon proper connection teardown (TCP state machine) or after a 60 second timeout (whichever occurs first). We consider only UDP and TCP flows that have at least 1 packet in each direction and transport at least 1 byte of payload.

We distinguish active and idle periods of flows by using an idle threshold, which is 1 second by default. Periods where no packets are observed for 1 second or more are treated as idle periods.

We compute the following features: protocol, duration, volume in bytes and packets, averagesub-flow volume in bytes and packets per active period (subflows are active parts of a flow, see below), number of packets with push flag set (only for TCP flows – always 0 for UDP), packet length (minimum, mean, maximum, standard deviation), inter-arrival times (minimum, mean, maximum, standard deviation) active and idle times (minimum, mean, maximum, standard deviation). Aside from protocol and duration all features are computed separately in both directions of a flow. Packet length derived features are based on the IP length excluding link layer overhead. Inter-arrival times are computed with microsecond precision.

Dataset

For our evaluation we use gaming data captured by members of CAIA and some Command and Conquer traffic captured by Mark Claypool. ‘Generic’ traffic examples were taken from several publicly available traffic traces. We predominantly focus on First Person Shooter (FPS) games as these fast-paced games have the most stringent QoS requirements. However, we have also included some data of a Real Time Strategy (RTS) game. The games tested were from the PC and Xbox platforms. It is important to include Xbox traffic as current and next-generation console devices such as the Xbox 360 and PlayStation 3 are expected to produce a significant share of online gaming traffic. The following table summarises the different games.

We also use a large number of other (non-game flows) taken from different public trace files available from NLANR. The other class consists mostly of web, peer-to-peer (eDonkey, Kazaa), mail (SMTP, POP) and DNS traffic.

Table 1: Traffic classes used in experiments

Class	Description	Genre/platform
CCG	Command and Conquer: Generals	RTS / PC
HL1	Half-Life: Death Match	FPS / PC
HL2-DM	Half-Life 2: Death Match	FPS / PC
HL2-CS	Half-Life 2: Counter Strike	FPS / PC
Q3	Quake 3 Death Match	FPS / PC
TS	Time Splitters	FPS / X-Box
HALO	Halo: Death Match	FPS / X-Box
HALO2	Halo 2: Death Match	FPS / X-Box
Other	Common network protocols e.g. HTTP, DNS, P2P

For further details please refer to the tech report.

Results

We use the standard metrics of precision and recall (see the defintion in the tech report). We compute precicision and recall not only based on the number of finstances but also on the byte volume to evaluate the classification performance for large traffic flows.

First we evaluate if different algorithms are able to separate the game traffic from the non-game traffic. We used the following algorithms: C4.5, Naive Bayes, Bayesian Networks and NBTree. Figure 1 shows the precision and recall across averaged across the two classes for both number of instances and number of bytes. The results show very high precision and recall (>99%) for C4.5, NBTRee and Bayes Net (except volume). Naive Bayes performs worst.

precision and recall for separating between each individual game and other applications

Figure 1: Mean precision and recall for game vs. non-game classes

We also evaluate if the same algorithms can separate between the different games (and the non-game traffic). Figure 2 shows the mean precision and recall for both number of instances and number of bytes. Mean precision and recall is slightly less but still high (~95%) for C4.5 and NBTree. The Bayes Network performs not as good (~91%) and Naive Bayes again performs far worse than any of the other algorithms.

precision and recall for game vs. non-game applications

Figure 2: Mean precision and recall for individual game traffic classes vs. non-game traffic

Figure 3 shows precision and recall for all classes based on the number of instances (top) and byte volume (bottom) instances for the C4.5 classification algorithm. Instance-based metrics are very good for all classes except for Time Splitters (which is the smallest class). Byte-based performance is also very good although there is a reduction for some classes. The reduced precision and recall for the HL2-based games is due to a number of HL2-CS in-game flows being misclassified as HL2-DM. A number of Q3 flows with larger volumes were also classified as HL2-DM.

precision and recall for number of instances

Figure 3: Precision and recall for individual game traffic classes based on number of instances (top) and byte volume (bottom)

Figure 4 shows the classification performance in number of instances per second for all algorithms. It shows the performance of classifiers trained for 2 classes (game vs. non-game traffic) and classifiers trained on each game as separate class. As expected classification performance is better for fewer classes for all algorithms. All classifiers are reasonably fast for only 2 classes but only C4.5 and Naive Bayes can classify a larger number of flows per second when trained on all individual games separately. C4.5 is the fastest algorithm and even the classifier trained on individual games can classify ~90,000 flows per second.

Figure 4: Classification performance of the different algorithms

For further results please have a look at the tech report.

Centre closure

netAI - Example Results

Overview

Detecting Game Traffic

Overview

Tech Report

Features

Dataset

Results