Recap of Machine Learning For Network-Based IDS Study


An excellent study was done by Robin Sommer and Vern Paxson on “Using Machine Learning for Network Intrusion Detection” that provides us with an in-depth view of machine learning and network security. Although the paper was written a few years back, the topic is very relevant today because CDNs and Cloud Security companies are starting to position themselves as machine learning platforms.

A Network Intrusion Detection System (NIDS) monitors a single or a network of computers looking for suspicious activity, which could be an attack or unauthorized activity. A large NIDS server can be placed on a backbone network to monitor all traffic, or smaller systems can be set up to monitor traffic for a particular server, switch, gateway, or router.

As a method of security management, NIDS is installed only at specific points such as servers that interface between the outside environment and the network segment to be protected. This involves the analysis and information from various areas with a computer or network to identify possible threats posed by attackers.

However, a large weakness with IDS systems is that they are passive and often not reliable security safeguards. They only need to detect threats and as such is placed out-of-band on the network infrastructure, meaning that it is not in the true real-time communication path between the sender and receiver. The IDS monitors traffic and reports its results to an administrator, but cannot automatically take action to prevent a detected exploit from overtaking the system.

The true function for IDS is to only monitor and notify in case of an impending threat. IDS was originally developed this way because at the time the depth of analysis required for intrusion detection could not be performed at a speed that could keep pace with components on the direct communications path of the network infrastructure.

Without strong identification and authentication mechanisms in place at the network, attackers are capable of exploiting vulnerabilities very quickly once they enter the network, manipulating network traffic and rendering the IDS an inadequate prevention deployment.

Most techniques used in today’s IDS are not able to deal with the dynamic and complex nature of cyber attacks on computer networks. Traditional intrusion detection and prevention techniques such as firewalls, access control mechanisms, and encryptions, have several limitations in fully protecting networks and systems from increasingly sophisticated attacks like DDoS.

To address the ever-changing threat landscape that can affect computers and networks, efficiently using various techniques of machine learning can result in higher detection rates, lower false positives and better adaptability to challenging attacks.

Misuse Detection vs. Anomaly Detection

An IDS generally has to deal with noisy network traffic volumes, uneven data distribution, the difficulty to identify the distinctions normal and abnormal network behavior, and the ability to adapt to a constantly changing environment. Bad packets generated from software bugs, corrupt DNS data, and local packets that escaped can create a significantly high false-alarm rate.

Understanding how NDIS recognizes network activity helps determine the core weaknesses behind the system. Strategies for classification of network behaviors typically fall under two categories: misuse detection and anomaly detection.

Misuse detection techniques examine both network and system activity for known instances of misuse using signature matching algorithms. This technique is effective at detecting attacks that are already known.

Alerts may be generated by the IDS, but reaction to every alert wastes time and resources leading to instability of the system. To overcome this problem, IDS should not start elimination procedure as soon as the first symptom has been detected but rather it should be patient enough to collect alerts and decide based on the correlation of them. Typically, misuse detection rules acts like a firewall that looks out for:

  • Using one of the many SMTP/SSH exploits
  • Detecting a port scan
  • Parsing user commands looking for abuse

On the other hand, anomaly detection on the other hand proceeds by comparing every instance to what is “normal” to the network. It seems obvious that such system needs a profile of the network which may be a problem in the way that it takes time and resources to train an anomaly detection sensor in order to build a profile that is reflecting a normal network usage. For instance:

  • Excessive bandwidth usage
  • Excessive system calls from a process
  • More than one entity using a service

Benefits of Machine Learning to NIDS

The challenge is to efficiently capture and classify various behaviors in a computer network, since they cannot be categorized under a single umbrella. Few modern machine learning approaches can be used to solve the problem of finding malicious activity within the network:

Supervised/Unsupervised Learning

One popular strategy is to monitor a network’s activity for anomalies, or anything that deviates from normal network behavior. Anomaly detection creates models of normal behavior for networks, systems, applications, end users and other devices and then looks for deviations from those patterns of behavior at a much faster pace.

Putting machine learning algorithms in place and using a variety supervised (classification) and unsupervised machine learning (clustering) algorithms to detect anomalous patterns of user behavior, as gleaned from a variety of sources, like server logs, Active Directory entries, and virtual private networking (VPN) logs.

Without human intervention, unsupervised machine learning does all of the processing work in order to identify potential security issues. It does this by processing millions of data points each minute and automatically identifying anomalous behavior. It then correlates anomalies across multiple data sources to determine their potential impact.

Artificial Neural Networks (ANN)

A neural network consists of a collection of processing units called neurons that are highly interconnected like a human brain. ANNs have the ability to learning by example and generalize from limited, noisy, and incomplete data and create better predictions the more it “learns” from the network. They have been successfully employed in a broad spectrum of data-intensive applications. If ANNs are fed raw network traffic data, this can help the neural nets learn overtime what can be an anomaly/outlier of the network and detect them faster.


Machine learning has potential, but requires a ton of cognitive load. Since operations of machine learning has been a hot topic throughout the tech industry, there is a lot of aloofness to the actual potency of their solution, and the same applies for NIDS. Network traffic is composed of many individual sessions, which equals to enormous amounts of variety and unpredictable behavior. Finding what qualifies as “normal” is hard for networks to identify, which could result in higher false positive rates.

Human activity, application behavior, and network traffic are all heavily autocorrelated, making it hard to understand what activity is normal. This gives malicious actors plenty of opportunity to “hide in plain sight” and even an opportunity to train the system that malicious activity is normal. This is the disconnect between what the system reports and what the operator wants, and the root cause for too many false positives.

The best solution is to properly train the dataset and use machine learning models for fraud detection and analysis. NIDS alone cannot detect all malicious activity based on its signature-based system. It’s not recommended to jumping to perform classification/regression on the data, without taking time to take a look and analyze the data, understanding the features and their relation with each other and the output.

This step gives a lot of insights to the problem. It can also provide possible answers to questions that may arise due to odd behavior of the learning model. The learning model can be trained as an effective predictor for what lies in network traffic.

Scroll to Top