Cyberdefense mechanisms such as Network Intrusion Detection Systems predominantly use signature-based approaches to effectively detect known malicious activities in network traffic. Unfortunately, constructing a database with signatures is very time-consuming and this approach can only find previously seen variants. Machine learning algorithms are known to be effective software tools in detecting known or unrelated novel intrusions, but if they are also able to detect unseen variants has not been studied. In this research, we study to what extent binary classification models are accurately able to detect novel variants of application layer targeted cyberattacks. To be more precise, we focus on detecting two types of intrusion variants, namely (Distributed) Denial-of-Service and Web attacks, targeting the Hypertext Transfer Protocol of a web server. We mathematically describe how two selected datasets are adjusted in three different experimental setups and the results of the classification models deployed in these setups are benchmarked using the Dutch Draw baseline method. The contributions of this research are as follows: we provide a procedure to create intrusion detection datasets combining information from the transport, network, and application layer to be directly used for machine learning purposes. We show that specific variants are successfully detected by these classification models trained to distinguish benign interactions from those of another variant. Despite this result, we demonstrate that the performances of the selected classifiers are not symmetric: the test score of a classifier trained on A and tested on B is not necessarily similar to the score of a classifier trained on B and tested on A. At last, we show that increasing the number of different variants in the training set does not necessarily lead to a higher detection rate of unseen variants. Selecting the right combination of a machine learning model with a (small) set of known intrusions included in the training data can result in a higher novel intrusion detection rate.

, , , ,
International Journal on Advances in Security
Stochastics

van de Bijl, E., Klein, J., Pries, J., van der Mei, R., & Bhulai, S. (2022). Detecting novel application layer cybervariants using supervised learning. International Journal on Advances in Security, 15, 75–85.