Cross-Datasets Evaluation of Machine Learning Models for Intrusion Detection Systems

Said Al-Riyami*, Alexei Lisitsa, Frans Coenen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The conventional way to evaluate the performance of machine learning models intrusion detection systems (IDS) is by using the same dataset to train and test. This method might lead to the bias from the computer network where the traffic is generated. Because of that, the applicability of the learned models might not be adequately evaluated. We argued in Al-Riyami et al. (ACM, pp 2195-2197 [1]) that a better way is to use cross-datasets evaluation, where we use two different datasets for training and testing. Both datasets should be generated from various networks. Using this method as it was shown in Al-Riyami et al. (ACM, pp 2195-2197 [1]) may lead to a significant drop in the performance of the learned model. This indicates that the models learn very little knowledge about the intrusion, which would be transferable from one setting to another. The reasons for such behaviour were not fully understood in Al-Riyami et al. (ACM, pp 2195-2197 [1]). In this paper, we investigate the problem and show that the main reason is the different definitions of the same feature in both models. We propose the correction and further empirically investigate cross-datasets evaluation for various machine learning methods. Further, we explored cross-dataset evaluation in the multiclass classification of attacks, and we show for most models that learning traffic normality is more robust than learning intrusions.

Original languageEnglish
Title of host publicationProceedings of 6th International Congress on Information and Communication Technology, ICICT 2021
EditorsXin-She Yang, Simon Sherratt, Nilanjan Dey, Amit Joshi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages815-828
Number of pages14
ISBN (Print)9789811621017
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event6th International Congress on Information and Communication Technology, ICICT 2021 - Virtual, Online
Duration: Feb 25 2021Feb 26 2021

Publication series

NameLecture Notes in Networks and Systems
Volume217
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference6th International Congress on Information and Communication Technology, ICICT 2021
CityVirtual, Online
Period2/25/212/26/21

Keywords

  • Machine learning
  • Model evaluation
  • Network intrusion detection system
  • Network security
  • Security and privacy

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Signal Processing
  • Computer Networks and Communications

Cite this