Résumé
Network Intrusion Detection Systems (NIDS) observe a network environment and aim to identify intrusions: malicious behaviors that compromise integrity, confidentiality or availability of either the network data or the systems. NIDS can be classified into signature-based NIDS, that identify known intrusions by comparing the traffic with a knowledge base, and anomaly-based NIDS (AIDS)that aim to qualify the unknown intrusion traffic from a model of normal traffic. AIDS are mostly based on Machine Learning techniques. Performing detection of rare events such as intrusions in an ever-changing network environment using learned AIDS is a challenge bound to several big issues. Firstly, gathering representative network data with accurate label information is costly. These data are also highly imbalanced as intrusions are rare events.
Finally, there is no guarantee that a learned AIDS on a network intrusion detection dataset is useful for real NIDS inference. This thesis explores the capabilities of the Tangled Program Graphs (TPG) framework to act as an AIDS probe. TPG is a form of machine learning based on genetic programming that offers lightweight and versatile learning capabilities.