I have searched a lot on Intrusion Detection system but now I am confused as now from where should I start. I dont know whether any open source reusable codes exists but I want to make Intrusion Detection and Prevention System with Neural Network.
From the Developer point of view my question is from where should I begin with. Kindly guide me on this topic.
Also I am presently working and analysing KDD CUP 1999 Dataset. And in search for more such data sets.
Kindly tell me which will be the best algorithms for building Intrusion Detection System.
Thanks to whomsoever reply or read.. Kindly guide me on this. Thanks in advance.
Most intrusion detection systems which use Neural Networks make use of supervised training, ie. the system prompts you for an opinion when certain changes are requested to its host. I suggest that you start with finding out the methodology for hooking change requests. In windows that could involve using a system hook to filter certain actions that are requested by applications. This will allow your app the option of prompting you for a response, that response will overtime be fed into the neural net. This dataset then can be used to optimize the recognition of certain patterns and your responses to those patterns. There are obviously more things to consider when building a system such as this but you should be off to a good start based on what I said.
I study in the same subject. Intrusion detection and machine learning. It is rather broad subject. I will answer more about data pre-processing and feature construction point of view. Neural Network part is different story altogether.
First of all, this area is heavily commercialized therefore there is almost no open source code examples. A lot things are done commercially in a closed ecosystem.
From academic perspective: There is a big data set problem. DK99C (Darpa - KDD99 data set) exists but it is very old. KDD99 dataset is constructed from DARPA tcpdumps. They used bro IDS , tcpdump api to construct features. From my perspective it is a lot harder to create features from raw tcpdump than working with machine learning algorithms (Neural Network) on ready features.
Read this article to learn more about how it (KDD99) is constructed
Read this article and its presentation to learn why this subject is a hard problem to study.
Read this article to see how most academics work in this subject. A bit disappointing really.
Read this why DK99C is considered harmful. It is harmful but no other credible dataset exists.
Read this about taxonomy of IDS data pre processing