I have worked recently with the DARPA network traffic packets and the derived version of it used in KDD99 for intrusion detection evaluation.
Excuse my limited domain knowledge in computer networks, I could only derive 9 features from the DARPA packet headers. and Not the 41 features used in KDD99.
I am intending to continue my work on the UNB ISCX Intrusion Detection Evaluation DataSet. However, I want to derive from the pcap files the 41 features used in the KDD99 and save it in a CSV format. Is there a fast/easy way to achieve this?
as it was already been done previously for the KDD99, is there a library or converter that can do this for me ? if not, is there a guide of how to derive these features from a pcap file ?
Be careful with this data set.
http://www.kdnuggets.com/news/2007/n18/4i.html
Some excerpts:
As for the feature extraction used. IIRC the majority of features simply were attributes of the parsed IP/TCP/UDP headers. Such as, port number, last octet of IP, and some packet flags.
As such, these findings no longer reflect realistic attacks anymore anyway. Todays TCP/IP stacks are much more robust than at the time the data set was created, where a "ping of death" would instantly lock up a windows host. Every developer of a TCP/IP stack should by now be aware of the risk of such malformed packets and stress-test the stack against such things.
With this, these features have become pretty much meaningless. Incorrectly set SYN flags etc. are no longer used in network attacks; these are much more sophisticated; and most likely no longer attacking the TCP/IP stack, but the services running on the next layer. So I would not bother finding out which low level packet flags were used in that '99 flawed simulation using attacks that worked in the early '90s...