I need to read a PCAP file, modify some fields (actually IPv4 source and destination and Ethernet source and destination). The PCAP is pre-filtered to only include IPv4 over Ethernet packets.
Up to now I tried to do this with scapy, which however has a severe memory problem. My 16GB RAM are completely filled when reading a ~350MB PCAP file. Actually, just reading. I did nothing else with this file yet. I have also found this answer, and with these changes reading is very fast. As soon as I start to modify the packet, memory is bloated again. Scapy is in fact not usable in this context
I also thought about using other tools such as tcprewrite, but it cannot serve my purposes. The Source MAC is always the same for each packet, this can also be done with tcprewrite. Source IP should be random in a given subnet range, for example uniformly distributed in 10.0.0.0/16. Not too easy. Even more complicated is destination IP, which needs to be calculated out of a given traffic matrix.
So the question is: how can I read in a PCAP file, modify four basic fields (Ethernet src+dst, IP src+dst) with a custom function, and write it back to (another) PCAP file?
Actually, the rest of my framework is written in Python, so I would prefer a python based solution. However, as I could simply call other scripts, this is not mandatory. Thank you!
I don't know if there is a way to do that with scapy, but you could also use the very simple PcapFile.py library that lets you read/write pcap files packet by packet (disclaimer: I'm one of the authors). If your needs aren't too complicated (e.g. you don't need to re-generate checksums) you could simply modify the frame's bytestring using Python slicing and Python's struct module.
But I think it should also be possible to get scapy to analyze the frame using p = Ether(packet_bytes)
and convert back to a bytestream for PcapFile.py using str(p)
. This way you can let scapy re-calculate a valid checksum for you.
Scapy seems to have "a severe memory problem", as you state, probably because you read the whole PCAP file in memory with rdpcap()
and then modify it (still in memory), and then write it back to another file, all at once, from your memory, with wrpcap()
.
But the most "Pythonic" and "Scapyist" way to do such a thing would be to use generators (PcapReader
and PcapWriter
). Here is an example:
from scapy.all import *
ETHER_ADDR_TRANSLATION = {
"orig_mac_1": "new_mac_1",
# [...]
}
IP_ADDR_TRANSLATION = {
"orig_ip_1": "new_ip_1",
# [...]
}
def addr_translation_pcap(source, destination):
out = PcapWriter(destination)
for pkt in PcapReader(source):
# In case we have complex encapsulations, like IP-in-IP, etc.,
# we have to do something like this. If we know for sure that's
# not the case, there's no need for such a (time-consuming) code.
layer = pkt
while not isinstance(layer, NoPayload):
if isinstance(layer, Ether):
for field in ['src', 'dst']:
fval = getattr(layer, field)
if fval in ETHER_ADDR_TRANSLATION:
setattr(layer, field, ETHER_ADDR_TRANSLATION[fval])
# Let's not forget IP-in-ICMP-error
elif isinstance(layer, (IP, IPerror)):
for field in ['src', 'dst']:
fval = getattr(layer, field)
if fval in IP_ADDR_TRANSLATION:
setattr(layer, field, IP_ADDR_TRANSLATION[fval])
elif isinstance(layer, ARP):
fields = {}
if layer.hwtype == 1:
fields.update({'hwsrc': ETHER_ADDR_TRANSLATION,
'hwdst': ETHER_ADDR_TRANSLATION})
if layer.ptype == 2048:
fields.update({'psrc': IP_ADDR_TRANSLATION,
'pdst': IP_ADDR_TRANSLATION})
for field, translator in fields.iteritems():
fval = getattr(layer, field)
if fval in translator:
setattr(layer, field, translator[fval])
layer = layer.payload
out.write(pkt)
out.close()