I'm new to pyshark. I'm trying to write a parser for custom UDP packets. I'm using the FileCapture
object to read packets from a file.
>>> cap = pyshark.FileCapture('sample.pcap')
>>> pkt = cap.next()
>>> pkt
<UDP/DATA Packet>
>>> pkt.data.data
'01ca00040500a4700500a22a5af20f830000b3aa000110da5af20f7c000bde1a000006390000666e000067f900000ba7000026ce000001d00000000100001726000100000000000000000000000017260500a4700500a22a608600250500a8c10500a22a608601310500a8c10500a22b608601200500a8cc0500a22a6086000c'
>>> dir(pkt.udp)
['DATA_LAYER', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__format__', '__getattr__', '__getattribute__', '__getstate__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_all_fields', '_field_prefix', '_get_all_field_lines', _get_all_fields_with_alternates', '_get_field_or_layer_repr', '_get_field_repr', '_layer_name', '_sanitize_field_name', 'checksum', 'checksum_status', 'dstport', 'field_names', 'get', 'get_field', 'get_field_by_showname', get_field_value', 'layer_name', 'length', 'port', 'pretty_print', raw_mode', 'srcport', 'stream']
I need a method to simply access the packet's UDP payload. The only method I found to access raw packet data is pkt.data.data
, but this returns the entire content of the packet while I'm only interested to UDP portion. Something like pkt.udp.data
. Is there a way to simply do that or I need to use pkt.data.data
and calculate the offset at which my data are placed?
pyshark_parser might help you out:
https://github.com/jlents/pyshark_parser/blob/master/pyshark_parser/
I was looking at their code and what you might be looking for here:
https://github.com/jlents/pyshark_parser/blob/master/pyshark_parser/packet_util.py
def get_all_field_names(packet, layer=None):
'''
Builds a unique list of field names, that exist in the packet,
for the specified layer.
If no layer is provided, all layers are considered.
Args:
packet: the pyshark packet object the fields will be gathered from
layer: the string name of the layer that will be targeted
Returns:
a set containing all unique field names
or None, if packet is None
'''
if not packet:
return None
field_names = set()
for current_layer in packet.layers:
if not layer or layer == current_layer.__dict__['_layer_name']:
for field in current_layer.__dict__['_all_fields']:
field_names.add(field)
return field_names
and
def get_value_from_packet_for_layer_field(packet, layer, field):
'''
Gets the value from the packet for the specified 'layer' and 'field'
Args:
packet: The packet where you'll be retrieving the value from
layer: The layer that contains the field
field: The field that contains the value
Returns:
the value at packet[layer][key] or None
or None, if any of the arguments are None
'''
if not packet or not layer or not field:
return None
for current_layer in packet.layers:
if layer == current_layer.__dict__['_layer_name'] and \
current_layer.__dict__['_all_fields']:
return current_layer.__dict__['_all_fields'][field]
return None
The only method I found to access raw packet data is pkt.data.data,
Correct.
but this returns the entire content of the packet while I'm only interested to UDP portion.
Incorrect. The .data.data
attribute is a hex string representation of just the UDP payload itself.
For example if your UDP payload is the ASCII string "hello", you can simply retrieve it as such with: bytearray.fromhex(pkt.data.data).decode()
(You can easily test this yourself from a Bash console, e.g., with echo -n hello >/dev/udp/localhost/12345
while doing a pyshark capture on lo:12345.)