I recently began to use the scapy library for Python 2.x I found there to be minimal documentation on the sniff() function. I began to play around with it and found that I can veiw TCP packets at a very low level. So far I have only found informational data. For example:
Here is what I put in the scapy terminal:
A = sniff(filter="tcp and host 216.58.193.78", count=2)
This is a request to google.com asking for the homepage:
<Ether dst=e8:de:27:55:17:f3 src=00:24:1d:20:a6:1b type=0x800 |<IP version=4L ihl=5L tos=0x0 len=60 id=46627 flags=DF frag=0L ttl=64 proto=tcp chksum=0x2a65 src=192.168.0.2 dst=216.58.193.78 options=[] |<TCP sport=54036 dport=www seq=2948286264 ack=0 dataofs=10L reserved=0L flags=S window=29200 chksum=0x5a62 urgptr=0 options=[('MSS', 1460), ('SAckOK', ''), ('Timestamp', (389403, 0)), ('NOP', None), ('WScale', 7)] |>>>
Here is the response:
<Ether dst=00:24:1d:20:a6:1b src=e8:de:27:55:17:f3 type=0x800 |<IP version=4L ihl=5L tos=0x0 len=60 id=42380 flags= frag=0L ttl=55 proto=tcp chksum=0x83fc src=216.58.193.78 dst=192.168.0.2 options=[] |<TCP sport=www dport=54036 seq=3087468609 ack=2948286265 dataofs=10L reserved=0L flags=SA window=42540 chksum=0xecaf urgptr=0 options=[('MSS', 1430), ('SAckOK', ''), ('Timestamp', (2823173876, 389403)), ('NOP', None), ('WScale', 7)] |>>>
Using this function, is there a way that I can extract HTML code from the response?
Also, what do those packets look like?
And finaly, Why are both of these packets nearly identical?
Have you tried using scapy-http? It's a great scapy extension that helps with this exact issue
The segments in your example are "nearly identical" because they are the TCP SYN and SYN-ACK segments which are part of the TCP handshake, HTTP request and response comes after that during the connection (usually when in ESTABLISHED state except when TCP Fast Open option is used) so you need to look at segments after the handshake to get the data you are interested in.
You can use Scapy's
Raw
layer to extract data above TCP in your case (e.g.pkt[Raw]
)