Capturing performance with pcap vs raw socket

2019-03-14 02:28发布

问题:

When capturing network traffic for debugging, there seem to be two common approaches:

  1. Use a raw socket.

  2. Use libpcap.

Performance-wise, is there much difference between these two approaches? libpcap seems a nice compatible way to listen to a real network connection or to replay some canned data, but does that feature set come with a performance hit?

回答1:

The answer is intended to explain more about the libpcap.

libpcap uses the PF_PACKET to capture packets on an interface. Refer to the following link. https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt

From the above link

In Linux 2.4/2.6/3.x if PACKET_MMAP is not enabled, the capture process is very inefficient. It uses very limited buffers and requires one system call to capture each packet, it requires two if you want to get packet's timestamp (like libpcap always does). In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size  configurable circular buffer mapped in user space that can be used to either send or receive packets. This way reading packets just needs to wait for them, most of the time there is no need to issue a single system call. Concerning transmission, multiple packets can be sent through one system call to get the highest bandwidth. By using a shared buffer between the kernel and the user also has the benefit of minimizing packet copies.

performance improvement may vary depending on PF_PACKET implementation is used. 

From https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt -

It is said that TPACKET_V3 brings the following benefits:  *) ~15 - 20% reduction in CPU-usage  *) ~20% increase in packet capture rate

The downside of using libpcap -

  1. If an application needs to hold the packet then it may need to make a copy of the incoming packet.

    Refer to manpage of pcap_next_ex.

    pcap_next_ex() reads the next packet and returns a success/failure indication. If the packet was read without problems, the pointer pointed to by the pkt_header argument is set to point to the pcap_pkthdr struct for the packet, and the pointer pointed to by the pkt_data argument is set to point to the data in the packet. The struct pcap_pkthdr and the packet data are not to be freed by the caller, and are not guaranteed to be valid after the next call to pcap_next_ex(), pcap_next(), pcap_loop(), or pcap_dispatch(); if the code needs them to remain valid, it must make a copy of them.

  2. Performance penalty if application only interested in incoming packets.

    PF_PACKET works as taps in the kernel i.e. all the incoming and outgoing packets are delivered to PF_SOCKET.  Which results in an expensive call to packet_rcv for all the outgoing packets.  Since libpcap uses the PF_PACKET, so libpcap can capture all the incoming as well outgoing packets. if application is only interested in incoming packets then outgoing packets can be discarded by setting pcap_setdirection on the libpcap handle. libpcap internally discards the outgoing packets by checking the flags on the packet metadata. So in essence, outgoing packets are still seen by the libpcap but only to be discarded later. This is performance penalty for the application which is interested in incoming packets only.



回答2:

Raw packet works on IP level (OSI layer 3), pcap on data link layer (OSI layer 2). So its less a performance issue and more a question of what you want to capture. If performance is your main issue search for PF_RING etc, that's what current IDS use for capturing.

Edit: raw packets can be either IP level (AF_INET) or data link layer (AF_PACKET), pcap might actually use raw sockets, see Does libpcap use raw sockets underneath them?