Path of UDP packet in linux kernel

2019-01-21 11:59发布

I want to find the path of a UDP packet in the Linux kernel. For this, I want to read up on some documentation ( I have this so far, which is for TCP) and then have some printk statements in the relevant kernel functions to confirm that. I will do this by recompiling the kernel code.

Is this the way to go about it? Do you have any suggestions/references?

3条回答
Fickle 薄情
2楼-- · 2019-01-21 12:26

If you prefer a more visual way, try flame-grahps. Here is an example of UDP transmit flow (using netperf to transmit UDP packets): enter image description here

And here is the same graph zoomed-in on udp_send_skb: enter image description here

You can do the same for any relevant flow in the kernel. You can also search for specific functions or key-words and zoom in/out. This also gives you an idea of the heavier functions in the flow.

Hope this helps.

查看更多
Juvenile、少年°
3楼-- · 2019-01-21 12:32

Specifically answering your question, to understand UDP processing for IPv4 you can use ftrace, as is done here:

At the ingress (receiving side):

 96882  2)               |                                ip_local_deliver_finish() {
 96883  2)   0.069 us    |                                  raw_local_deliver();
 96884  2)               |                                  udp_rcv() {
 96885  2)               |                                    __udp4_lib_rcv() {
 96886  2)   0.087 us    |                                      __udp4_lib_lookup();
 96887  2)               |                                      __skb_checksum_complete_head() {
 96888  2)               |                                        skb_checksum() {
 96889  2)               |                                          __skb_checksum() {
 96890  2)               |                                            csum_partial() {
 96891  2)   0.161 us    |                                              do_csum();
 96892  2)   0.536 us    |                                            }
 96893  2)               |                                            csum_partial() {
 96894  2)   0.167 us    |                                              do_csum();
 96895  2)   0.523 us    |                                            }
 96896  2)               |                                            csum_partial() {
 96897  2)   0.158 us    |                                              do_csum();
 96898  2)   0.513 us    |                                            }
 96899  2)               |                                            csum_partial() {
 96900  2)   0.154 us    |                                              do_csum();
 96901  2)   0.502 us    |                                            }
 96902  2)               |                                            csum_partial() {
 96903  2)   0.165 us    |                                              do_csum();
 96904  2)   0.516 us    |                                            }
 96905  2)               |                                            csum_partial() {
 96906  2)   0.138 us    |                                              do_csum();
 96907  2)   0.506 us    |                                            }
 96908  2)   5.462 us    |                                          }
 96909  2)   5.840 us    |                                        }
 96910  2)   6.204 us    |                                      }

Another part of the tracing show below:

 98212  2)               |                              ip_rcv() {
 98213  2)               |                                ip_rcv_finish() {
 98214  2)   0.109 us    |                                  udp_v4_early_demux();
 98215  2)               |                                  ip_route_input_noref() {
 98216  2)               |                                    fib_table_lookup() {
 98217  2)   0.079 us    |                                      check_leaf.isra.8();
 98218  2)   0.492 us    |                                    }

And for egress of networking code, some snippets are extracted below:

 4)   0.547 us    |  udp_poll();
 4)               |  udp_sendmsg() {
 4)               |    udp_send_skb() {
 4)   0.387 us    |      udp_error [nf_conntrack]();
 4)   0.185 us    |      udp_pkt_to_tuple [nf_conntrack]();
 4)   0.160 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.151 us    |      udp_get_timeouts [nf_conntrack]();
 4)   0.145 us    |      udp_new [nf_conntrack]();
 4)   0.160 us    |      udp_get_timeouts [nf_conntrack]();
 4)   0.261 us    |      udp_packet [nf_conntrack]();
 4)   0.181 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.195 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.170 us    |      udp_invert_tuple [nf_conntrack]();
 4)   0.175 us    |      udp_invert_tuple [nf_conntrack]();
 4)               |      udp_rcv() {
 4) + 15.021 us   |        udp_queue_rcv_skb();
 4) + 18.829 us   |      }
 4) + 82.100 us   |    }
 4) + 92.415 us   |  }
 4)               |  udp_sendmsg() {
 4)               |    udp_send_skb() {
 4)   0.226 us    |      udp_error [nf_conntrack]();
 4)   0.150 us    |      udp_pkt_to_tuple [nf_conntrack]();
 4)   0.146 us    |      udp_get_timeouts [nf_conntrack]();
 4)   1.098 us    |      udp_packet [nf_conntrack]();
 4)               |      udp_rcv() {
 4)   1.314 us    |        udp_queue_rcv_skb();
 4)   3.282 us    |      }
 4) + 20.646 us   |    }

The above is called function graph in ftrace:

How to make a linux kernel function available to ftrace function_graph tracer?

And my bashscript for tracing udp are as follows (to be run as root):

#!/bin/bash

mkdir /debug
mount -t debugfs nodev /debug
mount -t debugfs nodev /sys/kernel/debug
echo udp_* >/debug/tracing/set_ftrace_filter
echo function_graph >/debug/tracing/current_tracer
echo 1 >/debug/tracing/tracing_on
sleep 20
echo 0 >/debug/tracing/tracing_on
cat /debug/tracing/trace > /tmp/tracing.out$$

Now the output file is locate inside the /tmp/tracing.out where is the shell script process. The purpose of 20 seconds is to allow userspace activities to happen - just starts lots of UDP activities at this point. You can also remove "echo udp_* >/debug/tracing/set_ftrace_filter" from above script, because the default is to trace everything.

查看更多
放我归山
4楼-- · 2019-01-21 12:44

The linux networking stack is a big piece of the kernel and you need to spend some time studying it. I think that this books may help (Focused on older kernels 2.4 and 2.6, but the logic remain the same for the latest kernels 3.x):

Understanding Linux Network Internals

The Linux Networking Architecture - Design and Implementation of Network Protocols in the Linux Kernel

You can also checkout this links:

http://e-university.wisdomjobs.com/linux/chapter-189-277/sending-the-data-from-the-socket-through-udp-and-tcp.html

http://www.linuxfoundation.org/collaborate/workgroups/networking/kernel_flow

http://wiki.openwrt.org/doc/networking/praxis

http://www.ibm.com/developerworks/linux/library/l-linux-networking-stack/?ca=dgr-lnxw01lnxNetStack

http://gicl.cs.drexel.edu/people/sevy/network/Linux_network_stack_walkthrough.html

You need also to browse the kernel source :

http://lxr.linux.no/#linux+v3.7.3/

Begin your road to the network sub-system with this function : ip_rcv which is called when a packet is received. other functions are then called (ip_rcv_finish, ip_local_deliver and ip_local_deliver_finish=> This function is responsible for choosing the good transport layer)

查看更多
登录 后发表回答