How can I measure the performance and TCP RTT of m

I created a basic TCP server that reads incoming binary data in protocol buffer format, and writes a binary msg as response. I would like to benchmark the the roundtrip time.

I tried iperf, but could not make it send the same input file multiple times. Is there another benchmark tool than can send a binary input file repeatedly?

回答1:

If you have access to a linux or unix machine¹, you should use tcptrace. All you need to do is loop through your binary traffic test while capturing with wireshark or tcpdump file.

After you have that .pcap file², analyze with tcptrace -xtraffic <pcap_filename>³. This will generate two text files, and the average RTT stats for all connections in that pcap are shown at the bottom of the one called traffic_stats.dat.

[mpenning@Bucksnort tcpperf]$ tcptrace -xtraffic willers.pcap
mod_traffic: characterizing traffic
1 arg remaining, starting with 'willers.pcap'
Ostermann's tcptrace -- version 6.6.1 -- Wed Nov 19, 2003

16522 packets seen, 16522 TCP packets traced
elapsed wallclock time: 0:00:00.200709, 82318 pkts/sec analyzed
trace file elapsed time: 0:03:21.754962
Dumping port statistics into file traffic_byport.dat
Dumping overall statistics into file traffic_stats.dat
Plotting performed at 15.000 second intervals
[mpenning@Bucksnort tcpperf]$
[mpenning@Bucksnort tcpperf]$ cat traffic_stats.dat


Overall Statistics over 201 seconds (0:03:21.754962):
4135308 ttl bytes sent, 20573.672 bytes/second
4135308 ttl non-rexmit bytes sent, 20573.672 bytes/second
0 ttl rexmit bytes sent, 0.000 bytes/second
16522 packets sent, 82.199 packets/second
200 connections opened, 0.995 conns/second
11 dupacks sent, 0.055 dupacks/second
0 rexmits sent, 0.000 rexmits/second
average RTT: 67.511 msecs        <------------------
[mpenning@Bucksnort tcpperf]$

The .pcap file used in this example was a capture I generated when I looped through an expect script that pulled data from one of my servers. This was how I generated the loop...

#!/usr/bin/python
from subprocess import Popen, PIPE
import time

for ii in xrange(0,200):
    # willers.exp is an expect script
    Popen(['./willers.exp'], stdin=PIPE, stdout=PIPE, stderr=PIPE)
    time.sleep(1)

You can adjust the sleep time between loops based on your server's accept() performance and the duration of your tests.

END NOTES:

A Knoppix Live-CD will do
Filtered to only capture test traffic
tcptrace is capable of very detailed per-socket stats if you use other options...

================================
[mpenning@Bucksnort tcpperf]$ tcptrace -lr willers.pcap
1 arg remaining, starting with 'willers.pcap'
Ostermann's tcptrace -- version 6.6.1 -- Wed Nov 19, 2003

16522 packets seen, 16522 TCP packets traced
elapsed wallclock time: 0:00:00.080496, 205252 pkts/sec analyzed
trace file elapsed time: 0:03:21.754962
TCP connection info:
200 TCP connections traced:
TCP connection 1:
        host c:        myhost.local:44781
        host d:        willers.local:22
        complete conn: RESET    (SYNs: 2)  (FINs: 1)
        first packet:  Tue May 31 22:52:24.154801 2011
        last packet:   Tue May 31 22:52:25.668430 2011
        elapsed time:  0:00:01.513628
        total packets: 73
        filename:      willers.pcap
   c->d:                              d->c:
     total packets:            34           total packets:            39
     resets sent:               4           resets sent:               0
     ack pkts sent:            29           ack pkts sent:            39
     pure acks sent:           11           pure acks sent:            2
     sack pkts sent:            0           sack pkts sent:            0
     dsack pkts sent:           0           dsack pkts sent:           0
     max sack blks/ack:         0           max sack blks/ack:         0
     unique bytes sent:      2512           unique bytes sent:     14336
     actual data pkts:         17           actual data pkts:         36
     actual data bytes:      2512           actual data bytes:     14336
     rexmt data pkts:           0           rexmt data pkts:           0
     rexmt data bytes:          0           rexmt data bytes:          0
     zwnd probe pkts:           0           zwnd probe pkts:           0
     zwnd probe bytes:          0           zwnd probe bytes:          0
     outoforder pkts:           0           outoforder pkts:           0
     pushed data pkts:         17           pushed data pkts:         33
     SYN/FIN pkts sent:       1/1           SYN/FIN pkts sent:       1/0
     req 1323 ws/ts:          Y/Y           req 1323 ws/ts:          Y/Y
     adv wind scale:            6           adv wind scale:            1
     req sack:                  Y           req sack:                  Y
     sacks sent:                0           sacks sent:                0
     urgent data pkts:          0 pkts      urgent data pkts:          0 pkts
     urgent data bytes:         0 bytes     urgent data bytes:         0 bytes
     mss requested:          1460 bytes     mss requested:          1460 bytes
     max segm size:           792 bytes     max segm size:          1448 bytes
     min segm size:            16 bytes     min segm size:            32 bytes
     avg segm size:           147 bytes     avg segm size:           398 bytes
     max win adv:           40832 bytes     max win adv:           66608 bytes
     min win adv:            5888 bytes     min win adv:           66608 bytes
     zero win adv:              0 times     zero win adv:              0 times
     avg win adv:           14035 bytes     avg win adv:           66608 bytes
     initial window:           32 bytes     initial window:           40 bytes
     initial window:            1 pkts      initial window:            1 pkts
     ttl stream length:      2512 bytes     ttl stream length:        NA
     missed data:               0 bytes     missed data:              NA
     truncated data:            0 bytes     truncated data:            0 bytes
     truncated packets:         0 pkts      truncated packets:         0 pkts
     data xmit time:        1.181 secs      data xmit time:        1.236 secs
     idletime max:          196.9 ms        idletime max:          196.9 ms
     throughput:             1660 Bps       throughput:             9471 Bps

     RTT samples:              18           RTT samples:              24
     RTT min:                43.8 ms        RTT min:                 0.0 ms
     RTT max:               142.5 ms        RTT max:                 7.2 ms
     RTT avg:                68.5 ms        RTT avg:                 0.7 ms
     RTT stdev:              35.8 ms        RTT stdev:               1.6 ms

     RTT from 3WHS:          80.8 ms        RTT from 3WHS:           0.0 ms

     RTT full_sz smpls:         1           RTT full_sz smpls:         3
     RTT full_sz min:       142.5 ms        RTT full_sz min:         0.0 ms
     RTT full_sz max:       142.5 ms        RTT full_sz max:         0.0 ms
     RTT full_sz avg:       142.5 ms        RTT full_sz avg:         0.0 ms
     RTT full_sz stdev:       0.0 ms        RTT full_sz stdev:       0.0 ms

     post-loss acks:            0           post-loss acks:            0
     segs cum acked:            0           segs cum acked:            9
     duplicate acks:            0           duplicate acks:            1
     triple dupacks:            0           triple dupacks:            0
     max # retrans:             0           max # retrans:             0
     min retr time:           0.0 ms        min retr time:           0.0 ms
     max retr time:           0.0 ms        max retr time:           0.0 ms
     avg retr time:           0.0 ms        avg retr time:           0.0 ms
     sdv retr time:           0.0 ms        sdv retr time:           0.0 ms
================================

回答2:

You can always stick a shell loop around a program like iperf. Also, assuming iperf can read from a file (thus stdin) or programs like ttcp, could allow a shell loop catting a file N times into iperf/ttcp.

If you want a program which sends a file, waits for your binary response, and then sends another copy of the file, you probably are going to need to code that yourself.

回答3:

You will need to measure the time in the client application for a roundtrip time, or monitor the network traffic going from, and coming to, the client to get the complete time interval. Measuring the time at the server will exclude any kernel level delays in the server and all the network transmission times.

回答4:

Note that TCP performance will go down as the load goes up. If you're going to test under heavy load, you need professional tools that can scale to thousands (or even millions in some cases) of new connection/second or concurrent established TCP connections.

I wrote an article about this on my blog (feel free to remove if this is considered advertisement, but I think it's relevant to this thread): http://synsynack.wordpress.com/2012/04/09/realistic-latency-measurement-in-the-application-layers

回答5:

As a very simple highlevel tool netcat comes to mind ... so something like time (nc hostname 1234 < input.binary | head -c 100) assuming the response is 100 bytes long.