Getting TTFB (time till first byte) for an HTTP Re

2019-03-30 04:21发布

问题:

Here is a python script that loads a url and captures response time:

import urllib2
import time

opener = urllib2.build_opener()
request = urllib2.Request('http://example.com')

start = time.time()
resp = opener.open(request)
resp.read()
ttlb = time.time() - start

Since my timer is wrapped around the whole request/response (including read()), this will give me the TTLB (time to last byte).

I would also like to get the TTFB (time to first byte), but am not sure where to start/stop my timing. Is urllib2 granular enough for me to add TTFB timers? If so, where would they go?

回答1:

you should use pycurl, not urllib2

  1. install pyCurl:
    you can use pip / easy_install, or install it from source.

    easy_install pyCurl

    maybe you should be a superuser.

  2. usage:

    import pycurl
    import sys 
    import json
    
    WEB_SITES = sys.argv[1]
    
    def main():
        c = pycurl.Curl()
        c.setopt(pycurl.URL, WEB_SITES)              #set url
        c.setopt(pycurl.FOLLOWLOCATION, 1)  
        content = c.perform()                        #execute 
        dns_time = c.getinfo(pycurl.NAMELOOKUP_TIME) #DNS time
        conn_time = c.getinfo(pycurl.CONNECT_TIME)   #TCP/IP 3-way handshaking time
        starttransfer_time = c.getinfo(pycurl.STARTTRANSFER_TIME)  #time-to-first-byte time
        total_time = c.getinfo(pycurl.TOTAL_TIME)  #last requst time
        c.close()
    
        data = json.dumps({'dns_time':dns_time,         
                           'conn_time':conn_time,        
                           'starttransfer_time':starttransfer_time,    
                           'total_time':total_time})
        return data
    
    if __name__ == "__main__":    
        print main()
    


回答2:

Using your current open / read pair there's only one other timing point possible - between the two.

The open() call should be responsible for actually sending the HTTP request, and should (AFAIK) return as soon as that has been sent, ready for your application to actually read the response via read().

Technically it's probably the case that a long server response would make your application block on the call to read(), in which case this isn't TTFB.

However if the amount of data is small then there won't be much difference between TTFB and TTLB anyway. For a large amount of data, just measure how long it takes for read() to return the first smallest possible chunk.



回答3:

By default, the implementation of HTTP opening in urllib2 has no callbacks when read is performed. The OOTB opener for the HTTP protocol is urllib2.HTTPHandler, which uses httplib.HTTPResponse to do the actual reading via a socket.

In theory, you could write your own subclasses of HTTPResponse and HTTPHandler, and install it as the default opener into urllib2 using install_opener. This would be non-trivial, but not excruciatingly so if you basically copy and paste the current HTTPResponse implementation from the standard library and tweak the begin() method in there to perform some processing or callback when reading from the socket begins.



回答4:

To get a good proximity you have to do read(1). And messure the time.

It works pretty well for me. The ony thing you should keep in mind: python might load more than one byte on the call of read(1). Depending on it's internal buffers. But i think the most tools will behave alike inaccurate.

import urllib2
import time

opener = urllib2.build_opener()
request = urllib2.Request('http://example.com')

start = time.time()
resp = opener.open(request)
# read one byte
resp.read(1)
ttfb = time.time() - start
# read the rest
resp.read()
ttlb = time.time() - start