I am writing a Python application that queries social media APIs via cURL. Most of the different servers I query (Google+, Reddit, Twitter, Facebook, others) have cURL complaining:
additional stuff not fine transfer.c:1037: 0 0
The unusual thing is that when the application first starts, each service's response will throw this line once or twice. After a few minutes, the line will appear several several times. Obviously cURL is identifying something that it doesn't like. After about half an hour, the servers begin to time out and this line is repeated many tens of times, so it is showing a real problem.
How might I diagnose this? I tried using Wireshark to capture the request and response headers to search for anomalies that might cause cURL to complain, but for all Wireshark's complexity there does not seem to be a way to isolate and display only the headers.
Here is the relevant part of the code:
output = cStringIO.StringIO()
c = pycurl.Curl()
c.setopt(c.URL, url)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:17.0) Gecko/20100101 Firefox/17.0')
c.setopt(c.WRITEFUNCTION, output.write)
c.setopt(c.CONNECTTIMEOUT, 10)
c.setopt(c.TIMEOUT, 15)
c.setopt(c.FAILONERROR, True)
c.setopt(c.NOSIGNAL, 1)
try:
c.perform()
toReturn = output.getvalue()
output.close()
return toReturn
except pycurl.error, error:
errno, errstr = error
print 'The following cURL error occurred: ', errstr
I disagree with this - I get the same message when attempting to call a website via a BIGIP LTM external VIP address.
For example:
I call the website http://11five.10.10.10/index.html (IP address is random in this case). The BIG F5 should be load balancing the traffic to two internal web servers (17two.20.0.10 and 17two.20.0.11) via a pool associated with the virtual server.
In this case, the request coming from the external source (Internal Client) to the VIP address on TCP 80 should round robin between the two web servers. What I find is that all the servers receive an initial SYN packet and never a SYN-ACK back.
If I sit on a terminal within the local subnet where the real servers reside, I can "wget" the index.html webpage - sourced from 17two.20.0.11 to http://17two.20.0.10}/index.html.
Coming from external, I get the *additional stuff not fine transfer.c:1037 0 0 message.
You are right in saying that it's a built in debug mechanism for CURL in older revisions of the libcurl library but I disagree with the below statement;
What ever is causing this is due to a networking issue within the environment, I.E... the web servers cannot return the traffic back to the original source and hence displays this error or two, there is something wrong with the request header and the response back from the web server.
In this case I will opt to say that the original issue is more likely as when I performed a curl using different URis on the original request from a test host in the local subnet, I could retrieve the index.html web page fine. This implies that the server is listening and accepting connections using the FQDN and short name of the server.
I believe that this error is there to suggest that curl received a response that it is unsure on and therefore produces the above error. Without developing curl or reading the source code, I cannot comment further.
Any additional response that questions this logic would be welcome - all up for learning new things.
Andy
I'm 99.99% sure this is not actually in any HTTP headers, but is rather being printed to
stderr
bylibcurl
. Possibly this happens in the middle of you logging the headers, which is why you were confused.Anyway, a quick search for
"additional stuff not fine" curl transfer.c
turned up a recent change in the source where the description is:So, this is basically harmless, and the only reason you're seeing it is that you got a build of
libcurl
(probably from your linux distro) that had full debug logging enabled (despite thecurl
author thinking that's a bad idea). So you have three options:libcurl
.libcurl
without debug info.You can look at the
libcurl
source fortransfer.c
(as linked above) to try to understand whatcurl
is complaining about, and possibly look for threads on the mailing list for around the same time—or just email the list and ask.However, I suspect that actually may not relevant to the real problem at all, given that you're seeing this even right from the start.
There are three obvious things that could be going wrong here:
The first one actually seems the least likely. If you want to rule it out, just capture all of the requests you make, and then write a trivial script that uses some other library to replay the exact same requests, and see if you get the same behavior. If so, the problem obviously can't be in the implementation of how you make your requests.
You may be able to distinguish between cases 2 and 3 based on the timing. If all of the services time out at once—especially if they all do so even when you start hitting them at different times (e.g., you start hitting Google+ 15 minutes after Facebook, and yet they both time out 30 minutes after you hit Facebook), it's definitely case 2. If not, it could be case 3.
If you rule out all three of these, then you can start looking for other things that could be wrong, but I'd start here.
Or, if you tell us more about exactly what your app does (e.g., do you try to hit the servers over and over as fast as you can? do you try to connect on behalf of a slew of different users? are you using a dev key or an end-user app key? etc.), it might be possible for someone else with more experience with those services to guess.
confirming
Systen info: Linux alt 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2+deb7u1 x86_64 GNU/Linux
I've updated curl library, and continuous messages (which were caught on twitter rest api testing)
have disappeared
my newly updated curl --version data