I'm trying to debug an HTTP request that produces an error on our server. However, I'm struggling to reproduce the exact request somehow.
On my nginx logs, I see this entry which generated the error
157.55.33.20 - - [22/Nov/2013:04:06:22 +0000] "GET /en/library/search?utf8=\xE2\x9C\x93&q=something HTTP/1.1" 500 0 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"
However, trying to use curl
or httpie
using the same string produces log entries like:
GET /en/library/search?utf8=\x5CxE2\x5Cx9C\x5Cx93&q=something
or
GET /en/library/search?utf8=xE2x9Cx93&q=something
or
GET /en/library/search?utf8=%5CxE2%5Cx9C%5Cx93&q=something
I just can't seem to reproduce the exact same request. I tried with various command-line parameters, but can't figure this one out.
Any suggestions on how to reproduce the exact same request?
You can create HTTP requests with greater freedom with telnet. To start a connection, type this at the command line:
telnet www.example.com 80
(Here, 80 corresponds to the port number for HTTP requests). Once you're connected, you'll see a message like this:
Trying 12.34.56.78...
Connected to www.example.com.
Escape character is '^]'.
You can then type in your request, e.g.:
GET /en/library/search?utf8=\xE2\x9C\x93&q=something HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Finish with two carriage returns to mark the end of the header.
EDIT: lcornea's suggestion is also worth considering. If your command line terminal
accepts UTF8 (type in echo $LANG
to find out), then type (or paste) the character ✓
(check mark) instead of the three escaped characters. Or on a Windows Latin-1 terminal,
type in ✓ instead.
It looks to me as \xE2\x9C\x93 are actual UTF8 chars re-encoded by your server to be able to put them in the logs. Maybe look at what the chars actually are in UTF8 and put the chars in the URL.
Hope it helps
Instead of \xYZ
use %YZ
and it will be the same. For example replace \xE2
to %E2
in CURL request.
Have you try: curl -s -o /dev/null "http://yoursite.com/en/library/search?utf8=\xE2\x9C\x93&q=something" ?