My goal here is to be able to parse html/xml data
from a password protected page then based on that data (a timestamp) I need to send xml commands
to another device. The page I am trying to access is a webserver generated by an IP device.
Also, if this would be easier to accomplish in another language please let me know.
I have very little experience programming (one C programming class)
I have tried using Requests for Basic and Digest Auth. I still can't get authenticated, which is stopping me from getting any further.
Here are my attempts:
import requests
from requests.auth import HTTPDigestAuth
url='http://myUsername:myPassword@example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'))
r.status_code
print(r.headers)
print(r.status_code)
Output:
401
CaseInsensitiveDict({'Content-Length': '0', 'WWW-Authenticate': 'Digest realm="the realm of device", nonce="23cde09025c589f05f153b81306928c8", qop="auth"', 'Server': 'Device server name'})
I have also tried BasicAuth
with Requests and get the same output. I have tried both including the user:pass@
within the url and not. Although when I put that input that into my browser it works.
I thought that requests handled header data for Digest/BasicAuth
but maybe I need to include headers also?
I used Live HTTP Headers(firefox) and got this:
GET /cgi/metadata.cgi?template=html
HTTP/1.1
Host: [Device IP]
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate DNT: 1 Connection: keep-alive
HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm="Device Realm", nonce="a2333eec4cce86f78016343c48382d21",
qop="auth"
Server: Device Server Content-Length: 0
The two requests are independent:
r = requests.get(url, auth=HTTPDigestAuth('user', 'pass'))
response = requests.get(url) #XXX <-- DROP IT
The second request does not send any credentials. Therefore it is not surprising that it receives 401 Unauthorized
http response status.
To fix it:
- Use the same
url
as you use in your browser. Drop digest-auth/auth/user/pass
at the end. It is just an example in the requests docs
- Print
r.status_code
instead of response.status_code
to see whether it's succeeded.
Why would you use username/password in the url and in auth
parameter? Drop username/password from the url. To see the request that is sent and the response headers, you could enable logging/debugging:
import logging
import requests
from requests.auth import HTTPDigestAuth
# these two lines enable debugging at httplib level (requests->urllib3->httplib)
# you will see the REQUEST, including HEADERS and DATA,
# and RESPONSE with HEADERS but without DATA.
# the only thing missing will be the response.body which is not logged.
try:
import httplib
except ImportError:
import http.client as httplib
httplib.HTTPConnection.debuglevel = 1
logging.basicConfig(level=logging.DEBUG) # you need to initialize logging,
# otherwise you will not see anything from requests
# make request
url = 'https://example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'),
timeout=10)
print(r.status_code)
print(r.headers)
import requests
from requests.auth import HTTPDigestAuth
url='https://example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'), verify=False, stream=True)
print(r.headers)
print(r.status_code)
fixed with adding stream=True
since the page is streaming xml/html data. My next questions is, how do I store/parse a constant stream of data?
I tried storing in r.content but it seems to run indefinitely (the same problem I had before)