可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

My goal here is to be able to parse html/xml data from a password protected page then based on that data (a timestamp) I need to send xml commands to another device. The page I am trying to access is a webserver generated by an IP device. Also, if this would be easier to accomplish in another language please let me know. I have very little experience programming (one C programming class)

I have tried using Requests for Basic and Digest Auth. I still can't get authenticated, which is stopping me from getting any further.

Here are my attempts:

import requests
from requests.auth import HTTPDigestAuth

url='http://myUsername:myPassword@example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'))        
r.status_code

print(r.headers) 
print(r.status_code)

Output:

401 
CaseInsensitiveDict({'Content-Length': '0', 'WWW-Authenticate': 'Digest realm="the realm of device", nonce="23cde09025c589f05f153b81306928c8", qop="auth"', 'Server': 'Device server name'})

I have also tried BasicAuth with Requests and get the same output. I have tried both including the user:pass@ within the url and not. Although when I put that input that into my browser it works.

I thought that requests handled header data for Digest/BasicAuth but maybe I need to include headers also?

I used Live HTTP Headers(firefox) and got this:

GET /cgi/metadata.cgi?template=html
HTTP/1.1 
Host: [Device IP] 
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: en-US,en;q=0.5 
Accept-Encoding: gzip, deflate DNT: 1 Connection: keep-alive
HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm="Device Realm", nonce="a2333eec4cce86f78016343c48382d21", 
qop="auth" 
Server: Device Server Content-Length: 0

回答1:

The two requests are independent:

r = requests.get(url, auth=HTTPDigestAuth('user', 'pass')) 
response = requests.get(url) #XXX <-- DROP IT

The second request does not send any credentials. Therefore it is not surprising that it receives 401 Unauthorized http response status.

To fix it:

Use the same url as you use in your browser. Drop digest-auth/auth/user/pass at the end. It is just an example in the requests docs
Print r.status_code instead of response.status_code to see whether it's succeeded.

Why would you use username/password in the url and in auth parameter? Drop username/password from the url. To see the request that is sent and the response headers, you could enable logging/debugging:

import logging
import requests
from requests.auth import HTTPDigestAuth

# these two lines enable debugging at httplib level (requests->urllib3->httplib)
# you will see the REQUEST, including HEADERS and DATA, 
# and RESPONSE with HEADERS but without DATA.
# the only thing missing will be the response.body which is not logged.
try:
    import httplib
except ImportError:
    import http.client as httplib

httplib.HTTPConnection.debuglevel = 1

logging.basicConfig(level=logging.DEBUG) # you need to initialize logging, 
                      # otherwise you will not see anything from requests

# make request
url = 'https://example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'),
                 timeout=10)
print(r.status_code)
print(r.headers)

回答2:

import requests
from requests.auth import HTTPDigestAuth

url='https://example.com/cgi/metadata.cgi?template=html'
r = requests.get(url, auth=HTTPDigestAuth('myUsername', 'myPassword'), verify=False,  stream=True)        


print(r.headers) 
print(r.status_code)

fixed with adding stream=True since the page is streaming xml/html data. My next questions is, how do I store/parse a constant stream of data?

I tried storing in r.content but it seems to run indefinitely (the same problem I had before)